Are you planning to implement ITIL? Do you handle Change Requests? Are you measuring the risk associated with a given RFC (Request for change)?
IT leaders across the world struggle to find the right methodology when it comes to quantifying the "Risk Element" associated with change request. Customers use tools such as CA Service Desk, Service Now, ServiceDesk Plus, LanDesk, BOSS, BMC Remedy, or other products to manage their #ITIL change requests.
Primary Attributes of RFC
Irrespective of the tools or even manual ways of tracking changes using XLS sheets, you need to look for 5 primary attributes in every RFC:
- Change Impact
- Impact if change is not performed
- Testing Status
- Outage period
- Backout period
Each attribute carries a "weight" based on their importance and helps in calculating the Risk Score. During the CAB (Change Advisory Board) or CMC (Change Management Committee) meetings, this Risk score could be used to make informed decisions.
Risk Score Calculation
Our Magic Formula uses the primary attributes to calculate the Risk Score using the Risk Rating value of the primary attribute multiplied by weight of attribute. You can modify the Risk score by changing the weight.
1. Change Impact — [weight: 3]
Every RFC has some kind of impact in the environment affecting few individuals to everyone. We have to capture the risk rating during the Change Request input process. For example, a functional manager could request a change or a Technician might move an incident to the change queue.
Change Impact values could be:
- Minimal Impact (1)
- Significant – but not critical (2)
- Critical business systems impacted (3)
- Major Impact (4)
(x) indicates the Risk Rating and weight of change impact is 3.
Change Impact Score = Weight x Risk Rating
In case of Minimal Impact, Change Impact score = 3 x 1 = 3 and in case of Major Impact, Change Impact score = 3 x 4 = 12
2. Impact if Change is not performed — [weight: 1.5]
This attribute is another key factor in determining the need for the given change. Sometimes RFC is pushed through even though it carries a High Risk score if the 'System would not function' without the change. Also, this factor helps to monitor pet projects being pushed through the queue.
Values of 'Impact if Change is not performed' could be:
- No Impact (1)
- Minimum performance loss (2)
- Substantial performance loss (3)
- System not usable (4)
In case of No Impact, the attribute score = 1.5 x 1 = 1.5 and in case of 'system not usable' scenario, score = 1.5 x 4 = 6
3. Testing Status — [weight: 2]
IT leaders of large organizations are paranoid about making a change that is not validated thoroughly in R&D setup. Inability to test the RFC in QA environment implies High Risk. Financial institutions & Healthcare firms demand "Filing / Review of Test Results" as a precursor to even consider a change.
Testing Status values could be:
- Fully tested in production (1)
- Partially tested in production (2)
- Tested in a controlled environment (3)
- Not able to test until live (4)
If you could fully test the change, the attribute score = 2 x 1 = 2 and if not, attribute score = 2 x 4 = 8
4. Outage Period — [weight: 1.5]
System Availability is an important measure considered by leaders to evaluate the functioning of a Service Desk Team and any kind of scheduled / unscheduled downtime is carefully reviewed prior to the approval of RFC. Any outage that lasts more than 4 hours can have devastating consequences. Weight for this attribute would be high if you are running a Healthcare environment.
Outage Period values could be:
- Less than 1 hour (1)
- Between 1 to 2 hours (2)
- Between 2 to 4 hours (3)
- Up to a Day (4)
Outage more than a day can cause significant business loss, but it can happen due to various reasons such as lack of planning, limited support, natural calamities, resource limitations, etc.
Applying the same formula as above, Outage period scores can range between 1.5 and 6
5. Backout Period — [weight: 2]
Despite all the planning, a change can fail and Disaster Planning / Recovery is a key component of any Risk Management plan. Every RFC should capture this attribute backout period (generally in hours) as a mandatory field. It is better to be safe than sorry!
Values of Backout Period:
- Backout is easy < 1 hour (1)
- Backout is complex – 4 hours (2)
- Backout is very difficult – 1 Day (3)
- Backout is not possible (4)
Backout Period score can range between 2 and 8
Risk Score Analysis
RFC Risk Score = Total Sum [Weight * Risk Rating] of all attributes
If you use the Weight & Risk ratings defined above, RFC Risk Score can range between 10 and 40
Change Requests with corresponding Risk Score -- BOSS
Change Management becomes easy once we associate a Risk Score to every RFC. Companies can define their own workflow rules to allow Technicians work on the RFCs with a Risk Score below threshold levels. For example, technicians can work on change requests with a score less than 18 without approval from CAB.
Risk Score helps CAB to focus on important changes. Users may consider additional factors like application dependency, environment, and connectivity to make prudent decisions.