ITIL – the body of IT service management (ITSM) best practice guidance – defines an incident as an unplanned interruption or reduction in the quality of IT services. More often than not, these incidents are typically reported to and managed by the IT team through an IT service desk. But not all incidents are born equal, with the term “major incident” used to describe incidents that have a major effect on business operations and outcomes.
Examples of the most common business-affecting incidents handled by an IT service desk include:
- Network outages
- Hardware failures
- Capacity issues
- Business application issues
- The business impact of unplanned maintenance
- Release deployment issues
- Data centre outages
- Cyberattacks/DDoS (Distributed Denial of Service) attacks
The cost of major incidents can be significant – from lost operations and revenues, through the costs of corrective work, to the impact on the corporate brand (and potentially its share price).
Time is precious; time is money
27 minutes is the average time it takes an organisation to assemble a response team in the event of a major incident. Every single minute is costly, and the bigger the company, the bigger the cost.
Then the clock keeps ticking, and the cost keeps rising, as these people and others work together to remedy the root cause(s) of the major incident.
Your organisation would prefer that major incidents don’t happen but, when they do, there’s a need for all the interested parties to work together as efficiently as possible toward the required solution. So, how can you better manage the responses to the major incidents that occur within your business?
Improve your communications
Your initial major incident management communications can’t afford to rely on the same enterprise application used to manage virtually every activity within a business – here, the important communications get lost in a sea of far less important messages. Instead, the internal businesses communications related to major incidents need to be fast, efficient, and representative of the modern age of digital technology. The most critical incidents need to be communicated in ways that cannot be missed or ignored; with this now even more relevant and applicable given the increase in remote working and teams being globally spread.
So, what needs to happen to improve the effectiveness of major incident communications?
It’s not just the active participants that need to be notified and engaged collaboratively during major incidents; to maintain a positive customer experience, everyone affected must be engaged and informed of the major incident occurrence and the anticipated resolution time. This has traditionally been done via email but, as with the advent of social media channels, there are better ways of enabling different types of communication need.
For example, in the case of major incident communications, a proactive messaging workflow that automates the initial response and updates is a logical and efficient way of keeping different stakeholders consistently informed of developments, relative to managing the major incident. This will not only notify your employees of any potential obstacles and threats (in terms of the most current information), it also helps those involved in the resolution efforts to focus on driving the actions to solve the major incident in the most effective and time-sensitive way possible.
Reduce/replace email use
Email has its place in every organisation. It’s arguably still the most widely connected messaging stream used within businesses. It’s how you contact colleagues, it’s how your enterprise platforms contact you, it’s how you contact your customers, it’s how you receive industry information; spam or not, virtually all communications in business can reach you via email and that’s part of the issue.
Email is notoriously overloaded and ineffective, yet roughly 83% of companies rely heavily on email to engage the major incident response team. Email is not urgent and doesn’t meet the pressures and priorities of major incidents or effectuate the way automated, actionable alerts would do.
Email is a one-size-fits-all option for work-based communication. However, if you want to get the responses your organisation needs, you need more streamlined, efficient communication within your enterprise. The primary reason being that these alerts cannot get missed or give out-of-date information. Think about it, an initial major incident alert email is sent out at 01:00 that says services will be restored by 07:00. Another is sent out at 03:00 that states that the issue is more severe than originally thought and the resolution will now not be until 15:00. Someone might read the first email at 06:00, and overlook the second one (from 03:00), and completely miss the fact that the major incident is going to severely disrupt most of their business day. There’s a need to ensure that, where critical communications are concerned, the recipients get to see the most recent (and important) messages first.
Reduce the manual work
Major incident management is often a hands-on task, but far too much of the initial response process is reliant on manual administrative tasks. Checking directories for colleague contact information. Consulting various office calendars and time sheets to see who’s online and available. These and other tasks all add high-cost minutes to the initial response.
Many of these tasks can be easily automated as part of your major incident management response process. As soon as a major incident is identified, you want the relevant employees and users automatically notified. Ultimately, automation helps to optimise the entire major incident management process – creating immediate awareness of an incident, facilitating the resolution (and workarounds/contingency arrangements), and ultimately saving time and money (and potentially protecting the corporate brand).
Optimise your major incident management response capabilities
When major incident management is done ineffectively, it has expensive consequences. As highlighted by the reported figures, the initial process of gathering a team is a time-sensitive task and one that, currently, many companies are failing to optimise. This is why in the age of digital transformation, where further incidents are expected, businesses need to invest in smarter major incident response processes that employ more effective communication mechanisms for achieving a proficient and successful response – getting the organisation running again as quickly as possible.