The one thing for certain is that when it comes to technology, there will come a time when disaster will strike, and as your client’s IT shop, it’s going to be up to you to fix it. You must have a plan in place that outlines how to fix common problems and so that you can get things back up and running again as quickly as possible.
Up until recently, companies would typically rely on a few knowledgeable people, who knew the tricks for getting things going again. However, as an MSP, leaving disaster recovery to a “super user” can be dangerous for your clients and there should be a more automated way of documenting these kinds of processes.
The good news is that there are tools available now that can alert you when there is a problem, and then help you get up and running again when things do go wrong. In fact, a number of startups have popped up in recent years to put these kinds of tools, which previously were only in reach of the largest tech companies, into the hands of smaller businesses.
Getting a grip
The thing about disaster recovery, like so much advanced technology, is that the biggest companies have the best tools, and the smaller ones are often left on their own to fix their messes.
Thanks to the cloud, startups can now take these advanced concepts, package them up and deliver them as a software service. Such is the case with two disaster recovery startups Fire Hydrant and Transposit.
Both companies use the concept of run books. These are basically play books that outline what you need to do to fix the problem. These aren’t some notebooks sitting on the shelf though. They are electronic, and provide instructions on how to proceed. For example, opening a Jira ticket, sending a message to the disaster channel in Slack and emailing the team to let them know what’s going on.
Leaving disaster recovery to a “super user” can be dangerous for your MSP’s clients. There should be automated documentation for these kinds of processes. #DisasterRecovery #Automation
Transposit wants to take this concept a step further by using machine learning to transform the run books from static documents into live documents, which can take the responses to an incident and adjust the playbook over time based on those actions.
Whichever direction you take, just having this kind of structure in place can help you deal with disaster whenever it strikes. The run book concept can help you bring the right people into the mix to make sure you fix the problem as quickly as possible. Over time, you can probably automate aspects of this, depending on what service you are using.
Ultimately, the goal is to put some structure and discipline in place so that you have a plan when things are chaotic and crazy.
Photo: turgaygundogdu / Shutterstock