Uh-oh. Your computer screen is black. This could mean that a disaster such as severe weather, a cyber attack, or system failure has knocked out your IT-infrastructure. The resulting downtime can bring your operations to a standstill. You lose money, time, data or customers.
With a disaster recovery plan, you can resume mission-critical IT functions faster. You don’t have a plan? Or you have one… somewhere? Then it’s high time to read our guide on how to manage risk and mitigate disruption by being prepared.
1. What is meant by disaster recovery?
An event such as a cyberattack, natural disaster, power outage, or human error can cause network failure, and seriously disrupt operations. So, disaster recovery is being able maintain or quickly restore your IT infrastructure and services in a disaster.
Any disaster recovery plan definition should relate to business continuity. Your aim is to minimize business loss (productivity, revenue, data), and liability. Downtime derails processes or shuts down customer communication. So, the speed of recovery is critical.
What is in a disaster recovery plan? It contains instructions, processes and procedures that serve to prevent or anticipate likely events, and to set up alternative systems. Using the plan, you reestablish function and regain access to data. Most disaster recovery plans use the metrics of recovery time objective (RTO) and recovery point objective RPO):
Recovery Time Objective (RTO) is the maximum time you can take to get operations running before business continuity suffers.
Recovery Point Objective (RPO) is how much data loss you can afford. This parameter determines how fast emergency planning has to kick in.
Let’s look at a contained disaster recovery example. Imagine that a lightning strike at one of your sites causes a power outage, and a fire that melts your mainframe. A good disaster recovery strategy includes having a cloud solution or remote data center. Backup files and programs are stored there, or you even have duplicate systems that can take over.
Operations can be diverted quickly, reducing disruption. This is why it’s so essential to be prepared, react faster and be proactive.
2. What are five major elements of a disaster recovery plan?
To get started, break down your disaster recovery plan into who, what, which, when/where, and how.
Build a DRP team. Ask yourself: Who will create or maintain the plan? Who do we contact in an emergency? Who works with which systems? Do we have full-time specialists who manage backup and recovery? Create your taskforce first.
Identify risks and causes. For example, what are the chances of natural disasters (hurricane, flood, earthquake), man-made risks (explosions, blackouts)? What about technology incidents (cyberattacks, system failure, data sabotage)? Analyze how likely each of these are, and how severe the impact, so you can allocate resources and develop your response.
Determine critical resources (applications, files, documents, access). Which of these must be saved or restored first because business operations depend on them? Which generate or secure revenue? Take inventory—and classify threats in terms of business priority. This tells you where to start.
Specify disaster recovery procedures for backup and off-site storage. How often do I need to back up data and where do I store it? Make task lists specific, yet as simple as possible. Include details of how to handle critical offline communication and sensitive data.
Test the plan and maintain it. Does everyone know how to react? Will your plan rescue essential data? Through various disaster recovery drills, you can test the effectiveness. Then adjust, review and renew the process regularly.
Through disaster recovery planning, you can improve risk awareness. Real-time monitoring alerts you of potential threats. You learn immediately of data breaches, natural disasters, or power outages, among others. You can react faster to minimize damage in your supply network.
3. What is a disaster recovery site (DR site)?
As part of a disaster recovery plan, a company temporarily relocates its systems and services to alternative servers at a geographically separate disaster recovery site. Many companies use cloud-based disaster recovery services or set up a twin infrastructure at a remote backup disaster recovery site. Companies can establish cold, hot and warm (or split) disaster recovery sites, depending on available timeframe, cost and resources.
A cold site is a bare-bones office or datacenter space without servers installed. Having physical facilities offers some protection, but is time-consuming to activate.
Warm sites (or split sites) offer office space/datacenter space and will have some pre-installed server hardware. Getting everything running takes moderate time and effort, and offers a good balance of cost and recovery time.
The fastest option is a hot site that mirrors your files, databases and infrastructure in real time. This lets you get back online fast, but is the most expensive option.
4. What are the types of disaster recovery?
Disaster recovery scenarios include specifics on how and where to back up data. Many plans draw on disaster recovery cloud computing or software as a service. Using such services save companies the cost of doubling up on infrastructure. Solutions are scalable and geographically distant, a big advantage in regional natural disaster recovery.
External providers of cloud-based disaster recovery services may use virtual machine (VM) files, software that behaves like computers. For additional security, you might have two datacenters, one cloud storage and one remote or local. Trailer-like mobile disaster recovery sites are an option when enterprises have bandwidth or security concerns.
What is the difference between a backup and disaster recovery? A backup is simply a copy of your data. You always need backups, for installing on new devices, for example. Yet having a backup does not replace vital IT-systems, nor guarantee that you can get these up and running quickly.
This is why failover is important for disaster preparedness. These are having processes and hardware that allows you to switch over to an alternative location automatically. You might say it is like having a spare tire in your car. It lets you get back on the road quickly. Similarly, failback is when you can switch workloads back to the primary site (the “real” tire).
5. How to write a disaster recovery plan
Your disaster recovery procedures follow naturally from your risk assessment. It is now clear which systems and data are mission critical, and which employees are responsible for each task.
To make your disaster recovery process go smoothly, you want to document all significant IT processes, people, and inventory. Consider scope, scenarios, locations, and all the people involved, along with their equipment. You must make contact lists, and instructions for each area of responsibility. It’s essential to also record authorization codes or keys. And double-check that everyone has access to all required software and systems. Have backup and failover information readily available to your team.
When creating specific disaster recovery plan steps, here are three initial action points:
- Beginimmediate response to the emergency
- Know who will access and activate your backup
- Initiate activities that will restore data quickly
Although your plan rests on thorough documentation, you could distill key tasks into simple checklists. This supports fast action during an emergency.
As part of your disaster recovery plan, double up on staff members who perform critical processes. In this way, essential duties are covered at all times. These employees should record any changes to hardware and software, and keep track of backup schedules.
6. Why is disaster recovery important?
Few enterprises today can afford to be offline for days or even hours. Enterprises can bleed large amounts of money in a very short period of time when their IT systems go down. Of course, securing data and IT systems can save your business in an emergency. Yet the benefits of disaster recovery extend to day-to-day operations. With disaster recovery planning, you:
+ Establish good documenting processes
+ Manage inventory and networks
+ Install task redundancy, when you have two trained people per task, so you always have a human backup
+ Identify and reduce bottlenecks, which saves time and money
And what is the difference between disaster recovery and business continuity planning? A disaster recovery plan is usually embedded in a business continuity plan. It includes guidelines and procedures on how to prevent risk events, respond to and recover all core business operations. The primary goals are to keep people safe and to resume normal operations quickly following a disaster. Business continuity planning should always include good risk management.
7. How to test your disaster recovery plan
According to the online trade publication Tech Republic, nearly one-fourth (23%) of companies never test their disaster recovery plans. Yet testing is critically important. Even planning the test will reveal any gaps. Through testing, you validate the processes you have established and verify their value.
Actors and musicians would never go on stage without practice, and your enterprise, too, must rehearse. How should disaster recovery drills be executed? Methods include:
Tabletop or walkthrough means groups proceed step-by-step to make sure they understand everything
Simulation is when members of the group act out their tasks individually and collectively
Parallel test allows you to test your recovery systems while your main systems continue running
Cut over refers to gaining practice running full operations on the recovery systems
Through testing your disaster recovery plan, you instill a culture of care and risk awareness. And because you train specifically for adverse events, you learn how to avoid downtime, minimize interruptions, and mitigate damage. Factors such as these are also what make supply chain risk management so valuable. It gives you competitive advantage, even without disruption.
8. What is the disaster recovery reform act (DRRA)?
Another form of disaster recovery planning also helps citizens and societies deal with emergencies. For example, in 2018, the US Congress passed the Disaster Recovery Reform Act (DRRA). This streamlined how the Federal Emergency Management Agency (FEMA) prepares for and responds to natural disasters. Notably this FEMA disaster recovery plan strengthened building codes and modified administrative policies. When disaster response is faster, community resilience increases.
Yet is rebuilding what was destroyed really the best option? Natural disasters such as hurricanes or earthquakes happen with some regularity because the region is vulnerable. The same is true of cyber attacks that take advantage of weak spots. To strengthen resilience at every level, disaster recovery emphasizes preparedness and mitigation.
We’ll say it again: Being prepared is what having a comprehensive supply chain risk management solution is all about. As a risk-aware enterprise with proactive planning, you go from black to back online faster after disaster.
riskmethods was acquired by Sphera in October 2022. This content originally appeared on the riskmethods website and was slightly modified for sphera.com.