The 4 Step Guide to Disaster Recovery Planning
IT is a critical component of every successful business on the planet. With everything from company websites to phone systems to resource management systems hosted in enterprise data centers, a disaster, whether it is man-made or natural, can bring a company to a shuddering halt. Any organization wishing to avoid losing critical data and systems needs a disaster recovery plan in place.
Planning for disaster recovery is more than standing ready for the worst: it can also help highlight inefficiencies and the discovery of what systems and data are truly mission-critical. This quick guide will help you on the way to disaster preparedness.
1) Get Executives on Board and Launch the Planning Committee
Executives must lead the charge on disaster planning. Without leadership high up the chain, an emergency situation can quickly devolve into finger pointing or shirked responsibilities. With input from leadership, a disaster planning committee should be formed to assess IT operations and develop the plan.
This committee should first perform risk assessment and business impact analysis to analyze the effect of losing IT for each arm of the organization. What data and applications are necessary to continue operations in marketing vs. sales? What about product development or customer support? Document the necessary systems, databases, and software in daily use for each department to help determine the Recovery Point Objective (RPO) and Recovery Time Objective (RTO).
2) Estimate the RPO and RTO to Create a Recovery Strategy
RPO states the goal restore point – the furthest back in time that files and apps need to be saved for business operations to continue normally. An e-mail archive from several years ago is not likely to be included in the RPO, for example. RTO is the maximum time going forward that a process can be down before the business is impacted. E-mail service likely would have a short RTO, as it has become essential for daily operations in virtually every company.
A disaster recovery strategy can be built from the RPO and RTO of each service, application, or arm of the organization. The planning committee must consider facility operations, hardware in use, software in use, communications tools, data files, customer-facing processes, and user-facing processes. Combing through each of these categories can help identify inefficient daily use as well as discovering which branches must be replaced during a disaster event.
Besides deciding on which systems must come online first in a disaster, the recovery strategy should include the methods of restoration: public cloud instances? Private cloud hosted in a data center? In-house data center at a remote site? Or backup equipment? Depending on the type of disaster, budget available, and security or compliance concerns, one of these options may be more attractive.
3) Write the Disaster Recovery Plan and Test, Test, Test
After documenting which essential business technology must be included in the plan, the committee should produce written strategies for specific recovery steps including:
- Contract terms with outside service providers and backup systems
- Backup goals for each application/system/database – constant mirror backups, daily, weekly, etc?
- Regular testing dates and reports
- Security concerns – what happens when workers must log in through new VPNs, onto new servers, or from home?
- Compatibility – new equipment or replacement virtual machines must work with previous settings and configurations
- Availability – can backup systems handle daily business loads until primary systems are back online?
The written plan should also include a complete inventory of equipment, software, databases, network and configuration settings.
Testing of the recovery plan should be thorough and performed immediately after its approval from company leadership. Testing should happen at least once per year after the initial plan to guarantee its feasibility as well as to revise the plan as needed. A successful disaster recovery test includes complete shutdown of existing systems, customer and user notifications, switch to failover systems, support for users and customers, and restoration of equipment and operations.
4) Recovering from the Real Deal
The first step in a disaster event is generally to power off equipment, if possible, to avoid further damage. During the planning process, teams should be assigned various operations such as backup, restoration, and support.
Operations can be pushed to a “hot site” where servers and a live backup of all virtual machines and data lives in case of disaster. For mission critical files, a quick configuration change can keep operations from skipping a beat with a hot site. A “warm site” contains hardware and configuration settings ready to receive software and data for operation restoration. Finally, a “cold site” is basically just an empty data center or reserved cloud computing space that must be set up, software installed, settings configured, etc before systems are up and running.
Automatic backup software and disaster recovery applications can make failover automated, less stressful, and set-up in advance. These programs offer data deduplication, ensuring only changed files are backed up again during regular backup, saving on storage and network usage. Once set up, they automatically backup data and replicate in minutes to hours.
After the backup system is powered on and tested, employees have been notified, and business seems to be operational, the recovery team should work to restore the original systems, if possible.
Green House Data offers a variety of disaster recovery software backed by highly-available data center infrastructure and expert technicians. We can help you design a disaster recovery plan and Recovery Service Level (RSL) to determine which mission-critical systems need to be instantly restored in the cloud to keep your operations running smoothly.
Posted By: Joe Kozlowicz