What is it and why is a Disaster Recovery plan needed?

Planning for Disaster Recovery is a key step in propelling your company into the future. Let's see why.

‍

Risks associated with moments of inactivity

‍

Italian companies show very different situations. But one common factor is undoubtedly the difficulty of facing moments of inactivity due to external causes.

Regardless of where you are and what your business is, the recent pandemic has made it clear that companies are sometimes forced to deal with force majeure, such as natural disasters or man-made damage.

In addition to the repercussions on the social fabric, one cannot overlook the serious economic consequences of an interruption in production activities.

Without necessarily thinking of disasters of macro proportions, even just one hour of downtime would cost most of your businesses large sums of money...

‍

For those working in the IT sector, this is an issue that needs to be frequently addressed. It is thanks to IT and technology infrastructures that these situations can be anticipated and contained. How so?

‍

Technology to prevent damage

‍

As a matter of fact, most companies now have backup systems that periodically 'take a snapshot' of the existing situation and that are able to restore the situation from when the last backup was made, if necessary.

The Disaster Recovery process involves: backing up, testing that everything is correct, restoring data, and finally rebuilding the most up-to-date snapshot.

‍

What factors have contributed to this process, which has seen an increased need for technological solutions to ensure the organisational resilience of companies?

‍

We have identified 3 main factors:

Technological innovation in companies has become indispensable. Along with it comes the difficulty of managing systems. This has required more qualified personnel and, in most cases, people have opted to directly call in specialists;
The demand for fast recovery is growing exponentially;
The ability to restore the infrastructure for companies means not losing their business and not risking the possibility to reopen their business following a major data loss.

Even companies with all the necessary infrastructure must, however, consider all possible damage scenarios and carefully assess their own particular characteristics.

Whether it be cyber attacks, incidents caused by human error, environmental disasters, power cuts, equipment failures, etc., it is indeed necessary to assess on a case-by-case basis what is the most efficient way of restoring infrastructure or data systems. This way applications will be back up and running as soon as possible.

‍

What would cause the most damage in case of an interruption of activities depends on the type of company and on the type of structure.

In order to determine the best strategy and to avoid unpleasant losses of data, time and money, it is necessary to provide a system that is able to ensure immediate recovery and that allows work to resume with minimal damage in the event of a disaster of any kind.

‍

This system is precisely Disaster Recovery.

Let's see what it consists of...

‍

What is disaster recovery and what benefits does it offer?

‍

First and foremost, you will need to analyse the characteristics of your infrastructure to prioritise which are the most critical applications and which can afford to wait for backup recovery times.

‍

More specifically, in order to implement an effective Disaster Recovery plan, it is necessary to:

‍

Make a thorough analysis of the infrastructure, its policies and its procedures;
Classify the data and systems to be protected into 4 categories: #Critical, #Vital, #Delicate and #Non-Critical;
Preparing for the recovery of Critical infrastructure and systems, eaning those systems or applications that have a unique functionality that cannot be replaced by other systems or applications, and that cannot be substituted by manual actions.
Focus on technology systems information that supports the continuity of critical activities;
Define the restoration of Vital systems, meaning those systems that can be manually managed for a limited period with significantly lower impact compared to critical systems, and if restored within a few days, result in contained damages.
Evaluate the process to be followed with respect to Vital systems or applications-which can be managed manually but at the same time could cause inefficiencies by using too many resources-and Non- Critical ones, for which even long downtime does not cause major damage and in most cases whose restoration is easy;
It is also essential to identify the type of possible disaster: whether of natural cause (fire, flood, earthquake etc...) or human cause (mistakes, sabotage, malicious attack etc..). In the former case, an analysis has to be carried out and based on what is happening in the surrounding environment; in the latter, it is more difficult to prevent it because it could be due to an error of a person within the organization or from a targeted attack.

Città metropolitana distrutta da una esplosione

‍

After gathering all the necessary information to get a complete picture of the situation, a Disaster Recovery plan has to be put in place. This can provide significant benefits.

‍

Companies that implement a Disaster Recovery plan ensure:

‍

Quick restoration of essential critical tasks in a short time;
Competitive advantage over companies that do not have a plan in place. Such companies would have to shut down operations in the event of a natural disaster or other crisis;
Possibility of preserving all essential aspects for business continuity, even in the event of significant disruptions;
Reduce costs and losses in the event of systems downtime. Sometimes this can be fatal;
Last but not least, the guarantee of reliability and credibility towards internal and external collaborators. Above all this guarantee is ensured to customers and suppliers, since a long down-time could damage the image of the brand that you have so painstakingly built up.

‍

Possible solutions for a disaster recovery plan

‍

Having understood why it is necessary to implement a DRP (Disaster Recovery Plan), and what advantages it offers, let us now look in detail at how to proceed in order to define it. The starting point is time! Or rather, let's talk specifically about RTO and RPO time:

RTO - Recovery Time Objective - is the time between interruption and restoration of activities that is considered tolerable;
RPO - Recovery Point Objective - indicates the maximum time that should elapse between backups, thus the maximum amount of data that you are willing to lose in the event of a system shutdown.

Once the RPO and RTO have been defined, the next step is to choose the most appropriate type of solution. The market offers several:

Cold-Site: Solution with secondary off-site storage that remains idle until needed. Recovery times are quite long and can last up to 5 days, which is not a feasible solution for some realities due to the extended downtime period;
Hot-site: Service which, unlike Cold-site, is permanently active and has almost instantaneous replication of the situation and immediate recovery. It does not present a solution for all realities because it virtually doubles all infrastructure costs: hardware, management costs, licences, connection etc.
Virtualisation: Replication of the entire IT environment, thus making it possible to have VMs (Virtual Machines) available at all times as they are off-site. The VMs are always on and do not require any further configuration. They can also be used at any time because they are independent from the hardware.
DRaaS - Disaster Recovery as a Service: The last typology for defining a Disaster Recovery plan makes use of the availability of a Cloud infrastructure on which to replicate one's own infrastructure. In case of need, one can restore the previously defined systems and applications by turning on the latent infrastructure. It is an extremely flexible solution that allows companies to choose whether to take advantage of a monthly fee or pay-per-use solution, depending on their needs.

‍

‍

The Disaster Recovery Experience with CloudFire

At CloudFire we have supported partners and customers in the definition of Disaster Recovery plans, analysing together the characteristics and proposing the best solution according to the specifications and availability of each reality. In fact, it is essential to carefully choose the right people for such a project.

The system we have most frequently implemented is DRaaS - Disater Recovery as a Service, which provides replication of systems and applications on the infrastructure in the Cloud, to be able to tolerate any failure of any site in the face of any eventuality.

‍

The whole operation is carried out through replication jobs that can be classified according to the nature of the systems and applications, for which RTOs and RPOs can be differentiated, thus deciding in what order and with what timing to restart in the event of an outage.

‍

The DraaS solution provides a console from which it is possible to manage the failover plan and to modify the service characteristics at any time. This allows this solution to be rather flexible with regard to structural changes, as well as giving it the ability to resume operations if necessary.

However, extreme flexibility presents factors that may or may not be decisive for the success of the project. It is hence essential to be able to rely on experts who know how to set up the procedures in the right way and who offer concrete and prompt support.

‍

We hope we have helped you better understand the dynamics of Disaster Recovery Plan. Contact us to implement your own!