Category Archives: Disaster Recovery

Risk Mitigation through DR

Timely information is the key to business success and in today’s scenario most organizations are fully dependent on IT enabled services to achieve their business goals. However IT is filled with risks such as viruses, malware, worms, hacker attacks, etc. Hence, it is crucial for organizations to protect information and critical business data in their IT systems. IT risks are managed by implementing comprehensive disaster recovery plans.

Risks in IT are a result of vulnerabilities and their resulting threats. Vulnerabilities are basically the weaknesses in a system or in the infrastructure. For example, not having adequate backup of data and information is vulnerability because in the event of hardware or software failure, important data could be lost. Similarly, not using the latest virus scanners is vulnerability. Threats are understood as a source or event that has the potential to accidentally trigger a misuse of IT systems or intentionally exploit a specific vulnerability. For example, stealing of passwords by hackers, viruses, worms, spam emails, etc. are threats. Threat also includes natural events such as storms, electric outages, high-voltage surge due to lightning, flood, fire, earthquake, etc. IT systems are vulnerable also to natural threats.

Vulnerabilities and threats in IT can be successfully handled and mitigated by developing a comprehensive disaster recovery management plan for the organization. Therefore, risk in IT is a function of threat and its potential vulnerability which always results in adverse impact for the organization. To avoid negative impact, risk management must be commensurate with the organization’s strategic objectives and focus on securing data and systems (hardware and software). IT risks can be managed effectively by security planning as well as disaster recovery (DR) planning.

DR plans will implement policies, procedures and actions to minimize disruptions to business in the event of a disaster and in order to ensure business continuity. The very first step in DR planning is to establish processes for business impact analysis (BIA). BIA processes helps to identifying specific risks and analyzes the impact of all IT enabled business processes. Using BIA as the key, other important elements to consider while developing a DR plan will include,

  • Clarity in organizational responsibilities: Many organizations fall short in determining roles and responsibilities in terms of DR. DR is more than just restoring data on servers or replicating databases, instead DR plan will ensure the applications and systems are able to support business functions. Here the participation of non-IT members is needed to understand the impact to business units while developing DR plans.
  • Define application recovery service levels: Application recovery services can be catalogued based on different levels of recovery provided by BIA. DR offers the insurance for protecting data and critical information however, efficiency in application recovery is also important. Aligning applications according to the levels of recovery obtained from BIA and restoring them with business functions according to their importance must be included in DR plan. For instance, restoring data related to product features immediately after a disaster is important for sales and marketing units.
  • Apply a cost model for DR: It is important to note that IT service levels are highly influenced by cost. The cost model can include items such as hardware, software maintenance, support, personnel and facilities. A carefully developed cost model can significantly result in continued IT services efficiently. Cost models are a must when business organizations hire IT services from data centers. Data centers provide different cost models based on business requirements.
  • Establish secondary facilities and involve experts: Many organizations consider the option of having an additional back-up facility in case the primary facility will experience a disaster. People with skills and capabilities for restoring IT services are needed to assist business users in restoring their data, applications and services without disruptions.
  • Establish standardized procedures: In the absence of DR planning, day to day operations can be disrupted to result in heavy losses for the organization. There are instances where organizations have compromised their mission critical data during a disaster. The need for adequate documentation to highlight risk analysis for key IT enabled business processes cannot be overlooked. Many organizations have embraced and implemented standard frameworks for security such as ITIL to significantly improve their chances of mitigating risks due to disasters (man-made or natural).

Large organizations that are dependent on IT cannot tolerate downtime of their business critical applications. DR plans help organizations to restore applications and data efficiently. In DR plans, provisioning IT services is done in order to ensure business continuity quickly without long disruptions. The objective of implementing DR is to mitigate threats, but it should be noted that DR plans once implemented does not provide all the protection required from new type of threats or attacks. DR plans are dynamic and must be updated and validated regularly whenever new types of threats arise.

Best Practices in Disaster Recovery and Business Continuity

Business enterprises depend on information to survive and to ensure business continuity. However, in IT protecting information from disasters is a constant challenge. In spite of having adequate data backup and storage, companies face business disruptions and useful data is lost. Thoughtful planning and collaboration of people at all levels in the organization can help to develop comprehensive disaster recovery and business continuity plans to adequately protect critical business data from loss.

The term disaster recovery is relative because disaster has many forms and will occur unexpectedly. IT systems, servers, data and applications are vulnerable to disasters like floods, hurricane, power outage, hardware failure, attacks and hacks on servers or network and also to human error. Inadequate planning in technology and business can compromise critical business data and cause substantial financial downfall. Most of the disaster issues can be mitigated by developing comprehensive DR plans and restore business operations within few hours, instead of days as earlier. DR plans are aimed at protecting data and vital assets in the organization. It is important to note that DR planning is unique for every organization.

Consider a scenario when important notices or information must be sent to specific set of people at a given time. If IT is down, the cost of downtime can result in the company losing its value with customers, stocks decline and stakeholder confidence is lost. Therefore, DR plans in an organization should identify potential risks that exist within their IT environment and define steps to mitigate those risks in order to ensure business continuity.

Business continuity and DR planning is often defined as an ongoing process that is integrated with day to day operations. For example if a C-level executive will notice his email is down or is unable to generate a report for decision making, this cannot be tolerated. The process involved in a DR plan includes certain key elements to ensure efficient and effective restoration of critical business functions in the event of an unplanned disruption. The key elements to consider in BC and DR planning includes,

  • Assessment of critical applications
  • Procedures for Back-Up and Restore, recovery of data
  • Procedures for implementation and testing and maintenance

Some of the best practices which can be considered while defining DR plans are:

  • Catalogue systems and identify all impact: BCDR planning starts with identifying applications and data for their criticality and their cost of downtime. Also, the recovery points and recovery time objectives (RTO) for each component is understood. For example, the negative impact of losing critical customer records or network connectivity must be understood and planned appropriately. The plan must include all the systems and services (servers, storage disks, network components, etc.) participating in business operations must be catalogued and the RTO for each component is understood and plans are defined.
  • Involve people in BCDR: Normally, DR responsibility falls with IT or a single person in the company. If this person is not available during a disaster event, the company has all the recovery plans, but no one to restore the systems. Therefore, several people from different departments can be trained to handle recovery procedures, it will be best to train some people outside the primary data center region or another team of people in another region are taught on applications and data recovery, particularly in a hosted environment.
  • Ensure redundancies to protect mission critical data: The plans will ensure that adequate resources are available for backing up data and applications. In case of availing cloud services from data centers, the company must ensure appropriate SLAs in place for disaster recovery to make sure applications and data are available at all times. SLAs must also define infrastructure redundancy like replicating data in another location for availability in case if the primary data center is down for some reason. Infrastructure redundancies to consider in plans will include power, cooling, telecom, network and other related hardware.
  • Include changes to DR Plan as and when they occur: BCDR plans must include all critical business processes and their associated applications, their SLAs, data sources and steps to recover within their recovery points and RTO. For example, if a new application is implemented, or if applications and data are moved physically to a cloud, the earlier plans are outdated. It will be best to keep plans aligned with changes in the operating environment and documented fully.
  • Periodically evaluate BCDR Plans: BCDR plans are evaluated periodically by running vulnerability tests and appropriate patching is done for protection to systems and infrastructure. Latest technologies are also evaluated for their benefits in ensuring business continuity and to develop a new set of SLAs. The evaluation and testing of plans must consider future requirements and ensure a fail-safe infrastructure for the organization.

In spite of diligent planning and continuous evaluation of BCDR, IT failures are common and hence DR is a continuous process. Companies can consider these best practices in BCDR planning to overcome IT disaster nightmares.