Recover Faster! New Approaches To Disaster Recovery With Oracle Cloud Infrastructure (OCI)
By Arun Kota, HEXstream senior director analytics
An intricately crafted disaster-recovery system stands as a vital pillar for utility organizations, ensuring swift and efficient recovery in the aftermath of catastrophes. In the dynamic landscape of IT, disasters take various forms, ranging from network outages and equipment failures to application glitches and natural calamities, which all pose imminent risks to the integrity of your applications.
As we can’t predict when a disaster will strike, the best option is to have a plan on how to quickly recover.
An important step in disaster-recovery planning is determining the Recovery Time Objective (RTO) and Recovery Point Objective (RPO) for an application.
Recovery Time Objective (RTO) serves as a crucial metric, reflecting the maximum allowable downtime for a specific application following a disaster, with a general rule that emphasizes shorter RTOs for applications of higher criticality.
Recovery Point Objective (RPO) signifies the timeframe, post-disaster, during which an application can endure data loss without negatively affecting the enterprise, underscoring the critical need for tailored recovery strategies based on data sensitivity and business-continuity priorities.
In crafting a robust disaster-recovery plan, it is imperative to factor in both RTO and RPO, built upon a strategic alignment that harmonizes the goal of timely application recovery with a tolerance for data loss. This dual consideration facilitates the development of a comprehensive, cost-effective, and resilient framework for disaster preparedness.
Multiple Approaches for Disaster Recovery in OCI
Oracle Cloud Infrastructure (OCI) provides four different approaches in disaster recovery.
Selecting the appropriate disaster-recovery solution for an OCI application hinges on various factors such as availability needs, data durability, as well as the specific requirements for both RTO and RPO, necessitating a thoughtful evaluation to align the chosen solution with the organization’s unique considerations.
1. Backup and Restore–This approach backs up critical data and systems into offsite storage on regular intervals. In the event of a disaster, data can be restored from these backups to bring the systems back to their previous states. This is a relatively cheaper approach, but it usually takes more time. The backup-and-restore approach is usually used for noncritical applications.
2. Pilot Light–In this approach, a minimal version of the environment is always running in the cloud. In the event of a disaster, the infrastructure can quickly scale up to full capacity. The
RPO and RTO are shorter, but a little costly compared to the backup-and-restore tactic. The
Pilot Light approach is usually used for critical applications.
3. Active/Passive or Warm Standby–In this approach, a fully installed and configured application is on standby and ready to run when required. This setup reduces the time required to switch to a fully operational state in the event of a disaster. The RPO is in seconds and RTO is in minutes; the cost is significantly more expensive than Pilot Light. The
Active/Passive or Warm Standby approach is usually used for critical applications.
4. Active/Active–In this approach, the applications are deployed in Active/Active mode and are actively serving. Every transaction is registered in both applications. Both RPO and RTO are almost zero. This approach has the highest cost, as the applications are running in parallel. The Active/Active approach is used for extremely critical applications.
OCI disaster recovery can be created using multiple regions. Choosing the right DR approach for an application is very important. The criticality of the applications, data consistency, cost and complexity need to be factored in when deciding the DR approach for an application.