Pilot Light DR for WebCenter Content on AWS

Terry Wang
Sep 8, 2023 11:54:48 AM

Disaster recovery (DR) is a key part of the business continuity plan for an application. We have over the past implemented many DR solutions for WebCenter Content. After moving some of the WebCenter Content workloads to AWS, we have designed and implemented a Pilot Light DR solution for WebCenter Content environments on the cloud platform.

Leveraging cloud-native technology, this Pilot Light DR solution has the following advantages. It strikes a great balance between operational efficiency and cost-effectiveness.

  1. Rapid Recovery. By leveraging the AWS CloudFormation template, DR resources are provisioned automatically on the fly. This reduces human error and improves recovery time.
  2. Cost Efficiency. This Pilot Light solution minimizes costs during normal operation as most of the resources in the WebCenter Content environment are provisioned on demand.
  3. Near zero data loss. For WebCenter Content, the transactional data are kept in the system database and the content store file system. This Pilot Light DR solution implements Oracle Data Guard for database protection and AWS EFS for the native content store file system.

The below diagram illustrates the components of the Pilot Light DR solution -

Elastic Load Balancer (ELB)

The ELB is configured to forward traffic to pre-allocated web server IP addresses in both Availability Zone (AZ). User traffic will go to the primary site in normal operations and be transparently redirected to the DR site if a DR failover/switchover occurs.

Web and Application Servers

In the Pilot Light DR solution, no web or application servers are running on the DR site during normal operation. These servers (shown in the dotted line box) are provisioned on the fly using the latest images during DR.

Database Servers

Oracle Data Guard is set up to run a standby database server in the DR site. This is the only DR server that exists during normal operations. This part can be further simplified by running the system database on AWS RDS multi-AZ service.

Elastic File System (EFS)

The EFS is used for the vault file system of WebCenter Content. Any changes to the EFS file system are automatically committed to and become available across all Availability Zones in the region.

FSx for Lustre File System

The FSx for Lustre is used for the web layout file system of WebCenter Content. There is no FSx for the Lustre file system kept on the DR site during normal operations. Leveraging its fast restore feature, we only provision the FSx for the Lustre file system (shown in the dotted line box) on the fly from the latest backup during DR.

At a high level, the tasks below occur during a DR failover/switchover event. For the most part, these tasks can be automated for fast recovery.

  1. Switch over the database to the Data Guard standby site.
  2. Restore the web and application servers in the DR site using the latest images.
  3. Restore the FSx for the Lustre file system in the DR site using the latest backup.
  4. Update internal DNS records such that all server names and FSx file system names will resolve to the resources in the DR site.

Depending on the complexity and size of the implementation, this DR solution typically achieves an RPO of between 1 and 8 hours and a near-zero RTO. The cost of this DR solution is kept to a minimum, with only the database server resources being duplicated during normal operations.

No Comments Yet

Let us know what you think