tencent cloud

Cross-Region Disaster Recovery
Last updated:2025-12-24 15:00:42
Cross-Region Disaster Recovery
Last updated: 2025-12-24 15:00:42

Cross-Region Disaster Recovery

Message middleware serves as a critical component in the technical architecture of business systems. TDMQ for Apache Pulsar natively supports multi-availability zone (AZ) disaster recovery. To ensure that customers can promptly migrate business and maintain business continuity in the event of a region-level disaster, we introduced the cross-region disaster recovery solution.
The following figure shows the cross-region disaster recovery solution.

In normal scenarios, when a business accesses the TDMQ for Apache Pulsar server in Region A, users need to complete two main actions:
1. Inter-city network connection. Cross-region VPC networks are connected through Cloud Connect Network (CCN).
2. Metadata synchronization. Cluster metadata across two regions is synchronized through the TDMQ for Apache Pulsar console. The metadata includes namespaces, topics, subscriptions, and roles.
When an exception occurs, the TDMQ for Apache Pulsar console provides the domain name resolution switching feature, which redirects the domain name originally used in Region A to the cluster in Region B (disaster recovery region). This eliminates the need for clients to modify the access point address, implements disaster recovery for the cluster in Region B (disaster recovery region), and ensures business continuity.
After Region A resumes normal operation, users first need to determine whether to write back the messages generated in Region B to Region A to ensure message integrity. If write-back is required, contact the after-sales team for assistance. Then, users switch the domain name resolution of the access point of the cluster in Region B back to Region A. After the switchback operation is completed, clients can normally access Region A.

Operation Guide

2. Create a pro cluster in the backup region. On the Cluster Purchase page, enable Cross-Region Replication and select the cluster to be backed up.

3. In the left sidebar, choose Cross-Region Replication, click New Link, and then configure the cloud data synchronization link for the cluster.
Copy Link Name: Define a name for the synchronization link. It cannot be empty. It can contain digits, letters, and characters "-_=:.". The length cannot exceed 128 characters.
Link Type: Select metadata.
Source Cluster: Select the TDMQ for Apache Pulsar cluster for which you want to provide disaster recovery.
Select Target Cluster: Select an existing target cluster in another region. Only clusters with the same cluster ID are displayed here.
Replication Level: It supports cluster level, namespace level, and topic level.
The cluster level is suitable for cluster-level replication.
The namespace level is suitable for scenarios where clusters in both regions are active on a daily basis, with different namespaces distributed across regions.
4. Click Save to complete the creation.

CCN-based Network Connectivity

Connect the network between the production region and the backup region through CCN to set up a network access channel. This way, clients in the production region can access the backup cluster across regions in the event of a disaster.
For more information about the configuration, see Operation Guide of CCN. Then, perform the following operations:

In the Event of a Disaster

According to the user decision, switch client access to the backup region:
1. Initiate the switch of domain name resolution in the console (if available).
2. If the console is unavailable, customers can contact after-sales architects to initiate the switch from the TDMQ service side.

After Disaster Recovery

According to the user decision, switch client access back to the cluster in the original region:
1. Users determine whether to perform message write-back. If write-back is required, contact the after-sales team for assistance.
2. Initiate domain name switchback in the console. The clients can normally access the original region.

Must-Knows

1. Supported Scope

This feature is supported only by pro clusters.

2. Message Write-Back

Message write-back is a prerequisite decision when users switch traffic back to the original region. This aims to avoid data loss and ensure data integrity. Domain name switchback can be performed only after users determine whether to perform write-back.
Information provided by users:
The list of topics to be migrated. Example: cluster ID, namespace, or a specific topic list.
The start time and end time. Messages in the topic whose sending time falls within this time range are the data to be migrated. The reference field is publishTime in the message header.
Impact of message write-back:
A large number of duplicate messages. Subjectively, the server does not consider the complex state machine for checkpoint synchronization between the source and target clusters. It uniformly considers a migrated message as a new message. As a result, they are regarded as two different messages even if historical data already contains the same message. If duplicate messages have an impact on business, it is recommended to perform idempotent processing on the clients.
Fewer out-of-order messages.

3. Roles

In the source cluster, at least one role is required, and it may not be bound to a namespace. The purpose is to ensure that roles and tokens in the source cluster remain consistent with those in the disaster recovery cluster during synchronization.

4. CCN Configuration

When you configure CCN, ensure that the VPC CIDR blocks created in the two regions do not overlap. For example, the CIDR blocks in the two regions are Guangzhou 10.0.0.0/16 and Shanghai 10.1.0.0/16. This way, CCN can connect the two VPCs without IP address conflicts.

5. Effective Time of Domain Name Switch

It takes approximately 5 seconds to 5 minutes for the domain name switch to take effect. The process includes two phases: Switch domain name resolution; clients disconnect from the cluster in the original region and then reconnect to brokers in the new cluster.

6. After the Switch During a Disaster

In the event of a disaster, after traffic is switched to the disaster recovery cluster, avoid modifying metadata in the backup cluster, such as modifying namespace properties and creating a topic.
Was this page helpful?
You can also Contact Sales or Submit a Ticket for help.
Yes
No

Feedback