tencent cloud

Cluster High Reliability
Last updated: 2025-09-19 14:32:46
Cluster High Reliability
Last updated: 2025-09-19 14:32:46
TDMQ for CKafka provides cluster-level high availability capability. With multi-AZ deployment and the availability zone migration feature, it effectively improves service stability and disaster recovery capability.

Cluster-Level High Availability Description

High Availability Capability
Applicable Versions
Description
Reference Documentation
Cross-AZ deployment
Advanced Edition Pro Edition
When purchasing a CKafka instance in a region with three or more AZs, you can choose multiple availability zones for deployment. Partition replicas are forcibly distributed across nodes in different AZs. This deployment method enables your instance to provide services properly even if a single availability zone becomes unavailable.
Pro Edition supports up to four AZs.
Advanced Edition supports up to two AZs.
Modify instance AZ
Professional Edition
Migrate the instance to another AZ within the same region. After migrating availability zones, all attributes of the instance, configurations, and connection address will not change.
Migrate from one availability zone to another: The instance availability zone is under full load or other situations that impact instance performance.
Migrate from one availability zone to multiple availability zones: Improve disaster recovery capability of the instance and implement cross IDC disaster recovery.

Cross-AZ Deployment

Deployment Architecture


The cross-availability zone deployment architecture of CKafka is divided into network layer, data layer and control layer.

Network Layer

CKafka exposes a VIP to clients. After connecting to the VIP, clients get metadata information on partitions (this metadata usually maps address information to different ports under the same VIP). This VIP can failover to another availability zone at any time. When a certain AZ is unavailable, the VIP automatically drifts to another available node in this region, thereby achieving cross-availability zone disaster recovery.

Data Layer

The CKafka data layer and Native Kafka use the same distributed deployment method, with multiple replicas distributed across different Broker nodes in different AZs. For each partition, there is a Leader-follower relationship between nodes. If the Leader encounters an exception or goes offline, the cluster control node (Controller) will elect a new partition Leader to handle requests for that partition.
For clients, when an AZ becomes unavailable, if the Leader of a Topic partition is located on a Broker node in the unavailable AZ, the established connections may experience timeouts or be closed. After the partition Leader node fails, the Controller (if the Controller node fails, the remaining nodes will elect a new Controller) will elect a new Leader node to provide services. The Leader switchover time is at the second level (the specific switch time is directly proportional to the number of cluster nodes and metadata size). Clients will perform a Scheduled Refresh of Topic partition metadata and connect to the new Leader node for production consumption.

Control layer

The control layer of CKafka and Native Kafka implement the same technical solution, depending on zookeeper for service discovery and cluster Controller election of Broker nodes. CKafka instances supporting cross-availability zone deployment have zookeeper cluster nodes (abbreviated as zk nodes) deployed in three AZs (or IDCs). If any zk node in an AZ disconnects due to failure, the entire zookeeper cluster can still provide services normally.



Pros and Cons of Cross-AZ Deployment

Strengths

It can significantly enhance cluster disaster recovery capability. When unexpected network fluctuations, restart due to power failure, or other force majeure risks occur in a single AZ, it ensures clients can restore message production and consumption after a short waiting time for reconnection.

Disadvantages

If cross-AZ deployment is adopted, since partition replicas are distributed in multiple AZs, message replication incurs additional cross-AZ network latency compared with a single AZ. This latency directly affects the write time consumed for production (when the client ACK parameter is greater than 1 or equal to -1, all). Currently, the cross-AZ latency in major regions such as Guangzhou, Shanghai, and Beijing is generally 10ms–40ms.

Cross-AZ Deployment Scenario Analysis

Single AZ Unavailable

After a single AZ becomes unavailable, as explained earlier, clients will disconnect and reconnect, and after reconnection, the service can still be provided normally.
Since the management API service currently does not support cross-AZ deployment, if a single AZ becomes unavailable, you may be unable to create topics, configure ACL policies, or view monitoring through the console. However, this will not affect the production and consumption of existing services.

Network Isolation between Two AZs

If network isolation occurs between two AZs, making them unable to communicate with each other, a cluster split-brain event may appear. In this case, nodes in both AZs provide services, but data writes from one AZ may be deemed as dirty data after cluster recovery.
Consider the following scenario: when the cluster Controller node and one zk node in the ZooKeeper cluster experience network isolation from other nodes. At this point, the other nodes will re-elect a new Controller (since the majority of ZooKeeper cluster nodes have normal network communication, the Controller election can succeed). However, the isolated Controller still considers itself the Controller node, leading to a split-brain situation in the cluster.
At this point, client writes need to be considered case by case. For example, when the client's ACK policy equals -1 or all, and the number of replicas is 2, assuming the cluster has 3 nodes, a split-brain scenario may result in a 2:1 distribution. Partitioned writes from the original Leader in the 1-node region will report errors, while the other side will proceed with normal writing. However, if there are 3 replicas and ACK = -1 or all is configured, neither side will write successfully. At this point, further handling solutions must be determined based on specific parameter configurations.
After the cluster network is restored, clients can resume production and consumption without any operation. However, since the server will normalize the data again, the data of one split node will be directly truncated. But for multi-replica cross-AZ data storage methods, this kind of truncation will not bring data loss.

Multi-AZ Disaster Recovery Limit

Capacity Constraint

CKafka implements multi-AZ disaster recovery through the geographic distribution of underlying resources. In case of AZ failure, CKafka will switch the Partition Leader to a Broker node in an available AZ, so the underlying resources carrying traffic will become fewer. Therefore, there will be capacity issues. The following is the resource available case for single availability zone failure:
For a 2-AZ disaster recovery instance, usable capacity = instance / 2. To ensure proper usage, the customer needs to reserve 100% redundant resources.
For a 3-AZ disaster recovery instance, usable capacity = instance / 3 * 2. To ensure proper usage, the customer needs to reserve 50% redundant resources.
For a 4-AZ disaster recovery instance, usable capacity = instance / 4 * 3. To ensure proper usage, the customer needs to reserve 33.3% redundant resources.
For an n-AZ disaster recovery instance, usable capacity = instance / n * (n - 1). To ensure proper usage, the customer needs to reserve 1 / (n - 1) redundant resources.

Parameters and Configuration Limits

CKafka allows modifying the Topic-level configuration `min.insync.replicas` in the console. This configuration is valid only when the client sets ack=-1, to ensure a message is considered successful only after being synchronized by `min.insync.replicas` replicas simultaneously (for example: Topic replicas = 3, min.insync.replicas=2, message production is required to be synchronized by at least 2 replicas to be considered successful).
Therefore, the Topic configuration (`min.insync.replicas` < number of AZs) must be set to ensure production availability in case of AZ failure. The Topic configuration `min.insync.replicas` < Topic replicas must be set to ensure production availability in case of single node failure. The rule is: `min.insync.replicas` < Topic replicas <= number of AZs.
2-AZ instance, Topic replicas = 2, min.insync.replicas = 1
3-AZ instance, Topic replicas = 2, min.insync.replicas = 1
3-AZ instance, Topic replicas = 3, min.insync.replicas <= 2

Was this page helpful?
You can also Contact Sales or Submit a Ticket for help.
Yes
No

Feedback