TDMQ for CKafka (CKafka) provides traffic throttling protection solutions at both cluster and topic levels. The specific configuration description is as follows:
Configuring Cluster-Level Traffic Throttling Rules
After you purchase peak bandwidth of a specified specification, the system monitors business traffic usage in real time and automatically determines whether the cluster exceeds the specified usage based on the traffic throttling mechanism. If it detects that the traffic continuously exceeds the threshold of the purchased specification, the system automatically triggers traffic throttling protection, and no special configuration is required. For detailed information about the traffic throttling mechanism, see Traffic Throttling Mechanism. If you purchase a Pro Edition instance, you can enable elastic bandwidth to cope with burst traffic and prevent business interruption caused by traffic throttling. For specific steps, see Enabling Elastic Bandwidth. Configuring Traffic Throttling Rules for Topics
You can set traffic throttling rules for topics to prevent excessive traffic on a single topic from affecting other topics.
Note
Only when the broker version is 2.11-1.1.1_1.1.4 or later, 2.12-2.4.2_1.1.1 or later, or 2.8.1_1.0.8 or later can you set traffic throttling rules for topics.
Operation Steps
2. In the left sidebar, select Instance List. Click the ID of the target instance to go to the basic instance information page.
3. On the basic instance information page, select the Topic List tab, and choose More > Traffic Throttling in the Operation column of the target topic.
4. Enable traffic throttling in the pop-up window and configure traffic throttling rules.
Max Topic Production Traffic: With replica traffic excluded, the value ranges from 1 MB/s to the maximum bandwidth purchased for the instance or the number of replicas for this topic.
Max Topic Consumption Traffic: The value ranges from 1 MB/s to the maximum bandwidth purchased for the instance. Since traffic throttling is implemented at the broker level, the actual traffic throttling value (which equals an integer multiple of the number of brokers) may differ slightly from the configured value. For details about the soft traffic throttling mechanism, see Traffic Throttling Mechanism. Use Cases for CKafka Traffic Throttling
Planning for the Number of Partitions and Partial Traffic Throttling
CKafka instances feature a distributed multi-node deployment mode and provide overall write and consumption services, with each node assigned fixed quotas for read-write traffic throttling. To improve CKafka traffic utilization, you need to try to ensure that the number of partitions is a multiple of the number of nodes (when the number of partitions is a multiple of the number of nodes, CKafka will try to make each node store the same number of partitions) and maintain balanced traffic distribution (special scenarios such as specified message keys may cause uneven write traffic; by default, CKafka clients try to evenly distribute traffic across partitions when sending it to servers). This effectively prevents local traffic throttling issues caused by local hot spots.
Number of Traffic Throttling Events and Monitoring of Response Packet Latency for Instances
The number of traffic throttling events for a CKafka instance is the sum of the number of traffic throttling events for all nodes. It does not reflect the overall write and consumption performance of the entire instance, nor does it indicate that traffic throttling has been triggered on all nodes. Therefore, when the number of traffic throttling events occurs but the throttled traffic remains below the specification value, you can check the statistics on the number of traffic throttling events for each node in Advanced Monitoring to identify which node has experienced traffic throttling.
To address such issues, it is recommended to adjust the number of partitions to a multiple of the number of nodes to improve overall bandwidth utilization of the instance. Since CKafka currently employs a response packet latency policy, when traffic throttling occurs, you need to pay close attention to the response packet latency metric in the Advanced Monitoring section of Pro Edition.
Differences Between Write and Consumption Traffic Throttling Models
Since production involves replica synchronization, division by the number of replicas is required. In comparison, consumption involves pulling data only from the leader. Accordingly, division by the number of replicas is not required.
Maximum write traffic per node = Specification value/Number of nodes/Number of replicas
Consumption traffic per node = Specification value/Number of nodes
Number of Occasional Traffic Throttling Events
Since replicas consume write throttling bandwidth, more replicas result in lower available write traffic. Traffic throttling is executed at the node level. While the traffic throttling system observes monitoring metrics at the second level, the monitoring metrics displayed to customers are at the minute level. This explains why traffic throttling is more sensitive. When overall write traffic exceeds 70% (divided by the number of replicas), local traffic throttling may briefly occur on certain nodes at the second level. To address this issue, you can check how much the traffic has exceeded by viewing the node traffic throttling metric in Advanced Monitoring. For quicker response, reserve a 30% buffer in specifications.
Number of Persistent Traffic Throttling Events
When the number of traffic throttling events for an instance occurs, while Advanced Monitoring shows persistent traffic throttling on every node and instance traffic at more than 10% below the specification value, with the impact of traffic throttling rules for topics excluded, this situation does not meet expectations. For such issues, submit a ticket for support.
FAQs
Why Is the Actual Traffic Throttling Value Lower than the Configured Threshold?
The configured traffic throttling value is actually distributed to each broker (traffic throttling on each broker). A traffic throttling value for an instance is distributed by configured traffic throttling value/number of brokers, and a traffic throttling value for a topic is also distributed by configured traffic throttling value/number of brokers.
Especially when the number of topic partitions is smaller than the number of brokers or partition traffic is imbalanced, premature traffic throttling may occur. In such cases, it is recommended to increase the number of partitions or optimize partition traffic. For specific steps, see Changing Instance Specifications and Configuring Topic Partition Balancing. Why Is Traffic Throttling Triggered When the System Monitors that the Displayed Production/Consumption Traffic Is Below the Instance Specifications?
As mentioned above, traffic throttling is executed in milliseconds. The monitoring platform in the console collects data at the second level and aggregates it (maximum values or average values) at the minute level.
According to the token bucket principle, a single bucket does not limit traffic forcibly. If instance A has a bandwidth specification of 100 MB/s, the traffic throttling threshold per 100 ms time bucket is 100 MB/10 or 10 MB/bucket. Assume the production traffic of instance A reaches 30 MB (3 times the traffic throttling threshold for a time bucket) in the first 100 ms time bucket of a given second. This would trigger the traffic throttling policy of the broker to increase response packet latency—for example, extending the original 100 ms TCP return time by an additional 500 ms. Consequently, the traffic within that second is as follows: 30 MB x 1 + 0 MB x 5 + 10 MB x 4 = 70 MB. This indicates that the traffic rate within that second is 70 MB/s, which is below the instance specification of 100 MB/s.
Why Can Peak Production/Consumption Traffic Be Above the Instance Specifications?
As mentioned above, if instance A has a bandwidth specification of 100 MB/s, the traffic throttling threshold per 100 ms time bucket is 10 MB. Assume the production traffic of instance A reaches 70 MB (7 times the traffic throttling threshold for a time bucket) in the first 100 ms time bucket of a given second. This would trigger the traffic throttling policy of the broker to increase response packet latency—for example, extending the original 100 ms TCP return time by an additional 800 ms. After the packet returns at 900 ms, the client immediately sends another 70 MB of traffic in the 10th time bucket. Consequently, the traffic within that second is as follows: 70 MB x 1 + 0 MB x 8 + 70 MB x 1 = 140 MB. This indicates that the traffic rate within that second is 140 MB/s, which is above the instance specification of 100 MB/s.
Why Does the Number of Traffic Throttling Events Surge?
Statistics on the number of traffic throttling events are based on TCP requests. If instance A exceeds the traffic threshold in the first time bucket of a second, all TCP requests during the remaining time of that bucket will be throttled and counted as traffic throttling events.
How Does CKafka Perform Traffic Throttling?
To ensure service stability, CKafka implements traffic control on both message production and consumption.
When the sum of traffic of all your replicas exceeds the peak traffic purchased, traffic throttling is performed.
When traffic throttling occurs on the producer, CKafka will extend the response time of a TCP connection. The delay time depends on how much the user's instantaneous traffic exceeds the limit. This is similar to the road traffic control principle. The more the traffic exceeds the threshold, the greater the delay value calculated by the algorithm, with a maximum delay of 5 minutes.
When traffic throttling occurs on the consumer, CKafka will reduce the value of fetch.request.max.bytes each time the consumer pulls data from it to control the traffic on the consumer.
How Do I Determine Whether CKafka Is Performing Traffic Throttling?
1. The instance list displays the health status of each cluster. When the health status displays Alarm, you can hover your mouse pointer over it to view the detailed data in the pop-up window. The data displays the current user's peak traffic and the number of traffic throttling events. You can use the data to determine whether the instance has experienced traffic throttling.
2. You can check the maximum traffic on the monitoring data page. If the value of the maximum traffic multiplied by the number of replicas is greater than the peak bandwidth purchased, it indicates that at least one traffic throttling event has occurred. You can configure a traffic throttling alarm to determine whether traffic throttling has occurred. For how to configure alarm rules, see Configuring an Alarm Policy. 3. On the monitoring page of the CKafka console, check the monitoring data of the instance. If the number of traffic throttling events is greater than 0, it indicates that traffic throttling has occurred. For specific steps, see Viewing Monitoring Data.