As the business grows and data traffic increases, CKafka may lead to excessive resource consumption on the Broker when producers and consumers produce/consume large amounts of data or generate requests at extremely high speeds, resulting in network I/O saturation and other issues. To avoid impacting the entire business, CKafka has designed a refined throttling scheme for nodes to enable self-protection and prevent affecting the full business due to excessive resource consumption.
Overview
Take an instance of the 20MB sales specification as an example:
CKafka targets customer use scenarios and requirements, with at least 3-node deployment. Therefore, for 20MB traffic, each node is designed to withstand 6.67MB of read and write traffic. In use, it is recommended to set the number of partitions to 2-3 times the nodes to balance request traffic and enhance maximum performance.
Currently, traffic throttling capabilities of CKafka are provided at two levels, as described below:
Cluster-level traffic throttling
The overall write traffic throttling is 20 MB/s, indicating the maximum write traffic (including replicas) tested by the customer is around 20 MB/s. However, with three nodes and the traffic of 6.67 MB/s for each node, the maximum write traffic for individual partitions is 6.67 MB/s. If a node has two partitions, the maximum write traffic is 3.33 MB/s when the replica traffic is also counted.
The overall read traffic throttling is 20 MB/s, indicating the maximum read traffic (excluding replicas) tested by the customer is around 20 MB/s.
Topic-level traffic throttling
Customers can configure traffic throttling for Topics. For example, for a Topic with 2 replicas, the production traffic can be limited to 7 MB/s (including replicas), and the consumption traffic to 20 MB/s.
Production traffic throttling: 7 MB/s
Consumption traffic throttling: 20 MB/s
Traffic Throttling Mechanism Explanation
CKafka's traffic throttling mechanism adopts soft throttling, which means that when user traffic exceeds the quota, the server delays packet response instead of returning an error to the client.
Take API traffic throttling as an example:
Hard throttling : Assume that the call frequency is 100 times/s. The server will return an error when client calls exceed 100 times per second, and the client needs to handle the error according to the business logic.
Soft throttling : Assume that the call frequency is 100 times/s, and the normal time consumption is 10 ms. When client calls exceed 100 times per second:
The request takes 20 ms if frequency exceeds 110 times per second.
The request takes 50 ms if frequency exceeds 200 times per second. This is friendly to the client. No error or alarm will be triggered due to traffic surges or fluctuations, and the business can run normally.
Therefore, soft throttling is better for users in high-traffic scenarios like Kafka.
Relationship between the purchased bandwidth and production/consumption bandwidth:
Maximum production bandwidth (per second) = purchased bandwidth / number of replicas
Maximum consumption bandwidth (per second) = purchased bandwidth
Delayed Response Packet Traffic Throttling Principles
The underlying traffic throttling mechanism of CKafka instances is implemented based on the token bucket principle. Each second is divided into multiple time buckets, measured in milliseconds.
The traffic throttling policy divides each second (1,000 ms) into several time buckets. For example, if there are 10 time buckets, each bucket is 100 ms. The traffic throttling for each time bucket is 1/10 of the total instance traffic. If the traffic of a TCP request in a certain time bucket exceeds the traffic throttling, the packet response delay for that request will be increased according to the internal traffic throttling algorithm. In this way, the client cannot quickly receive the TCP response, thus achieving traffic throttling.