Kafka's traffic throttling mechanism is soft throttling; that is, when the user's traffic exceeds the quota, Kafka uses the delayed response method for processing instead of returning an error to the client.
Take API traffic throttling as an example:
Hard traffic throttling: If the call rate is 100 times/sec, when the client calls more than 100 times per second, the server will return an error, and the client needs to process it according to the business logic.
Soft traffic throttling: If the call rate is 100 times/sec, and the normal duration is 10 ms, when the client calls more than 100 times per second:
Relationship between purchased bandwidth and production/consumption bandwidth:
- Maximum production bandwidth (per second) = purchased bandwidth/number of replicas
- Maximum consumption bandwidth (per second) = purchased bandwidth
The underlying traffic throttling mechanism of a CKafka instance is implemented based on token bucket. Each second is divided into multiple time buckets measured in milliseconds.
The traffic throttling policy divides every second (1,000 ms) into several time buckets. For example, if one second is divided into 10 time buckets, then the time of each bucket is 100 ms, and the traffic throttling threshold of each time bucket is 1/10 of the total instance specification speed. If the TCP request traffic in a certain time bucket exceeds such threshold, the delayed response time of the request will be increased according to the internal traffic throttling algorithm, so that the client cannot quickly receive the TCP response, thus achieving the traffic throttling effect in a period of time.
As mentioned above, traffic throttling is measured in milliseconds, but the monitoring platform in the console collects data at the second level and aggregates the data at the minute level (maximum or average value).
According to the principle of token bucket, a single bucket does not forcibly throttle the traffic. Suppose the bandwidth specification of instance A is 100 MB/sec, then the traffic throttling threshold of each 100-ms time bucket is 100 MB/10 = 10 MB/bucket. If the production traffic of instance A reaches 30 MB in the first 100-ms time bucket of a certain second (3 times the threshold), then the broker's traffic throttling policy will be triggered to increase the delayed response time. Suppose the original normal TCP response time is 100 ms, then the delay may be increased by 500 ms before response after the threshold is exceeded. The final traffic in this second is 30 MB * 1 + 0 MB * 5 + 10 MB * 4 = 70 MB, so the traffic speed in this second (70 MB/sec) is lower than the instance specification (100 MB/sec).
Suppose again the bandwidth specification of instance A is 100 MB/sec, then the traffic throttling threshold of each 100-ms time bucket is 10 MB. If the production traffic of instance A reaches 70 MB in the first 100-ms time bucket of a certain second (7 times the threshold), then the broker's traffic throttling policy will be triggered to increase the delayed response time. Suppose the original normal TCP response time is 100 ms, then the delay may be increased by 800 ms before response after the threshold is exceeded. After the response is returned at the 900th ms, the client immediately injects 70 MB of traffic into the 10th time bucket. The final traffic in this second is (70 MB * 1 + 0 MB * 8 + 70 MB * 1) = 140 MB, so the traffic speed in this second (140 MB/sec) is higher than the instance specification (100 MB/sec).
The number of traffic throttling events is counted based on TCP requests. If instance A exceeds the traffic threshold in the first time bucket in a certain second, all TCP requests in the remaining time of this time bucket after the threshold is exceeded will be throttled and counted as traffic throttling events.
To ensure service stability, CKafka implement network traffic control strategies on both input and output messages.
fetch.request.max.bytes
request to control the traffic.
Was this page helpful?