tencent cloud

Message Compression
Last updated:2025-12-24 15:03:04
Message Compression
Last updated: 2025-12-24 15:03:04

Background Description

In TDMQ for Apache Pulsar, the maximum message size is 5 MB. An oversized message body will fail to be sent. Therefore, the client needs to compress large messages to support sending messages of 20 MB.

Large Message Handling in TDMQ for Apache Pulsar

In TDMQ for Apache Pulsar, the maximum message size is 5 MB by default. If a producer attempts to send a message exceeding 5 MB, the message will fail to be sent. When the client sends a message that exceeds this limit, we can adopt the following two methods to handle it:
Chunking messages: TDMQ for Apache Pulsar provides the chunking messages feature. When the chunking mechanism is enabled, the client can automatically split large messages and ensure message integrity, and the consumer can automatically reassemble the messages.
Message compression: Identical character sequences in message data are replaced to reduce the message size. TDMQ for Apache Pulsar supports four compression algorithms: LZ4, ZLIB, ZSTD, and SNAPPY.
It is recommended that large messages be compressed.

Compression Algorithm Analysis and Comparison

Algorithm Introduction

LZ4
LZ4 is a lossless data compression algorithm that delivers extremely fast compression and decompression speeds with minimal CPU consumption.
ZLIB
ZLIB is a commonly used lossless data compression technique that effectively reduces the size of sent and received data, thereby improving network transmission efficiency and capacity. ZLIB is a variant of the Lempel-Ziv compression algorithm, which can compress original data to less than half its original size and supports compression and decompression operations.
ZSTD
ZSTD is a Huffman coding-based compression algorithm and a variant of LZ77. It can efficiently compress different types of data. As a real-time encoding algorithm, it compresses large data faster and more efficiently. Compared with other compression algorithms, ZSTD achieves a higher compression ratio while balancing compression speed.
SNAPPY
SNAPPY is a lossless compression technique that relies on the LZ77 principle to achieve compression. Its core principle is that whenever two repeated strings are found in a data stream, shorter code is used to represent the string, reducing the data stream size.

Algorithm Comparison

Compression Algorithm
Compression Ratio
Compression Speed
Decompression Speed
ZLIB 1.2.11-1
2.743
110 MB/s
400 MB/s
LZ4 1.8.1
2.101
750 MB/s
3,700 MB/s
ZSTD 1.3.4-1
2.877
470 MB/s
1,380 MB/s
SNAPPY 1.1.4
2.091
530 MB/s
1,800 MB/s
Throughput: LZ4 > SNAPPY > ZSTD > ZLIB
Compression ratio: ZSTD > ZLIB > LZ4 > SNAPPY
Network bandwidth consumption: The SNAPPY algorithm consumes the most network bandwidth, and the ZSTD algorithm consumes the least.

Test of Various Compression Algorithms

Test Results

Note:
The following test results are for reference only. The compression effect needs to be verified based on the specific message body content.
Message Size
Message
Compression Algorithm
Topic Monitoring Message Size
Client Message Compression Time
Message Sending Time
5 MB
Random message body
LZ4 (threshold: 5 MB)
9.95 MB
31 ms
0.049 ms
ZLIB
7.26 MB
31 ms
0.038 ms
ZSTD
8.20 MB
31 ms
0.039 ms
SNAPPY (threshold: 5 MB)
9.70 MB
33 ms
0.046 ms
6 MB
Random message body
ZLIB (threshold: 6 MB)
8.71 MB
35 ms
0.044 ms
ZSTD (threshold: 6 MB)
9.84 MB
35 ms
0.046 ms
20 MB
Same message body
LZ4
0.16 MB
41 ms
0.006 ms
ZLIB
0.20 MB
42 ms
0.006 ms
ZSTD
0.01 MB
42 ms
0.003 ms
SNAPPY
2.47 MB
41 ms
0.021 ms
40 MB
Same message body
LZ4
0.32 MB
123 ms
0.008 ms
ZLIB
0.39 MB
122 ms
0.008 ms
ZSTD
0.01 MB
124 ms
0.004 ms
SNAPPY
4.95 MB
123 ms
0.036 ms
80 MB
Same message body
LZ4
0.63 MB
241 ms
0.009 ms
ZLIB
0.39 MB
244 ms
0.01 ms
ZSTD
0.01 MB
243 ms
0.004 ms
SNAPPY (threshold: 80 MB)
9.9 MB
243 ms
0.056 ms
160 MB
Same message body
LZ4
1.26 MB
484 ms
0.013 ms
ZLIB
1.56 MB
479 ms
0.016 ms
ZSTD
0.03 MB
481 ms
0.004 ms
320 MB
Same message body
LZ4
2.5 MB
1,035 ms
0.03 ms
ZLIB
3.1 MB
1,008 ms
0.027 ms
ZSTD
0.03 MB
949 ms
0.004 ms
585 MB
Same message body
LZ4
4.59 MB
1,705 ms
0.027 ms
ZLIB
5.67 MB
1,733 ms
0.03 ms
ZSTD
0.11 MB
1,722 ms
0.006 ms
Summary:
In purely random data streams, the compression efficiency of the four algorithms is not high. When the message size exceeds 5 MB, none of the four compression algorithms can compress the message below 5 MB.
In data streams with a lot of duplicate data, the four compression algorithms can achieve high compression rates. LZ4, ZLIB, and ZSTD can compress messages within 600 MB to within 5 MB.

Message Compression Demo and Usage Test

For details about the message compression demo, see tdmq-sdk-Demo.

Usage Test

Producer-side calling parameters:
java -jar tdmq-sdk-demo-1.0-SNAPSHOT-jar-with-dependencies.jar pulsar://xxxx:6650 
eyJrZXlJZCI6ImRlZmF1bHRfa2V5SWQiLCJhbGciOiJIUzI1NiJ9.eyJzdWIiOiJzdXBlcnVzZXIifQ.dYcCfp4XrdWRKdKaWylobY-_xEExfRCi1pMvNyZXbqU
pulsar-78ra8ownxb7d/BigMSGSpace/BigMSGTopic subname 1 500 0 1 20480 1 0
Consumer-side calling parameters:
java -jar tdmq-sdk-demo-1.0-SNAPSHOT-jar-with-dendencies.jar pulsar://xxxx:6650 
eyJrZXlJZCI6ImRlZmF1bHRfa2V5SWQiLCJhbGciOiJIUzI1NiJ9.eyJzdWIiOiJzdXBlcnVzZXIifQ.dYcCfp4XrdWRKdKaWylobY-_xEExfRCi1pMvNyZXbqU 
pulsar-92d7w2mjwmv9/BigMessSpace/BigMessTopic subname 1 500 1

Was this page helpful?
You can also Contact Sales or Submit a Ticket for help.
Yes
No

Feedback