tencent cloud

TDMQ for Apache Pulsar

    ドキュメントTDMQ for Apache Pulsar

    Message Compression

    フォーカスモード
    フォントサイズ
    最終更新日: 2025-12-24 15:03:04

    Background Description

    In TDMQ for Apache Pulsar, the maximum message size is 5 MB. An oversized message body will fail to be sent. Therefore, the client needs to compress large messages to support sending messages of 20 MB.

    Large Message Handling in TDMQ for Apache Pulsar

    In TDMQ for Apache Pulsar, the maximum message size is 5 MB by default. If a producer attempts to send a message exceeding 5 MB, the message will fail to be sent. When the client sends a message that exceeds this limit, we can adopt the following two methods to handle it:
    Chunking messages: TDMQ for Apache Pulsar provides the chunking messages feature. When the chunking mechanism is enabled, the client can automatically split large messages and ensure message integrity, and the consumer can automatically reassemble the messages.
    Message compression: Identical character sequences in message data are replaced to reduce the message size. TDMQ for Apache Pulsar supports four compression algorithms: LZ4, ZLIB, ZSTD, and SNAPPY.
    It is recommended that large messages be compressed.

    Compression Algorithm Analysis and Comparison

    Algorithm Introduction

    LZ4
    LZ4 is a lossless data compression algorithm that delivers extremely fast compression and decompression speeds with minimal CPU consumption.
    ZLIB
    ZLIB is a commonly used lossless data compression technique that effectively reduces the size of sent and received data, thereby improving network transmission efficiency and capacity. ZLIB is a variant of the Lempel-Ziv compression algorithm, which can compress original data to less than half its original size and supports compression and decompression operations.
    ZSTD
    ZSTD is a Huffman coding-based compression algorithm and a variant of LZ77. It can efficiently compress different types of data. As a real-time encoding algorithm, it compresses large data faster and more efficiently. Compared with other compression algorithms, ZSTD achieves a higher compression ratio while balancing compression speed.
    SNAPPY
    SNAPPY is a lossless compression technique that relies on the LZ77 principle to achieve compression. Its core principle is that whenever two repeated strings are found in a data stream, shorter code is used to represent the string, reducing the data stream size.

    Algorithm Comparison

    Compression Algorithm
    Compression Ratio
    Compression Speed
    Decompression Speed
    ZLIB 1.2.11-1
    2.743
    110 MB/s
    400 MB/s
    LZ4 1.8.1
    2.101
    750 MB/s
    3,700 MB/s
    ZSTD 1.3.4-1
    2.877
    470 MB/s
    1,380 MB/s
    SNAPPY 1.1.4
    2.091
    530 MB/s
    1,800 MB/s
    Throughput: LZ4 > SNAPPY > ZSTD > ZLIB
    Compression ratio: ZSTD > ZLIB > LZ4 > SNAPPY
    Network bandwidth consumption: The SNAPPY algorithm consumes the most network bandwidth, and the ZSTD algorithm consumes the least.

    Test of Various Compression Algorithms

    Test Results

    Note:
    The following test results are for reference only. The compression effect needs to be verified based on the specific message body content.
    Message Size
    Message
    Compression Algorithm
    Topic Monitoring Message Size
    Client Message Compression Time
    Message Sending Time
    5 MB
    Random message body
    LZ4 (threshold: 5 MB)
    9.95 MB
    31 ms
    0.049 ms
    ZLIB
    7.26 MB
    31 ms
    0.038 ms
    ZSTD
    8.20 MB
    31 ms
    0.039 ms
    SNAPPY (threshold: 5 MB)
    9.70 MB
    33 ms
    0.046 ms
    6 MB
    Random message body
    ZLIB (threshold: 6 MB)
    8.71 MB
    35 ms
    0.044 ms
    ZSTD (threshold: 6 MB)
    9.84 MB
    35 ms
    0.046 ms
    20 MB
    Same message body
    LZ4
    0.16 MB
    41 ms
    0.006 ms
    ZLIB
    0.20 MB
    42 ms
    0.006 ms
    ZSTD
    0.01 MB
    42 ms
    0.003 ms
    SNAPPY
    2.47 MB
    41 ms
    0.021 ms
    40 MB
    Same message body
    LZ4
    0.32 MB
    123 ms
    0.008 ms
    ZLIB
    0.39 MB
    122 ms
    0.008 ms
    ZSTD
    0.01 MB
    124 ms
    0.004 ms
    SNAPPY
    4.95 MB
    123 ms
    0.036 ms
    80 MB
    Same message body
    LZ4
    0.63 MB
    241 ms
    0.009 ms
    ZLIB
    0.39 MB
    244 ms
    0.01 ms
    ZSTD
    0.01 MB
    243 ms
    0.004 ms
    SNAPPY (threshold: 80 MB)
    9.9 MB
    243 ms
    0.056 ms
    160 MB
    Same message body
    LZ4
    1.26 MB
    484 ms
    0.013 ms
    ZLIB
    1.56 MB
    479 ms
    0.016 ms
    ZSTD
    0.03 MB
    481 ms
    0.004 ms
    320 MB
    Same message body
    LZ4
    2.5 MB
    1,035 ms
    0.03 ms
    ZLIB
    3.1 MB
    1,008 ms
    0.027 ms
    ZSTD
    0.03 MB
    949 ms
    0.004 ms
    585 MB
    Same message body
    LZ4
    4.59 MB
    1,705 ms
    0.027 ms
    ZLIB
    5.67 MB
    1,733 ms
    0.03 ms
    ZSTD
    0.11 MB
    1,722 ms
    0.006 ms
    Summary:
    In purely random data streams, the compression efficiency of the four algorithms is not high. When the message size exceeds 5 MB, none of the four compression algorithms can compress the message below 5 MB.
    In data streams with a lot of duplicate data, the four compression algorithms can achieve high compression rates. LZ4, ZLIB, and ZSTD can compress messages within 600 MB to within 5 MB.

    Message Compression Demo and Usage Test

    For details about the message compression demo, see tdmq-sdk-Demo.

    Usage Test

    Producer-side calling parameters:
    java -jar tdmq-sdk-demo-1.0-SNAPSHOT-jar-with-dependencies.jar pulsar://xxxx:6650 
    eyJrZXlJZCI6ImRlZmF1bHRfa2V5SWQiLCJhbGciOiJIUzI1NiJ9.eyJzdWIiOiJzdXBlcnVzZXIifQ.dYcCfp4XrdWRKdKaWylobY-_xEExfRCi1pMvNyZXbqU
    pulsar-78ra8ownxb7d/BigMSGSpace/BigMSGTopic subname 1 500 0 1 20480 1 0
    Consumer-side calling parameters:
    java -jar tdmq-sdk-demo-1.0-SNAPSHOT-jar-with-dendencies.jar pulsar://xxxx:6650 
    eyJrZXlJZCI6ImRlZmF1bHRfa2V5SWQiLCJhbGciOiJIUzI1NiJ9.eyJzdWIiOiJzdXBlcnVzZXIifQ.dYcCfp4XrdWRKdKaWylobY-_xEExfRCi1pMvNyZXbqU 
    pulsar-92d7w2mjwmv9/BigMessSpace/BigMessTopic subname 1 500 1
    

    ヘルプとサポート

    この記事はお役に立ちましたか?

    フィードバック