tencent cloud

Testing Scheme Introduction

Download
Focus Mode
Font Size
Last updated: 2026-05-27 10:47:00

Overview

This document introduces how to use the Star Schema dataset to conduct performance tests on Tencent Cloud TCHouse-C and provides a reference solution for data import and performance testing.

Preparations

Instance Purchasing

Refer to Quick Start to purchase a Tencent Cloud TCHouse-C instance. You can select the Standard, High-Performance, or Large-Storage type. For the kernel version, version 22.8 or later is recommended.

Prepare Test Machine

Prepare a Linux machine that can access the Tencent Cloud TCHouse-C service and install the clickhouse client tool on it. The test machine must be able to access the Tencent Cloud TCHouse-C service and requires at least 1.5 TB of storage space. For instructions on installing the clickhouse client tool on the test machine, see the Installation Guide.
After an instance is purchased, adjust the following parameters in the console:
Parameter Name
Configuration File
Feature
Recommended Value
max_threads
users.xml
Maximum number of threads allowed for a single query
Number of CPU Cores
max_insert_threads
users.xml
Maximum number of threads allowed for a single write operation
Number of CPU Cores
max_memory_usage | users.xml
Maximum memory allowed for a single query
Total memory
10GB
background_pool_size
users.xml
Size of the background task thread pool for the MergeTree engine
Number of CPU Cores * 2 
max_thread_pool_size
config.xml
Maximum number of threads that can be allocated in the global thread pool
20000 
max_open_files
config.xml
Maximum number of file handles that a process is allowed to open
1000000 
mark_cache_size
config.xml 
Size of the mark file cache
10737418240
For specific parameter adjustments, see Parameter Configuration.
Note:
After the adjustments are complete, restart the cluster.

Test Steps

Verifying the Software Version

Use the clickhouse client to access the Tencent Cloud TCHouse-C service and view the software version.
clickhouse client --host $HOST --port $PORT -q "select version()"
Note:
Ensure that the software version is greater than 22.8.*.

Preparing Data Generation Tools

$ git clone git@github.com:vadimtk/ssb-dbgen.git
$ cd ssb-dbgen
$ make

Generating Test Data

The ssb-dbgen tool supports data at two scales. Using the parameter -s 100 can generate approximately 600 million rows of data, while using the parameter -s 1000 can generate approximately 6 billion rows of data. The following approach is recommended:
# Generate approximately 6 billion rows of data (-s 1000 indicates the scale factor)
./dbgen -s 1000 -T c # Generates data for the customer table
./dbgen -s 1000 -T l # Generates data for the lineorder table
./dbgen -s 1000 -T p # Generates data for the part table
./dbgen -s 1000 -T s # Generates data for the supplier table

Creating Database Tables

On the Tencent Cloud TCHouse-C console, obtain the service entry information: the access IP address and service port. Record them as HOST and PORT respectively. Use the clickhouse client tool to connect to the Tencent Cloud TCHouse-C service and execute the following SQL:
CREATE TABLE customer
(
        C_CUSTKEY       UInt32,
        C_NAME          String,
        C_ADDRESS       String,
        C_CITY          LowCardinality(String),
        C_NATION        LowCardinality(String),
        C_REGION        LowCardinality(String),
        C_PHONE         String,
        C_MKTSEGMENT    LowCardinality(String)
)
ENGINE = MergeTree ORDER BY (C_CUSTKEY);

CREATE TABLE lineorder
(
    LO_ORDERKEY             UInt32,
    LO_LINENUMBER           UInt8,
    LO_CUSTKEY              UInt32,
    LO_PARTKEY              UInt32,
    LO_SUPPKEY              UInt32,
    LO_ORDERDATE            Date,
    LO_ORDERPRIORITY        LowCardinality(String),
    LO_SHIPPRIORITY         UInt8,
    LO_QUANTITY             UInt8,
    LO_EXTENDEDPRICE        UInt32,
    LO_ORDTOTALPRICE        UInt32,
    LO_DISCOUNT             UInt8,
    LO_REVENUE              UInt32,
    LO_SUPPLYCOST           UInt32,
    LO_TAX                  UInt8,
    LO_COMMITDATE           Date,
    LO_SHIPMODE             LowCardinality(String)
)
ENGINE = MergeTree PARTITION BY toYear(LO_ORDERDATE) ORDER BY (LO_ORDERDATE, LO_ORDERKEY);

CREATE TABLE part
(
        P_PARTKEY       UInt32,
        P_NAME          String,
        P_MFGR          LowCardinality(String),
        P_CATEGORY      LowCardinality(String),
        P_BRAND         LowCardinality(String),
        P_COLOR         LowCardinality(String),
        P_TYPE          LowCardinality(String),
        P_SIZE          UInt8,
        P_CONTAINER     LowCardinality(String)
)
ENGINE = MergeTree ORDER BY P_PARTKEY;

CREATE TABLE supplier
(
        S_SUPPKEY       UInt32,
        S_NAME          String,
        S_ADDRESS       String,
        S_CITY          LowCardinality(String),
        S_NATION        LowCardinality(String),
        S_REGION        LowCardinality(String),
        S_PHONE         String
)
ENGINE = MergeTree ORDER BY S_SUPPKEY;

Importing Test Data

First, import the base table data:
$ clickhouse client --host $HOST --port $PORT --query "INSERT INTO customer FORMAT CSV" < customer.tbl
$ clickhouse client --host $HOST --port $PORT --query "INSERT INTO part FORMAT CSV" < part.tbl
$ clickhouse client --host $HOST --port $PORT --query "INSERT INTO supplier FORMAT CSV" < supplier.tbl
$ clickhouse client --host $HOST --port $PORT --query "INSERT INTO lineorder FORMAT CSV" < lineorder.tbl
Next, generate the wide table data based on the base table data.
Note that the max_memory_usage and max_insert_threads parameters have been adjusted.
CREATE TABLE lineorder_flat
ENGINE = MergeTree ORDER BY (LO_ORDERDATE, LO_ORDERKEY)
AS SELECT
    l.LO_ORDERKEY AS LO_ORDERKEY,
    l.LO_LINENUMBER AS LO_LINENUMBER,
    l.LO_CUSTKEY AS LO_CUSTKEY,
    l.LO_PARTKEY AS LO_PARTKEY,
    l.LO_SUPPKEY AS LO_SUPPKEY,
    l.LO_ORDERDATE AS LO_ORDERDATE,
    l.LO_ORDERPRIORITY AS LO_ORDERPRIORITY,
    l.LO_SHIPPRIORITY AS LO_SHIPPRIORITY,
    l.LO_QUANTITY AS LO_QUANTITY,
    l.LO_EXTENDEDPRICE AS LO_EXTENDEDPRICE,
    l.LO_ORDTOTALPRICE AS LO_ORDTOTALPRICE,
    l.LO_DISCOUNT AS LO_DISCOUNT,
    l.LO_REVENUE AS LO_REVENUE,
    l.LO_SUPPLYCOST AS LO_SUPPLYCOST,
    l.LO_TAX AS LO_TAX,
    l.LO_COMMITDATE AS LO_COMMITDATE,
    l.LO_SHIPMODE AS LO_SHIPMODE,
    c.C_NAME AS C_NAME,
    c.C_ADDRESS AS C_ADDRESS,
    c.C_CITY AS C_CITY,
    c.C_NATION AS C_NATION,
    c.C_REGION AS C_REGION,
    c.C_PHONE AS C_PHONE,
    c.C_MKTSEGMENT AS C_MKTSEGMENT,
    s.S_NAME AS S_NAME,
    s.S_ADDRESS AS S_ADDRESS,
    s.S_CITY AS S_CITY,
    s.S_NATION AS S_NATION,
    s.S_REGION AS S_REGION,
    s.S_PHONE AS S_PHONE,
    p.P_NAME AS P_NAME,
    p.P_MFGR AS P_MFGR,
    p.P_CATEGORY AS P_CATEGORY,
    p.P_BRAND AS P_BRAND,
    p.P_COLOR AS P_COLOR,
    p.P_TYPE AS P_TYPE,
    p.P_SIZE AS P_SIZE,
    p.P_CONTAINER AS P_CONTAINER
FROM lineorder AS l
INNER JOIN customer AS c ON c.C_CUSTKEY = l.LO_CUSTKEY
INNER JOIN supplier AS s ON s.S_SUPPKEY = l.LO_SUPPKEY
INNER JOIN part AS p ON p.P_PARTKEY = l.LO_PARTKEY;

Optimizing Queries (Optional)

Tencent Cloud TCHouse-C provides precomputation capabilities to accelerate execution. Here, PROJECTION is used to accelerate queries. Execute the following SQL:
    ALTER TABLE lineorder_flat
    ADD PROJECTION p1
    (
        SELECT 
            toYear(LO_ORDERDATE) AS year,
            sum(LO_REVENUE)
        GROUP BY 
            year,
            P_BRAND,
            P_CATEGORY,
            S_REGION
    );
    ALTER TABLE lineorder_flat
    ADD PROJECTION p2
    (
        SELECT 
            toYear(LO_ORDERDATE) AS year,
            sum(LO_REVENUE)
        GROUP BY 
            year,
            C_NATION,
            S_NATION,
            C_REGION,
            S_REGION
    );
    ALTER TABLE lineorder_flat
    ADD PROJECTION p3
    (
        SELECT 
            toYear(LO_ORDERDATE) AS year,
            sum(LO_REVENUE)
        GROUP BY 
            year,
            C_CITY,
            S_CITY
    );
    ALTER TABLE lineorder_flat
    ADD PROJECTION p4
    (
        SELECT 
            toYear(LO_ORDERDATE) AS year,
            sum(LO_REVENUE)
        GROUP BY 
            year,
            C_NATION,
            C_CITY,
            S_NATION,
            S_CITY
    );
    ALTER TABLE lineorder_flat
    ADD PROJECTION p5
    (
        SELECT 
            toYear(LO_ORDERDATE) AS year,
            sum(LO_REVENUE - LO_SUPPLYCOST) AS profit
        GROUP BY 
            year,
            C_NATION,
            C_REGION,
            S_REGION,
            P_MFGR,
            P_MFGR
    );
    ALTER TABLE lineorder_flat
    ADD PROJECTION p6
    (
        SELECT 
            toYear(LO_ORDERDATE) AS year,
            sum(LO_REVENUE - LO_SUPPLYCOST)
        GROUP BY 
            year,
            S_NATION,
            P_CATEGORY,
            C_REGION,
            S_REGION,
            P_MFGR
    );
    ALTER TABLE lineorder_flat
    ADD PROJECTION p7
    (
        SELECT 
            toYear(LO_ORDERDATE) AS year,
            sum(LO_REVENUE - LO_SUPPLYCOST)
        GROUP BY 
            year,
            S_CITY,
            P_BRAND,
            S_NATION,
            P_CATEGORY
    );    
After the preceding SQL is executed, you need to process the existing data to make the PROJECTION effective on it. Execute the following SQL:
ALTER TABLE lineorder_flat MATERIALIZE PROJECTION p1;
ALTER TABLE lineorder_flat MATERIALIZE PROJECTION p2;
ALTER TABLE lineorder_flat MATERIALIZE PROJECTION p3;
ALTER TABLE lineorder_flat MATERIALIZE PROJECTION p4;
ALTER TABLE lineorder_flat MATERIALIZE PROJECTION p5;
ALTER TABLE lineorder_flat MATERIALIZE PROJECTION p6;
ALTER TABLE lineorder_flat MATERIALIZE PROJECTION p7;
Note:
This step is optional. When the optimization is applied, the performance improvement is very significant.

Executing Test SQL and Collecting Execution Time Data

Q1.1
SELECT sum(LO_EXTENDEDPRICE * LO_DISCOUNT) AS revenue
FROM lineorder_flat
WHERE toYear(LO_ORDERDATE) = 1993 AND LO_DISCOUNT BETWEEN 1 AND 3 AND LO_QUANTITY < 25;
Q1.2

SELECT sum(LO_EXTENDEDPRICE * LO_DISCOUNT) AS revenue
FROM lineorder_flat
WHERE toYYYYMM(LO_ORDERDATE) = 199401 AND LO_DISCOUNT BETWEEN 4 AND 6 AND LO_QUANTITY BETWEEN 26 AND 35;
Q1.3
SELECT sum(LO_EXTENDEDPRICE * LO_DISCOUNT) AS revenue
FROM lineorder_flat
WHERE toISOWeek(LO_ORDERDATE) = 6 AND toYear(LO_ORDERDATE) = 1994
  AND LO_DISCOUNT BETWEEN 5 AND 7 AND LO_QUANTITY BETWEEN 26 AND 35;
Q2.1
SELECT
    sum(LO_REVENUE),
    toYear(LO_ORDERDATE) AS year,
    P_BRAND
FROM lineorder_flat
WHERE P_CATEGORY = 'MFGR#12' AND S_REGION = 'AMERICA'
GROUP BY
    year,
    P_BRAND
ORDER BY
    year,
    P_BRAND;
Q2.2
SELECT
    sum(LO_REVENUE),
    toYear(LO_ORDERDATE) AS year,
    P_BRAND
FROM lineorder_flat
WHERE P_BRAND >= 'MFGR#2221' AND P_BRAND <= 'MFGR#2228' AND S_REGION = 'ASIA'
GROUP BY
    year,
    P_BRAND
ORDER BY
    year,
    P_BRAND;
Q2.3
SELECT
    sum(LO_REVENUE),
    toYear(LO_ORDERDATE) AS year,
    P_BRAND
FROM lineorder_flat
WHERE P_BRAND = 'MFGR#2239' AND S_REGION = 'EUROPE'
GROUP BY
    year,
    P_BRAND
ORDER BY
    year,
    P_BRAND;
Q3.1
SELECT
    C_NATION,
    S_NATION,
    toYear(LO_ORDERDATE) AS year,
    sum(LO_REVENUE) AS revenue
FROM lineorder_flat
WHERE C_REGION = 'ASIA' AND S_REGION = 'ASIA' AND year >= 1992 AND year <= 1997
GROUP BY
    C_NATION,
    S_NATION,
    year
ORDER BY
    year ASC,
    revenue DESC;
Q3.2
SELECT
    C_CITY,
    S_CITY,
    toYear(LO_ORDERDATE) AS year,
    sum(LO_REVENUE) AS revenue
FROM lineorder_flat
WHERE C_NATION = 'UNITED STATES' AND S_NATION = 'UNITED STATES' AND year >= 1992 AND year <= 1997
GROUP BY
    C_CITY,
    S_CITY,
    year
ORDER BY
    year ASC,
    revenue DESC;
Q3.3
SELECT
    C_CITY,
    S_CITY,
    toYear(LO_ORDERDATE) AS year,
    sum(LO_REVENUE) AS revenue
FROM lineorder_flat
WHERE (C_CITY = 'UNITED KI1' OR C_CITY = 'UNITED KI5') AND (S_CITY = 'UNITED KI1' OR S_CITY = 'UNITED KI5') AND year >= 1992 AND year <= 1997
GROUP BY
    C_CITY,
    S_CITY,
    year
ORDER BY
    year ASC,
    revenue DESC;
Q3.4
SELECT
    C_CITY,
    S_CITY,
    toYear(LO_ORDERDATE) AS year,
    sum(LO_REVENUE) AS revenue
FROM lineorder_flat
WHERE (C_CITY = 'UNITED KI1' OR C_CITY = 'UNITED KI5') AND (S_CITY = 'UNITED KI1' OR S_CITY = 'UNITED KI5') AND toYYYYMM(LO_ORDERDATE) = 199712
GROUP BY
    C_CITY,
    S_CITY,
    year
ORDER BY
    year ASC,
    revenue DESC;
Q4.1
SELECT
    toYear(LO_ORDERDATE) AS year,
    C_NATION,
    sum(LO_REVENUE - LO_SUPPLYCOST) AS profit
FROM lineorder_flat
WHERE C_REGION = 'AMERICA' AND S_REGION = 'AMERICA' AND (P_MFGR = 'MFGR#1' OR P_MFGR = 'MFGR#2')
GROUP BY
    year,
    C_NATION
ORDER BY
    year ASC,
    C_NATION ASC;
Q4.2
SELECT
    toYear(LO_ORDERDATE) AS year,
    S_NATION,
    P_CATEGORY,
    sum(LO_REVENUE - LO_SUPPLYCOST) AS profit
FROM lineorder_flat
WHERE C_REGION = 'AMERICA' AND S_REGION = 'AMERICA' AND (year = 1997 OR year = 1998) AND (P_MFGR = 'MFGR#1' OR P_MFGR = 'MFGR#2')
GROUP BY
    year,
    S_NATION,
    P_CATEGORY
ORDER BY
    year ASC,
    S_NATION ASC,
    P_CATEGORY ASC;
Q4.3
SELECT
    toYear(LO_ORDERDATE) AS year,
    S_CITY,
    P_BRAND,
    sum(LO_REVENUE - LO_SUPPLYCOST) AS profit
FROM lineorder_flat
WHERE S_NATION = 'UNITED STATES' AND (year = 1997 OR year = 1998) AND P_CATEGORY = 'MFGR#14'
GROUP BY
    year,
    S_CITY,
    P_BRAND
ORDER BY
    year ASC,
    S_CITY ASC,
    P_BRAND ASC;

Summary

Performance testing, as a critical step before Tencent Cloud TCHouse-C service onboarding, serves as a key basis for performance evaluation and resource assessment.
When a business faces multiple system options, it typically conducts performance comparison tests. During the comparison testing process, pay attention to the following points:
The performance of Tencent Cloud TCHouse-C is affected by some of its key parameters. You must adjust them appropriately to fully leverage its performance advantages.
Resources need to be aligned. For example, in a one-time distributed query, only 1/2 of the nodes in Tencent Cloud TCHouse-C participate in computation, whereas all nodes in other systems do. Under these conditions and with equivalent data volumes, Tencent Cloud TCHouse-C may not show superior performance metrics. In this case, you can adjust the cluster backup policy to involve all nodes in computation. 
Tencent Cloud TCHouse-C incorporates many unique performance optimization mechanisms. Enabling these mechanisms can significantly improve query performance.


Help and Support

Was this page helpful?

Help us improve! Rate your documentation experience in 5 mins.

Feedback