Release Notes

Product Introduction

Overview

Scenarios

Product Architecture

Instance Types

Compatibility Notes

Usage specification recommendations

Kernel Features

Kernel Overview

Kernel Version Release Notes

Functionality Features

Performance Features

Billing

Billing Overview

Purchase Method

Pricing Details

Renewal

Overdue Payments

Refund

Getting Started

Creating an Instance

Connect to Instances

User Guide

Data Migration

Data Subscription

Instance Management

Parameter Configuration

Account Management

Security Group

Backup and Restoration

Database Auditing

Tag Management

Use Cases

Technical Evolution and Usage Practices of Online DDL

Lock Mechanism Analysis and Troubleshooting Practices

Data Intelligent Scheduling and Related Practices for Performance Optimization

TDSQL Boundless Selection Guide and Practical Tutorial

Developer Guide

Developer Guide (MySQL Compatibility Mode)

Developer Guide (HBase Compatibility Mode)

Performance Tuning

Performance Tuning Overview

SQL Tuning

DDL Tuning

Performance White Paper

Performance Overview

TPC-C Test

Sysbench Test

API Documentation

History

Introduction

API Category

Making API Requests

Instance APIs

Security Group APIs

Task APIs

Backup APIs

Rollback APIs

Parameter APIs

Database APIs

Data Types

Error Codes

General Reference

System Architecture

SQL Reference

Database Parameter Description

TPC-H benchmark data model reference

Error Code Information

Security and Compliance

FAQs

Agreements

Service Level Agreement

Data Processing And Security Agreement

Glossary

Creating a Partitioned Table

PDF

聚焦模式

字号

最后更新时间： 2026-03-06 18:48:24

Overview
Partitioned tables (Partition Table) are the core means for TDSQL Boundless to handle massive data and achieve horizontal scaling. Unlike single tables, partitioned tables distribute data across multiple storage nodes in a cluster through partition keys (Partition Key). This enables the database to leverage the overall computing power and storage space of the cluster, breaking through the performance bottlenecks of a single machine.
When Should Partitioned Tables Be Used?
If your business meets any one of the following characteristics, be sure to choose partitioned tables:
1. Large data volumes or high-concurrency writes: requiring extremely high write throughput (TPS) beyond the capacity of a single machine.
2. Unlimited growth: The business is in a rapid development phase, unable to estimate the future data upper limit.
Select a Partitioned Strategy
TDSQL Boundless supports four main partitioning policies. Selecting the correct policy is the most critical step in table creation.
1. Partition using HASH
Introduction: Performs modulo operation on the partition key value to randomly and evenly scatter the data.
Applicable scenarios: The most recommended default choice. Suitable for scenarios where the partition key is an integer type (INT/BIGINT), such as user IDs or purely numeric order numbers.
Advantages: Excellent load balancing; high write performance.
2. Partition using KEY
Introduction: Similar to HASH partitioning but supports multiple data types except BLOB/TEXT. The system processes data using an internal hash function.
Applicable scenarios: Scenarios where the partition key is a string type (VARCHAR/CHAR), such as UUIDs, ID card numbers, or business serial numbers (String).
Advantages: Solves the issue where strings cannot be directly processed by HASH modulo operations, eliminating the need for conversion at the business layer (application layer).
3. RANGE partitioning
Introduction: Data is divided based on continuous ranges of key values (such as time periods and ID ranges).
Applicable scenarios: Time-series data (logs, transactions, monitoring).
Advantages: Fast range queries, and expired historical data can be instantly deleted via DROP PARTITION.
4. LIST partitioning
Introduction: Divides data based on enumerated values (discrete list).
Applicable scenarios: Data with clear categorical attributes, such as regions (provinces/cities) or tenant IDs.
Advantages: Clear data categorization facilitates isolated management by category.
Risk: Data skew. If the data volume of a specific category (such as "Guangdong Province") far exceeds that of other categories, the corresponding node will become a bottleneck.
Syntax and Example
Scenario A: Standard E-Commerce Orders (Using HASH)
The partition key is a purely numeric ID.
CREATE TABLE `mall`.`orders` (
  `order_id` bigint NOT NULL,
  `user_id` bigint NOT NULL,
  `amount` decimal(10,2),
  `create_time` datetime DEFAULT CURRENT_TIMESTAMP,
  -- The primary key must include the partition key
  PRIMARY KEY (`order_id`)
) 
PARTITION BY HASH(`order_id`) 
PARTITIONS 64; 
Scenario B: User Center (Using KEY)
The partition key is a UUID string, and KEY partitioning automatically handles hashing. 
CREATE TABLE `user_center`.`user_profiles` (
  `user_uuid` varchar(36) NOT NULL,
  `nick_name` varchar(50),
  `register_time` datetime,
  PRIMARY KEY (`user_uuid`)
) 
PARTITION BY KEY(`user_uuid`) 
PARTITIONS 64;
Scenario C: Operation Logs (Using RANGE)
Storing logs monthly facilitates regular cleanup of old data.
CREATE TABLE `sys`.`audit_logs` (
  `log_id` bigint NOT NULL,
  `content` text,
  `log_time` datetime NOT NULL,
  PRIMARY KEY (`log_id`, `log_time`)
) 
PARTITION BY RANGE COLUMNS(`log_time`) (
    PARTITION p202301 VALUES LESS THAN ('2023-02-01'),
    PARTITION p202302 VALUES LESS THAN ('2023-03-01'),
    PARTITION p_future VALUES LESS THAN (MAXVALUE)
);
Scenario D: By Geographic Distribution (Using LIST)
Businesses are explicitly partitioned by city, with data isolation between different cities.
CREATE TABLE `crm`.`customers` (
  `cust_id` bigint NOT NULL,
  `city_id` int NOT NULL, 
  `name` varchar(50),
  PRIMARY KEY (`cust_id`, `city_id`)
) 
PARTITION BY LIST(`city_id`) (
    PARTITION p_bj VALUES IN (10),
    PARTITION p_sh VALUES IN (20),
    PARTITION p_others VALUES IN (30)
);
Critical Constraints
Partitioned tables have a mandatory hard constraint that must be adhered to: violating this constraint will result in table creation failure.
The primary key and unique index must include the partition key.
Error example:
-- Error: The partition key user_id is not included in the primary key
CREATE TABLE `error_table` (
  `id` bigint NOT NULL PRIMARY KEY,
  `user_id` bigint NOT NULL
) PARTITION BY HASH(`user_id`);
Correct example:
-- Correct: Add user_id to the composite primary key
CREATE TABLE `correct_table` (
  `id` bigint NOT NULL,
  `user_id` bigint NOT NULL,
  PRIMARY KEY (`id`, `user_id`)
) PARTITION BY HASH(`user_id`);
Use Cases
1. the number of partitions
Policy: Pre-sharding should not only consider the current number of physical nodes, but also account for future cluster scale and expected total data volume.
Recommended value: It is recommended to set it to the maximum expected future number of nodes * 2, which ensures sufficient data dispersion.
2. Avoid modifying the partition key.
Try to avoid using UPDATE statements to modify the value of the partition key. This may cause data rows to migrate between physical nodes (Delete + Insert) at a high cost.
3. Query Optimization
Include the partition key: When querying, always include the partition key in the WHERE clause (for example, WHERE user_id = ?). This allows TDSQL Boundless to directly route the query to the specific node, delivering optimal performance.
Avoid full table scans: Queries without the partition key will trigger broadcast scans (Scatter-Gather), querying all nodes and aggregating the results, which consumes significant resources.
Frequently Asked Questions (FAQ)
What Is the Difference Between HASH and KEY Partitioning?
A: HASH primarily handles integers, with a simple and straightforward algorithm; KEY primarily handles strings (also supports integers), internally using a complex hash function. For string IDs, directly use KEY partitioning.
Q: Can I Use Multiple Columns as the Partition Key?
A: Yes, for example PARTITION BY HASH(col1 + col2)  or KEY(col1, col2). However, this increases complexity, and it is generally recommended to use a single high-cardinality column as the partition key.
Q: Can the partition key or partitioning type be modified after table creation?
A: Strongly discouraged. Although syntactically supported, this essentially constitutes a "reshuffle" of the entire dataset, consuming massive system resources (IO/CPU) and potentially causing prolonged business disruption.

帮助和支持

本页内容是否解决了您的问题？

您也可以联系销售或提交工单以寻求帮助。

填写满意度调查问卷，共创更好文档体验。

文档反馈

tencent cloud

TDSQL Boundless

Creating a Partitioned Table

Overview

When Should Partitioned Tables Be Used?

Select a Partitioned Strategy

Syntax and Example

Scenario A: Standard E-Commerce Orders (Using HASH)

Scenario B: User Center (Using KEY)

Scenario C: Operation Logs (Using RANGE)

Scenario D: By Geographic Distribution (Using LIST)

Critical Constraints

Use Cases

Frequently Asked Questions (FAQ)

What Is the Difference Between HASH and KEY Partitioning?

Q: Can I Use Multiple Columns as the Partition Key?

Q: Can the partition key or partitioning type be modified after table creation?

帮助和支持