tencent cloud

TDSQL-C for MySQL

Release Notes and Announcements
Release Notes
Product Announcements
Beginner's Guide
Product Introduction
Overview
Strengths
Use Cases
Architecture
Product Specifications
Instance Types
Product Feature List
Database Versions
Regions and AZs
Common Concepts
Use Limits
Suggestions on Usage Specifications
Kernel Features
Kernel Overview
Kernel Version Release Notes
Optimized Kernel Version
Functionality Features
Performance Features
Security Features
Stability Feature
Analysis Engine Features
Inspection and Repair of Kernel Issues
Purchase Guide
Billing Overview
Product Pricing
Creating Cluster
Specification Adjustment Description
Renewal
Payment Overdue
Refund
Change from Pay-as-You-Go to Yearly/Monthly Subscription
Change from Pay-as-You-Go to Serverless Billing
Value-Added Services Billing Overview
Viewing Billing Statements
Getting Started
Database Audit
Overview
Viewing Audit Instance List
Enabling Audit Service
Viewing Audit Logs
Log Shipping
Post-Event Alarm Configuration
Modifying Audit Rule
Modifying Audit Service
Disabling Audit Service
Audit Rule Template
Viewing Audit Task
Authorizing Sub-User to Use Database Audit
Serverless Service
Serverless Introduction
Creating and Managing a Serverless Cluster
Elastic Scaling Management Tool
Serverless Resource Pack
Multi-AZ Deployment
Configuration Change
FAQs
Serverless Cost Estimator
Operation Guide
Operation Overview
Switching Cluster Page View in Console
Database Connection
Instance Management
Configuration Adjustment
Instance Mode Management
Cluster Management
Scaling Instance
Database Proxy
Account Management
Database Management
Database Management Tool
Parameter Configuration
Multi-AZ Deployment
GD
Backup and Restoration
Operation Log
Data Migration
Parallel Query
Columnar Storage Index (CSI)
Analysis Engine
Database Security and Encryption
Monitoring and Alarms
Basic SQL Operations
Connecting to TDSQL-C for MySQL Through SCF
Tag
Practical Tutorial
Classified Protection Practice for Database Audit of TDSQL-C for MySQL
Upgrading Database Version from MySQL 5.7 to 8.0 Through DTS
Usage Instructions for TDSQL-C MySQL
New Version of Console
Implementing Multiple RO Groups with Multiple Database Proxy Connection Addresses
Strengths of Database Proxy
Selecting Billing Mode for Storage Space
Creating Remote Disaster Recovery by DTS
Creating VPC for Cluster
Data Rollback
Solution to High CPU Utilization
How to Authorize Sub-Users to View Monitoring Data
White Paper
Security White Paper
Performance White Paper
Troubleshooting
Connection Issues
Performance Issues
API Documentation
History
Introduction
API Category
Making API Requests
Instance APIs
Multi-Availability Zone APIs
Other APIs
Audit APIs
Database Proxy APIs
Backup and Recovery APIs
Parameter Management APIs
Billing APIs
serverless APIs
Resource Package APIs
Account APIs
Performance Analysis APIs
Data Types
Error Codes
FAQs
Basic Concepts
Purchase and Billing
Compatibility and Format
Connection and Network
Features
Console Operations
Database and Table
Performance and Log
Database Audit
Between TDSQL-C for MySQL and TencentDB for MySQL
Service Agreement
Service Level Agreement
Terms of Service
TDSQL-C Policy
Privacy Policy
Data Privacy and Security Agreement
General References
Standards and Certifications
Glossary
Contact Us

Elastic Anti-Jitter Capabilities

PDF
Focus Mode
Font Size
Last updated: 2025-12-29 10:07:01
The Serverless service has elastic anti-jitter capabilities, which can help users better adapt to different workloads and performance requirements. This document shows you the elastic anti-jitter capabilities of TDSQL-C for MySQL instances in a Serverless cluster.

Background

In the InnoDB storage engine of TDSQL-C for MySQL, the Buffer Pool is a critical memory area for caching data and indexes. When querying data, it will first be searched in the Buffer Pool. If the data is in the Buffer Pool (that is, cached), InnoDB can immediately return the result, thus avoiding the overhead of the disk I/O. Therefore, correctly configuring the Buffer Pool size is particularly important for improving database performance. The Serverless architecture decouples the purchased instance specifications and can be automatically enabled, disabled and scaled based on the actual load of the user database, achieving extreme elasticity in computing resources. The computing resources mainly include CCU (CPU + memory). CPU can be limited by technologies such as cgroup or docker, while memory is allocated to the database process, with the majority used by the Buffer Pool module to cache user data. The allocation and release process of the Buffer Pool memory involves the distribution and relocation of user data and the mutual exclusion of global resources in the kernel. Correctly configuring the Buffer Pool size can enable the TDSQL-C for MySQL Serverless service to provide more stable elastic service capabilities.
MySQL officially supports the dynamic configuration of the Buffer Pool size, which is achieved by directly adjusting the parameter innodb_buffer_pool_size. The task will be completed in the background. If the parameter is changed again before the task is completed, it will be ignored. The scaling-out logic is relatively simple, while the scaling-in logic is relatively complex and is also where bottlenecks are prone to occur, such as I/O bottlenecks, free/lru list mutex bottlenecks, and global lock bottlenecks. Therefore, a series of optimizations at the kernel level have been made for the TDSQL-C for MySQL Serverless service, enabling the database to be "elastic" more stably.

Bottleneck Analysis and Optimization Scheme

I/O Bottleneck

During the test with the official MySQL 8.0, the kernel team has found that the main bottleneck of scaling-in is flushing the lru list, because in most scenarios, the first scan of the free list cannot meet the reclaiming requirements, and the scan depth is determined based on the number of blocks to be reclaimed, which may be a relatively large value. For buf_flush_do_batch, page cleaner and page persistence are needed, during which the lru mutex will also frequently be acquired and released, competition with user threads will occur, and glitches will be generated. Page persistence involves I/O operations, which is the main bottleneck. This issue can be completely avoided in the TDSQL-C for MySQL architecture, because the pages in the distributed storage are asynchronously generated by applying redo logs at the storage layer. For computing nodes, page cleaner is not needed, and the pages that need to be eliminated can be directly discarded. The product architecture is shown in the figure below.


Free/lru List Mutex Bottleneck

In the main process of scaling-in, the free list and the lru list are traversed in each loop, and the corresponding mutex is held during the traversal. At this time, user threads for both reading and writing operations are inaccessible and it may be necessary to acquire the free/lru list mutex. All blocks in the Buffer Pool are maintained on these two linked lists, so the traversal process has an O(N) (N represents the number of blocks) time complexity. This value may be very large, and the mutex will be continuously held for a long time, causing glitches for users.

Optimization Scheme

The optimization policy is to traverse the blocks in the chunk that need to be reclaimed by address. The number of blocks traversed in this way is related to the size of the scaling-in and does not depend on the size of the entire Buffer Pool. The locking range changes from the entire lru linked list to a single block. This will reduce the lock holding range and the lock holding time.

Global Lock Bottleneck

Whether performing scaling-out or scaling-in operations, there is a logic that requires acquiring the global lock of the Buffer Pool. During this period, the Buffer Pool is almost unavailable to users. If this step takes too long, it can also lead to users experiencing a temporary performance drop, known as a "glitch". Through analysis, we have found that there are three main stages in this process that are time-consuming:
Reclaim chunks memory and free blocks mutex.
Allocate chunks memory and initialize blocks.
Resize Hash.

Optimization Scheme

For the first two stages, the kernel team adopts the policy of delaying the release of chunks and pre-allocating chunks in advance, so that the main work can be done outside the buffer pool mutex. The original complexity is O(N), and the initialization of blocks and free blocks result in N being the number of blocks. After optimization, N in O(N) is the number of chunks, and the complexity is relatively controllable. For the Resize Hash stage, this is essentially a Rehash issue. The fundamental solution is to optimize the algorithm, such as lock-free Hash or consistent Hash allocation. The Hash Table is a fundamental element in InnoDB. If this complexity is changed, the risk will be high and the cycle will be long. If the Hash Table is too large, space will be wasted and many cells will not be used. If it is too small, there will be many Hash collisions, impacting performance. The kernel team adopts the policy of exchanging space for time, making the trigger frequency configurable, and reducing or not triggering Resize Hash within a certain scaling range.

Optimization Effect

In the test, use sysbench oltp_read_only, set long_query_time = 0.1, and check the number of slow queries. The comparison before and after optimization is as follows.


Use Instructions

For the above bottlenecks and optimizations, in addition to the optimizations directly performed by the kernel team, we provide relevant parameter settings in the Resize Hash stage, allowing users to choose whether to scale the size of the Hash Table to prevent glitches. The main cause of glitches is the scaling of the size of the Hash Table, so it is possible to prevent glitches without updating the size of the Hash Table. The detailed introduction of the parameters is as follows.
Parameter Name
Re-enablement Required
Global Parameter
Default Value
Parameter Value Range
Description
innodb_ncdb_decrease_buffer_pool_hash_factor
Yes
No
2
0 | 2
Adjusting the frequency of the Buffer Pool size thread will reduce the Hash Table size.
A value of 2 indicates the disabled state, meaning that the Hash Table size will be updated when the Buffer Pool decreases by more than half.
A value of 0 indicates the enabled state, meaning that the Hash Table size will not be updated.
Note:
When this parameter is in the enabled state, changing the upper limit of node computing power setting will re-enable the instance.

Kernel Versions Supported by Parameters

Kernel version TXSQL 5.7 2.1.12 and later.
Kernel version TXSQL 8.0 3.1.14 and later.

Parameter Setting Method

For the setting method, see Setting Instance Parameters.


Help and Support

Was this page helpful?

Help us improve! Rate your documentation experience in 5 mins.

Feedback