Backup and Recovery Solution
TDSQL Boundless provides efficient and reliable full data backup and recovery capabilities, supporting data security and business continuity in large-scale cluster scenarios. The backup and recovery of TDSQL Boundless supports regularly and quickly backing up data to COS storage media and supports restoring to a selected point in time on a new instance. Backups are divided into full and incremental parts; the recovery process selects the most recent full backup data for restoration and then applies changes to the data to the selected point in time through incremental replay. The full plus incremental approach enables flexible rollback to a specific point in time without the need for frequent full backups, saving space and computational resources.
Full backup
Full backup uploads the SST files corresponding to the consistent snapshot of current data to COS.
1. Preparation Phase
Disable the cluster Region split task to block DDL operations.
Obtain full Region metadata (including member list), global log sequence number (GLSV), and the committed Raft index for each Region.
Generate an exclusive backup timestamp to ensure TDSQL Boundless does not purge snapshot data at that point in time.
2. Backup Execution
Create a backup directory in cloud storage (such as COS) to store Region metadata.
Distribute backup tasks to each node and hard-link the corresponding SST files to the local temporary directory.
Monitor the task status and upload the SST files to cloud storage.
3. Finalization Phase
Verify the backup status of all Regions.
Restore the Region split and DDL feature, and update the task status to "Completed".
Incremental backup
Incremental backup is an ongoing daily task that periodically scans Raft incremental logs and uploads them to COS.
1. Reading Logs: The incremental backup thread periodically scans local Raft logs and backs up (committed) logs.
2. Streaming Log Upload: Upload logs to COS based on path rules and split files by date/region.
Full Recovery
Full recovery downloads backup files from COS and uses them to initialize a new instance, restoring it to the full backup point in time.
1. Preparation Phase
Create a new target cluster (with the same number of replicas as the backup).
Select backup data (ensuring integrity: all-region coverage and backup status is "success").
Check the node capacity (must be greater than or equal to 120% of the total backup data size).
2. Restore Execution
Restore cluster metadata (GLSV, write barrier information).
Distribute restore tasks, with each node downloading SST files from cloud storage to local storage.
Load SST files to the corresponding Region to complete data recovery.
3. Finalization Phase
Verify the recovery status of all Regions, update the task status to "Completed", and output the recovery results.
Incremental Recovery
Incremental recovery restores cluster data to a specified point-in-time by replaying Raft logs (which record data changes) from incremental backups, using the full-backup SST files as the foundation.
1. Full Recovery: Initiate full recovery, load the foundational data, and wait for its completion.
2. Pull Incremental Logs: Pull incremental backup logs from the full backup timestamp to the specified recovery timestamp.
3. Incremental Replay: Grouped by Region, automatically process DDL-related changes along with Region splits and merges, continuously replay Raft logs to restore data to the specified point-in-time.