Architecture of the TDStore Distributed Storage Engine
TDStore is the storage engine that powers TDSQL Boundless. It is a distributed storage engine built upon RocksDB. By incorporating a data sharding module and a distributed transaction module in its upper layers, and integrating Raft consensus protocol module in its lower layer, TDStore transforms a single-machine RocksDB into a distributed KV storage engine. This architecture delivers high scalability, high availability, distributed transactions, automatic data balancing scheduling, strong consistency across multiple replicas.
Single-Machine KV Storage Engine (RocksDB)
Receive KV requests passed from the computing layer.
Stores data using a Log-Structured Merge-Tree (LSM-Tree) structure.
Logical Data Sharding
Supports scheduling and migration by data shard.
Implements flexible scheduling of data across nodes in the cluster.
Distributed Transaction
Maintains participant context and transaction states for distributed transactions across various data shards.
Supports scheduling co-location related data onto the same data shard. This strategy converts most of the distributed transactions into non-distributed transactions, thereby improving overall transaction performance.
Multi-Replica Disaster Recovery Layer (Multi-Raft)
Creates multiple replicas for each data shard across different TDStore nodes, with each shard and its replicas forming a Raft Group.
Each data shard is synchronized as an independent log stream.
Low-Cost Mass Storage
The TDStore storage layer leverages an LSM-Tree and SSTable architecture to store and manage data. This design achieves a very high compression ratio, significantly reducing storage costs for massive datasets. A single instance can support petabyte-scale (PB-scale) storage.