Feature Introduction
The LibraDB engine focuses on efficient analysis queries, offering a read-only extension for real-time and high-performance complex SQL processing. With its columnar storage capability, vectorized parallel execution engine, and distributed query optimizer, users can quickly achieve high-performance analysis directly within the database. Additionally, LibraDB's columnar storage has been optimized for high-QPS updates and ACID transactions, ensuring real-time data consistency.
Principles
The LibraDB engine kernel automatically converts data from the primary node into columnar storage while synchronizing the binlog in real time, ensuring data remains consistently aligned with the primary node. Additionally, by leveraging parallel processing capabilities and a vectorized execution engine, it efficiently processes complex user SQL queries and accelerates query execution.
Supported Features
The LibraDB engine kernel boasts a variety of outstanding features. Below is a brief introduction to the features supported by the product.
1. Parallel Computing Capability
In the LibraDB engine, the computing resources of nodes are fully leveraged to execute user SQL queries. When an SQL query is processed within the LibraDB engine, it can be divided into multiple subtasks that run concurrently across multiple processors. This multi-threaded execution effectively enhances CPU utilization and optimizes I/O resource usage, achieving accelerated processing.
2. Vectorized Execution Engine
Regarding the LibraDB engine, data is not only stored in columns but is also processed on a column basis. In traditional OLTP engines like TXSQL, computations are typically performed on row-based storage since transactions primarily involve point queries, reads, and writes. However, with the LibraDB engine focused on analytical processing, the computational workload of a single SQL query can be immense. To address this, we have implemented a vectorized execution mode in LibraDB. This mode processes in-memory columnar data in batches, invoking SIMD instructions once per batch, which reduces function call overhead and minimizes cache misses. Additionally, it fully leverages the parallel capabilities of SIMD instructions, significantly shortening computation time.
3. Support for Columnar Storage in High-Speed Change Scenarios
The primary instance of TencentDB for MySQL supports online data operations with a QPS exceeding one million. As a LibraDB engine designed to support real-time data analysis, it should ensure data consistency even under scenarios of rapid data changes. Traditional columnar storage offers advantages for high-volume data writes, but its performance is inadequate when dealing with large-scale deletions and updates. In conventional real-time data warehouse settings, a good practice is to convert updates into delete-and-insert operations and implement batch data processing at the data synchronization layer. However, columnar storage still faces performance limitations in deletion scenarios. Considering these factors, traditional columnar storage inevitably leads to high data latency and fails to achieve the desired real-time data analysis outcomes.
The LibraDB engine ensures data consistency for changes in high-concurrency scenarios by leveraging optimizations and support at the storage layer. This prevents missed analysis opportunities caused by data latency from frequent updates in read-write instances.
4. Targeted Data Loading Capability
In TencentDB for MySQL, not all data possesses analytical value, so it is not required to load every object as columnar storage. The LibraDB engine allows you to specify which objects to load. You can configure this either through the data loading console or by using SQL commands in the command line.