tencent cloud

Data Model
Last updated:2026-02-10 10:54:56
Data Model
Last updated: 2026-02-10 10:54:56

HBase Model

HBase Data Model

HBase's data model is a multidimensional mapping that can be represented as:
(row_key, column family, column, version) → value
row_key: the unique identifier for a row, sorted lexicographically.
column family: a collection of columns, which is the basic unit of physical storage. Each column family contains an unlimited number of columns and supports dynamic addition. The same column can contain multiple data versions, and the version number is usually represented by a timestamp.
column: a specific field within a column family, consisting of the column family name and the column qualifier (Qualifier).
version: the version number of data, usually represented by a timestamp.
value: the actual stored data.
cell: the smallest unit of data storage, uniquely identified by (row_key, column family, column, version).

HBase Storage Structure

Namespace: a logical container for tables, used to organize and manage tables. Tables belong to a specific Namespace.
Table: The basic unit of data storage in HBase. Tables are divided into multiple Regions based on the range of row key (Row Key). Each Region is responsible for storing a portion of the data.
Region: Each Region contains multiple Stores. Each Store corresponds to a column family (Column Family).
Store: Store is the basic unit for data storage within a Region, corresponding one-to-one with a column family. Each Store maintains an independent LSM structure, including MemStore and StoreFile. The number of column families is typically limited, and it is recommended not to exceed 3-5.
MemStore: MemStore is an in-memory write cache that provides high write performance.
StoreFile: StoreFile is a storage file on disk, based on the HFile format, supporting efficient reading and compression.
Block: Block is the basic storage unit within a StoreFile, supporting efficient random reads.
HBase's storage structure can be represented by the following hierarchical relationships:
Table
├── Region (divided by row key range)
│ ├── Store (each column family corresponds to one Store)
│ │ ├── MemStore (in-memory write cache)
│ │ └── StoreFile (on-disk storage file)
│ │ └── Block (data block in the file)
│ └── ...
└── ...

TDSQL Boundless Data Model

Role Mapping

HBase Master is analogous to the MC (Metadata Cluster) in TDSQL Boundless.
HBase Region Server is analogous to the TDStore node in TDSQL Boundless.

Table Mapping Rules

One-to-many mapping: One HBase Table corresponds to multiple tables in TDSQL Boundless.
Column family mapping: Each Column Family corresponds to one TDSQL Boundless table, with the table name formatted as HBase table name_column family name.
Column mapping: A specific version of each column in HBase corresponds to a row of data in the TDSQL Boundless table.
Primary key design: The primary key of the TDSQL Boundless table is HBase Row Key + Column Qualifier + Version.

Mapping Example

Assuming the HBase table ht1 contains two column families cf1 and cf2, TDSQL Boundless will create two tables internally: ht1_cf1 and ht1_cf2.
HBase data example
row key
column family
column
version
value
row1
cf1
a
100
v1
row1
cf1
b
100
v2
row1
cf1
b
110
v3
row1
cf2
c
120
v4
row2
cf1
d
120
v5
row2
cf2
d
130
v6
TDSQL Boundless table data
create table ht1_cf1 (
K varbinary(1024),
Q varbinary(256),
T bigint,
V MediumBlob NOT NULL,
primary key(K, Q, T)) HBase;
create table ht1_cf2 (
K varbinary(1024),
Q varbinary(256),
T bigint,
V MediumBlob NOT NULL,
primary key(K, Q, T)) HBase;
table ht1_cf1
Primary Key (K + Q + T)
Value (V)
row1 + a + 100
v1
row1 + b + 100
v2
row1 + b + 110
v3
row2 + d + 120
v5
table ht1_cf2
Primary Key (K + Q + T)
Value (V)
row1 + c + 120
v4
row2 + d + 130
v6
Was this page helpful?
You can also Contact Sales or Submit a Ticket for help.
Yes
No

Feedback