tencent cloud

TDSQL Boundless

Release Notes
Product Introduction
Overview
Scenarios
Product Architecture
Instance Types
Compatibility Notes
Kernel Features
Kernel Overview
Kernel Version Release Notes
Functionality Features
Performance Features
Billing
Billing Overview
Purchase Method
Pricing Details
Renewal
Overdue Payments
Refund
Getting Started
Creating an Instance
Connect to Instances
User Guide
Data Migration
Data Subscription
Instance Management
Configuration Change
Parameter Configuration
Account Management
Security Group
Backup and Restoration
Database Auditing
Tag Management
Use Cases
Technical Evolution and Usage Practices of Online DDL
Lock Mechanism Analysis and Troubleshooting Practices
Data Intelligent Scheduling and Related Practices for Performance Optimization
TDSQL Boundless Selection Guide and Practical Tutorial
Developer Guide
Developer Guide (MySQL Compatibility Mode)
Developer Guide (HBase Compatibility Mode)
Performance Tuning
Performance Tuning Overview
SQL Tuning
DDL Tuning
Performance White Paper
Performance Overview
TPC-C Test
Sysbench Test
API Documentation
History
Introduction
API Category
Making API Requests
Instance APIs
Security Group APIs
Task APIs
Backup APIs
Rollback APIs
Parameter APIs
Database APIs
Data Types
Error Codes
General Reference
System Architecture
SQL Reference
Database Parameter Description
TPC-H benchmark data model reference
Error Code Information
Security and Compliance
FAQs
Agreements
Service Level Agreement
Terms of Service
Privacy Policy
Data Processing And Security Agreement
Contact Us
Glossary

Single-Table Queries

PDF
Mode fokus
Ukuran font
Terakhir diperbarui: 2026-03-26 14:54:46
This document describes how to perform efficient queries on single-table data in TDSQL Boundless databases, including basic query syntax, query optimization techniques, and TDSQL Boundless-specific optimization mechanisms.

Before You Begin

Before a single-table query is performed, ensure that:
1. TDSQL Boundless instance has been created.
2. Test data has been imported: This document uses the TPC-H benchmarking dataset as an example. For details, see TPC-H Benchmark Data Model Reference.
3. Database connection has been established: Connect to TDSQL Boundless via MySQL client or application.

Simple queries

The most basic query operation involves reading specific columns from a table.

Basic SELECT Statement

Query basic customer information:
SELECT c_custkey, c_name, c_mktsegment FROM customer;
Query key fields of orders:
SELECT o_orderkey, o_custkey, o_orderdate, o_totalprice FROM orders;

Query all columns

Using * allows you to query all columns in a table:
SELECT * FROM customer;
Hands-on Tutorial:
In production environments, avoid using SELECT * and explicitly specify required columns.
Reason: Reduce network traffic, improve query performance, and avoid reading unnecessary large fields (such as c_comment).

Use aliases

Assigning aliases to columns or tables can improve SQL readability:
SELECT
c_custkey AS customer_id,
c_name AS customer_name,
c_acctbal AS account_balance
FROM customer AS c;

Filter the results

Using the WHERE clause allows you to filter query results and return only data that meets the criteria.

Equality condition

Query customers in specific market segments:
SELECT c_custkey, c_name, c_acctbal
FROM customer
WHERE c_mktsegment = 'AUTOMOBILE';
Query orders with specific order status:
SELECT o_orderkey, o_orderdate, o_totalprice
FROM orders
WHERE o_orderstatus = 'O'; -- O: Open, F: Finished, P: Pending

Comparison condition queries

Query customers with account balance greater than 5000:
SELECT c_custkey, c_name, c_acctbal
FROM customer
WHERE c_acctbal > 5000.00;
Query orders after 1995:
SELECT o_orderkey, o_orderdate, o_totalprice
FROM orders
WHERE o_orderdate >= '1995-01-01';

Range query

Using BETWEEN for range queries:
SELECT o_orderkey, o_orderdate, o_totalprice
FROM orders
WHERE o_orderdate BETWEEN '1995-01-01' AND '1995-12-31';

Multiple condition combinations

Using AND and OR to combine multiple conditions:
SELECT c_custkey, c_name, c_acctbal, c_mktsegment
FROM customer
WHERE c_mktsegment = 'AUTOMOBILE'
AND c_acctbal > 5000.00;

Fuzzy search

Using LIKE for pattern matching:
SELECT c_custkey, c_name
FROM customer
WHERE c_name LIKE 'Customer%';

Sort the results

Using the ORDER BY clause allows you to sort query results.

Ascending order (default)

Arrange in ascending order by order date:
SELECT o_orderkey, o_orderdate, o_totalprice
FROM orders
ORDER BY o_orderdate;
Or explicitly specify ASC:
SELECT o_orderkey, o_orderdate, o_totalprice
FROM orders
ORDER BY o_orderdate ASC;

Descending Order

Query customers with the highest account balance:
SELECT c_custkey, c_name, c_acctbal
FROM customer
ORDER BY c_acctbal DESC;
Query the most recent orders:
SELECT o_orderkey, o_orderdate, o_totalprice
FROM orders
ORDER BY o_orderdate DESC;

Sorting multiple columns

You can sort by multiple columns with priority from left to right:
SELECT o_orderkey, o_orderdate, o_totalprice, o_orderpriority
FROM orders
ORDER BY o_orderdate DESC, o_totalprice DESC;

Sorting Optimization Recommendations

TDSQL Boundless Optimization Key Points:
1. Index Optimization: Create indexes on columns specified in ORDER BY to avoid filesort.
2. Covering Index: If all columns queried are included in the index, TDSQL Boundless can read directly from the index, avoiding table lookup.
3. Index Order: The column order in ORDER BY should match the index definition order.
Example: Optimizing Sorting Performance
-- Create a composite index
CREATE INDEX idx_orderdate_totalprice ON orders(o_orderdate DESC, o_totalprice DESC);

-- This query can directly use the index to avoid sorting.
SELECT o_orderkey, o_orderdate, o_totalprice
FROM orders
ORDER BY o_orderdate DESC, o_totalprice DESC;
View the execution plan via EXPLAIN:
EXPLAIN SELECT o_orderkey, o_orderdate, o_totalprice
FROM orders
ORDER BY o_orderdate DESC;
If the execution plan does not include Using filesort, it indicates that index sorting is utilized, delivering optimal performance.

Limit the number of query results

Using the LIMIT clause can limit the number of result rows returned, commonly used for pagination queries or obtaining Top N results.

Obtain the top N records

Query the top 10 highest-value orders:
SELECT o_orderkey, o_orderdate, o_totalprice
FROM orders
ORDER BY o_totalprice DESC
LIMIT 10;
Query the top 20 customers with the highest account balance:
SELECT c_custkey, c_name, c_acctbal
FROM customer
ORDER BY c_acctbal DESC
LIMIT 20;

Paging Query

Using LIMIT offset, count or LIMIT count OFFSET offset for pagination:
Method 1:
-- Query the second page of data (with 20 entries per page)
SELECT c_custkey, c_name, c_acctbal
FROM customer
ORDER BY c_custkey
LIMIT 20, 20; -- offset=20, count=20
Method 2 (Recommended):
-- Clearer syntax
SELECT c_custkey, c_name, c_acctbal
FROM customer
ORDER BY c_custkey
LIMIT 20 OFFSET 20;

LIMIT optimization mechanism

TDSQL Boundless LIMIT Optimization:
1. Early Termination of Scan:
TDSQL Boundless stops immediately after scanning sufficient data without scanning the entire table.
For example, LIMIT 10 only needs to retrieve 10 rows of data before the results are returned.
2. Index Pushdown:
If both the query conditions and sorting fields are indexed, TDSQL Boundless will apply LIMIT directly at the storage engine layer.
Reduced data transmission and memory usage.
3. TopN Optimization:
ORDER BY ... LIMIT N uses the heapsort algorithm with a space complexity of O(N).
Avoids full table sorting, significantly improves performance.
Performance Comparison Example:
-- Without LIMIT: requires scanning and sorting the full table
SELECT o_orderkey, o_totalprice
FROM orders
ORDER BY o_totalprice DESC;
-- Rows scanned: 1,500,000 (full table)

-- With LIMIT: early termination
SELECT o_orderkey, o_totalprice
FROM orders
ORDER BY o_totalprice DESC
LIMIT 10;
-- Rows scanned: ~10 (using index) or with heapsort optimization

Pagination Optimization Recommendations:

Deep Pagination Problem:
-- Poor performance: requires scanning and skipping the first 1,000,000 rows
SELECT o_orderkey, o_orderdate
FROM orders
ORDER BY o_orderkey
LIMIT 1000000, 20;
Optimization Solution: Use range queries instead of OFFSET.
-- Method 1: Record the maximum ID from the last query
SELECT o_orderkey, o_orderdate
FROM orders
WHERE o_orderkey > 1000000 -- ID of the last record from the previous operation
ORDER BY o_orderkey
LIMIT 20;

-- Method 2: Use covering index + deferred join
SELECT o.o_orderkey, o.o_orderdate, o.o_totalprice
FROM orders o
INNER JOIN (
SELECT o_orderkey
FROM orders
ORDER BY o_orderkey
LIMIT 1000000, 20
) tmp ON o.o_orderkey = tmp.o_orderkey;

Aggregation Query

Aggregate queries are used to perform statistical analysis on data. Commonly used aggregate functions include COUNT, SUM, AVG, MAX, and MIN.

Basic aggregate functions

Count the total number of orders:
SELECT COUNT(*) AS total_orders
FROM orders;
Calculate the total sales amount:
SELECT SUM(o_totalprice) AS total_sales
FROM orders;
Query the maximum and minimum order amounts:
SELECT
MAX(o_totalprice) AS max_order_amount,
MIN(o_totalprice) AS min_order_amount,
AVG(o_totalprice) AS avg_order_amount
FROM orders;

grouping and aggregation

Using the GROUP BY clause to perform group statistics on data.
Count the number of customers in each market segment:
SELECT
c_mktsegment,
COUNT(*) AS customer_count
FROM customer
GROUP BY c_mktsegment
ORDER BY customer_count DESC;
Calculate the number of orders and total sales amount per year:
SELECT
YEAR(o_orderdate) AS order_year,
COUNT(*) AS order_count,
SUM(o_totalprice) AS total_sales,
AVG(o_totalprice) AS avg_order_amount
FROM orders
GROUP BY YEAR(o_orderdate)
ORDER BY order_year DESC;
Count the number of orders per order status:
SELECT
o_orderstatus,
COUNT(*) AS status_count,
SUM(o_totalprice) AS total_amount
FROM orders
GROUP BY o_orderstatus;

HAVING clause

HAVING is used to filter the results after grouping is performed (as opposed to WHERE, which filters raw rows).
Query customers with more than 1000 orders:
SELECT
o_custkey,
COUNT(*) AS order_count,
SUM(o_totalprice) AS total_spent
FROM orders
GROUP BY o_custkey
HAVING COUNT(*) > 1000
ORDER BY total_spent DESC;
Query years where the average order amount is greater than 10000:
SELECT
YEAR(o_orderdate) AS order_year,
AVG(o_totalprice) AS avg_amount
FROM orders
GROUP BY YEAR(o_orderdate)
HAVING AVG(o_totalprice) > 10000
ORDER BY order_year;

Count distinct

Using DISTINCT for deduplicated counting:
Count the number of different market segments:
SELECT COUNT(DISTINCT c_mktsegment) AS segment_count
FROM customer;
Count the number of customers with orders:
SELECT COUNT(DISTINCT o_custkey) AS active_customers
FROM orders;

Aggregation Query Optimization

TDSQL Boundless Aggregate Query Optimization Key Points:
1. Index Accelerated Grouping:
-- Create an index on GROUP BY columns
CREATE INDEX idx_mktsegment ON customer(c_mktsegment);

-- This query can utilize indexes to accelerate grouping
SELECT c_mktsegment, COUNT(*)
FROM customer
GROUP BY c_mktsegment;
2. Loose Index Scan (Loose Index Scan):
When the GROUP BY columns are the prefix of the index, TDSQL Boundless can skip irrelevant index entries.
Check via EXPLAIN whether Using index for group-by is utilized.
3. Covering Index Optimization:
-- Create a covering index
CREATE INDEX idx_orderdate_totalprice ON orders(o_orderdate, o_totalprice);

-- This query avoids table lookup and reads directly from the index.
SELECT
YEAR(o_orderdate) AS order_year,
SUM(o_totalprice) AS total_sales
FROM orders
GROUP BY YEAR(o_orderdate);
4. Partitioned Table Acceleration:
If a table is partitioned by date, aggregate queries can be executed in parallel across partitions.
TDSQL Boundless automatically performs partition pruning and only scans relevant partitions.
-- Assume the orders table is partitioned by year
-- This query only scans the 1995 partition
SELECT
COUNT(*) AS order_count,
SUM(o_totalprice) AS total_sales
FROM orders
WHERE o_orderdate BETWEEN '1995-01-01' AND '1995-12-31';

Bantuan dan Dukungan

Apakah halaman ini membantu?

masukan