tencent cloud

TDSQL Boundless

Release Notes
Product Introduction
Overview
Scenarios
Product Architecture
Instance Types
Compatibility Notes
Kernel Features
Kernel Overview
Kernel Version Release Notes
Functionality Features
Performance Features
Billing
Billing Overview
Purchase Method
Pricing Details
Renewal
Overdue Payments
Refund
Getting Started
Creating an Instance
Connect to Instances
User Guide
Data Migration
Data Subscription
Instance Management
Configuration Change
Parameter Configuration
Account Management
Security Group
Backup and Restoration
Database Auditing
Tag Management
Use Cases
Technical Evolution and Usage Practices of Online DDL
Lock Mechanism Analysis and Troubleshooting Practices
Data Intelligent Scheduling and Related Practices for Performance Optimization
TDSQL Boundless Selection Guide and Practical Tutorial
Developer Guide
Developer Guide (MySQL Compatibility Mode)
Developer Guide (HBase Compatibility Mode)
Performance Tuning
Performance Tuning Overview
SQL Tuning
DDL Tuning
Performance White Paper
Performance Overview
TPC-C Test
Sysbench Test
API Documentation
History
Introduction
API Category
Making API Requests
Instance APIs
Security Group APIs
Task APIs
Backup APIs
Rollback APIs
Parameter APIs
Database APIs
Data Types
Error Codes
General Reference
System Architecture
SQL Reference
Database Parameter Description
TPC-H benchmark data model reference
Error Code Information
Security and Compliance
FAQs
Agreements
Service Level Agreement
Terms of Service
Privacy Policy
Data Processing And Security Agreement
Contact Us
Glossary

Subquery

PDF
포커스 모드
폰트 크기
마지막 업데이트 시간: 2026-03-26 14:56:10
This document describes how to use subqueries in TDSQL Boundless databases. A subquery is an SQL query nested within another query, allowing you to utilize the results of one query in a single statement.

Categories of Subqueries

In TDSQL Boundless, subqueries typically exist in certain forms:

Scalar Subqueries

Scalar subqueries return a value from a single row and single column, and can be used in SELECT lists, WHERE clauses, or any location requiring a single value. Their key characteristic is that the subquery result is equivalent to a constant value.
SELECT
c_name,
c_acctbal,
(SELECT AVG(c_acctbal) FROM customer) AS avg_balance
FROM customer
LIMIT 5;

Derived table

Derived tables are subqueries placed in the FROM clause, serving as a temporary table used in subsequent queries. Their key characteristic is that the subquery must be enclosed in parentheses and given an alias.
SELECT seg.c_mktsegment, seg.cnt
FROM (
SELECT c_mktsegment, COUNT(*) AS cnt
FROM customer
GROUP BY c_mktsegment
) seg
ORDER BY seg.cnt DESC;

Existence test

Through keywords such as EXISTS, NOT EXISTS, IN, and NOT IN, it determines whether the subquery returns data, resulting in a Boolean value. Its key characteristic is that it does not concern itself with the specific values returned by the subquery, only whether any rows exist.
-- EXISTS: Checks whether matching rows exist
SELECT c_name FROM customer c
WHERE EXISTS (
SELECT 1 FROM orders o WHERE o.o_custkey = c.c_custkey
);

-- IN: Checks whether a value exists within the result set
SELECT c_name FROM customer
WHERE c_nationkey IN (SELECT n_nationkey FROM nation WHERE n_name = 'CHINA');

Set comparison

Using keywords such as ANY, ALL, and SOME to compare a value with the result set returned by a subquery. Its key characteristic is the combined use of comparison operators (=, >, <, and so on) with ANY/ALL.
-- = ANY is equivalent to IN
SELECT c_name, c_acctbal FROM customer
WHERE c_acctbal > ANY (
SELECT o_totalprice FROM orders WHERE o_orderstatus = 'F'
);

-- > ALL: means greater than all values returned by the subquery
SELECT c_name, c_acctbal FROM customer
WHERE c_acctbal > ALL (
SELECT AVG(c_acctbal) FROM customer GROUP BY c_mktsegment
);

as an operand for comparison operators

Subqueries serve directly as one operand of comparison operators (>, <, =, >=, <=, <>). Their key characteristic is that the subquery must return a single row and single column (that is, scalar value). This differs from scalar subqueries in that it appears within comparison conditions of WHERE/HAVING clauses.
SELECT o_orderkey, o_totalprice
FROM orders
WHERE o_totalprice > (
SELECT AVG(o_totalprice) FROM orders
);

Subquery correlation

Depending on whether the subquery references columns from the outer query, it can be categorized into two types: non-correlated subqueries and correlated subqueries.

non-correlated subquery

Non-correlated subqueries do not reference any columns from the outer query, and their results are independent of the outer query. TDSQL Boundless executes the inner subquery first and substitutes the result as a constant into the outer query.
Query customers whose account balance is higher than the average account balance of all customers:
SELECT c_custkey, c_name, c_acctbal
FROM customer
WHERE c_acctbal > (
SELECT AVG(c_acctbal) FROM customer
);
When processing this query, TDSQL Boundless will first execute the inner subquery:
SELECT AVG(c_acctbal) FROM customer;
Assuming the computation result is 4990.51, then the outer query is equivalent to:
SELECT c_custkey, c_name, c_acctbal
FROM customer
WHERE c_acctbal > 4990.51;
Find customers who have placed orders using IN subqueries:
SELECT c_custkey, c_name, c_mktsegment
FROM customer
WHERE c_custkey IN (
SELECT DISTINCT o_custkey FROM orders
WHERE o_orderdate >= '1995-01-01'
);
The inner subquery executes independently and returns a set of o_custkey values; the outer query matches against this set.

Correlated subqueries

Correlated subqueries reference columns from the outer query, thus the result of the inner subquery depends on the row currently being processed in the outer query. Logically, a correlated subquery must be re-executed for each row in the outer query.
Query the highest-value order for each customer:
SELECT o_orderkey, o_custkey, o_totalprice, o_orderdate
FROM orders o1
WHERE o_totalprice = (
SELECT MAX(o2.o_totalprice)
FROM orders o2
WHERE o2.o_custkey = o1.o_custkey
);
The inner subquery references the outer o1.o_custkey. For each row in the outer query, the subquery calculates the maximum order amount for that customer, then retains only the rows where the amount equals this maximum value.
Query customers whose account balance is higher than the average account balance of customers in the same market segment:
SELECT c1.c_custkey, c1.c_name, c1.c_mktsegment, c1.c_acctbal
FROM customer c1
WHERE c1.c_acctbal > (
SELECT AVG(c2.c_acctbal)
FROM customer c2
WHERE c2.c_mktsegment = c1.c_mktsegment
);
TDSQL Boundless optimizer attempts to perform Unnesting optimization on correlated subqueries, rewriting them as equivalent JOIN queries to improve performance. For example, the above query may be rewritten as:
SELECT c1.c_custkey, c1.c_name, c1.c_mktsegment, c1.c_acctbal
FROM customer c1
INNER JOIN (
SELECT c_mktsegment, AVG(c_acctbal) AS avg_acctbal
FROM customer
GROUP BY c_mktsegment
) c2 ON c1.c_mktsegment = c2.c_mktsegment
WHERE c1.c_acctbal > c2.avg_acctbal;
The rewritten query only needs to scan the customer table twice (once for aggregation and once for joining), rather than executing a subquery for each customer, resulting in significantly improved performance.

Common subquery scenarios

EXISTS subquery

EXISTS is used to determine whether the subquery returns at least one row of data, commonly employed for existence checks.
Query customers who have placed high-value orders (amount > 300000):
SELECT c.c_custkey, c.c_name, c.c_acctbal
FROM customer c
WHERE EXISTS (
SELECT 1 FROM orders o
WHERE o.o_custkey = c.c_custkey
AND o.o_totalprice > 300000
);
Query customers who have not placed orders (NOT EXISTS):
SELECT c.c_custkey, c.c_name, c.c_phone
FROM customer c
WHERE NOT EXISTS (
SELECT 1 FROM orders o
WHERE o.o_custkey = c.c_custkey
);
NOT EXISTS is semantically equivalent to LEFT JOIN ... WHERE ... IS NULL, but their execution efficiency may differ in certain scenarios; this can be compared using EXPLAIN.

IN subquery

IN is used to determine whether a value exists within the result set returned by a subquery.
Query customers from countries in the ASIA region:
SELECT c_custkey, c_name, c_nationkey
FROM customer
WHERE c_nationkey IN (
SELECT n_nationkey FROM nation
WHERE n_regionkey IN (
SELECT r_regionkey FROM region
WHERE r_name = 'ASIA'
)
);
Choice between IN and EXISTS: When the subquery returns a small result set, the performance difference between IN and EXISTS is negligible; when the outer table is small and the subquery result set is large, EXISTS is generally more efficient.

Scalar Subqueries

Scalar subqueries return a single value and can appear in the SELECT list, WHERE clause, or other locations.
Using scalar subqueries in the SELECT list — Query each order and its corresponding customer name:
SELECT
o_orderkey,
o_totalprice,
o_orderdate,
(SELECT c_name FROM customer WHERE c_custkey = o_custkey) AS customer_name
FROM orders
WHERE o_orderdate = '1995-03-15';
Scalar subqueries in the SELECT list logically execute once for each row. When the number of outer rows is large, it is recommended to rewrite them as JOIN:
SELECT
o.o_orderkey,
o.o_totalprice,
o.o_orderdate,
c.c_name AS customer_name
FROM orders o
INNER JOIN customer c ON o.o_custkey = c.c_custkey
WHERE o.o_orderdate = '1995-03-15';

Derived table (subquery in the FROM clause)

Subqueries placed in the FROM clause are used as temporary tables (derived tables).
Query customers with the highest account balance in each market segment:
SELECT c.c_custkey, c.c_name, c.c_mktsegment, c.c_acctbal
FROM customer c
INNER JOIN (
SELECT c_mktsegment, MAX(c_acctbal) AS max_acctbal
FROM customer
GROUP BY c_mktsegment
) seg_max ON c.c_mktsegment = seg_max.c_mktsegment
AND c.c_acctbal = seg_max.max_acctbal;
Statistics on the number of orders and average amount for each order priority, retaining only priorities with over 10000 orders:
SELECT *
FROM (
SELECT
o_orderpriority,
COUNT(*) AS order_count,
AVG(o_totalprice) AS avg_amount
FROM orders
GROUP BY o_orderpriority
) summary
WHERE order_count > 10000
ORDER BY order_count DESC;


도움말 및 지원

문제 해결에 도움이 되었나요?

피드백