tencent cloud

Data Lake Compute

Release Notes
Product Introduction
Overview
Strengths
Use Cases
Purchase Guide
Billing Overview
Refund
Payment Overdue
Configuration Adjustment Fees
Getting Started
Complete Process for New User Activation
DLC Data Import Guide
Quick Start with Data Analytics in Data Lake Compute
Quick Start with Permission Management in Data Lake Compute
Quick Start with Partition Table
Enabling Data Optimization
Cross-Source Analysis of EMR Hive Data
Standard Engine Configuration Guide
Configuring Data Access Policy
Operation Guide
Console Operation Introduction
Development Guide
Runtime Environment
SparkJar Job Development Guide
PySpark Job Development Guide
Query Performance Optimization Guide
UDF Function Development Guide
System Restraints
Client Access
JDBC Access
TDLC Command Line Interface Tool Access
Third-party Software Linkage
Python Access
Practical Tutorial
Accessing DLC Data with Power BI
Table Creation Practice
Using Apache Airflow to Schedule DLC Engine to Submit Tasks
Direct Query of DLC Internal Storage with StarRocks
Spark cost optimization practice
DATA + AI
Using DLC to Analyze CLS Logs
Using Role SSO to Access DLC
Resource-Level Authentication Guide
Implementing Tencent Cloud TCHouse-D Read and Write Operations in DLC
DLC Native Table
SQL Statement
SuperSQL Statement
Overview of Standard Spark Statement
Overview of Standard Presto Statement
Reserved Words
API Documentation
History
Introduction
API Category
Making API Requests
Data Table APIs
Task APIs
Metadata APIs
Service Configuration APIs
Permission Management APIs
Database APIs
Data Source Connection APIs
Data Optimization APIs
Data Engine APIs
Resource Group for the Standard Engine APIs
Data Types
Error Codes
General Reference
Error Codes
Quotas and limits
Operation Guide on Connecting Third-Party Software to DLC
FAQs
FAQs on Permissions
FAQs on Engines
FAQs on Features
FAQs on Spark Jobs
DLC Policy
Privacy Policy
Data Privacy And Security Agreement
Service Level Agreement
Contact Us

Collection Functions

PDF
Focus Mode
Font Size
Last updated: 2024-08-07 17:31:46

ARRAY

Function statement:
ARRAY(<e1> T, ..., <en> T)
Supported engines: SparkSQL and Presto
Usage instructions: Generate an array from the given elements.
Return type: array<T>
Example:
> SELECT array(1, 2, 3);
[1,2,3]

FILTER

Function statement:
FILTER(<expr> array<T>, <predicate> function(T[, integer])->boolean)
Supported engine: SparkSQL
Usage instructions: Filter the input array using the given predicate.
Return type: array<T>
Example:
> SELECT `filter`(array(1, 2, 3), x -> x % 2 == 1);
[1,3]
> SELECT `filter`(array(0, 2, 3), (x, i) -> x > i);
[2,3]
> SELECT `filter`(array(0, null, 2, 3, null), x -> x IS NOT NULL);
[0,2,3]

TRANSFORM

Function statement:
TRANSFORM(<expr> array<T>, <func> function(T[, integer])->U)
Supported engine: SparkSQL
Usage instructions: Use func to transform elements in an array.
Return type: array<U>
Example:
> SELECT transform(array(1, 2, 3), x -> x + 1);
[2,3,4]
> SELECT transform(array(1, 2, 3), (x, i) -> x + i);
[1,3,5]

ZIP_WITH

Function statement:
ZIP_WITH(<left> array<T>, <right> array<U>, <func> function(T, U)->R)
Supported engine: SparkSQL
Usage instructions: Use a function to merge two given arrays element-wise into a single array. If one array is shorter, nulls are appended at the end to match the length of the longer array before the function is applied.
Return type: array<R>
Example:
> SELECT zip_with(array(1, 2, 3), array('a', 'b', 'c'), (x, y) -> (y, x));
[{"y":"a","x":1},{"y":"b","x":2},{"y":"c","x":3}]
> SELECT zip_with(array(1, 2), array(3, 4), (x, y) -> x + y);
[4,6]
> SELECT zip_with(array('a', 'b', 'c'), array('d', 'e', 'f'), (x, y) -> concat(x, y));
["ad","be","cf"]

FORALL

Function statement:
FORALL(<expr> array<T>, <pred> function(T)->boolean)
Supported engine: SparkSQL
Usage instructions: Test whether a predicate applies to all elements in the array.
Return type: boolean
Example:
> SELECT forall(array(1, 2, 3), x -> x % 2 == 0);
false
> SELECT forall(array(2, 4, 8), x -> x % 2 == 0);
true
> SELECT forall(array(1, null, 3), x -> x % 2 == 0);
false
> SELECT forall(array(2, null, 8), x -> x % 2 == 0);
NULL

AGGREGATE

Function statement:
AGGREGATE(<expr> array<T>, <start> U, <merge> function(U, T)->U, <finish> function(U)->R)
Supported engine: SparkSQL
Usage instructions: Aggregate the elements in the array expr.
Return type: R
Example:
> SELECT aggregate(array(1, 2, 3), 0, (acc, x) -> acc + x);
6
> SELECT aggregate(array(1, 2, 3), 0, (acc, x) -> acc + x, acc -> acc * 10);
60

EXISTS

Function statement:
EXISTS(<expr> array<T>, <pred> function(T)->boolean)
Supported engine: SparkSQL
Usage instructions: Test whether a predicate applies to one or more elements in the array.
Return type: boolean
Example:
> SELECT exists(array(1, 2, 3), x -> x % 2 == 0);
true> SELECT exists(array(1, 2, 3), x -> x % 2 == 10);
false
> SELECT exists(array(1, null, 3), x -> x % 2 == 0);
NULL
> SELECT exists(array(0, null, 2, 3, null), x -> x IS NULL);
true
> SELECT exists(array(1, 2, 3), x -> x IS NULL);
false

ARRAY_CONTAINS

Function statement:
ARRAY_CONTAINS(<expr> array<T>, <value> T)
Supported engines: SparkSQL and Presto
Usage instructions: Return true if the array contains value.
Return type: boolean
Example:
> SELECT array_contains(array(1, 2, 3), 2);
true

ARRAYS_OVERLAP

Function statement:
ARRAYS_OVERLAP(<a> array<T>, <b> array<U>)
Supported engines: SparkSQL and Presto
Usage instructions: Return true if a contains at least one non-null element that also exists in b. If the arrays have no common elements, and they are non-empty, and either contains null elements, it returns null. Otherwise, it returns false.
Return type: boolean
Example:
> SELECT arrays_overlap(array(1, 2, 3), array(3, 4, 5));
true

ARRAY_INTERSECT

Function statement:
ARRAY_INTERSECT(<a> array<T>, <b> array<T>)
Supported engines: SparkSQL and Presto
Usage instructions: Return an array of elements that are the intersection of a and b, without duplicates.
Return type: array<T>
Example:
> SELECT array_intersect(array(1, 2, 3), array(1, 3, 5));
[1,3]

ARRAY_JOIN

Function statement:
ARRAY_JOIN(<a> ARRAY<T>, <delimiter> string[, <nullReplacement> string])
Supported engines: SparkSQL and Presto
Usage instructions: Join the elements of the given array using a delimiter and an optional string to replace nulls. If nullReplacement is not set, all null values are filtered out.
Return type: string
Example:
> SELECT array_join(array('hello', 'world'), ' ');
hello world
> SELECT array_join(array('hello', null ,'world'), ' ');
hello world
> SELECT array_join(array('hello', null ,'world'), ' ', ',');
hello , world

ARRAY_POSITION

Function statement:
ARRAY_POSITION(<a> array<T>, <element> T)
Supported engines: SparkSQL and Presto
Usage instructions: Return the 1-based index of the first element of the array.
Return type: integer
Example:
> SELECT array_position(array(3, 2, 1), 1);
3

ARRAY_SORT

Function statement:
ARRAY_SORT(<a> array<T>[, <func> function(T, T)->integer])
Supported engines: SparkSQL and Presto
Usage instructions: Sort the input array. If func is omitted, the array is sorted in ascending order.
Return type: array<T>
Example:
> SELECT array_sort(array(5, 6, 1), (left, right) -> case when left < right then -1 when left > right then 1 else 0 end);
[1,5,6]
> SELECT array_sort(array('bc', 'ab', 'dc'), (left, right) -> case when left is null and right is null then 0 when left is null then -1 when right is null then 1 when left < right then 1 when left > right then -1 else 0 end);
["dc","bc","ab"]
> SELECT array_sort(array('b', 'd', null, 'c', 'a'));
["a","b","c","d",null]

ARRAY_EXCEPT

Function statement:
ARRAY_EXCEPT(<a> array<T>, <b> array<T>)
Supported engines: SparkSQL and Presto
Usage instructions: Return an array of elements in a that are not in b, without duplicates.
Return type: array<T>
Example:
> SELECT array_except(array(1, 2, 3), array(1, 3, 5));
[2]

ARRAY_UNION

Function statement:
ARRAY_UNION(<a> array<T>, <b> array<T>)
Supported engines: SparkSQL and Presto
Usage instructions: Return an array of elements that are the union of a and b, without duplicates.
Return type: array<T>
Example:
> SELECT array_union(array(1, 2, 3), array(1, 3, 5));
[1,2,3,5]

NAMED_STRUCT

Function statement:
NAMED_STRUCT(name1 K, val1 V, ...)
Supported engine: SparkSQL
Usage instructions: Create a struct using the given field names and values.
Return type: struct
Example:
> SELECT named_struct("a", 1, "b", 2, "c", 3);
{"a":1,"b":2,"c":3}

STRUCT

Function statement:
STRUCT(<col1> T1, <col2> T2, ...)
Supported engines: SparkSQL and Presto
Usage instructions: Create a struct using the given field names and values.
Return type: struct
Example:
> SELECT struct('a', 'b', 'c');
{"col1":"a","col2":"b","col3":"c"}

> SELECT struct('a', 'b', 'c', 1, 2);
{"col1":"a","col2":"b","col3":"c","col4":1,"col5":2}

SLICE

Function statement:
SLICE(<a> array<T>, <start> integer, <length> integer)
Supported engines: SparkSQL and Presto
Usage instructions: Return a subset of array a starting from the index start (array hidden index starts from 1, if start is negative, it counts from the end) with the specified length length.
Return type: array<T>
Example:
> SELECT slice(array(1, 2, 3, 4), 2, 2);
[2,3]
> SELECT slice(array(1, 2, 3, 4), -2, 2);
[3,4]

ARRAYS_ZIP

Function statement:
ARRAYS_ZIP(<a1> array<T>, ...)
Supported engine: SparkSQL
Usage instructions: Return a merged array where each element is of struct type, with the Nth struct containing the Nth value of each input array.
Return type: array<struct<string, T>>
Example:
> SELECT arrays_zip(array(1, 2, 3), array(2, 3, 4));
[{"0":1,"1":2},{"0":2,"1":3},{"0":3,"1":4}]
> SELECT arrays_zip(array(1, 2), array(2, 3), array(3, 4));
[{"0":1,"1":2,"2":3},{"0":2,"1":3,"2":4}]

SORT_ARRAY

Function statement:
SORT_ARRAY(<a> array<T>[, ascendingOrder boolean])
Supported engines: SparkSQL and Presto
Usage instructions: Sort the input array in ascending or descending order. For double/float types, NaN is considered greater than any non-NaN element. Null elements are placed at the beginning of the returned array in ascending order or at the end in descending order.
Return type: array<T>
Example:
> SELECT sort_array(array('b', 'd', null, 'c', 'a'), true);
[null,"a","b","c","d"]

SHUFFLE

Function statement:
SHUFFLE(<a> array<T>)
Supported engines: SparkSQL and Presto
Usage instructions: Return a random permutation of the given array.
Return type: array<T>
Example:
> SELECT shuffle(array(1, 20, 3, 5));
[3,1,5,20]
> SELECT shuffle(array(1, 20, null, 3));
[20,null,3,1]

ARRAY_MAX

Function statement:
ARRAY_MAX(<a> array<T>)
Supported engines: SparkSQL and Presto
Usage instructions: Return the maximum value in the array. For double/float types, NaN is considered greater than any non-NaN element. Null elements are skipped.
Return type: array<T>
Example:
> SELECT array_max(array(1, 20, null, 3));
20

ARRAY_MIN

Function statement:
ARRAY_MIN(<a> array<T>)
Supported engines: SparkSQL and Presto
Usage instructions: Return the minimum value in the array. For double/float types, NaN is considered greater than any non-NaN element. Null elements are skipped.
Return type: array<T>
Example:
> SELECT array_min(array(1, 20, null, 3));
1

FLATTEN

Function statement:
FLATTEN(<aa> array<array<T>>
Supported engines: SparkSQL and Presto
Usage instructions: Convert a two-dimensional array into a one-dimensional array.
Return type: array<T>
Example:
> SELECT flatten(array(array(1, 2), array(3, 4)));
[1,2,3,4]

SEQUENCE

Function statement:
SEQUENCE(<start> integer|date|timestamp, end integer|date|timestamp[, step integer|interval])
Supported engines: SparkSQL and Presto
Usage instructions: Generate an array from start to end (inclusive), incrementing by step. The type of the returned elements matches the type of start and end. The start and stop expressions must parse to the same type. If the start and stop expressions are parsed to the date or timestamp type, then step must be parsed to an interval, year-month interval, or day-time interval type; otherwise, it must be parsed to the same type as start and end.
Return type: same as start
Example:
> SELECT sequence(1, 5);
[1,2,3,4,5]
> SELECT sequence(5, 1);
[5,4,3,2,1]
> SELECT sequence(to_date('2018-01-01'), to_date('2018-03-01'), interval 1 month);
[2018-01-01,2018-02-01,2018-03-01]
> SELECT sequence(to_date('2018-01-01'), to_date('2018-03-01'), interval '0-1' year to month);
[2018-01-01,2018-02-01,2018-03-01]

ARRAY_REPEAT

Function statement:
ARRAY_REPEAT(<element> T, <count> integer)
Supported engines: SparkSQL and Presto
Usage instructions: Return an array containing count occurrences of element.
Return type: array<T>
Example:
> SELECT array_repeat('123', 2);
["123","123"]

ARRAY_REMOVE

Function statement:
ARRAY_REMOVE(<a> array<T>, <element> T)
Supported engines: SparkSQL and Presto
Usage instructions: Delete all elements equal to the element from the array.
Return type: array<T>
Example:
> SELECT array_remove(array(1, 2, 3, null, 3), 3);
[1,2,null]

ARRAY_DISTINCT

Function statement:
ARRAY_DISTINCT(<a> array<T>)
Supported engines: SparkSQL and Presto
Usage instructions: Delete duplicate values from the array.
Return type: array<T>
Example:
> SELECT array_distinct(array(1, 2, 3, null, 3));
[1,2,3,null]

ELEMENT_AT

Function statement:
ELEMENT_AT(<a> array<T>, <index> integer)
ELEMENT_AT(<m> map<K, V>, <key> K)
Supported engines: SparkSQL and Presto
Usage instructions: Return the array element at the given (1-based) index or the value for the given key.
Return type: T and V
Example:
> SELECT element_at(array(1, 2, 3), 2);
2
> SELECT element_at(map(1, 'a', 2, 'b'), 2);
b

MAP

Function statement:
MAP(<k1> K, <v1> V, ...)
Supported engines: SparkSQL and Presto
Usage instructions: Generate a map from the given elements.
Return type: MAP<K, V>
Example:
> SELECT map(1.0, '2', 3.0, '4');
{1.0:"2",3.0:"4"}

MAP_FROM_ARRAYS

Function statement:
MAP_FROM_ARRAYS(<keys> array<K>, <values> array<V>)
Supported engines: SparkSQL and Presto
Usage instructions: Create a map using the given key/value arrays. None of the elements in keys should be null.
Return type: map<K, V>
Example:
> SELECT map_from_arrays(array(1.0, 3.0), array('2', '4'));
{1.0:"2",3.0:"4"}

MAP_KEYS

Function statement:
MAP_KEYS(<m> map<K, V>)
Supported engines: SparkSQL and Presto
Usage instructions: Return an unordered array containing the keys of the map.
Return type: array<K>
Example:
> SELECT map_keys(map(1, 'a', 2, 'b'));
[1,2]

MAP_VALUES

Function statement:
MAP_VALUES(<m> map<K, V>)
Supported engines: SparkSQL and Presto
Usage instructions: Return an unordered array containing the values of the map.
Return type: array<V>
Example:
> SELECT map_values(map(1, 'a', 2, 'b'));
["a","b"]

MAP_ENTRIES

Function statement:
MAP_ENTRIES(<m> map<K, V>)
Supported engines: SparkSQL and Presto
Usage instructions: Return an unordered array of all entries in the given map.
Return type: array<struct<K, V>>
Example:
> SELECT map_entries(map(1, 'a', 2, 'b'));
[{"key":1,"value":"a"},{"key":2,"value":"b"}]

MAP_FROM_ENTRIES

Function statement:
MAP_FROM_ENTRIES(<entries> array<struct<K, V>>)
Supported engines: SparkSQL and Presto
Usage instructions: Return a map created from the given array of entries.
Return type: map<K, V>
Example:
> SELECT map_from_entries(array(struct(1, 'a'), struct(2, 'b')));
{1:"a",2:"b"}

MAP_CONCAT

Function statement:
MAP_CONCAT(map1 map<K, V>, ...)
Supported engines: SparkSQL and Presto
Usage instructions: Return the union of all given maps.
Return type: map<K, V>
Example:
> SELECT map_concat(map(1, 'a', 2, 'b'), map(3, 'c'));
{1:"a",2:"b",3:"c"}

MAP_FILTER

Function statement:
MAP_FILTER(<m> map<K, V>, <func> function(K, V)->boolean)
Supported engine: SparkSQL
Usage instructions: Filter the entries in m using func.
Return type: map<K, V>
Example:
> SELECT map_filter(map(1, 0, 2, 2, 3, -1), (k, v) -> k > v);
{1:0,3:-1}

MAP_ZIP_WITH

Function statement:
MAP_ZIP_WITH(<map1> map<K, V1>, <map2> map<K, V2>, <func> function(K, V1, V2)->R)
Supported engine: SparkSQL
Usage instructions: Merge two given maps into a single map using func. For keys that appear in only one map, their values will be set to NULL. If the input maps contain duplicate keys, only the first entry for each duplicate key is transmitted to func.
Return type: MAP<K, R>
Example:
> SELECT map_zip_with(map(1, 'a', 2, 'b'), map(1, 'x', 2, 'y'), (k, v1, v2) -> concat(v1, v2));
{1:"ax",2:"by"}

TRANSFORM_KEYS

Function statement:
TRANSFORM_KEYS(<m> map<K, V>, <func> function(K, V)->R)
Supported engine: SparkSQL
Usage instructions: Transform the keys in the map using func.
Return type: map<R, V>
Example:
> SELECT transform_keys(map_from_arrays(array(1, 2, 3), array(1, 2, 3)), (k, v) -> k + 1);
{2:1,3:2,4:3}
> SELECT transform_keys(map_from_arrays(array(1, 2, 3), array(1, 2, 3)), (k, v) -> k + v);
{2:1,4:2,6:3}

TRANSFORM_VALUES

Function statement:
TRANSFORM_VALUES(<m> map<K, V>, <func> function(K, V)->R)
Supported engine: SparkSQL
Usage instructions: Transform the values in the map using func.
Return type: map<K, R>
Example:
> SELECT transform_values(map_from_arrays(array(1, 2, 3), array(1, 2, 3)), (k, v) -> v + 1);
{1:2,2:3,3:4}
> SELECT transform_values(map_from_arrays(array(1, 2, 3), array(1, 2, 3)), (k, v) -> k + v);
{1:2,2:4,3:6}

SIZE

Function statement:
SIZE(<expr> array<T>|map<K, V>)
Supported engines: SparkSQL and Presto
Usage instructions: Return the size of an array or map.
Return type: integer
Example:
> SELECT size(array('b', 'd', 'c', 'a'));
4
> SELECT size(map('a', 1, 'b', 2));
2
> SELECT size(NULL);
-1

CARDINALITY

Function statement:
CARDINALITY(<expr> array<T>|map<K, V>)
Supported engines: SparkSQL and Presto
Usage instructions: Return the size of an array or map.
Return type: integer
Example:
> SELECT cardinality(array('b', 'd', 'c', 'a'));
4
> SELECT cardinality(map('a', 1, 'b', 2));
2
> SELECT cardinality(NULL);
-1

ANY_MATCH / FORALL

Function statement:
ANY_MATCH(<expr> array<T>, x -> lambda(x))
FORALL(<expr> array<T>, x -> lambda(x))
Supported engine: SparkSQL
Usage instructions: For each element in the array, execute the lambda expression in sequence. The function returns true if at least one element returns true; otherwise, it returns false.
If any element in the array is NULL, the function returns NULL.
Return type: boolean
Example:
> SELECT any_match(array(1, 2, 3), x -> x % 2 == 0);
false
> SELECT any_match(array(2, 4, 8), x -> x % 2 == 0);
true
> SELECT any_match(array(1, null, 3), x -> x % 2 == 0);
NULL



Help and Support

Was this page helpful?

Help us improve! Rate your documentation experience in 5 mins.

Feedback