Collection Functions

Download
Focus Mode
Font Size
Last updated: 2024-08-07 17:31:46
ARRAY
Function statement:
ARRAY(<e1> T, ..., <en> T)
Supported engines: SparkSQL and Presto
Usage instructions: Generate an array from the given elements.
Return type: array<T>
Example:
> SELECT array(1, 2, 3);
 [1,2,3]
FILTER
Function statement:
FILTER(<expr> array<T>, <predicate> function(T[, integer])->boolean)
Supported engine: SparkSQL
Usage instructions: Filter the input array using the given predicate.
Return type: array<T>
Example:
> SELECT `filter`(array(1, 2, 3), x -> x % 2 == 1);
 [1,3]
> SELECT `filter`(array(0, 2, 3), (x, i) -> x > i);
 [2,3]
> SELECT `filter`(array(0, null, 2, 3, null), x -> x IS NOT NULL);
 [0,2,3]
TRANSFORM
Function statement:
TRANSFORM(<expr> array<T>, <func> function(T[, integer])->U)
Supported engine: SparkSQL
Usage instructions: Use func to transform elements in an array.
Return type: array<U>
Example:
> SELECT transform(array(1, 2, 3), x -> x + 1);
 [2,3,4]
> SELECT transform(array(1, 2, 3), (x, i) -> x + i);
 [1,3,5]
ZIP_WITH
Function statement:
ZIP_WITH(<left> array<T>, <right> array<U>, <func> function(T, U)->R)
Supported engine: SparkSQL
Usage instructions: Use a function to merge two given arrays element-wise into a single array. If one array is shorter, nulls are appended at the end to match the length of the longer array before the function is applied.
Return type: array<R>
Example:
> SELECT zip_with(array(1, 2, 3), array('a', 'b', 'c'), (x, y) -> (y, x));
 [{"y":"a","x":1},{"y":"b","x":2},{"y":"c","x":3}]
> SELECT zip_with(array(1, 2), array(3, 4), (x, y) -> x + y);
 [4,6]
> SELECT zip_with(array('a', 'b', 'c'), array('d', 'e', 'f'), (x, y) -> concat(x, y));
 ["ad","be","cf"]
FORALL
Function statement:
FORALL(<expr> array<T>, <pred> function(T)->boolean)
Supported engine: SparkSQL
Usage instructions: Test whether a predicate applies to all elements in the array.
Return type: boolean
Example:
> SELECT forall(array(1, 2, 3), x -> x % 2 == 0);
 false
> SELECT forall(array(2, 4, 8), x -> x % 2 == 0);
 true
> SELECT forall(array(1, null, 3), x -> x % 2 == 0);
 false
> SELECT forall(array(2, null, 8), x -> x % 2 == 0);
 NULL
AGGREGATE
Function statement:
AGGREGATE(<expr> array<T>, <start> U, <merge> function(U, T)->U, <finish> function(U)->R)
Supported engine: SparkSQL
Usage instructions: Aggregate the elements in the array expr.
Return type: R
Example:
> SELECT aggregate(array(1, 2, 3), 0, (acc, x) -> acc + x);
 6
> SELECT aggregate(array(1, 2, 3), 0, (acc, x) -> acc + x, acc -> acc * 10);
 60
EXISTS
Function statement:
EXISTS(<expr> array<T>, <pred> function(T)->boolean)
Supported engine: SparkSQL
Usage instructions: Test whether a predicate applies to one or more elements in the array.
Return type: boolean
Example:
> SELECT exists(array(1, 2, 3), x -> x % 2 == 0);
 true> SELECT exists(array(1, 2, 3), x -> x % 2 == 10);
 false
> SELECT exists(array(1, null, 3), x -> x % 2 == 0);
 NULL
> SELECT exists(array(0, null, 2, 3, null), x -> x IS NULL);
 true
> SELECT exists(array(1, 2, 3), x -> x IS NULL);
 false
ARRAY_CONTAINS
Function statement:
ARRAY_CONTAINS(<expr> array<T>, <value> T)
Supported engines: SparkSQL and Presto
Usage instructions: Return true if the array contains value.
Return type: boolean
Example:
> SELECT array_contains(array(1, 2, 3), 2);
 true
ARRAYS_OVERLAP
Function statement:
ARRAYS_OVERLAP(<a> array<T>, <b> array<U>)
Supported engines: SparkSQL and Presto
Usage instructions: Return true if a contains at least one non-null element that also exists in b. If the arrays have no common elements, and they are non-empty, and either contains null elements, it returns null. Otherwise, it returns false.
Return type: boolean
Example:
> SELECT arrays_overlap(array(1, 2, 3), array(3, 4, 5));
 true
ARRAY_INTERSECT
Function statement:
ARRAY_INTERSECT(<a> array<T>, <b> array<T>)
Supported engines: SparkSQL and Presto
Usage instructions: Return an array of elements that are the intersection of a and b, without duplicates.
Return type: array<T>
Example:
> SELECT array_intersect(array(1, 2, 3), array(1, 3, 5));
 [1,3]
ARRAY_JOIN
Function statement:
ARRAY_JOIN(<a> ARRAY<T>, <delimiter> string[, <nullReplacement> string])
Supported engines: SparkSQL and Presto
Usage instructions: Join the elements of the given array using a delimiter and an optional string to replace nulls. If nullReplacement is not set, all null values are filtered out.
Return type: string
Example:
> SELECT array_join(array('hello', 'world'), ' ');
 hello world
> SELECT array_join(array('hello', null ,'world'), ' ');
 hello world
> SELECT array_join(array('hello', null ,'world'), ' ', ',');
 hello , world
ARRAY_POSITION
Function statement:
ARRAY_POSITION(<a> array<T>, <element> T)
Supported engines: SparkSQL and Presto
Usage instructions: Return the 1-based index of the first element of the array.
Return type: integer
Example:
> SELECT array_position(array(3, 2, 1), 1);
 3
ARRAY_SORT
Function statement:
ARRAY_SORT(<a> array<T>[, <func> function(T, T)->integer])
Supported engines: SparkSQL and Presto
Usage instructions: Sort the input array. If func is omitted, the array is sorted in ascending order.
Return type: array<T>
Example:
> SELECT array_sort(array(5, 6, 1), (left, right) -> case when left < right then -1 when left > right then 1 else 0 end);
 [1,5,6]
> SELECT array_sort(array('bc', 'ab', 'dc'), (left, right) -> case when left is null and right is null then 0 when left is null then -1 when right is null then 1 when left < right then 1 when left > right then -1 else 0 end);
 ["dc","bc","ab"]
> SELECT array_sort(array('b', 'd', null, 'c', 'a'));
 ["a","b","c","d",null]
ARRAY_EXCEPT
Function statement:
ARRAY_EXCEPT(<a> array<T>, <b> array<T>)
Supported engines: SparkSQL and Presto
Usage instructions: Return an array of elements in a that are not in b, without duplicates.
Return type: array<T>
Example:
> SELECT array_except(array(1, 2, 3), array(1, 3, 5));
 [2]
ARRAY_UNION
Function statement:
ARRAY_UNION(<a> array<T>, <b> array<T>)
Supported engines: SparkSQL and Presto
Usage instructions: Return an array of elements that are the union of a and b, without duplicates.
Return type: array<T>
Example:
> SELECT array_union(array(1, 2, 3), array(1, 3, 5));
 [1,2,3,5]
NAMED_STRUCT
Function statement:
NAMED_STRUCT(name1 K, val1 V, ...)
Supported engine: SparkSQL
Usage instructions: Create a struct using the given field names and values.
Return type: struct
Example:
> SELECT named_struct("a", 1, "b", 2, "c", 3);
 {"a":1,"b":2,"c":3}
STRUCT
Function statement:
STRUCT(<col1> T1, <col2> T2, ...)
Supported engines: SparkSQL and Presto
Usage instructions: Create a struct using the given field names and values.
Return type: struct
Example:
> SELECT struct('a', 'b', 'c');
{"col1":"a","col2":"b","col3":"c"}
﻿
> SELECT struct('a', 'b', 'c', 1, 2);
{"col1":"a","col2":"b","col3":"c","col4":1,"col5":2}
SLICE
Function statement:
SLICE(<a> array<T>, <start> integer, <length> integer)
Supported engines: SparkSQL and Presto
Usage instructions: Return a subset of array a starting from the index start (array hidden index starts from 1, if start is negative, it counts from the end) with the specified length length.
Return type: array<T>
Example:
> SELECT slice(array(1, 2, 3, 4), 2, 2);
 [2,3]
> SELECT slice(array(1, 2, 3, 4), -2, 2);
 [3,4]
ARRAYS_ZIP
Function statement:
ARRAYS_ZIP(<a1> array<T>, ...)
Supported engine: SparkSQL
Usage instructions: Return a merged array where each element is of struct type, with the Nth struct containing the Nth value of each input array.
Return type: array<struct<string, T>>
Example:
> SELECT arrays_zip(array(1, 2, 3), array(2, 3, 4));
 [{"0":1,"1":2},{"0":2,"1":3},{"0":3,"1":4}]
> SELECT arrays_zip(array(1, 2), array(2, 3), array(3, 4));
 [{"0":1,"1":2,"2":3},{"0":2,"1":3,"2":4}]
SORT_ARRAY
Function statement:
SORT_ARRAY(<a> array<T>[, ascendingOrder boolean])
Supported engines: SparkSQL and Presto
Usage instructions: Sort the input array in ascending or descending order. For double/float types, NaN is considered greater than any non-NaN element. Null elements are placed at the beginning of the returned array in ascending order or at the end in descending order.
Return type: array<T>
Example:
> SELECT sort_array(array('b', 'd', null, 'c', 'a'), true);
 [null,"a","b","c","d"]
SHUFFLE
Function statement:
SHUFFLE(<a> array<T>)
Supported engines: SparkSQL and Presto
Usage instructions: Return a random permutation of the given array.
Return type: array<T>
Example:
> SELECT shuffle(array(1, 20, 3, 5));
 [3,1,5,20]
> SELECT shuffle(array(1, 20, null, 3));
 [20,null,3,1]
ARRAY_MAX
Function statement:
ARRAY_MAX(<a> array<T>)
Supported engines: SparkSQL and Presto
Usage instructions: Return the maximum value in the array. For double/float types, NaN is considered greater than any non-NaN element. Null elements are skipped.
Return type: array<T>
Example:
> SELECT array_max(array(1, 20, null, 3));
 20
ARRAY_MIN
Function statement:
ARRAY_MIN(<a> array<T>)
Supported engines: SparkSQL and Presto
Usage instructions: Return the minimum value in the array. For double/float types, NaN is considered greater than any non-NaN element. Null elements are skipped.
Return type: array<T>
Example:
> SELECT array_min(array(1, 20, null, 3));
 1
FLATTEN
Function statement:
FLATTEN(<aa> array<array<T>>
Supported engines: SparkSQL and Presto
Usage instructions: Convert a two-dimensional array into a one-dimensional array.
Return type: array<T>
Example:
> SELECT flatten(array(array(1, 2), array(3, 4)));
 [1,2,3,4]
SEQUENCE
Function statement:
SEQUENCE(<start> integer|date|timestamp, end integer|date|timestamp[, step integer|interval])
Supported engines: SparkSQL and Presto
Usage instructions: Generate an array from start to end (inclusive), incrementing by step. The type of the returned elements matches the type of start and end.
The start and stop expressions must parse to the same type. If the start and stop expressions are parsed to the date or timestamp type, then step must be parsed to an interval, year-month interval, or day-time interval type; otherwise, it must be parsed to the same type as start and end.
Return type: same as start
Example:
> SELECT sequence(1, 5);
 [1,2,3,4,5]
> SELECT sequence(5, 1);
 [5,4,3,2,1]
> SELECT sequence(to_date('2018-01-01'), to_date('2018-03-01'), interval 1 month);
 [2018-01-01,2018-02-01,2018-03-01]
> SELECT sequence(to_date('2018-01-01'), to_date('2018-03-01'), interval '0-1' year to month);
 [2018-01-01,2018-02-01,2018-03-01]
ARRAY_REPEAT
Function statement:
ARRAY_REPEAT(<element> T, <count> integer)
Supported engines: SparkSQL and Presto
Usage instructions: Return an array containing count occurrences of element.
Return type: array<T>
Example:
> SELECT array_repeat('123', 2);
 ["123","123"]
ARRAY_REMOVE
Function statement:
ARRAY_REMOVE(<a> array<T>, <element> T)
Supported engines: SparkSQL and Presto
Usage instructions: Delete all elements equal to the element from the array.
Return type: array<T>
Example:
> SELECT array_remove(array(1, 2, 3, null, 3), 3);
 [1,2,null]
ARRAY_DISTINCT
Function statement:
ARRAY_DISTINCT(<a> array<T>)
Supported engines: SparkSQL and Presto
Usage instructions: Delete duplicate values from the array.
Return type: array<T>
Example:
> SELECT array_distinct(array(1, 2, 3, null, 3));
 [1,2,3,null]
ELEMENT_AT
Function statement:
ELEMENT_AT(<a> array<T>, <index> integer)
ELEMENT_AT(<m> map<K, V>, <key> K)
Supported engines: SparkSQL and Presto
Usage instructions: Return the array element at the given (1-based) index or the value for the given key.
Return type: T and V
Example:
> SELECT element_at(array(1, 2, 3), 2);
 2
> SELECT element_at(map(1, 'a', 2, 'b'), 2);
 b
MAP
Function statement:
MAP(<k1> K, <v1> V, ...)
Supported engines: SparkSQL and Presto
Usage instructions: Generate a map from the given elements.
Return type: MAP<K, V>
Example:
> SELECT map(1.0, '2', 3.0, '4');
 {1.0:"2",3.0:"4"}
MAP_FROM_ARRAYS
Function statement:
MAP_FROM_ARRAYS(<keys> array<K>, <values> array<V>)
Supported engines: SparkSQL and Presto
Usage instructions: Create a map using the given key/value arrays. None of the elements in keys should be null.
Return type: map<K, V>
Example:
> SELECT map_from_arrays(array(1.0, 3.0), array('2', '4'));
 {1.0:"2",3.0:"4"}
MAP_KEYS
Function statement:
MAP_KEYS(<m> map<K, V>)
Supported engines: SparkSQL and Presto
Usage instructions: Return an unordered array containing the keys of the map.
Return type: array<K>
Example:
> SELECT map_keys(map(1, 'a', 2, 'b'));
 [1,2]
MAP_VALUES
Function statement:
MAP_VALUES(<m> map<K, V>)
Supported engines: SparkSQL and Presto
Usage instructions: Return an unordered array containing the values of the map.
Return type: array<V>
Example:
> SELECT map_values(map(1, 'a', 2, 'b'));
 ["a","b"]
MAP_ENTRIES
Function statement:
MAP_ENTRIES(<m> map<K, V>)
Supported engines: SparkSQL and Presto
Usage instructions: Return an unordered array of all entries in the given map.
Return type: array<struct<K, V>>
Example:
> SELECT map_entries(map(1, 'a', 2, 'b'));
 [{"key":1,"value":"a"},{"key":2,"value":"b"}]
MAP_FROM_ENTRIES
Function statement:
MAP_FROM_ENTRIES(<entries> array<struct<K, V>>)
Supported engines: SparkSQL and Presto
Usage instructions: Return a map created from the given array of entries.
Return type: map<K, V>
Example:
> SELECT map_from_entries(array(struct(1, 'a'), struct(2, 'b')));
 {1:"a",2:"b"}
MAP_CONCAT
Function statement:
MAP_CONCAT(map1 map<K, V>, ...)
Supported engines: SparkSQL and Presto
Usage instructions: Return the union of all given maps.
Return type: map<K, V>
Example:
> SELECT map_concat(map(1, 'a', 2, 'b'), map(3, 'c'));
 {1:"a",2:"b",3:"c"}
MAP_FILTER
Function statement:
MAP_FILTER(<m> map<K, V>, <func> function(K, V)->boolean)
Supported engine: SparkSQL
Usage instructions: Filter the entries in m using func.
Return type: map<K, V>
Example:
> SELECT map_filter(map(1, 0, 2, 2, 3, -1), (k, v) -> k > v);
 {1:0,3:-1}
MAP_ZIP_WITH
Function statement:
MAP_ZIP_WITH(<map1> map<K, V1>, <map2> map<K, V2>, <func> function(K, V1, V2)->R)
Supported engine: SparkSQL
Usage instructions: Merge two given maps into a single map using func. For keys that appear in only one map, their values will be set to NULL. If the input maps contain duplicate keys, only the first entry for each duplicate key is transmitted to func.
Return type: MAP<K, R>
Example:
> SELECT map_zip_with(map(1, 'a', 2, 'b'), map(1, 'x', 2, 'y'), (k, v1, v2) -> concat(v1, v2));
 {1:"ax",2:"by"}
TRANSFORM_KEYS
Function statement:
TRANSFORM_KEYS(<m> map<K, V>, <func> function(K, V)->R)
Supported engine: SparkSQL
Usage instructions: Transform the keys in the map using func.
Return type: map<R, V>
Example:
> SELECT transform_keys(map_from_arrays(array(1, 2, 3), array(1, 2, 3)), (k, v) -> k + 1);
 {2:1,3:2,4:3}
> SELECT transform_keys(map_from_arrays(array(1, 2, 3), array(1, 2, 3)), (k, v) -> k + v);
 {2:1,4:2,6:3}
TRANSFORM_VALUES
Function statement:
TRANSFORM_VALUES(<m> map<K, V>, <func> function(K, V)->R)
Supported engine: SparkSQL
Usage instructions: Transform the values in the map using func.
Return type: map<K, R>
Example:
> SELECT transform_values(map_from_arrays(array(1, 2, 3), array(1, 2, 3)), (k, v) -> v + 1);
 {1:2,2:3,3:4}
> SELECT transform_values(map_from_arrays(array(1, 2, 3), array(1, 2, 3)), (k, v) -> k + v);
 {1:2,2:4,3:6}
SIZE
Function statement:
SIZE(<expr> array<T>|map<K, V>)
Supported engines: SparkSQL and Presto
Usage instructions: Return the size of an array or map.
Return type: integer
Example:
> SELECT size(array('b', 'd', 'c', 'a'));
 4
> SELECT size(map('a', 1, 'b', 2));
 2
> SELECT size(NULL);
 -1
CARDINALITY
Function statement:
CARDINALITY(<expr> array<T>|map<K, V>)
Supported engines: SparkSQL and Presto
Usage instructions: Return the size of an array or map.
Return type: integer
Example:
> SELECT cardinality(array('b', 'd', 'c', 'a'));
 4
> SELECT cardinality(map('a', 1, 'b', 2));
 2
> SELECT cardinality(NULL);
 -1
ANY_MATCH / FORALL
Function statement:
ANY_MATCH(<expr> array<T>, x -> lambda(x))
FORALL(<expr> array<T>, x -> lambda(x))
Supported engine: SparkSQL
Usage instructions: For each element in the array, execute the lambda expression in sequence. The function returns true if at least one element returns true; otherwise, it returns false.
If any element in the array is NULL, the function returns NULL.
Return type: boolean
Example:
> SELECT any_match(array(1, 2, 3), x -> x % 2 == 0); 
false
> SELECT any_match(array(2, 4, 8), x -> x % 2 == 0); 
true
> SELECT any_match(array(1, null, 3), x -> x % 2 == 0); 
NULL
﻿
﻿
Help and Support

Was this page helpful?
You can also Contact sales or Submit a Ticket for help.
Help us improve! Rate your documentation experience in 5 mins.
Feedback
tencent cloud

Data Lake Compute

Collection Functions

ARRAY

FILTER

TRANSFORM

ZIP_WITH

FORALL

AGGREGATE

EXISTS

ARRAY_CONTAINS

ARRAYS_OVERLAP

ARRAY_INTERSECT

ARRAY_JOIN

ARRAY_POSITION

ARRAY_SORT

ARRAY_EXCEPT

ARRAY_UNION

NAMED_STRUCT

STRUCT

SLICE

ARRAYS_ZIP

SORT_ARRAY

SHUFFLE

ARRAY_MAX

ARRAY_MIN

FLATTEN

SEQUENCE

ARRAY_REPEAT

ARRAY_REMOVE

ARRAY_DISTINCT

ELEMENT_AT

MAP

MAP_FROM_ARRAYS

MAP_KEYS

MAP_VALUES

MAP_ENTRIES

MAP_FROM_ENTRIES

MAP_CONCAT

MAP_FILTER

MAP_ZIP_WITH

TRANSFORM_KEYS

TRANSFORM_VALUES

SIZE

CARDINALITY

ANY_MATCH / FORALL

Help and Support