Release Notes
Announcements
Security Announcements
Type | Supported Operators | Unsupported Operators |
Source | FileSourceScanExec,HiveTableScanExec,BatchScanExec,InMemoryTableScanExec | - |
Sink | DataWritingCommandExec,InsertIntoHiveTable, | - |
Common | FilterExec,ProjectExec,SortExec,UnionExec | - |
Aggregate | HashAggregateExec | SortAggregateExec,ObjectHashAggregateExec |
Join | BroadcastHashJoinExec,ShuffledHashJoinExec,SortMergeJoinExec,BroadcastNestedLoopJoinExec,CartesianProductExec | - |
Window | WindowExec | WindowGroupLimitExec |
Exchange | ShuffleExchangeExec,ReusedExchangeExec,BroadcastExchangeExec,CoalesceExec | CustomShuffleReaderExec |
Limit | GlobalLimitExec,LocalLimitExec,TakeOrderedAndProjectExec,CollectLimitExec | - |
Subquery | SubqueryBroadcastExec | - |
Other | ExpandExec,GenerateExec,CollectTailExec,RangeExec | RangeExec,SampleExec |
Type | Supported Functions |
Generator Functions | explode,explode_outer,inline,inline_outer,posexplode,posexplode_outer,stack |
Window Functions | cume_dist,dense_rank,lag,lead,nth_value,ntile,percent_rank,rank,row_number |
Aggregate Functions | any,any_value,approx_count_distinct,approx_percentile,array_agg,avg,bit_and,bit_or,bit_xor,bool_and,bool_or,collect_list,collect_set,corr,count,count_if,covar_pop,covar_samp,every,first,first_value,grouping,grouping_id,kurtosis,last,last_value,max,max_by,mean,median,min,min_by,percentile,percentile_approx,regr_avgx,regr_avgy,regr_count,regr_intercept,regr_r2,regr_slope,regr_sxx,regr_sxy,regr_syy,skewness,some,std,stddev,stddev_pop,stddev_samp,sum,try_avg,try_sum,var_pop,var_samp,variance |
Array Functions | array,array_append,array_compact,array_contains,array_distinct,array_except,array_insert,array_intersect,array_join,array_max,array_min,array_position,array_prepend,array_remove,array_repeat,array_union,arrays_overlap,arrays_zip,flatten,get,shuffle,slice,sort_array |
Bitwise Functions | &,^,bit_count,bit_get,getbit,shiftright,|,~ |
Collection Functions | array_size,cardinality,concat,reverse,size |
Conditional Functions | coalesce,if,ifnull,nanvl,nullif,nvl,nvl2,when |
Conversion Functions | bigint,binary,boolean,cast,date,decimal,double,float,int,smallint,string,timestamp,tinyint |
Date and Timestamp Functions | add_months,date_add,date_diff,date_format,date_from_unix_date,date_sub,date_trunc,dateadd,datediff,day,dayofmonth,dayofweek,dayofyear,extract,from_unixtime,from_utc_timestamp,hour,last_day,make_date,make_timestamp,make_ym_interval,minute,month,next_day,quarter,second,timestamp_micros,timestamp_millis,to_unix_timestamp,to_utc_timestamp,trunc,unix_date,unix_micros,unix_millis,unix_seconds,unix_timestamp,weekday,weekofyear,year |
Hash Functions | crc32,hash,md5,sha,sha1,sha2,xxhash64 |
JSON Functions | from_json,get_json_object,json_array_length,json_object_keys,json_tuple,schema_of_json,to_json |
Lambda Functions | aggregate,array_sort,exists,filter,forall,map_filter,map_zip_with,reduce,transform,transform_keys,transform_values,zip_with |
Map Functions | element_at,map,map_concat,map_contains_key,map_entries,map_keys,map_values,str_to_map,try_element_at |
Mathematical Functions | %,*,+,-,/,abs,acos,acosh,asin,asinh,atan,atan2,atanh,bin,cbrt,ceil,ceiling,conv,cos,cosh,cot,csc,degrees,e,exp,expm1,factorial,floor,greatest,hex,hypot,least,log,log10,log1p,log2,mod,negative,pi,pmod,positive,pow,power,rand,random,rint,round,sec,shiftleft,sign,signum,sinh,sqrt,try_add,unhex,width_bucket |
Misc Functions | assert_true,equal_null,spark_partition_id,uuid,version,|| |
Predicate Functions | !,!=,<,<=,<=>,<>,=,==,>,>=,and,between,case,ilike,in,isnan,isnotnull,isnull,like,not,or,regexp,regexp_like |
String Functions | ascii,base64,bit_length,btrim,char,char_length,character_length,chr,concat_ws,contains,endswith,find_in_set,format_number,format_string,initcap,instr,lcase,left,len,length,levenshtein,locate,lower,lpad,ltrim,luhn_check,mask,overlay,position,regexp_extract,regexp_extract_all,regexp_replace,repeat,replace,right,rpad,rtrim,soundex,split,split_part,startswith,substr,substring,substring_index,translate,trim,ucase,unbase64,upper |
Struct Functions | named_struct,struct |
URL Functions | url_decode,url_encode |
Parameter | Description |
spark.plugins | The plug-in used by Spark, set the parameter value to org.apache.gluten.GlutenPlugin (if spark.plugins is already configured, you can add org.apache.gluten.GlutenPlugin to it, use comma "," as separator). |
spark.memory.offHeap.enabled | Set to true, Meson speed up requires the use of JVM off memory |
spark.memory.offHeap.size | Set the offHeap Memory size according to actual conditions. For details, see recommended configurations for executor memory of varying specifications. |
spark.shuffle.manager | The columnar shuffle manager used by Meson, set the parameter value to: org.apache.spark.shuffle.sort.ColumnarShuffleManager |
executor-cores | spark.executor.memory | spark.memory.offHeap.size |
2 | 2GB | 4GB |
4 | 3GB | 10GB |
8 | 6GB | 20GB |
Parameter | Description |
spark.plugins | The plug-in used by Spark, set the parameter value to org.apache.gluten.GlutenPlugin (if spark.plugins is already configured, you can add org.apache.gluten.GlutenPlugin to it, use comma "," as separator). |
spark.memory.offHeap.enabled | Set to true, Native speed up requires the use of JVM off memory |
spark.memory.offHeap.size | Set the offHeap Memory size according to actual conditions. The initial size can be set to 1G. |
spark.shuffle.manager | The columnar shuffle manager used by Meson, set the parameter value to: org.apache.spark.shuffle.sort.ColumnarShuffleManager |
spark.driver.extraClassPath | The Gluten native jar used by Spark, the default path of the jar is /usr/local/service/spark/gluten |
spark.executor.extraClassPath | The Gluten native jar used by Spark, with the default path at /usr/local/service/spark/gluten |
spark.executorEnv.LIBHDFS3_CONF | Path of the integrated HDFS cluster configuration file, default at /usr/local/service/hadoop/etc/hadoop/hdfs-site.xml |
피드백