tencent cloud

Function Development
Last updated:2025-03-04 12:18:50
Function Development
Last updated: 2025-03-04 12:18:50
1. Log in to WeData console and enter the function development page.
2. Click on Project List in the left menu to find the target project for which you want to operate the function development feature.
3. After selecting a project, click to enter the data development module.
4. Click on Function Development in the left menu.

Function Overview

The UDF functions uploaded in the resource management feature can be used in function development. After categorizing the functions, specifying the class name, and indicating the usage, they can be utilized in the data development process. Currently, support is provided for creating functions of types Hive SQL, Spark SQL, and DLC SQL.

Create a Function

1. On the function development page, click

to select creating a new Hive SQL function, Spark SQL function, or DLC SQL function. Directly click the

button on the right side of the target path under the corresponding function type to create the respective type of function as well.



2. Configure the function in the pop-up, click Save and submit to complete the function registration.

The configuration information is shown in the table below:
Information
Description(Optional)
Function Type
Create functions according to their nature under the preset function categories, which include: analytical functions, encryption functions, aggregate functions, logical functions, date and time functions, math functions, conversion functions, string functions, IP and domain name functions, window functions, and other functions.
Class Name
Enter the class name of the function.
Function file
Select the address of the function source file:
Select resource file: Select the function file from the jar or zip resources uploaded through the resource management feature.
Specify COS path: Obtain the function file from the platform COS bucket path.
Resource file
The function file option is Select Resource File, which requires selecting the required function file in the resource management directory.
COS Path
The function file option is Specify COS Path; you need to enter the path of the function file in the platform COS bucket.
Command Format
The format is: function name(parameter). For example, the command format of the sum function is sum(col).
Usage Instructions
Instructions for Custom Functions. For example, the usage instructions for the sum function are: Calculate the summarized value.
Parameter Description
Parameter Description of Custom Functions. For example, the parameter description for the sum function is: col: required. The column value can be of type DOUBLE, DECIMAL, or BIGINT. If the input is of type STRING, it will be implicitly converted to DOUBLE type before participating in the calculation.
Return Value
Description of the return value of custom functions. For example, the return value of the sum function is: Returns DOUBLE type.
Example
Example description of custom functions. For example, the example for the sum function is: Calculate the total sales amount of all products, the command example is: select sum(sales) from table.
3. After the function is changed, the version feature can be used to save the history record, including the version number, submitter, submission time, type of change, remark, and it also supports the operation of rolling back the historical version.


Function Example

Example Of Spark SQL Function Development

1. Create a project
Create a Maven project and introduce the hive-exec dependency. You can create a project using the mvn command line or through the IDEA tool, where groupId and artifactId are replaced with your own defined names.
mvn archetype:generate -DgroupId=com.example -DartifactId=demo-hive -Dveriosn=1.0-SNAPSHOT -Dpackage=com.example

2. Write code
Introduce hive-exec and junit test pom dependencies.
<dependencies>
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-exec</artifactId>
<version>2.3.8</version>
<exclusions>
<exclusion>
<groupId>org.pentaho</groupId>
<artifactId>pentaho-aggdesigner-algorithm</artifactId>
</exclusion>
<exclusion>
<groupId>javax.servlet</groupId>
<artifactId>servlet-api</artifactId>
</exclusion>
<exclusion>
<groupId>org.eclipse.jetty.orbit</groupId>
<artifactId>javax.servlet</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.11</version>
<scope>test</scope>
</dependency>
</dependencies>
Create a Java class in the `src/main/java/com/example` directory, inherit from `org.apache.hadoop.hive.ql.exec.UDF`, and implement the `evaluate` method to define the specific behavior of the custom function, for example: converting the input string to uppercase.
package com.example;

import org.apache.hadoop.hive.ql.exec.UDF;

public class UppercaseUDF extends UDF {
public String evaluate(String input) {
return input.toUpperCase();
}
}
3. Compilation and packaging
Introduce the Maven packaging plugin, execute the mvn package command at the root path of the project for compilation and packaging, and the generated package name is: demo-hive-1.0-SNAPSHOT.jar.
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.8.1</version>
<configuration>
<source>1.8</source>
<target>1.8</target>
</configuration>
</plugin>
<!--(start) for package jar with dependencies -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-assembly-plugin</artifactId>
<version>3.0.0</version>
<configuration>
<archive>
<!--Specify the class containing the main method-->
<manifest>
<mainClass>com.example.UppercaseUDF</mainClass>
</manifest>
</archive>
<!--Cannot modify jar-with-dependencies-->
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
<appendAssemblyId>false</appendAssemblyId>
</configuration>
<executions>
<execution>
<id>make-assembly</id> <!-- this is used for inheritance merges -->
<phase>package</phase> <!-- bind to the packaging phase -->
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
<!--(end) for package jar with dependencies -->
</plugins>
</build>

<repositories>
<repository>
<id>alimaven</id>
<name>aliyun maven</name>
<url>http://maven.aliyun.com/nexus/content/groups/public/</url>
</repository>
</repositories>
Execute the `mvn package` command:
mvn package -Dmaven.test.skip=true
4. Operation Function
Go to the WeData Function Development page, create a custom function, fill in the full path name of the function class: com.example.UppercaseUDF, select the corresponding resource file, that is, the jar package that implements the custom function encoding. If there is no resource file, create resources.
4.1 Resource upload:
Upload the function package demo-hive-1.0-SNAPSHOT.jar through the resource management feature.

4.2 Function Creation:
Create a Spark SQL function through the function development feature.

Example Function Information:
Information
Content
Function Type
Other functions
Class Name
com.example.UppercaseUDF
Function file
Select a resource file
Resource file
demo-hive-1.0-SNAPSHOT.jar
Command Format
UppercaseUDF(col)
Usage Instructions
Convert the input string to uppercase format
Parameter Description
Input parameter of string type
Return Value
Output the uppercase form of a string
4.3 Function usage:
In the development space, create a new SQL file and use the successfully created function to verify its feature.


DLC SQL Function Development Example

You can directly use the above UppercaseUDF example to create and use the DLC function.

Was this page helpful?
You can also Contact Sales or Submit a Ticket for help.
Yes
No

Feedback