tencent cloud

Creating Processing Task
Last updated:2025-12-02 18:03:32
Creating Processing Task
Last updated: 2025-12-02 18:03:32
This document introduces how to create a processing task.

Prerequisites

Activated Cloud Log Service (CLS), create a source log topic, and log data has been successfully collected.
Created target log topic. It is advisable to use an empty topic as the target log topic to allow writing of processed data.
Ensure the current operation account has permission to configure data processing tasks. See CLS Access Policy Template.

Operation Steps

1. Log in to the CLS service console, and select Data Processing in the left sidebar.
2. Click Create. Configure basic information:
Preprocessing of Data
Non-Data Preprocessing
Pre-processing usage scenario:
Log collection to CLS, perform data processing (filtering, structuring), then write to log topic.
Example: The log topic name is "test", and LogListener is used to collect data. Now you need to write logs with loglevel="Error" to "test", while the other logs will not be reported. So you can create a pre-processing task "my_transform" to complete this.
preprocessing
preprocessing


The value of preprocessing:
Perform log filtering in preprocessing to effectively reduce log write traffic, index traffic, index storage volume, and log storage volume.
Perform log structuring in preprocessing. After enabling key-value index, you can use SQL to analyze logs, configure dashboards, and set up alarms.
Note:
Each log topic can only configure a pre-processing task.
Pre-processing does not currently support distributing logs to multiple fixed log topics or dynamic log topics. Therefore, you cannot use the log_output and log_auto_output functions in pre-processing tasks.
If you add or modify the collection configuration, it will cause the collected data to change, therefore needs to simultaneously adjust the processing statement.
Pre-processing tasks and non-pre-processing tasks do not support switching between.

Configuration Item:

Configuration Item
Description
Task Name
Name of the data processing task, for example: my_transform.
Enabling Status
Task start/stop, default start.
Preprocessing Data
Turn on the switch.
The feature entry for preprocessing:
Entry 1: Toggle on the Preprocessing Data switch when creating a data processing task.
Entry 2: You can also click Data Processing at the bottom of the Collection Configuration page to enter the preprocessing data editing page.
Log Topic
Specify the log topic to write pre-processing results to.
external data source
Add external data source, applicable to dimension table join scenarios. Currently only support Tencent Cloud MySQL, see res_rds_mysql function.
Region: The region where the cloud MySQL instance is located
TencentDB for MySQL instance: Please select in the pull-down menu
Username: Enter your database username
Password: Enter your database password
Alias: Your MySQL alias name, which will be used in res_rds_mysql as parameter.
Note:
Currently, it is only available in the Shanghai region. If needed, please Contact Us.
Data processing service log
The data processing task running logs are saved in the cls_service_log log topic (free). The alarm feature in the monitoring chart depends on this log topic and is enabled by default.
Upload Processing Failure Logs
When enabled, logs that failed to be processed will write into the target topic. When turned off, this option will drop processing-failed logs.
Field Name in Processing Failure Logs
If your choice is to write failed logs to the target log topic, failure logs will be saved in this field, with a default name of ETLParseFailure.
Advanced Settings
Add environment variable: Add environment variables for the data processing task runtime.
For example, add a pair of variables, name ENV_MYSQL_INTERVAL, value 300, then you can use refresh_interval=ENV_MYSQL_INTERVAL in the res_rds_mysql function, the task will parse into refresh_interval=300.
Note:
Currently, it is only available in the Shanghai region. If needed, please Contact Us.
Non-preprocessing is mainly used for log distribution scenarios.
Distribute to Fixed log topic usage scenario.
Scenarios for determining the target log topic for distribution. Example: Output logs with loglevel=warning from the source log topic to a log topic named WARNING, output logs with loglevel=error to a log topic named ERROR, and output logs with loglevel=info to a log topic named INFO. Please refer to the log_output function.

Distribute to Dynamic log topic usage scenario.
For scenarios where there are considerable target log topics to distribute or unable to determine. Example: There is one Key (field) named "AppName" in your log. Since business needs, this field continuously adds new values as time passes, and you need to distribute logs according to "AppName". This option is suitable for use. Please refer to the log_auto_output function.

Configuration Item:
Distribute to Fixed Log Topic
Distribute to Dynamic Log Topic
Configuration Item
Description
Task Name
Name of the data processing task, for example: my_transform.
Enabling Status
Task start/stop, default start.
Preprocessing Data
Turn off the switch.
Source Log Topic
Data source of the data processing task.
External data source
Add external data source, applicable to dimension table join scenarios. Currently only support Tencent Cloud MySQL, see res_rds_mysql function.
Region: The region where the cloud MySQL instance is located.
TencentDB for MySQL Instance: Please select in the pull-down menu.
Username: Enter your database username.
Password: Enter your database password.
Alias: Your MySQL alias name, which will be used in res_rds_mysql as parameter.
Note:
Currently, it is only available in the Shanghai region. If needed, please Contact Us.
Process Time Range
Specify the log scope for data processing.
Note:
Process only data in the lifecycle of a log topic.
Target Log Topic
Select fixed log topic:
Log topic: destination bucket for inventory output of data processing, configured as one or multiple.
Target name: For example, in the source log topic, output loglevel=warning logs to Log Topic A, loglevel=error logs to Log Topic B, and loglevel=info logs to Log Topic C. You can configure the target names of Log Topic A, B, and C as warning, error, and info.
Data processing service log
The data processing task running logs are saved in the cls_service_log log topic (free). The alarm feature in the monitoring chart depends on this log topic and is enabled by default.
Upload Processing Failure Logs
When enabled, logs that failed to be processed will write into the target topic. When turned off, this option will drop processing-failed logs.
Field Name in Processing Failure Logs
If your choice is to write failed logs to the target log topic, failure logs will be saved in this field, with a default name of ETLParseFailure.
Advanced Settings
Add environment variable: Add environment variables for the data processing task runtime.
For example, add a pair of variables, name ENV_MYSQL_INTERVAL, value 300, then you can use refresh_interval=ENV_MYSQL_INTERVAL in the res_rds_mysql function, the task will parse into refresh_interval=300.
Note:
Currently, it is only available in the Shanghai region. If needed, please Contact Us.
Configuration Item
Description
Task Name
Name of the data processing task, for example: my_transform.
Enabling Status
Task start/stop, default start.
Preprocessing Data
Turn off the switch.
Source Log Topic
Data source of the data processing task.
External data source
Add external data source, applicable to dimension table join scenarios. Currently only support Tencent Cloud MySQL, please refer to res_rds_mysql function.
Region: The region where the cloud MySQL instance is located.
TencentDB for MySQL Instance: Please select in the pull-down menu.
Username: Enter your database username.
Password: Enter your database password.
Alias: Your MySQL alias name, which will be used in res_rds_mysql as parameter.
Note:
Currently, it is only available in the Shanghai region. If needed, please Contact Us.
Process Time Range
Specify the log scope for data processing.
Note:
Process only data in the lifecycle of a log topic.
Target Log Topic
Select Dynamic Log Topic. No configuration required for target log topic, it will be automatically generated according to the specified field value.
Overrun handling
When the topic count generated by your data processing task exceeds the product spec, you can choose:
Create a fallback logset and log topic, and write logs to the fallback topic (created when creating a task).
Fallback logset: auto_undertake_logset, single-region single account next.
Fallback topic: auto_undertake_topic_$(data processing task name). For example, if a user creates two data processing tasks etl_A and etl_B, two fallback topics will occur: auto_undertake_topic_etl_A, auto_undertake_topic_etl_B.
Discard log data: Discard logs directly without creating a fallback topic.
Data processing service log
The data processing task running logs are saved in the cls_service_log log topic (free). The Alarm feature in the monitoring chart depends on this log topic and is enabled by default.
Upload Processing Failure Logs
When enabled, logs that failed to be processed will write into the target topic. When turned off, this option will drop processing-failed logs.
Field Name in Processing Failure Logs
If your choice is to write failed logs to the target log topic, failure logs will be saved in this field, with a default name of ETLParseFailure.
Advanced Settings
Add environment variable: Add environment variables for the data processing task runtime.
For example, add a pair of variables, name ENV_MYSQL_INTERVAL, value 300, then you can use refresh_interval=ENV_MYSQL_INTERVAL in the res_rds_mysql function, the task will parse into refresh_interval=300.
Note:
Currently, it is only available in the Shanghai region. If needed, please Contact Us.
3. After the configuration is complete, click Next.
4. Edit/debug processing statement. At the bottom left of the webpage, you can see two tags: Raw logs and Test data. At the bottom right is the processing result. You can debug your processing statement by comparing the original/test data with the processing result.
Raw logs: The system automatically loads raw log data, 100 records by default.
Test data: The system randomly loads raw data, which may not meet your debugging needs. You may click Add Test Data on the Original Data tab to add raw logs to test data, then modify these data on the Test Data tab to meet your debugging requirements.
You can choose Statement mode or Interaction mode to write processing statements.
Statement Mode
Interaction Mode
Method 1: Use AI-generated processing statements (recommended)
Click the AI icon in the upper-right corner of the raw log to add it to the dialog box with CLS AI assistant. Then describe your processing requirement in natural language, such as "retain logs with Loglevel=Error" or "structure this log," and click the Send icon. The AI assistant will provide a processing statement, which you can copy to the processing statement editing box. If the statement is inaccurate, you can obtain an accurate one through multi-round dialogue. The following is an example.
Raw Log logs to be processed:
{
"content": "[2021-11-24 11:11:08,232][328495eb-b562-478f-9d5d-3bf7e][INFO] curl -H 'Host: ' http://abc.com:8080/pc/api -d {\\"version\\": \\"1.0\\",\\"user\\": \\"CGW\\",\\"password\\": \\"123\\"}"
}
Multi-round dialogue with AI
Dialogue Turn
User Question
AI Assistant Reply
First-round dialogue
Structure this log
// retrieval time, request ID, log level
ext_regex("content", regex="\\[(\\d{4}-\\d{2}-\\d{2} \\d{2}:\\d{2}:\\d{2},\\d{3})\\]\\[(\\w+-\\w+-\\w+-\\w+-\\w+)\\]\\[([A-Z]{4})\\]", output="time,requestid,level")
// extract request URL
ext_regex("content", regex="curl -H 'Host: ' (http://[^\\s]+)", output="url")
// extract JSON data in the request body
ext_json("content", prefix="request_")
Second-round dialogue
The content is not standard JSON. An error occurred when using ext_json. First extract the JSON part from the content, then extract the node from the JSON.
// step one: extract the JSON part
ext_regex("content", regex=" -d (\\{.*\\})", output="json_content")

// step 2: parse fields from the extracted JSON
ext_json("json_content")

// Step 3: Delete the middle field
fields_drop("json_content")
Processing result:
{
"level":"INFO",
"password":"123",
"requestid":"328495eb-b562-478f-9d5d-3bf7e",
"time":"2021-11-24 11:11:08,232",
"user":"CGW",
"version":"1.0"
}
Method Two: Write statements manually
You may switch to Interaction Mode on the Edit Processing Statement page.

Add visualization function: Click

, select the visualization function category and visualization function name you need to add.
Debug visualization function: Click ▶️ in the top-right corner of the visualization function.
Delete visualization function: Click X in the top-right corner of the visualization function.
Edit visualization function: Click

in the top-right corner of the visualization function.
The currently supported visualization functions are as follows:
Function Category
Visualization Function Name
Application Scenario
Extract Key Value
JSON: Extract fields and field values from JSON nodes
Separator: Extract field values based on the separator. Users are advised to enter the field name.
Regular Expression: Extract field values using regular expressions. User input is required for the field name.
Log Structuring
Log Processing
Filter Logs: Configure conditions for filtering logs (multiple conditions are in an OR relationship). For example, if field A exists or field B does not exist, filter out the log.
Distribute Logs: Configure conditions for distributing logs.
If status="error" and message contains "404", distribute to topic A
If status="running" and message contains "200", distribute to topic B
Retain Logs: Configure conditions for preserving logs.
Delete/Retain Logs
Field Processing
Delete fields
Rename Field
Delete/Rename Field
For detailed function use cases, see visualization function.
After writing the DSL processing statement, click Preview or Breakpoint Debug in the top left corner of the page (or ▶️ in the top-right corner of the visualization function in interaction mode) to run and debug the DSL function. The execution result will show on the right. Based on the result, adjust DSL statements until they meet your requirements.
Note:
Statement mode: Supports all processing functions. It is recommended to use AI to write processing statements.
Interaction mode: Only supports partial functions. Please see Visualization Function. When the target log topic is configured as Dynamic Log Topic, interaction mode is not supported.
5. Click OK to submit the data processing task.
Was this page helpful?
You can also Contact Sales or Submit a Ticket for help.
Yes
No

Feedback