tencent cloud

Data Lake Compute

Historical Task Instances

PDF
Focus Mode
Font Size
Last updated: 2026-04-14 12:03:56
Historical Task Instances focus on recording and managing various types of tasks performed by users in DLC for subsequent tracking, review, and optimization. Through the Historical Task Instances feature, users can quickly view the execution status of tasks, including start and end times, execution status (such as successful or failed), input and output details, and generated logs or error information. It provides users with the convenience of auditing and retrieval, helping users identify task health status, potential issues, and optimize resource configuration, etc.

Operation Steps

2. Enter the historical task instances page. Administrators can view all historical operation tasks in the past 45 days, and general users can query tasks related to themselves in the past 45 days.
3. Support filtering and viewing by task type, task status, creator, task time range, task name, ID, content, sub-channel, and other methods.
4. Click the task ID/name. Support view task details, including modules such as basic information, running result, task insights, and task logs.
5. Support user click to modify task configuration, quickly enter job details to adjust configuration for optimization.

Historical Task Instances List

Note:
The *field supports after enabling the insight feature. For enablement method, please see How to Enable Insight Feature.
Field Name
Description
Task ID
Unique identifier of the task.
Task name
Prefix_yyyymmddhhmmss_eight-digit uuid, where yyyymmddhhmmss is the task execution time.
Prefix rule:
1. The job task submitted by the console is prefixed with the job name. For example, if the user-created job is customer_segmentation_job and it is executed at 21:25:10 on November 26, 2024, the task id will be customer_segmentation_job_20241126212510_f2a65wk1. According to the current data format restriction, the job name should be <= 100 characters.
2. SQL type submitted on the data exploration page, prefixed with sql_query. Example: sql_query_20241126212510_f2a65wk1.
3. Data optimization tasks, according to the prefixes of different sub-types of optimization tasks, among them:
3.1 The prefix of the optimizer is only optimizer.
3.2 The SQL type of the optimized instance is optimizer_sql.
3.3 The batch type of the optimized instance is optimizer_batch.
3.4 Configuration task created when configuring the data optimization policy: optimizer_config.
4. Import data task, prefixed with import, for example: import_20241126212510_f2a65wk1.
5. Export data task, prefixed with export, for example: export_20241126212510_f2a65wk1.
6. Wedata submission, prefixed with wd, for example: wd_20241126212510_f2a65wk1.
7. Other API submissions, prefixed with customized, for example: customized_20241126212510_f2a65wk1.
8. Tasks created for metadata operations on the metadata management page, prefixed with metadata, for example: metadata_20241126212510_f2a65wk1.
Task status
Starting
Executing
Queuing up
Successful
Failed
Canceled
Expired
Task run timeout
Task content
Detailed content of the task. For job type tasks, it is a hyperlink to job details; for SQL type tasks, it is the complete sql statement.
Task type
Be divided into Job type, SQL type.
Task source
The origin of this task. Support data exploration tasks, data job tasks, data optimization tasks, import tasks, export tasks, metadata management, Wedata tasks, and API submission tasks.
Sub-channel
Users can customize sub-channels when submitting tasks via the API.
Compute resource
The computing engine/resource group used to run the task.
Consumed CU*H
During task execution, CU*H consumption occurs. Please note that the final CU consumption is subject to the bill, and the final result may vary. In the Spark scenario, it is approximately equal to the sum of Spark task execution durations divided by 3600.
Total Execution Duration
The duration from the execution of the task to its completion, which may include waiting time caused by insufficient resources.
1. For a SparkSQL task, it is the platform scheduling time + queuing time within the engine + execution duration within the engine.
2. For a Spark job task, it is the platform scheduling time + engine startup duration + queuing time within the engine + execution duration within the engine.
* Engine Execution Duration
If the task has insight results, it is the execution duration within the engine. This metric reflects the actual time required for computation, which is the duration from the start execution of the first task of a Spark task to the task completion.
(This metric can only be calculated after the task is completed.)
Scanned data volume
The physical data volume read from storage by this task is approximately equal to the sum of Stage Input Size in Spark UI in the Spark scenario.
*Scanned data records
The number of physical data entries read from storage by this task is, in the Spark scenario, approximately equal to the sum of Stage Input Records in Spark UI.
Creator
If it is a job type task, it refers to the creator of the job.
Executor
The user running the task.
Submitted at
The time when the user submits tasks.
*Engine execution time
The time when the first preemption of the CPU starts execution of the task, the start execution time of the first task within the Spark engine.
*Number of output files
The collection of this metric requires upgrading the Spark engine kernel to a version later than 2024.11.16.
Total number of files written by tasks through statements such as Insert. Case-insensitive to task type.
*Output small-sized files
The collection of this metric requires upgrading the Spark engine kernel to a version later than 2024.11.16.
Small File Definition: An individual file size of the output that is less than 4 MB is defined as a small file (controlled by the parameter spark.dlc.monitorFileSizeThreshold, with a default value of 4 MB, which can be configured globally or at the task level for the engine).
This metric definition: Total number of small files written by tasks through statements such as insert.
Case-insensitive to task type.
*Total output lines
The number of records output after this task processes data is, in the Spark scenario, approximately equal to the sum of Stage Output Records in Spark UI.
*Total output size
The Size of the record output after this task processes data is, in the Spark scenario, approximately equal to the sum of Stage Output Size in Spark UI.
*Data shuffle lines
Approximately equal to the sum of Stage Shuffle Read Records in Spark UI in the Spark scenario.
*Data shuffle size
Approximately equal to the sum of Stage Shuffle Read Size in Spark UI in the Spark scenario.
*Health status
Analyze the task to judge the health status of the task and determine whether optimization is required. Please see Task Insight for details.

Historical Task Instances Details

Basic Info

1. Users can view specific task content in execution content. For SQL tasks, view the complete SQL statement; for job tasks, view job details and job parameters.
2. Users can view relevant content about task resources in Resource Consumption, including consumed CUs x hours, total execution duration, engine execution duration, scanned data volume, computing resources, kernel version, Driver resources, Executor resources, count of Executors, scanned data volume, compute resource, kernel version, Driver resource, Executor resource, and count of Executors.
3. Users can view basic information of tasks in basic info, including task name, task ID, task type, task source, creator, executor, submission time, and engine execution time.
4. For tasks running on the SuperSQL SparkSQL or SuperSQL Presto engine, users can view the task running progress bar in query statistics, which includes the time taken for stages such as creating tasks, scheduling tasks, executing tasks, and obtaining results.

Running Result

After task completion, users can query the task result on the execution result page. There are two types of task results:
1. Write file information: For file writing tasks running on SuperSQL, standard engine, or Spark kernel engine, support user viewing of write file information.
Average file size
minimum file size
maximum file size
Total file size
2. Execution result: SQL task query statement, which can display the query result of the current task and support users to download query results.

Task Insight

After task completion, users can view task insight results on the task insight page. It supports analyzing the aggregate metrics that each task has executed and insights into optimizable issues. Based on the actual execution situation of the current task, DLC task insight will combine data analysis and algorithm rules to provide corresponding optimization suggestions. For details, please see Task Insight.

Task Log

Users can view the logs of the current task on the task log page.
Note:
Only the job type and BatchSQL type support task log viewing.
The SQL type tasks run on permanent clusters, so logs cannot be displayed at the task level.
They are internal optimization tasks tasks are internal optimization tasks and do not require displaying cluster logs.
1. You can switch logs for different cluster nodes through the Pod name, including logs for the Driver, executor, and others.
2. You can switch log types, including All, Log4j, Stdout, and Stderr.
Note:
Spark engine images upgraded on or after July 4, 2025, support the display of Log4j, Stdout, and Stderr logs individually.
For historic version images of the Spark engine:
When "All" or "Stderr" is selected, all logs will be displayed.
When "Log4j" or "Stdout" is selected, no content will be displayed (the output will be empty).
If you want to upgrade the engine, Submit a ticket to contact the after-sales for upgrade.
All: Displays all log content of the task for comprehensive troubleshooting.
Log4j: Displays logs generated by the Spark cluster itself, helping you understand the cluster's running status and internal information.
Stdout: Displays business logs, usually containing normal output information of user programs.
Stderr: Displays standard error logs, helping you quickly capture exceptions and error information.
3. The following three log levels are supported for filtering: All, Error, and WARN.
4. This page only displays the latest 1,000 logs. To view all logs, export them.
5. Log export history and status of exported tasks can be viewed. In log export history, you can save log files locally.

Resource Usage Statistics

The real-time fluctuation chart provides an intuitive display of resource consumption status during task execution. The chart automatically updates every minute and reports the latest data (due to collection frequency limitations, no fluctuation chart is displayed for tasks running less than 5 seconds). This helps you to dynamically monitor resource usage trends.
Note:
The feature is only supported in Spark engine versions released on July 1, 2025 or later.
SQL tasks: Statistics on the real-time usage of Executor cores (excluding Driver cores) are collected, accurately reflecting the computing resources consumption of SQL tasks.
Batch tasks: Statistics on all startup cores (including Driver cores) are collected, providing a comprehensive display of resource usage for batch tasks.

Help and Support

Was this page helpful?

Help us improve! Rate your documentation experience in 5 mins.

Feedback