tencent cloud

Task Alerts
Last updated:2026-01-08 10:44:23
Task Alerts
Last updated: 2026-01-08 10:44:23

Alarm Rule

The Rule Page provides configuration features for alarm tasks, supporting alarm condition and alarm notification setup for project, task, and workflow operating status.

Creating a Rule

1. log in to WeData console.
2. Click the project list in the left menu and find the target project to configure alarm rules.
3. After selecting a project, click to enter the ops center module.
4. Click Alarm Rules in the left menu.
5. Enter the alarm rule management page, click Create Rule, and fill in the rule information.

Function Description

Information
Description
Basic Info
Rule Name
Alarm rule name, 1-128 characters, only Chinese, English, digits, and underscores.
Rule description
Alarm rule description, optional, no more than 500 characters.
Monitoring Object
Select the monitored object for rule alarm. Currently supports tasks and projects.
When selecting a task as the configuration object, alarm rules can be configured for tasks through computing task, workflow, or project.
When selecting a project as the configuration object, configure alarm rules for it by checking the project.
When selecting a task as the monitored object, configure alarm rules via:
Configure by task: Click the Add button, and you can check the submitted operational compute task nodes in the orchestration space in the pop-up dialog box to configure alarm rules for them.



Configure by workflow: Click the Add button, and you can check the submitted operational workflows in the orchestration space in the pop-up dialog box to configure alarm rules for the computing tasks within the workflows. Provide allow list capability. After the computing tasks in the workflow are added to the allow list, the allow listed tasks will not be monitored by the alarm rules.



Configure by project: Click the Add button, and you can check the submitted operational computational tasks in the orchestration space of the current project in the pop-up dialog box to set alarm rules for them. Provide allow list capability. After the computational tasks are added to the allow list, the allow listed tasks will not be monitored by the alarm rules.



When selecting a project as the monitored object, configure alarm rules via:
Configure by project: Click the Add button, and you can check the projects the current user has joined (non-visitor) in the pop-up dialog box to set alarm rules. Users can select multiple projects for monitoring, and the subsequent alert conditions will determine based on the sum of the corresponding monitor values of the selected items.
Alarm condition (monitored object: project)
Failed instance count upward fluctuation rate (compared to the 7-day average)
Use the fluctuation rate and the cumulative failed instance count on the day as conditions to determine whether to trigger an alarm based on the consecutive or cumulative count of conditions met within the input number of periods.

The fluctuation rate calculation formula is:
(The number of failed instances on the current day at the current moment - The average number of failed instances at the same time in the past 7 days) / The average number of failed instances at the same time in the past 7 days
If this value is in both of the following statuses, it meets the alarm conditions:
1. The average of this value is positive
2. exceeds the set threshold
Failed instance count downward fluctuation rate (compared to the 7-day average)
Use the fluctuation rate and the cumulative failed instance count on the day as conditions to determine whether to trigger an alarm based on the consecutive or cumulative count of conditions met within the input number of periods.

The fluctuation rate calculation formula is:
(The number of failed instances on the current day at the current moment - The average number of failed instances at the same time in the past 7 days) / The average number of failed instances at the same time in the past 7 days
If this numeric value is in both of the following statuses, it meets the alarm conditions:
1. The average of this value is negative
2. Absolute value exceeds the set threshold
Alarm condition
Monitored object is an offline task.
Execution Failed
An alert is triggered when the monitored task's instance fails to run. Configuration can be made for cycle execution, data replenishment, or rerun execution. The rule trigger condition can be either "All retries still fail after completion" or "The First Execution failed".
All retries still fail after completion: According to the computational task scheduling strategy, if failure retries are configured, the alarm rule is triggered when all retries fail, and the instance fails.
The First Execution Failed: According to the computational task scheduling strategy, the alarm rule is triggered when the instance generated from the first run fails.


Running Timeout
An alarm is triggered when the monitored task's instance scheduling or running exceeds the preset time. Configuration can be made for periodic execution, data replenishment, or rerun execution. Rule trigger conditions can be set for four key time requirements: "task running time", "task completion time", "total task waiting time", and "task not completed within the period".
Task running time: Selecting task running time (cycle) means calculating from the task cycle's start time. If it exceeds the threshold and is not completed, an alarm is triggered. Selecting task running time (rerun, replenishment) means calculating from the task's replenishment or rerun start time. If it exceeds the threshold and is not completed, an alarm is triggered. A "fixed value" or "historical average" can be used as the alarm condition.
Fixed value: If the instance is not completed within the designated Hour Minutes time requirement, the alarm rule is triggered.
Historical average: Calculate the instance running time of the most recent 10 successful executions of the computing task, remove the maximum and minimum values, then calculate the average of the rest. If there are fewer than 10 executions, the setting is invalid.

Task completion time (cycle): Calculate from the task instance run start time. If not completed by the prescribed time point, an alarm is triggered. A "fixed value" or "historical average" can be used as the alarm condition.
Fixed value: If the instance is not completed by the designated hour/minute time point, the alarm rule is triggered.
Historical average: Calculate the average instance running time of the most recent 10 successful executions after removing the maximum and minimum values. The setting is invalid if there are fewer than 10 executions.

Total waiting duration for tasks: Selecting total waiting duration for tasks (period) indicates the interval time from the scheduled task time to the start of scheduling run time. If it exceeds the threshold and is not running, an alarm will be triggered. Selecting total waiting duration for tasks (rerun, supplementary entry) indicates the interval time from the submission time to the start of scheduling run time for rerun and supplementary entry instances. If it exceeds the threshold and is not running, an alarm will be triggered. A "fixed value" or "historical average" can be used as the alarm condition.
Fixed value: If the instance does not start running within the designated Hour Minutes (HH:MM) time requirement, the alarm rule is triggered.
Historical average: Calculate the average waiting duration of the most recent 10 successful executions after removing the maximum and minimum values. The setting is invalid if there are fewer than 10 executions.

Task not completed within period (period): Trigger an alarm when a task instance does not complete during its current runtime cycle. Period = interval * period unit, for example:
Minute-based task: For a 15-minute interval task, the cycle is 15 minutes. If the task runs for more than 15 minutes and is not completed, trigger an alarm.
Hourly task: If the specified hour or interval is 1, the period is 1 hr; if the interval is 2, the period is 2 hr, and so on.
Day, week, month, and year tasks: The period is 1 day.

Run Successful
An alarm is triggered when the monitored task's instance runs successfully. Configuration can be made for periodic execution, data replenishment, or rerun execution.
Alarm condition
Monitored object is a real-time task.
Data ingestion/egress interval
Real-time task alert is only supported on the latest version of standard Spark engine. To upgrade the engine, submit a DLC ticket.
When a real-time task (DLC Spark Streaming) has a continuous data inflow/outflow amount of 0 within the preset time, the system will trigger an alarm. The minimum interval for alarm notification follows the configured alarm triggering frequency.
Alarm trigger frequency configurable range: 5-1440 minutes.

Number of retries
Real-time task alert is only supported on the latest version of standard Spark engine. To upgrade the engine, submit a DLC ticket.
When a real-time task (DLC Spark Streaming) exceeds the preset retry count within the preset time, the system triggers an alarm. The minimum interval for alarm notification follows the configured alarm triggering frequency.
Alarm trigger frequency configurable range: 5-1440 minutes.

Alarm Notification
Alarm Severity
Differentiate alarm message content based on alarm severity of different types. Currently provides ordinary, important, and critical type selection.



Alert Method
After an alarm rule is triggered, the sending channels for alarm information. Currently supports push methods such as mail, SMS, WeChat, call, WeCom, HTTP, WeCom group, Lark Group, DingTalk Group, Slack Group, and Teams Group. Mobile phone, WeChat, and mail accounts can be configured in the Tencent Cloud Personal Center > Cloud Access Management > User module, while WeCom accounts are configured in the Tencent Cloud Personal Center > Cloud Access Management > Associated Account module. Only select one from WeCom group, Lark Group, DingTalk Group, Slack Group, or Teams Group.

Alarm Recipient
After an alarm rule is triggered, it will send alert messages to recipients. Currently supports three methods to set alert message recipients: "specified personnel", "task owner", and "duty schedule".
Specified personnel: You can specify any one or more users as alert message recipients.
Task owner: Set the computing task owner as the alert message recipient.
Duty schedule: Set the scheduled duty schedule as the recipient to send alarm messages to duty users. Once a user in the duty schedule clicks confirm alarm for an alarm message, other users under the same duty schedule will no longer receive messages for this alarm.

Alarm escalation recipient
When selecting the specified personnel or task owner option as the alarm recipient, you can add alarm escalation recipients. Alarm escalation recipients support adding project members through a drop-down list, with up to 5 additions.
If the alarm recipient or superior escalation recipient does not confirm this item in the alarm information of the Ops center within the alarm interval, the system will send the alarm to the next-level escalation recipient. The hierarchy of alarm escalation recipients is determined based on the positional sequence configured.

Notification Frequency
Support defining the number of times an alarm is sent and the interval time between each message. If at least one alarm escalation recipient is configured, the notification frequency cannot be configured, only displaying the alarm interval.

Do Not Disturb
Support setting a do not disturb period. Alarms will not be sent during this period. Users can view alarm records in the alarm information.
Do Not Disturb supports configuration based on week and time, and supports multiple do not disturb periods.

Add alarm receipt configuration
Alarm reception configuration supports adding multiple groups to configure different alert modes for different recipients. Click Add alarm reception configuration at the bottom of the alarm notification configuration to add a new configuration. A single rule can support adding up to 10 groups of alarm reception configurations.

View Alarm Rule List

After the alarm rule is created, it will display in the alarm rule list. The list shows the rule name, alarm type, alert mode, recipient information, etc., and provides features like rule switch and rule detail to help users manage and maintain alarm rules.




Feature Description

Information
Description
Rule Name
Display the alarm rule name and ID.
Monitoring Object
Display tasks, workflows, and projects where alarm rules take effect, and check computing tasks involved in alarm rules under monitored objects.
Alarm Type
Display the monitoring type of alarm rules: fail, timeout, success.
Alarm Severity
Display the alarm level of alarm rules: ordinary, important, critical.
Alarm On-Off
Display the current running state of alarm rules, which can be manually enabled or disabled. When stopped, alarm rules will not take effect and alarm information will not be generated.
Alert Method
Display the sending channel for alarm information of alarm rules.
Alarm Recipient
Display the alert message recipient and alarm escalation person in the alarm rule configuration.
Created by
Display the creator of the current alarm rule.

Operation Alarm Rule





Feature Description

Information
Description
Rule Details
View rule details to check parameters configured for the alarm rule, including rule name, monitored object, monitoring task, alarm condition, and alarm notification.
Alert Information
Navigate to the alarm information list page generated after rule trigger to view alert information details for each alarm rule trigger.
Delete
Delete this alarm rule.

Filter Alarm Rules

Enter the alarm rule name or ID in the search box to filter the list.




Alert Information

The alarm information generated after the alarm rule is triggered against the monitored object will display in the alarm information list. The list provides details and running logs of the alarm information, as well as basic information viewing.

View Alarm Message List

1. log in to WeData console.
2. Click the project list in the left menu, find the target project for data management feature operation required.
3. After selecting a project, click to enter the data development module.
4. Click Alarm Information in the left menu to enter the alarm information management page.


Feature Description

Information
Description
Alarm time
Display the alarm information generation time.
Alarm Entity
Display the task instance name and instance ID that triggered the alarm. Click the instance name to navigate to the corresponding instance management page.



Alarm Cause
Display the triggering cause of the current alarm.
Alarm Severity
Display the alarm level of alarm information: ordinary, important, critical.
Rule Name
Display the alarm rule that triggered this item of alarm information. Click the rule name to navigate to the corresponding alarm rule management page.

Alert Method
Display the notification channel for alarm information.
Alarm Recipient
Display the alarm message recipient.

Operate Alarm Message



View Details

Click View Detail in the operation column. In the pop-up window, you can see the alarm object, alarm cause, and message sending status of the alarm information.


Function Description

Information
Description
Alarm object
Display the task instance name and instance ID that triggered the alarm.
Task name: Display the computing task name that triggered the alarm. Click the task name to navigate to the instance location webpage.
Instance ID: Display the instance ID that triggered the alarm. Click to view logs to navigate to the log information page of the corresponding instance.
Alarm Cause
Display the triggering cause of the current alarm based on the configured alarm rules trigger condition.
For example, if the alarm condition is set to execution timeout > estimated completion time, the alarm cause displayed after the rule is triggered will be expected completion time timeout.
Send Status
Display the send time, recipient, and notification channel of the current alarm. From the channel status, you can see whether it was sent successfully for each channel.
Send time: displays the time when alarm messages are sent to recipients after rule trigger.
Recipient: displays the alarm message recipient.
Notification channel: Different icons are used to display the message sending status for each channel.

Confirm Alarm

If an alarm escalation recipient is configured in the alarm settings, there will be a confirm alarm option in the operation bar. After you click confirm alarm, the alarm information will no longer be sent to the escalation recipient.

Filtering Alarm Information

Operation
Description(Optional)
Alarm time
Filter conditions: choose today, yesterday, last 7 days, last 30 days, all, or custom date range.
Alarm Cause
You can filter alerts for failed, successful, timed out, or fluctuating number of instances.
Task name/ID
Filter by task name/ID. Separate multiple task names/IDs with |.
Rule name/ID
Filter by rule name/ID. Separate multiple rule names/IDs with |.
Was this page helpful?
You can also Contact Sales or Submit a Ticket for help.
Yes
No

Feedback