tencent cloud

Tencent Cloud Smart Advisor

Release Notes
Product Introduction
Overview
Features
Product Strengths
Scenarios
Customer Cases
Purchase Guide
Getting Started
Using TSA to Execute a Chaos Experiment on CFG
Operation Guide
Operation Guide to TSA-Cloud Architecture
Operation Guide to TSA-Cloud Risk Assessment
Operation Guide to TSA-Chaotic Fault Generator
Operation Guide to TSA-Digital Assets
Permission Management
API Documentation
History
Introduction
API Category
Making API Requests
Other APIs
Task APIs
Cloud Architecture Console APIs
Data Types
Error Codes
FAQs
FAQs: TSA
FAQs: TSA-Cloud Risk Assessment
FAQs: TSA-Cloud Architecture
FAQs: TSA-Chaotic Fault Generator
Related Protocol
Tencent Cloud Smart Advisor Service Level Agreement
PRIVACY POLICY MODULE CHAOTIC FAULT GENERATOR
DATA PRIVACY AND SECURITY AGREEMENT MODULE CHAOTIC FAULT GENERATOR
Contact Us
ドキュメントTencent Cloud Smart AdvisorGetting StartedUsing TSA to Execute a Chaos Experiment on CFG

Using TSA to Execute a Chaos Experiment on CFG

PDF
フォーカスモード
フォントサイズ
最終更新日: 2026-03-31 21:54:17

1. Executing a Visualized Fault Injection Experiment in the Cloud Architecture

Chaotic Fault Generator (CFG) is one of the governance applications in TSA-Cloud Architecture, allowing users to simulate a visualized fault injection based on the business cloud architecture. The following uses the experiment scenario MySQL Primary Node Fault as an example to describe how to quickly create an experiment task on CFG in Cloud Architecture.

Step 1: Creating an Experiment

1. Log in to the Tencent Cloud Smart Advisor (TSA) console, choose Folder, and go to the business architecture diagram page. Select Governance Mode at the top of the page and click CFG at the bottom of the page. For details about how to build a cloud architecture, see Architecture Diagram Drawing.
2. Create an experiment task. Currently, three methods are provided to create an experiment task.
Quickly create an experiment.
Click Create Experiment in the toolbar above the architecture dashboard or click Create Experiment in the Experiment Task panel to quickly create an experiment.
Use the Fault Action Library panel to create an experiment.
In the Fault Action Library panel, select a fault action in the action library and click Create Experiment.
Use the Experiment Template Library panel to create an experiment.
In the Experiment Template Library panel, click an industry template card and click Use.
3. In the Create Experiment panel, perform the first step Basic Information, select the region where the experiment resources are located, and fill in the experiment information and the personnel and organization information.
4. Perform the second step to create an experiment. Experiment orchestration: First, select the resource object to which a fault needs to be injected, click Add Action, and select a fault injection mode.
Note:
In an action group, a fault injection can be performed only on objects of the same type. You can add multiple actions, and drag, drop, and orchestrate the experiment actions.
You can replicate and add multiple action groups.
Select the MySQL Primary Node Fault action.
In the Set Action Parameters step, set Wait Time Before Action and Wait Time After Action in the Common Parameters section to control the experiment process.
Click Add Instance and select the instance resource to which a fault needs to be injected. In this example, select a MySQL instance.
You can search for instances by instance type, instance name, and so on.
You can add instances in batches.
5. In the Global Configuration section, set Monitoring Metric for the experiment actions to facilitate observation of the experiment effect. In addition, set Guardrail Policy. If the policy is triggered, the experiment is suspended to ensure business security. After the settings are completed, click Submit.
6. After the experiment is created, click Experiment Details to review the experiment information and proceed to the next step to execute the experiment. If adjustments are required, click Modify (unavailable after the experiment starts). After the experiment details page is closed, you can also click Total Experiment Tasks in the toolbar above the architecture dashboard, search for the created experiment on the Experiment Tasks page, and click the experiment name to go to the experiment details page again.

Step 2: Executing the Experiment

1. In the Experiment Scenario section, click Execute to the right of the action name to execute the experiment. Alternatively, click Start Experiment in the quick operation bar above the cloud architecture diagram.
Note:
During experiment creation, if Execution Method is set to Automatic, the system automatically executes actions without manual intervention after you click Execute here. If Execution Method is set to Manual, you still need to click Start in the action group after you click Execute here.
If an action fails to execute, the system automatically switches to the Manual mode. In this case, manual intervention is required, and you can click Execute or Skip in the action group.
2. During experiment execution, click Pause Experiment or End Experiment above the architecture diagram. One of the cloud architecture nodes undergoing a fault injection will flash to indicate the current fault injection status (Not Started/Injecting/Succeeded/Failed).
3. During experiment execution, click Experiment Log in the Experiment Scenario section to view logs and click the experiment action area to view the action execution status. In the Monitoring Metrics area, you can view real-time monitoring data during experiment execution to estimate the steady-state behavior of the system after the fault injection.

Step 3: Ending the Experiment

1. After the fault action is successfully executed, click End Experiment.
2. Set parameters in the Experiment Conclusion section, document information such as encountered issues and adopted emergency plans during the experiment for subsequent review and analysis.

Step 4: Generating an Experiment Report

1. After the experiment ends, obtain the report for the experiment. In the Experiment Details, Experiment Tasks, and Experiment Reports panels, click Generate Report to obtain a preview version of the online report with one click. The report includes information such as the business architecture, experiment conclusions, experiment overview, and execution details.
Note:
1. Online reports cannot be modified or downloaded. After the report is archived to digital assets, operations such as downloading and sharing are supported.
2. Experiment conclusions can be modified in the quick operation bar on the experiment details page. After the content is updated, the report needs to be regenerated. Currently, other content in an experiment report cannot be modified.
2. Click Archive Report to archive the report to the Digital Assets > Archived Reports page.
3. Go to the Digital Assets page, view the experiment report on the Archived Reports page.
4. View, download, share, and delete the report after it is archived.

2. (Optional) Using CFG APIs to Execute a Fault Injection Experiment

This section demonstrates how to use CFG APIs to execute an experiment. The CFG APIs follow the standard specifications for Tencent Cloud APIs. For details about the common parameters and request methods, see API Documentation for TSA-CFG.
The experiment process is as follows:
1. Create Experiment
2. Query Experiment
3. Execute Experiment
4. Execute Action
5. End Experiment

Step 1: Creating an Experiment

Using a Template to Create an Experiment

1. Prepare the template ID.
Method 1: Obtain the ID from the console. Log in to the TSA console, choose CFG, click Template Library Management, select the template you want to use to create an experiment, and copy the template ID.
Method 2: Obtain the ID by using an API. Obtain the value of the TemplateId parameter that is required for creating an experiment.
2. API Request
The request is as follows:
POST / HTTP/1.1
Host: cfg.tencentcloudapi.com
Content-Type: application/json
X-TC-Action: CreateTaskFromTemplate
<Common request parameters>

{
"TemplateId": 626, # Template ID obtained in the previous step.
"TaskConfig": {
"TaskTitle": "This is an example of using an API to create an experiment.", # Experiment name. If not specified, the template name is used by default.
"TaskGroupsConfig": [
{
"TaskGroupInstances": [
"ins-xxxxxxxx" # Instance object ID associated with the action group, for example, resource IDs for Cloud Virtual Machine (CVM) and Cloud Load Balancer (CLB) instances.
]
}
]
}
}
3. API Response
{
"Response": {
"RequestId": "f0aee8ac-2ed3-4a7f-a25b-f0d7d228dd30",
"TaskId": 3256 # Experiment ID.
}
}

Using an Action to Create an Experiment

1. Prepare the resource object ID. Obtain the value of the ObjectTypeId parameter that is required for creating an experiment.
2. Prepare the action ID. Obtain the value of the ActionId parameter that is required for creating an experiment.
3. Prepare action parameters. Obtain the parameters required for the experiment action.
Note:
The following two parameters are provided for creating an action experiment:
TaskActionGeneralConfiguration: This is a general parameter that can be set or not. If it is left empty, the default parameter for the action is used.
TaskActionCustomConfiguration: This is a custom parameter. If it is customized as an optional parameter, the default value is used. If it is customized as a required parameter, the parameter must be explicitly passed.
The parameter must be set in the format of {"key1": "value1", "key2": "value2"}, and needs to be passed after serialization, for example, "{\\"domain\\": \\"www.test.com\\"}".
You can view the specific functions of action parameters in the console, which is more intuitive.
This API response is as follows:
{
"Response": {
"RequestId": "3e7fa74e-9045-4f01-88d4-ee158affe905",
"Common": [ # General parameter, which corresponds to TaskActionGeneralConfiguration used for subsequent experiment creation by using an action.
{
"ActionId": 466,
"ActionName": "DNS tampering.",
"ConfigDetail": [
{
"Type": "input",
"Lable": "Action alias.",
"Field": "AliasTitle", # Action parameter key.
"DefaultValue": "", # Action parameter default value.
"Config": "{}",
"Required": 0, # Specifies whether the parameter is required. Valid values: 0 (no) and 1 (yes).
"Validate": "{}",
"Visible": "{}"
},
{
"Type": "number",
"Lable": "Pre-action wait time (s).",
"Field": "PreTimeWait",
"DefaultValue": "0",
"Config": "{\\"max\\": 86400, \\"min\\": 0, \\"tooltip\\": \\"This parameter is only used in Auto mode.\\"}",
"Required": 0,
"Validate": "{}",
"Visible": "{}"
},
{
"Type": "number",
"Lable": "Post-action wait time (s).",
"Field": "AfterTimeWait",
"DefaultValue": "0",
"Config": "{\\"max\\": 86400, \\"min\\": 0, \\"tooltip\\": \\"This parameter is only used in Auto mode.\\"}",
"Required": 0,
"Validate": "{}",
"Visible": "{}"
},
{
"Type": "number",
"Lable": "Action timeout period (s).",
"Field": "ActionTimeout",
"DefaultValue": "1800",
"Config": "{\\"max\\": 86400, \\"min\\": 0, \\"tooltip\\": \\"Action timeout period.\\"}",
"Required": 0,
"Validate": "{}",
"Visible": "{\\"op\\": \\"<\\", \\"type\\": \\"need_insert\\", \\"value\\": 0, \\"relatedField\\": \\"ActionTimeout\\"}"
}
]
}
],
"Common": [ # General parameter, which corresponds to TaskActionCustomConfiguration used for subsequent experiment creation by using an action.
{
"ActionId": 466,
"ActionName": "DNS tampering.",
"ConfigDetail": [
{
"Type": "number",
"Lable": "Duration (s).",
"Field": "duration",
"DefaultValue": "180",
"Config": "{\\"max\\": 1800, \\"min\\": 0}",
"Required": 1,
"Validate": "{}",
"Visible": "{}"
},
{
"Type": "input",
"Lable": "Domain name.",
"Field": "domain", # Action parameter key.
"DefaultValue": "", # Action parameter default value.
"Config": "{}",
"Required": 1, # Specifies whether the parameter is required. Valid values: 0 (no) and 1 (yes).
"Validate": "{}",
"Visible": "{}"
},
{
"Type": "input",
"Lable": "IP",
"Field": "ip",
"DefaultValue": "",
"Config": "{}",
"Required": 1,
"Validate": "{}",
"Visible": "{}"
}
]
}
],
"ResourceOffline": []
}
}
4. API Request
The request is as follows:
Note:
For container resource objects, a four-tuple of {ClusterId} + {NodeName} + {NameSpace} + {PodName} is required to uniquely identify an instance. If the parameter TaskInstances is passed, this map needs to be serialized.
Example: "{"ClusterId":"cls-xxxx","PodName":"pod-xxxxxx","NodeName":"xxxxxxxx","NameSpace":"default-xxxxxx"}".
POST / HTTP/1.1
Host: cfg.tencentcloudapi.com
Content-Type: application/json
X-TC-Action: CreateTaskFromTemplate
<Common request parameters>

{
"TaskActionId": 462, # Action ID.
"TaskInstances": ["ins-xxxxxxxx"], # Resource object instance ID.
"TaskTitle": "Packet loss.", # Experiment name.
"TaskDescription": "This is an experiment of creating an experiment via OpenAPI.", # Experiment description.
"TaskActionCustomConfiguration": "{\\"interfaces\\": \\"eth0\\"}" # Custom action parameter, which needs to be serialized.
}
5. API Response
{
"Response": {
"RequestId": "f0aee8ac-2ed3-4a7f-a25b-f0d7d228dd30",
"TaskId": 150
}
}
At this point, click Experiment Management in the console to view the created experiment or query the experiment by using an API.

Step 2: Querying the Experiment

API Request

The request is as follows:
POST / HTTP/1.1
Host: cfg.tencentcloudapi.com
Content-Type: application/json
X-TC-Action: DescribeTask
<Common request parameters>

{
"RequestId": "02185fc4-0e8f-49ed-a8d5-6d0788d0e60c",
"TaskId": "3256" # Experiment ID returned during experiment creation.
}

API Response

{
"RequestId": "02185fc4-0e8f-49ed-a8d5-6d0788d0e60c",
"Task": {
"TaskId": 3256,
"TaskTitle": "This is an example of creating an experiment by using an API.",
"TaskDescription": "Test an empty action.",
"TaskTag": "",
"TaskStatus": 1002,
"TaskStatusType": 0,
"TaskProtectStrategy": null,
"TaskCreateTime": "2023-08-14 11:55:02",
"TaskUpdateTime": "2023-08-14 14:48:00",
"TaskStartTime": "2023-08-14 14:48:01",
"TaskEndTime": null,
"TaskExpect": null,
"TaskSummary": null,
"TaskMode": 1,
"TaskRegionId": 1,
"TaskPauseDuration": 60,
"TaskOwnerUin": "100032429988",
"TaskPlanId": null,
"TaskPlanTitle": null,
"TaskGroups": [
{
"TaskGroupActions": [
{
"TaskGroupInstances": [
{
"TaskGroupInstanceId": 24375, # Task action instance ID.
"TaskGroupInstanceObjectId": "ins-bfydnvta", # Resource object ID.
"TaskGroupInstanceStatus": 3001,
"TaskGroupInstanceStatusType": 0,
"TaskGroupInstanceExecuteLog": null,
"TaskGroupInstanceStartTime": null,
"TaskGroupInstanceEndTime": null,
"TaskGroupInstanceCreateTime": "2023-08-14 14:48:00",
"TaskGroupInstanceUpdateTime": "2023-08-14 14:48:00",
"TaskGroupInstanceIsRedo": false,
"TaskGroupInstanceExecuteTime": null
},
{
"TaskGroupInstanceId": 24376, # Task action instance ID.
"TaskGroupInstanceObjectId": "ins-ehxmry76", # Resource object ID.
"TaskGroupInstanceStatus": 3001,
"TaskGroupInstanceStatusType": 0,
"TaskGroupInstanceExecuteLog": null,
"TaskGroupInstanceStartTime": null,
"TaskGroupInstanceEndTime": null,
"TaskGroupInstanceCreateTime": "2023-08-14 14:48:00",
"TaskGroupInstanceUpdateTime": "2023-08-14 14:48:00",
"TaskGroupInstanceIsRedo": false,
"TaskGroupInstanceExecuteTime": null
}
],
"TaskGroupActionId": 11395, # Task action ID.
"ActionId": 12,
"ActionTitle": "Empty operation.",
"ActionApiType": 1,
"ActionType": "Platform.",
"ActionRisk": "Low risk.",
"ActionAttribute": 1,
"TaskGroupActionOrder": 1,
"TaskGroupActionGeneralConfiguration": "{\\"AliasTitle\\": \\"\\", \\"PreTimeWait\\": 0, \\"ActionTimeout\\": 1800, \\"AfterTimeWait\\": 0}",
"TaskGroupActionCustomConfiguration": "{}",
"TaskGroupActionStatus": 2002,
"TaskGroupActionStatusType": 0,
"TaskGroupActionRandomId": 156878,
"TaskGroupActionRecoverId": 193278,
"TaskGroupActionExecuteId": null,
"TaskGroupActionCreateTime": "2023-08-14 11:55:02",
"TaskGroupActionUpdateTime": "2023-08-14 14:48:00",
"IsExecuteRedo": false,
"TaskGroupActionExecuteTime": null
},
{
"TaskGroupInstances": [
{
"TaskGroupInstanceId": 24377, # Task action instance ID.
"TaskGroupInstanceObjectId": "ins-bfydnvta", # Resource object ID.
"TaskGroupInstanceStatus": 3001,
"TaskGroupInstanceStatusType": 0,
"TaskGroupInstanceExecuteLog": null,
"TaskGroupInstanceStartTime": null,
"TaskGroupInstanceEndTime": null,
"TaskGroupInstanceCreateTime": "2023-08-14 14:48:00",
"TaskGroupInstanceUpdateTime": "2023-08-14 14:48:00",
"TaskGroupInstanceIsRedo": false,
"TaskGroupInstanceExecuteTime": null
},
{
"TaskGroupInstanceId": 24378, # Task action instance ID.
"TaskGroupInstanceObjectId": "ins-ehxmry76", # Resource object ID.
"TaskGroupInstanceStatus": 3001,
"TaskGroupInstanceStatusType": 0,
"TaskGroupInstanceExecuteLog": null,
"TaskGroupInstanceStartTime": null,
"TaskGroupInstanceEndTime": null,
"TaskGroupInstanceCreateTime": "2023-08-14 14:48:00",
"TaskGroupInstanceUpdateTime": "2023-08-14 14:48:00",
"TaskGroupInstanceIsRedo": false,
"TaskGroupInstanceExecuteTime": null
}
],
"TaskGroupActionId": 11396, # Task action ID.
"ActionId": 13,
"ActionTitle": "Empty operation (rollback).",
"ActionApiType": 1,
"ActionType": "Platform.",
"ActionRisk": "Low risk.",
"ActionAttribute": 2,
"TaskGroupActionOrder": 2,
"TaskGroupActionGeneralConfiguration": "{\\"PreTimeWait\\": 0, \\"ActionTimeout\\": 1800, \\"AfterTimeWait\\": 0}",
"TaskGroupActionCustomConfiguration": "{}",
"TaskGroupActionStatus": 2001,
"TaskGroupActionStatusType": 0,
"TaskGroupActionRandomId": 193278,
"TaskGroupActionRecoverId": null,
"TaskGroupActionExecuteId": 156878,
"TaskGroupActionCreateTime": "2023-08-14 11:55:02",
"TaskGroupActionUpdateTime": "2023-08-14 11:55:02",
"IsExecuteRedo": false,
"TaskGroupActionExecuteTime": null
}
],
"TaskGroupId": 4684, # Action group ID.
"TaskGroupTitle": "abc",
"TaskGroupDescription": "abc",
"TaskGroupOrder": 1,
"TaskGroupMode": 1,
"TaskGroupInstanceList": [
"ins-bfydnvta",
"ins-ehxmry76"
],
"ObjectTypeId": 1,
"TaskGroupCreateTime": "2023-08-14 11:55:02",
"TaskGroupUpdateTime": "2023-08-14 11:55:02",
"TaskGroupInstancesExecuteRule": [
{
"TaskGroupInstancesExecuteMode": 1
}
],
"TaskGroupSelectedInstanceList": [
"ins-bfydnvta",
"ins-ehxmry76"
],
"TaskGroupDiscardInstanceList": []
}
],
"TaskMonitors": [],
"TaskPolicy": null,
"Tags": []
},
"ReportInfo": null
}

Step 3: Executing the Experiment

API Request

The request is as follows:
POST / HTTP/1.1
Host: cfg.tencentcloudapi.com
Content-Type: application/json
X-TC-Action: ExecuteTask
<Common request parameters>

{
"TaskId": "3256"
}

API Response

{
"Response": {
"RequestId": "46924e75-a149-4130-aac0-853dbf0abea9"
}
}

Step 4: Executing the Action

API Request

The request is as follows:
{
"TaskId": "3256",
"TaskActionId": "11396", # Task action ID (obtained from the experiment query response).
"TaskInstanceIds": [
"xxxxxxxx-01", # Task action instance ID (obtained from the experiment query response).
"xxxxxxxx-02"
],
"IsOperateAll": true, # Specifies whether to execute the entire task. If it is set to true, TaskInstanceIds is ignored. All instances passed during experiment creation are executed.
"ActionType": 2, # Action type. Valid values: 2 (execute), 3 (skip), and 5 (retry).
"TaskGroupId": 4684, # Action group ID (obtained from the experiment query response).
}

API Response

{
"Response": {
"RequestId": "6549ed1a-911f-46dd-b6cd-2c02d5bd180f"
}
}
Note:
The action executed in this step supports the skip and retry operations, which can be controlled by adjusting the value of ActionType. Valid values:
3: skip
5: retry

Step 5: Ending the Experiment

API Request

The request is as follows:
{
"TaskId": 3256, # Experiment task ID.
"Status": 1004, # End status code, which does not need to be modified.
"Summary": "This experiment meets expectations.", # Experiment conclusion.
"IsExpect": true, # Specifies whether the execution result meets expectations.
}

API Response

{
"Response": {
"RequestId": "e38eca72-e4ae-4a86-9696-7df399e672bd"
}
}

ヘルプとサポート

この記事はお役に立ちましたか?

フィードバック