tencent cloud

Tencent Cloud Observability Platform

Release Notes and Announcements
Release Notes
Product Introduction
Overview
Strengths
Basic Features
Basic Concepts
Use Cases
Use Limits
Purchase Guide
Tencent Cloud Product Monitoring
Application Performance Management
Mobile App Performance Monitoring
Real User Monitoring
Cloud Automated Testing
Prometheus Monitoring
Grafana
EventBridge
PTS
Quick Start
Monitoring Overview
Instance Group
Tencent Cloud Product Monitoring
Application Performance Management
Real User Monitoring
Cloud Automated Testing
Performance Testing Service
Prometheus Getting Started
Grafana
Dashboard Creation
EventBridge
Alarm Service
Cloud Product Monitoring
Tencent Cloud Service Metrics
Operation Guide
CVM Agents
Cloud Product Monitoring Integration with Grafana
Troubleshooting
Practical Tutorial
Application Performance Management
Product Introduction
Access Guide
Operation Guide
Practical Tutorial
Parameter Information
FAQs
Mobile App Performance Monitoring
Overview
Operation Guide
Access Guide
Practical Tutorial
Tencent Cloud Real User Monitoring
Product Introduction
Operation Guide
Connection Guide
FAQs
Cloud Automated Testing
Product Introduction
Operation Guide
FAQs
Performance Testing Service
Overview
Operation Guide
Practice Tutorial
JavaScript API List
FAQs
Prometheus Monitoring
Product Introduction
Access Guide
Operation Guide
Practical Tutorial
Terraform
FAQs
Grafana
Product Introduction
Operation Guide
Guide on Grafana Common Features
FAQs
Dashboard
Overview
Operation Guide
Alarm Management
Console Operation Guide
Troubleshooting
FAQs
EventBridge
Product Introduction
Operation Guide
Practical Tutorial
FAQs
Report Management
FAQs
General
Alarm Service
Concepts
Monitoring Charts
CVM Agents
Dynamic Alarm Threshold
CM Connection to Grafana
Documentation Guide
Related Agreements
Application Performance Management Service Level Agreement
APM Privacy Policy
APM Data Processing And Security Agreement
RUM Service Level Agreement
Mobile Performance Monitoring Service Level Agreement
Cloud Automated Testing Service Level Agreement
Prometheus Service Level Agreement
TCMG Service Level Agreements
PTS Service Level Agreement
PTS Use Limits
Cloud Monitor Service Level Agreement
API Documentation
History
Introduction
API Category
Making API Requests
Monitoring Data Query APIs
Alarm APIs
Legacy Alert APIs
Notification Template APIs
TMP APIs
Grafana Service APIs
Event Center APIs
TencentCloud Managed Service for Prometheus APIs
Monitoring APIs
Data Types
Error Codes
Glossary

Troubleshooting

PDF
포커스 모드
폰트 크기
마지막 업데이트 시간: 2024-01-27 17:45:42

Overview

Tencent Cloud Observability Platform provides multiple ways to help users detect resource exceptions and multiple channels to send the exception information to users as soon as possible.




Locating Exceptions

Detecting exceptions through alarms

Tencent Cloud uses monitoring and alarming to promptly detect exceptions and notify you automatically. This helps keep you informed on exceptions in real time across all scenarios. You can log in to the Tencent Cloud Observability Platform console and configure alarm policies for resources. For more information, please see Creating Alarm Policies.
If you have configured key performance metrics and events as alarm rules, you will be notified promptly in multiple ways through the alarm channel if an exception occurs.
Alarm policies configured with an alarm recipient group will be sent to you through SMS, email, etc. Features such as repeated alarms and alarm aggregation are also supported to keep you informed while avoiding unnecessary notifications.
You can also configure the callback API feature in alarm channel to receive alarms promptly and process the alarm information.

Detecting exceptions through monitoring charts

You need to actively analyze the historical data and average trends of performance metrics to locate exceptions through monitoring charts. If an exception is difficult to locate by using alarm rules or has not been configured with alarms, you can use monitoring charts to locate it during daily health check. Compared to alarms, monitoring charts allow you to query the global impact of resource exceptions. You can subscribe key resources to the dashboard and configure monitoring charts to highlight exceptions in different scenarios.
For some instances, you can subscribe to details views to compare the trends of instance performance data on the dashboard.
For resource clusters, you can subscribe to the aggregated data of a cluster to see the overall monitoring chart of the cluster on the dashboard and compare it with the chart of a single instance in this cluster.
For exceptions detected through monitoring charts, you can use the sorting feature to locate specific resources related to an exception for further troubleshooting.

Troubleshooting

Locating exception objects on monitoring overview page

If you receive an alarm during daily health check, you can go to Monitoring Overview in the Tencent Cloud Observability Platform console.
1. Go to the overview page > service health status module to view exceptions in each region and project. You can browse recent exceptions by clicking on the status of each service.


2. Click the number of exception objects to access the service monitoring page.

Affected resource objects are automatically filtered out on the service monitoring page.
3. Click the ID of a specific object to go to the monitoring details page, where detailed information about its historical exceptions is provided.
The exception timeline allows you to view the current and historical information of the affected object. This helps you troubleshoot current exceptions based on historical alarms and status changes.
The monitoring data for resource performance allows you to compare the current and historical data of the same metric over a specified period of time or compare data changes of different metrics within the same period for troubleshooting.



Locating exception objects through dashboards

Log in to the Tencent Cloud Observability Platform console. On the left sidebar, click Dashboard to access the dashboard management page.
1. When you find an exceptional trend in the monitoring chart, click the time period when the exception occurs. A sorting list of corresponding instances is displayed below the chart. You can locate the specific exception objects based on the sorting list.
2. Click the name of an object in the sorting list to access its monitoring details page, where detailed information about its historical exceptions is provided.
The exception timeline allows you to view the current and historical information of the affected object. This helps you troubleshoot current exceptions based on historical alarms and status changes.
The monitoring data for resource performance allows you to compare the current and historical data of the same metric over a specified period of time or compare data changes of different metrics within the

도움말 및 지원

문제 해결에 도움이 되었나요?

피드백