tencent cloud

Tencent Cloud Observability Platform

Release Notes and Announcements
Release Notes
Product Introduction
Overview
Strengths
Basic Features
Basic Concepts
Use Cases
Use Limits
Purchase Guide
Tencent Cloud Product Monitoring
Application Performance Management
Mobile App Performance Monitoring
Real User Monitoring
Cloud Automated Testing
Prometheus Monitoring
Grafana
EventBridge
PTS
Quick Start
Monitoring Overview
Instance Group
Tencent Cloud Product Monitoring
Application Performance Management
Real User Monitoring
Cloud Automated Testing
Performance Testing Service
Prometheus Getting Started
Grafana
Dashboard Creation
EventBridge
Alarm Service
Cloud Product Monitoring
Tencent Cloud Service Metrics
Operation Guide
CVM Agents
Cloud Product Monitoring Integration with Grafana
Troubleshooting
Practical Tutorial
Application Performance Management
Product Introduction
Access Guide
Operation Guide
Practical Tutorial
Parameter Information
FAQs
Mobile App Performance Monitoring
Overview
Operation Guide
Access Guide
Practical Tutorial
Tencent Cloud Real User Monitoring
Product Introduction
Operation Guide
Connection Guide
FAQs
Cloud Automated Testing
Product Introduction
Operation Guide
FAQs
Performance Testing Service
Overview
Operation Guide
Practice Tutorial
JavaScript API List
FAQs
Prometheus Monitoring
Product Introduction
Access Guide
Operation Guide
Practical Tutorial
Terraform
FAQs
Grafana
Product Introduction
Operation Guide
Guide on Grafana Common Features
FAQs
Dashboard
Overview
Operation Guide
Alarm Management
Console Operation Guide
Troubleshooting
FAQs
EventBridge
Product Introduction
Operation Guide
Practical Tutorial
FAQs
Report Management
FAQs
General
Alarm Service
Concepts
Monitoring Charts
CVM Agents
Dynamic Alarm Threshold
CM Connection to Grafana
Documentation Guide
Related Agreements
Application Performance Management Service Level Agreement
APM Privacy Policy
APM Data Processing And Security Agreement
RUM Service Level Agreement
Mobile Performance Monitoring Service Level Agreement
Cloud Automated Testing Service Level Agreement
Prometheus Service Level Agreement
TCMG Service Level Agreements
PTS Service Level Agreement
PTS Use Limits
Cloud Monitor Service Level Agreement
API Documentation
History
Introduction
API Category
Making API Requests
Monitoring Data Query APIs
Alarm APIs
Legacy Alert APIs
Notification Template APIs
TMP APIs
Grafana Service APIs
Event Center APIs
TencentCloud Managed Service for Prometheus APIs
Monitoring APIs
Data Types
Error Codes
Glossary

Performance Profiling

PDF
Mode fokus
Ukuran font
Terakhir diperbarui: 2025-03-20 20:14:42
The performance profiling capability is implemented based on the async-profiler technology. It enables the generation of performance profiling flame graphs with extremely low performance overhead, helping users intuitively analyze the reasons for CPU/memory spikes and quickly identify application performance bottlenecks. Compared to directly using the community async-profiler solution, the performance profiling capability provided by Application Performance Management (APM) eliminates complex operations such as tool installation, command execution, and profiling result downloads, significantly improving the efficiency of troubleshooting application performance issues.
Based on repeated performance test results, performance profiling collection was conducted on a typical Microservice application. During the collection process, the CPU overhead remained below 5%, and the memory overhead was below 50M, which had a negligible impact on the Transaction per Second (TPS) and response time.

Prerequisites

The application is accessed via Tencent Cloud Enhanced Java Agent 1.16-2024030510 or later.
OpenJDK or other JDKs developed based on the HotSpot JVM are installed in the runtime environment. Environments with only JRE installed are not supported at this time.
Linux (X64 and ARM64) operating systems are supported. macOS and Windows are not supported at this time.
It is not recommended to use JDKs earlier than Java 8u352, as there may be a risk of Memory Crash.

Directions

1. Log in to the TCOP console.
2. In the left sidebar, select Application Performance Management > Application Diagnosis, and click to enter the performance profiling page.
3. In the instance list on the left side of the page, find the instance that you want to perform performance profiling on, and click collect.
4. In the pop-up dialog, select the data collection duration and collection type, and click OK.
5. After the collection task is submitted, you can query the profiling results in the performance profiling records on the right side of the page. When the collection status is Collection Completed, click View Performance Analysis to view the performance profiling flame graph on the pop-up page.
Option Name
Description
Data Collection Duration
Specifies the duration for collecting profiling data, starting from when the application instance receives the collection task. Currently, supported durations include 5 seconds, 10 seconds, 30 seconds, 3 minutes, and 5 minutes.
Collection Type
Determines the performance metrics represented by the profiling data. Currently, three collection types are supported, that is, CPU, duration, and memory.
CPU: Measures the time taken by the CPU to execute a block of code.
Duration: Measures the actual time elapsed from entering to exiting a method. It is calculated based on the wall clock time. All waiting time, locking time, and thread synchronization time are included. Therefore, the wall clock time will not be shorter than the CPU time.
memory: Measures memory allocation.

How to Use Performance Profiling Flame Graphs?

Performance profiling collects profiling data based on sampling. Performance metrics are reflected by the frequency at which functions appear in the sampling, which are not absolute values. Therefore, you need to focus on the relative proportion between methods (functions). In the flame graph, the Y-axis represents the depth of the method (function) stack, and the X-axis represents the number of times a method (function) is sampled. For example, in the case of CPU time, the wider the width of a method (function), the longer the time it takes for the CPU to execute that block of code.
When analyzing the flame graph, it is recommended to start with the deepest method (function). The deeper the stack, the closer it appears to the bottom of the flame graph, that is, the position of the "flame" (unlike natural flames, the flames in the flame graphs used by APM point downward).
Note:
The wider the flame, the greater the performance consumption. Therefore, a wide flame is often the root cause of performance issues.





Bantuan dan Dukungan

Apakah halaman ini membantu?

masukan