Overview
Through this hands-on practice, you can gain a detailed understanding of the following content:
You can also quickly experience using CLS to analyze CSS through the following Demo.
Feature Strengths
Real-time log analysis:
CLS enables rapid search, analysis, and storage of log data through real-time collection and delivery of CSS logs. By mining log data, it facilitates data-driven Ops and operations, thus quickly and accurately formulating operational policies.
Out-of-the-box analysis reports:
CLS provides an out-of-the-box access analysis dashboard for CSS log analysis scenarios. The dashboard includes CSS user access distribution, traffic analysis, request error rate analysis, request latency analysis, and resource distribution. In traditional CSS log analysis scenarios, offline logs need to be downloaded, uploaded to a data warehouse, and then undergo a series of data cleansing and data model definitions. This process is cumbersome and consumes significant labor costs.
Collect CSS logs
View CSS analysis dashboard
CSS log analysis dashboard: Visually displays CSS user access metrics (such as user UV, PV, geographic distribution) and request quality metrics (such as error rates, latency), assisting in business operations and Ops troubleshooting scenarios.
Search and analysis of CSS logs
1. After completing log collection, go to the CSS console, choose Monitoring > Log Service > Real-time Log Analysis, and enter real-time log analysis. View the log topics and click Search. 2. After clicking Search, you will be redirected to the log search page where you can perform log search and analysis.
Log Field Description
Push Stream Log
|
1 | time | Request Time |
2 | client_ip | Client IP Address |
3 | host | Accessed Domain |
4 | url | URL |
5 | size | Push Stream Byte Size |
6 | country_id | country_id |
7 | prov | Province |
8 | isp | ISP |
9 | streamname | Stream ID |
10 | node_ip | Node IP |
11 | server_region | Server Region |
12 | server_country | Server Country |
Playback Logs
|
1 | type | Playback Type: lvb; leb. |
2 | time | Request Time |
3 | client_ip | Client IP Address |
4 | host | Accessed Domain |
5 | url | URL |
6 | size | Bytes Accessed |
7 | country_id | country_id |
8 | prov | Province |
9 | isp | ISP |
10 | http_code | HTTP status code |
11 | referer | Referer Information |
12 | process_time | Processing Time (ms) |
13 | ua | User-Agent Information |
14 | range | Range Parameter |
15 | method | HTTP Method |
16 | streamname | Stream ID |
17 | hit | CACHE HIT/MISS |
18 | node_ip | Node IP address (because some CDN cluster node IP addresses cannot be obtained, this field may be left empty) |
19 | server_region | Server Region |
20 | server_country | Server Country |
21 | connect_fd | connect_fd (connection port number) |
22 | lost_rate | Packet loss rate (available only when type=leb). |
23 | rtt | rtt (available only when type=leb). |
Note:
Special status codes in logs are described as follows:
0: Connection established.
4: Request timeout, authentication timeout, or response timeout.
5: Source disconnection or stream termination.
6: Client disconnection.
Monitoring Alarm Case
You can configure abnormal monitoring alarms based on CSS logs to monitor abnormalities occurring in CSS access traffic in real time. The following provides two cases.
Case 1: Trigger an alarm when P99 latency exceeds 100ms, and display the affected domain, url, client_ip in the alarm information to quickly identify error conditions.
2. On the Alarm Policy page, configure the following.
Basic information:
Alarm Name: CSS access latency alarm.
Enabling Status: Enable
Monitoring Object: selected created CSS log topic.
Monitoring Task:
Query Statement: Enter the following statement, select a time range of 15 minutes, and calculate the 99% latency for the past 15 minutes.
type:* | select approx_percentile(request_time, 0.99) as p99
Trigger Condition: Configure as follows: when 99% latency exceeds 100ms, the alarm condition is met.
Execution Cycle: Fixed frequency, executed once every 1 minute.
Multi-Dimensional Analysis: Displays affected domains, client IP address, and url in alarm information, helping developers quickly locate issues.
Notification Channel Group: By associating notification channel groups, configure the methods and recipients for sending notifications, supporting notification methods such as SMS, email, phone, WeChat, WeCom, DingTalk, Feishu, custom API callback (webhook), and so on. For details, see Manage Notification Channel Groups. Case 2: When the resource access error rate or latency exceeds a certain threshold, an alarm notification is triggered.
Query Statement:
Enter statement 1, select a time range of near 15 minutes.
type:* | select url_extract_path(url) as url_path , round(count_if(try_cast( "http_code" as bigint) >= 400)*100.0/count(*),2) as "Request Error Rate (%)" group by "url_path" order by "Request Error Rate (%)" desc limit 100
Enter statement 2, select a time range of the last 15 minutes.
type:* | select url_extract_path(url) as url_path , round(avg( "process_time" ), 1) as "Avg. processing time (ms)" group by "url_path" order by "Avg. processing time (ms)" desc limit 100
Trigger Condition:
Define according to your business needs. For example, set to trigger an alarm when the resource access error rate exceeds 3% or latency exceeds 500ms.
Multi-Dimensional Analysis:
Display resource URLs with error rates exceeding the threshold.
type:* | select url_extract_path(url) as url_path , round(count_if(try_cast( "http_code" as bigint) >= 400)*100.0/count(*),2) as "Request Error Rate (%)" group by "url_path" having "Request Error Rate (%)" >=3 order by "Request Error Rate (%)" desc limit 100
Display resource URLs with latency exceeding the threshold.
type:* | select url_extract_path(url) as url_path , round(avg( "process_time" ), 1) as "Avg. processing time (ms)" group by "url_path" having "Avg. processing time (ms)" >=500 order by "Avg. processing time (ms)" desc limit 100