{"remote_addr": "124.78.118.162","method": "GET","upstream_addr": "169.254.128.14:60002","upstream_response_length": "48","body_bytes_sent": "59","time_local": "2021-12-18T05:16:06+00:00","version": "HTTP/1.1","url": "/_ignition/execute-solution","http_user_agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36","remote_user": "-","req_id": "37585e290318470aa57b148282aed99f","upstream_status": "400","request_time": "0.002","sys_address": "11.149.171.195","request_length": "272","http_referer": "-","sys_datasource": "gz.1.1.v1.2.19","proxy_upstream_name": "default-kubernetes-443","upstream_response_time": "0.000","time": "1706701448003","timestamp": "2024-01-31T19:44:08+08:00","status": "400"}
* | select count(*) as "Request count",urlgroup by urlorder by "Request count" desclimit 50
count(*): Count all log entries, namely, the number of requests.group by url: Group data by URL and count the number of requests for each URL.order by "Request count" desc: Sort by request count in descending order, prioritizing URLs with more requests.limit 50: Only return the top 50 query results, which are the 50 URLs with the highest request counts.* | select count(*) as "Request count",url_extract_path(url) as url_pathgroup by url_pathorder by "Request count" desclimit 50
url_extract_path(url): Remove the parameter part from the URL, only retaining the path to avoid statistics being affected by URL parameter changes.* | select round(count_if(status >= 400)*1.0 / count(*),2) as "Error request proportion"
count_if(status >= 400): Count the number of log entries that meet specified conditions, where HTTP status >= 400 represents an error request.*1.0: Multiply by 1.0 to convert the count_if result to a double type so that the precision is preserved in subsequent division operations. Otherwise, integer division would result in an integer, losing precision.count(*): Count all log entries, namely, the total number of requests.round: Retain a specified number of decimal places. Two decimal places are retained here.status >= 400 | select count(*) as "Error request count",urlgroup by urlorder by "Error request count" desclimit 50
status >= 400: The statement before the pipe symbol | is the search condition, which represents only logs with HTTP status >= 400 are used for statistical analysis.count(*): Count all log entries. Because there is already a search condition before the pipe symbol |, all error request log entries are actually counted here.group by url: Group data by URL and count the number of error requests for each URL.order by "Error request count" desc: Sort by error request count in descending order, prioritizing URLs with more error requests.limit 50: Only return the top 50 query results, which are the 50 URLs with the highest error request counts.* | select round(count_if(status >= 400)*1.0 / count(*),2) as "Error request proportion",histogram(__TIMESTAMP__,INTERVAL ${__interval}) as timegroup by time limit 10000
count_if(status >= 400): Count the number of log entries that meet specified conditions, where HTTP status >= 400 represents an error request.*1.0: Multiply by 1.0 to convert the count_if result to a double type so that the precision is preserved in subsequent division operations. Otherwise, integer division would result in an integer, losing precision.count(*): Count all log entries, namely, the total number of requests.round: Retain a specified number of decimal places. Two decimal places are retained here.histogram(__TIMESTAMP__,INTERVAL ${__interval}): To view metric trends over time, specify the interval between each data point in the trend chart. For example, to query the trend for the last 60 minutes, set the interval between data points to 30 seconds. Here:__TIMESTAMP__ represents the log time, namely, the time the request occurred.${__interval} represents the interval between data points. It is a variable that dynamically calculates the most appropriate interval based on the length of the query time range. You can also manually specify it. For example, change it to 1 MINUTE, representing an interval of 1 minute.group by time: After specifying the time interval using a histogram, group the data by this interval to count the error request proportion for each interval separately.limit 10000: When no limit is specified in the SQL, only 100 query results are returned by default. To get the complete query results, set the limit to a larger value.* | select count(*) as cnt,status,histogram(__TIMESTAMP__,interval ${__interval}) astimegroup bytime,status limit 10000
count(*): Count all log entries, namely, the total number of requests.histogram(__TIMESTAMP__,INTERVAL ${__interval}): To view metric trends over time, specify the interval between each data point in the trend chart. For example, to query the trend for the last 60 minutes, set the interval between data points to 30 seconds. Here:__TIMESTAMP__ represents the log time, namely, the time the request occurred.${__interval} represents the interval between data points. It is a variable that dynamically calculates the most appropriate interval based on the length of the query time range. You can also manually specify it. For example, change it to 1 MINUTE, representing an interval of 1 minute.group by time: After specifying the time interval using a histogram, group the data by this interval to count the error request proportion for each interval separately.limit 10000: When no limit is specified in the SQL, only 100 query results are returned by default. To get the complete query results, set the limit to a larger value.* | selecttime,status,cnt*1.0/sum(cnt) over (partition bytime) as pctfrom (select count(*) as cnt,status,histogram(__TIMESTAMP__,interval ${__interval}) astimegroup bytime,status)limit 10000
select count(*) as cnt......group by time,status: This query is consistent with the trend of request count by status code case, where the trend of request count is calculated by status code.sum(cnt) over (partition by time): Based on the subquery results, add the request counts of different status codes under each time interval to get the total number of requests in that time interval.cnt*1.0/sum(cnt) over (partition by time) as pct: Divide the request count of each status code by the total number of requests to get the request proportion. *1.0 is used to convert cnt to the double type to preserve precision in subsequent division operations. Otherwise, integer division would result in an integer, losing precision.* | select approx_percentile(request_time,0.95) as P95,urlgroup by urlorder by P95 desclimit 50
approx_percentile(request_time,0.95): Collect statistics on request_time to get its value at the 95th percentile, namely, 95% of request latencies are below this value.group by url: Group data by URL and calculate the P95 latency for each URL.order by P95 desc: Sort by P95 latency in descending order, prioritizing URLs with higher P95 latency.limit 50: Return only the top 50 query results, which are the 50 URLs with the highest P95 latency.* | select approx_percentile(request_time,0.95) as P95,histogram(__TIMESTAMP__,INTERVAL ${__interval}) as timegroup by time limit 10000
approx_percentile(request_time,0.95): Collect statistics on request_time to get its value at the 95th percentile, namely, 95% of request latencies are below this value.histogram(__TIMESTAMP__,INTERVAL ${__interval}): To view metric trends over time, specify the interval between each data point in the trend chart. For example, to query the trend for the last 60 minutes, set the interval between data points to 30 seconds. Here:__TIMESTAMP__ represents the log time, namely, the time the request occurred.${__interval} represents the interval between data points. It is a variable that dynamically calculates the most appropriate interval based on the length of the query time range. You can also manually specify it. For example, change it to 1 MINUTE, representing an interval of 1 minute.group by time: After specifying the time interval using a histogram, group data by this interval to count the P95 latency for each interval separately.limit 10000: When no limit is specified in the SQL, only 100 query results are returned by default. To get the complete query results, set the limit to a larger value.* | select approx_percentile(request_time,0.95) as P95,histogram(__TIMESTAMP__,INTERVAL ${__interval}) as time,urlwhere url in (select url group by url order by approx_percentile(request_time,0.95) desc limit 10)group by time,urlorder by P95 desc limit 10000
where url in (select url group by url order by approx_percentile(request_time,0.95) desc limit 10): When there are many URLs, directly viewing the latency trend for each URL will make the trend chart very messy. This filter condition is used to only view the top 10 URLs by latency.* | select sum(body_bytes_sent) as "Network traffic",urlgroup by urlorder by "Network traffic" desclimit 50
sum(body_bytes_sent): Sum body_bytes_sent to get the total network traffic.group by url: Group data by URL and calculate total network traffic for each URL.order by "Network traffic" desc: Sort by the P95 percentile traffic value in descending order, prioritizing URLs with high total network traffic.limit 50: Only return the top 50 query results, namely, the 50 URLs with the highest total network traffic.* | select *where array_join(transform(split(version, '.'), x -> lpad(x, 10, '0')), '.') > array_join(transform(split('2.0.0', '.'), x -> lpad(x, 10, '0')), '.')
split(version, '.'): Split the version number into multiple parts using the symbol . to get an array. For example, convert 2.0.1 to ["2","0","1"].transform(split(version, '.'), x -> lpad(x, 10, '0')): Use the transform function to zero-pad the left of each number in the array to make the length 10, avoiding inconsistencies in the length of each part of the version number affecting subsequent comparisons. For example, convert 2.0.1 to ["0000000002","0000000000","0000000001"].array_join: Use the symbol . to concatenate the array into a string. Since the transform function has already unified the length of each part of the version number to 10, the string can now be compared directly by character.2.0 while others are 2.0.0.2.0.0-alpha.* | select count(*) as PV, approx_distinct(remote_addr) as UV, urlgroup by urlorder by PV desclimit 50
approx_distinct(remote_addr): Perform deduplicated count on client IP remote_addr to get the UV.group by url: Group data by URL and count the PV and UV for each URL.order by PV desc: Sort by PV in descending order, prioritizing URLs with higher PV.limit 50: Return only the top 50 query results, which are the 50 URLs with the highest PV.* | select approx_distinct(remote_addr) as UV, ip_to_province(remote_addr) as provincewhere ip_to_province(remote_addr) != 'NULL' and ip_to_country(remote_addr) = '中国'group by provinceorder by UV desclimit 1000
approx_distinct(remote_addr): Perform deduplicated count on client IP remote_addr to get the UV.ip_to_province(remote_addr): Get the client's province based on the client IP address remote_addr.where ip_to_province(remote_addr) != 'NULL' and ip_to_country(remote_addr) = '中国': Filter out IP addresses without province information (such as private IP addresses) and only retain IP addresses in China.group by province: Group data by province and count the UV for each province.order by UV desc: Sort by UV in descending order, prioritizing provinces with higher PV.limit 50: Return only the top 50 query results, which are the 50 provinces with the highest UV.* |select compare[1] as today, compare[2] as yesterday, round(compare[3],4) as ratiofrom (select compare(UV, 86400) as comparefrom (select approx_distinct(remote_addr) as UV))
approx_distinct(remote_addr): Perform deduplicated count on client IP remote_addr to get the UV.compare(UV, 100) as compare: Calculate the period-over-period UV in the subquery. 86400 represents calculating the ratio compared to 86400 seconds (1 day) ago. The result is an array, sequentially containing current UV (compare[1]), UV from one day ago (compare[2]), and the ratio of current UV to UV from one day ago (compare[3]).* | selectcase whenlower(http_user_agent) like '%iphone%' ORlower(http_user_agent) like '%ipod%' ORlower(http_user_agent) like '%ipad%' ORlower(http_user_agent) like '%ios%'then 'IOS'whenlower(http_user_agent) like '%android%'then 'Android'else 'other'end as type ,count(*) as PVgroup by typeHAVING type != 'other'
case when: Classify into different device types (IOS, Android, and other) based on keywords in http_user_agent.lower(http_user_agent): Convert http_user_agent to lowercase for keyword matching.count(*) as PV: Count all log entries, namely, the PV.group by type: Group by device type and count the PV for each device type.HAVING type != 'other': Perform secondary filtering on the statistical results, removing device type of other.* | select count(*) as PV, remote_addr as IPgroup by IPorder by PV desclimit 50
count(*) as PV: Count all log entries, namely, the PV.group by IP: Group data by IP address and count the PV for each IP.order by PV desc: Sort by PV in descending order, prioritizing IP addresses with higher PV.limit 50: Only return the top 50 query results, which are the 50 IP addresses with the highest PV.Was this page helpful?
You can also Contact sales or Submit a Ticket for help.
Help us improve! Rate your documentation experience in 5 mins.
Feedback