Overview
If you need to integrate CKafka monitoring metrics into your self-owned ops platform, CKafka offers two ways to access Prometheus monitoring:
1. Tencent Cloud Observability Platform integration methods
Configure cloud service monitoring via Prometheus + Grafana provided by TCOP. It supports exporting metrics collected by Cloud Monitor through Prometheus. For details, refer to Cloud Monitor. 2. Integration based on open source standard Prometheus Exporter
For professional edition users, Message Queue CKafka provides an access method based on open source standard Prometheus Exporter. Details are as follows:
Instance-level metrics: Pro Edition instances provide access to external monitoring services for all users by default. By providing access points, you can monitor CKafka instances, including but not limited tounsynced replicas, Topic read/write rate, request response duration and other monitorable metric indicators supported by open-source Kafka.
Node-level metrics: Pro Edition instances provide node metric information captured by Prometheus Exporter, including but not limited to basic monitoring metrics such as CPU, memory usage, system load, as well as metrics exposed by broker JMX. These metrics can be aggregated, displayed, and analyzed after Prometheus data acquisition.
This article introduces the operation steps for integrating Prometheus monitoring with TDMQ for CKafka based on open source standard Prometheus Exporter.
Directions
Step 1: Obtaining Prometheus Monitoring Targets
2. Select Instance List on the left sidebar and click the ID of the target instance to enter its basic information page.
3. Click Get Monitoring Target in the top-right corner of the Prometheus Monitoring module and select the VPC and subnet.
4. Click Submit to obtain a set of monitoring targets.
5. In the Prometheus Monitoring instance, click Retrieve again in the top-right corner to delete the current monitoring addresses under the existing network.
6. Click Confirm to retrieve the monitoring targets again.
Note:
If the instance undergoes migration configuration adjustment or AZ replacement, it can cause failure to obtain Prometheus monitoring data due to underlying layer broker migration. At this point, click Re-fetch via the Console to get the latest Exporter IP and port.
Step 2: Collecting Monitoring Metrics with Prometheus
1. Download Prometheus and configure the monitoring scrape address. 2. Enter the directory of the Prometheus package and run the following command to decompress it.
tar -vxf prometheus-2.30.3.linux-amd64.tar.gz
3. Modify the prometheus.yml configuration file by adding the jmx_exporter and node_exporter scrape tasks.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "prometheus"
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ["localhost:9090"]
- job_name: "broker-jmx-exporter"
scrape_interval: 5s
metrics_path: '/metrics'
static_configs:
- targets: ['10.x.x.0:60001','10.x.x.0:60003','10.x.x.0:60005']
labels:
application: 'broker-jmx'
- job_name: "broker-node-exporter"
scrape_interval: 10s
metrics_path: '/metrics'
static_configs:
- targets: ['10.x.x.0:60002','10.x.x.0:60004','10.x.x.0:60006']
labels:
application: 'broker-node'
broker-jmx-exporter is the tag configured for the jmx metric of the broker scraped by Prometheus, Targets contains the information of the mapped port.
broker-node-exporter is the tag configured for the basic metrics of the node of the scraped broker, and scrape_interval is the frequency of scraping metric data.
4. Start Prometheus.
./prometheus --config.file=prometheus.yml --web.enable-lifecycle
5. Open the UI provided by Prometheus to check whether the connected targets are normal by entering http://localhost:9090 in the browser for example.
6. Confirm that all targets are in UP status.
If the targets are in DOWN status, check whether the network is accessible or check the Error option at the end of the status bar for the cause.
7. Query the monitoring metric data.
Click Graph, enter a metric name such as node_memory_MemAvailable_bytes, and click Execute to view the monitoring data.
Note:
Note: When creating a Prometheus instance, the bound VPC and subnet should be consistent with the vpcId and subnetId in procedure 1, otherwise network disconnection may occur. 2. Click a Prometheus instance in the instance list to enter instance details. Select Data Acquisition > Integration Center, locate the scraping job under development, and click to enter.
3. In the pop-up scrape task drawer, fill in the collection configuration from the example code below and click Save.
Configure metrics by major category as needed, support jmx-exporter and node-exporter.
jmx-exporter configuration example:
job_name: broker-jmx-exporter
scrape_interval: 10s
metrics_path: /metrics
static_configs:
- targets:
- 10.0.x.x:60001
- 10.0.x.x:60003
- 10.0.x.x:60005
- 10.0.x.x:60007
labels:
application: broker-jmx
node-exporter configuration example:
job_name: broker-node-exporter
scrape_interval: 10s
metrics_path: /metrics
static_configs:
- targets:
- 10.0.x.x:60002
- 10.0.x.x:60004
- 10.0.x.x:60006
- 10.0.x.x:60008
labels:
application: broker-node
4. In the scrape task, click the integrated tab, wait 2-3 minutes, and you can see the running state change to "deployed". Targets also show specific data capture objects. Click metric details to view the scraped Broker metrics.