tencent cloud

Tencent Kubernetes Engine

Release Notes and Announcements
Release Notes
Announcements
Release Notes
Product Introduction
Overview
Strengths
Architecture
Scenarios
Features
Concepts
Native Kubernetes Terms
Common High-Risk Operations
Regions and Availability Zones
Service Regions and Service Providers
Open Source Components
Purchase Guide
Purchase Instructions
Purchase a TKE General Cluster
Purchasing Native Nodes
Purchasing a Super Node
Getting Started
Beginner’s Guide
Quickly Creating a Standard Cluster
Examples
Container Application Deployment Check List
Cluster Configuration
General Cluster Overview
Cluster Management
Network Management
Storage Management
Node Management
GPU Resource Management
Remote Terminals
Application Configuration
Workload Management
Service and Configuration Management
Component and Application Management
Auto Scaling
Container Login Methods
Observability Configuration
Ops Observability
Cost Insights and Optimization
Scheduler Configuration
Scheduling Component Overview
Resource Utilization Optimization Scheduling
Business Priority Assurance Scheduling
QoS Awareness Scheduling
Security and Stability
TKE Security Group Settings
Identity Authentication and Authorization
Application Security
Multi-cluster Management
Planned Upgrade
Backup Center
Cloud Native Service Guide
Cloud Service for etcd
TMP
TKE Serverless Cluster Guide
TKE Registered Cluster Guide
Use Cases
Cluster
Serverless Cluster
Scheduling
Security
Service Deployment
Network
Release
Logs
Monitoring
OPS
Terraform
DevOps
Auto Scaling
Containerization
Microservice
Cost Management
Hybrid Cloud
AI
Troubleshooting
Disk Full
High Workload
Memory Fragmentation
Cluster DNS Troubleshooting
Cluster kube-proxy Troubleshooting
Cluster API Server Inaccessibility Troubleshooting
Service and Ingress Inaccessibility Troubleshooting
Common Service & Ingress Errors and Solutions
Engel Ingres appears in Connechtin Reverside
CLB Ingress Creation Error
Troubleshooting for Pod Network Inaccessibility
Pod Status Exception and Handling
Authorizing Tencent Cloud OPS Team for Troubleshooting
CLB Loopback
API Documentation
History
Introduction
API Category
Making API Requests
Elastic Cluster APIs
Resource Reserved Coupon APIs
Cluster APIs
Third-party Node APIs
Relevant APIs for Addon
Network APIs
Node APIs
Node Pool APIs
TKE Edge Cluster APIs
Cloud Native Monitoring APIs
Scaling group APIs
Super Node APIs
Other APIs
Data Types
Error Codes
TKE API 2022-05-01
FAQs
TKE General Cluster
TKE Serverless Cluster
About OPS
Hidden Danger Handling
About Services
Image Repositories
About Remote Terminals
Event FAQs
Resource Management
Service Agreement
TKE Service Level Agreement
TKE Serverless Service Level Agreement
Contact Us
Glossary

Using Self-Built Prometheus to Collect Control Plane Monitoring Metrics

PDF
フォーカスモード
フォントサイズ
最終更新日: 2025-04-29 10:58:03
This document introduces the configuration for collecting monitoring metrics of control plane component (kube-apiserver, kube-scheduler, and kube-controller-manager) in a TKE managed cluster by using self-built Prometheus.

Prerequisites

Self-built Prometheus can access API Server of the TKE cluster.
Self-built Prometheus can be deployed inside or outside the TKE cluster.

Prometheus Collection Configuration

When self-built Prometheus is used to collect metrics of core control panel components in a TKE cluster, you need to configure metric collection Jobs first in the Prometheus configuration file prometheus.yaml. The configuration file format is as follows:
global:
scrape_interval: 15s # By default, scrape targets every 15 seconds.


# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: kube-apiserver
......

- job_name: kube-controller-manager
......

- job_name: kube-scheduler
......
Each core component corresponds to a Job. For specific configuration, see the metric list of the corresponding core component. For the method of configuring prometheus.yaml of community Prometheus, see Configuration.

Monitoring Metric Collection Inside a TKE Cluster

Prometheus is deployed in a TKE cluster for monitoring.

kube-apiserver

For more information about the kube-apiserver component, see kube-apiserver component metric descriptions.
scrape_configs:
- job_name: kube-apiserver
honor_timestamps: true
params:
component:
- apiserver
scrape_interval: 15s
metrics_path: /master/metrics
scheme: http
follow_redirects: true
enable_http2: true
relabel_configs:
- source_labels: [job]
separator: ;
regex: (.*)
target_label: __tmp_prometheus_job_name
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_service_label_app_kubernetes_io_managed_by,
__meta_kubernetes_service_labelpresent_app_kubernetes_io_managed_by]
separator: ;
regex: (Helm);true
replacement: $1
action: keep
- source_labels: [__meta_kubernetes_service_label_label_qcloud_app, __meta_kubernetes_service_labelpresent_label_qcloud_app]
separator: ;
regex: (cluster-monitor);true
replacement: $1
action: keep
- source_labels: [__meta_kubernetes_service_label_label_qcloud_service, __meta_kubernetes_service_labelpresent_label_qcloud_service]
separator: ;
regex: (master-metrics-service);true
replacement: $1
action: keep
- source_labels: [__meta_kubernetes_endpoint_port_name]
separator: ;
regex: http-metrics
replacement: $1
action: keep
- source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
separator: ;
regex: Node;(.*)
target_label: node
replacement: ${1}
action: replace
- source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
separator: ;
regex: Pod;(.*)
target_label: pod
replacement: ${1}
action: replace
- source_labels: [__meta_kubernetes_namespace]
separator: ;
regex: (.*)
target_label: namespace
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_service_name]
separator: ;
regex: (.*)
target_label: service
replacement: $1
action: replace
- separator: ;
regex: (.*)
target_label: endpoint
replacement: http-metrics
action: replace
metric_relabel_configs:
- source_labels: [pod]
separator: ;
regex: (.*)
target_label: instance
replacement: $1
action: replace
kubernetes_sd_configs:
- role: endpoints
kubeconfig_file: ""
follow_redirects: true
enable_http2: true
namespaces:
own_namespace: false
names:
- kube-system

kube-scheduler

For more information about the kube-scheduler component, see kube-scheduler component metric descriptions.
scrape_configs:
- job_name: kube-scheduler
honor_labels: true
honor_timestamps: true
params:
component:
- scheduler
scrape_interval: 30s
scrape_timeout: 10s
metrics_path: /master/metrics
scheme: http
follow_redirects: true
enable_http2: true
relabel_configs:
- source_labels: [job]
separator: ;
regex: (.*)
target_label: __tmp_prometheus_job_name
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_service_label_app_kubernetes_io_managed_by,
__meta_kubernetes_service_labelpresent_app_kubernetes_io_managed_by]
separator: ;
regex: (Helm);true
replacement: $1
action: keep
- source_labels: [__meta_kubernetes_service_label_label_qcloud_app, __meta_kubernetes_service_labelpresent_label_qcloud_app]
separator: ;
regex: (cluster-monitor);true
replacement: $1
action: keep
- source_labels: [__meta_kubernetes_service_label_label_qcloud_service, __meta_kubernetes_service_labelpresent_label_qcloud_service]
separator: ;
regex: (master-metrics-service);true
replacement: $1
action: keep
- source_labels: [__meta_kubernetes_endpoint_port_name]
separator: ;
regex: http-metrics
replacement: $1
action: keep
- source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
separator: ;
regex: Node;(.*)
target_label: node
replacement: ${1}
action: replace
- source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
separator: ;
regex: Pod;(.*)
target_label: pod
replacement: ${1}
action: replace
- source_labels: [__meta_kubernetes_namespace]
separator: ;
regex: (.*)
target_label: namespace
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_service_name]
separator: ;
regex: (.*)
target_label: service
replacement: $1
action: replace
- separator: ;
regex: (.*)
target_label: endpoint
replacement: http-metrics
action: replace
metric_relabel_configs:
- source_labels: [pod]
separator: ;
regex: (.*)
target_label: instance
replacement: $1
action: replace
kubernetes_sd_configs:
- role: endpoints
kubeconfig_file: ""
follow_redirects: true
enable_http2: true
namespaces:
own_namespace: false
names:
- kube-system

kube-controller-manager

For more information about the kube-controller-manager component, see kube-controller-manager component metric descriptions.
scrape_configs:
- job_name: kube-controller-manager
honor_labels: true
honor_timestamps: true
params:
component:
- controller-manager
scrape_interval: 30s
scrape_timeout: 10s
metrics_path: /master/metrics
scheme: http
follow_redirects: true
enable_http2: true
relabel_configs:
- source_labels: [job]
separator: ;
regex: (.*)
target_label: __tmp_prometheus_job_name
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_service_label_app_kubernetes_io_managed_by,
__meta_kubernetes_service_labelpresent_app_kubernetes_io_managed_by]
separator: ;
regex: (Helm);true
replacement: $1
action: keep
- source_labels: [__meta_kubernetes_service_label_label_qcloud_app, __meta_kubernetes_service_labelpresent_label_qcloud_app]
separator: ;
regex: (cluster-monitor);true
replacement: $1
action: keep
- source_labels: [__meta_kubernetes_service_label_label_qcloud_service, __meta_kubernetes_service_labelpresent_label_qcloud_service]
separator: ;
regex: (master-metrics-service);true
replacement: $1
action: keep
- source_labels: [__meta_kubernetes_endpoint_port_name]
separator: ;
regex: http-metrics
replacement: $1
action: keep
- source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
separator: ;
regex: Node;(.*)
target_label: node
replacement: ${1}
action: replace
- source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
separator: ;
regex: Pod;(.*)
target_label: pod
replacement: ${1}
action: replace
- source_labels: [__meta_kubernetes_namespace]
separator: ;
regex: (.*)
target_label: namespace
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_service_name]
separator: ;
regex: (.*)
target_label: service
replacement: $1
action: replace
- separator: ;
regex: (.*)
target_label: endpoint
replacement: http-metrics
action: replace
metric_relabel_configs:
- source_labels: [pod]
separator: ;
regex: (.*)
target_label: instance
replacement: $1
action: replace
kubernetes_sd_configs:
- role: endpoints
kubeconfig_file: ""
follow_redirects: true
enable_http2: true
namespaces:
own_namespace: false
names:
- kube-system

Collecting via ServiceMonitor

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: master-metrics-exporter
namespace: kube-system
spec:
endpoints:
- honorLabels: true
interval: 15s
metricRelabelings:
- action: replace
sourceLabels:
- pod
targetLabel: instance
params:
component:
- apiserver
path: /master/metrics
port: http-metrics
relabelings:
- action: replace
replacement: kube-apiserver
targetLabel: job
- honorLabels: true
interval: 15s
metricRelabelings:
- action: replace
sourceLabels:
- pod
targetLabel: instance
params:
component:
- controller-manager
path: /master/metrics
port: http-metrics
relabelings:
- action: replace
replacement: kube-controller-manager
targetLabel: job
- honorLabels: true
interval: 15s
metricRelabelings:
- action: replace
sourceLabels:
- pod
targetLabel: instance
params:
component:
- scheduler
path: /master/metrics
port: http-metrics
relabelings:
- action: replace
replacement: kube-scheduler
targetLabel: job
namespaceSelector: {}
selector:
matchLabels:
app.kubernetes.io/managed-by: Helm
label_qcloud_app: cluster-monitor
label_qcloud_service: master-metrics-service

Monitoring Metric Collection Outside a TKE Cluster

This document takes collection outside a cluster under the same VPC as Prometheus as an example.

Prerequisites

Public network or private network access has been enabled for the target cluster.

Enabling Cross-Cluster Metric Collection

1. Log in to the TKE console, and click Cluster in the left sidebar.
2. Click the ID of the target cluster on the Cluster Management page to enter the cluster details page.
3. Click Add-on management in the left sidebar.
4. Click Update Configuration of the clustermonitor component.
5. Check the option to enable control plane component metric collection outside the cluster on the component configuration update page, and select the container subnet.

Note:
The container subnet cannot be changed after it is selected. If you need to change it, disable the feature of control plane component metric collection outside the cluster first. After the subnet is changed, enable the feature again.
6. Click Complete.

Collection Configuration

For specific operations, see Configuration and Monitoring kubernetes with prometheus from outside of k8s cluster. Compared with the configuration of collection inside the cluster, the kubernetes_sd_configs settings are modified. The configuration method is as follows:
scrape_configs:
- job_name: out-tke-apiserver
kubernetes_sd_configs:
- role: endpoints
kubeconfig_file: ""
follow_redirects: true
enable_http2: true
namespaces:
names:
- kube-system
namespaces:
names:
- kube-system
api_server: 'https://<KUBERNETES URL>'
tls_config:
ca_file: /etc/prometheus/kubernetes-ca.crt
bearer_token: '<SERVICE ACCOUNT BEARER TOKEN>'
scrape_interval: 30s

Verification

1. Log in to the console of self-built Prometheus and switch to the Graph page.
2. Enter up to check whether all control plane components can be displayed.

3. Enter the following instruction to verify whether API Server is normal:
apiserver_request_total{job="apiserver"}

4. Enter the following instruction to verify whether Scheduler is normal:
rest_client_requests_total{job="scheduler"}

5. Enter the following instruction to verify whether Controller Manager is normal:
rest_client_requests_total{job="controller-manager"}




ヘルプとサポート

この記事はお役に立ちましたか?

フィードバック