tencent cloud

Using Custom Metrics for Auto Scaling in TKE
Last updated: 2025-05-29 18:28:37
Using Custom Metrics for Auto Scaling in TKE
Last updated: 2025-05-29 18:28:37

Scenario

TKE supports many metrics for elastic scaling based on the custom metrics API, covering CPU, memory, disk, network, and GPU in most HPA scenarios. For more information on the list, see HPA Metrics. For complex scenarios such as automatic scaling based on the QPS per replica, you can install prometheus-adapter to implement auto scaling. Kubernetes provides the custom metrics API and external metrics API for HPA to perform auto scaling based on metrics, allowing users to customize auto scaling as needed. Prometheus-adapter supports the above two APIs. In the actual environment, the custom metrics API can meet most scenarios. This document describes how to use custom metrics for auto scaling through the custom metrics API.

Prerequisites

Created a TKE cluster with version 1.14 or later. For details, see Creating a Cluster.
You have purchased a Prometheus instance and completed the correlation between the instance and the TKE cluster. For details, see Prometheus Getting Started and Associating with Cluster.
You have installed Helm.

Directions

Opening the monitoring metric

This document takes the Golang service application as an example, which opens the httpserver_requests_total metric and records HTTP requests. This metric can be used to calculate the QPS value of the service application, as shown below:
package main

import (
"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promhttp"
"net/http"
"strconv"
)

var (
HTTPRequests = prometheus.NewCounterVec(
prometheus.CounterOpts{
Name: "httpserver_requests_total",
Help: "Number of the http requests received since the server started",
},
[]string{"status"},
)
)

func init() {
prometheus.MustRegister(HTTPRequests)
}

func main(){
http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
path := r.URL.Path
code := 200
switch path {
case "/test":
w.WriteHeader(200)
w.Write([]byte("OK"))
case "/metrics":
promhttp.Handler().ServeHTTP(w, r)
default:
w.WriteHeader(404)
w.Write([]byte("Not Found"))
}
HTTPRequests.WithLabelValues(strconv.Itoa(code)).Inc()
})
http.ListenAndServe(":80", nil)
}

Deploying the service application

Package the application into a container image and deploy it to the cluster. Take the Deployment mode as an example:
apiVersion: apps/v1
kind: Deployment
metadata:
name: httpserver
namespace: httpserver
spec:
replicas: 1
selector:
matchLabels:
app: httpserver
template:
metadata:
labels:
app: httpserver
spec:
containers:
- name: httpserver
image: ccr.ccs.tencentyun.com/tkedemo/custom-metrics-demo:v1.0.0
imagePullPolicy: Always

---

apiVersion: v1
kind: Service
metadata:
name: httpserver
namespace: httpserver
labels:
app: httpserver
annotations:
prometheus.io/scrape: "true"
prometheus.io/path: "/metrics"
prometheus.io/port: "http"
spec:
type: ClusterIP
ports:
- port: 80
protocol: TCP
name: http
selector:
app: httpserver

Collecting service monitoring metrics through PROM instance

1. you have completed the correlation between the Prometheus monitoring instance and the TKE cluster, you can directly log in to the TKE console and select TMP from the left sidebar.
2. Select a monitoring instance. On the Data Collection > Integrate with TKE page, find the target instance, click on the right Data Collection Configuration, and enter the data collection configuration list page.
3. Click Customize Monitoring Configuration to configure new collection rules for the deployed business. Click OK, as shown in the following figure:


4. lick in the upper left corner of the Basic Info page , retrieve the access address (HTTP URL) for the Prometheus API and the authentication account information (Basic auth user and Basic auth password). Use these subsequently when installing prometheus-adapter, as shown below:


Installing prometheus-adapter

1. Use Helm to install prometheus-adapter. Please confirm and configure custom metrics before installation. According to the example in Opening the monitoring metric above, the httpserver_requests_total metric is used in the service to record HTTP requests, so you can calculate the QPS of each service Pod through the following PromQL, as shown below:
sum(rate(http_requests_total[2m])) by (pod)
2. Convert it to the configuration of prometheus-adapter. Create values.yaml with the following content:
rules:
default: false
custom:
- seriesQuery: 'httpserver_requests_total'
resources:
template: <<.Resource>>
name:
matches: "httpserver_requests_total"
as: "httpserver_requests_qps" # QPS metric calculated by PromQL
metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>)
prometheus:
url: http://127.0.0.1 # Replace with the Prometheus API address (HTTP URL) obtained in the previous step (without port)
port: 9090
extraArguments:
- --prometheus-header=Authorization=Basic {token} # Among them{token} is the base64 encoding of the Basic auth user:Basic auth password string you get from console.
3. Run the following Helm command to install prometheus-adapter, as shown below:
Note:
Before installation, you need to delete the TKE's registered Custom Metrics API using the following command:
kubectl delete apiservice v1beta1.custom.metrics.k8s.io
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
# Helm 3
helm install prometheus-adapter prometheus-community/prometheus-adapter -f values.yaml
# Helm 2
# helm install --name prometheus-adapter prometheus-community/prometheus-adapter -f values.yaml
4. Add prometheus authentication and authorization parameters.
The current charts provided by the community do not expose authentication-related parameters, which can cause authentication failures and prevent normal connection to the TMP service. To solve this problem, you can refer to the community documentation. The solution requires you to manually modify the Prometheus Adapter deployment and add --prometheus-header=Authorization=Basic {token} to the adapter startup parameters, where {token} is the base64-encoded Basic auth user:Basic auth password obtained from the console.

Verifying installation result

If the installation is correct, you can run the following command to view the configured QPS related metrics returned by the Custom Metrics API, as shown below:
$ kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1
{
"kind": "APIResourceList",
"apiVersion": "v1",
"groupVersion": "custom.metrics.k8s.io/v1beta1",
"resources": [
{
"name": "jobs.batch/httpserver_requests_qps",
"singularName": "",
"namespaced": true,
"kind": "MetricValueList",
"verbs": [
"get"
]
},
{
"name": "pods/httpserver_requests_qps",
"singularName": "",
"namespaced": true,
"kind": "MetricValueList",
"verbs": [
"get"
]
},
{
"name": "namespaces/httpserver_requests_qps",
"singularName": "",
"namespaced": false,
"kind": "MetricValueList",
"verbs": [
"get"
]
}
]
}
Run the following command to view the QPS value of the Pod, as shown below:
Note:
In the following example, the value is 500m, which means the value of QPS is 0.5 request/second.
$ kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1/namespaces/httpserver/pods/*/httpserver_requests_qps
{
"kind": "MetricValueList",
"apiVersion": "custom.metrics.k8s.io/v1beta1",
"metadata":{
"selfLink": "/apis/custom.metrics.k8s.io/v1beta1/namespaces/httpserver/pods/%2A/httpserver_requests_qps"
},
"items": [
{
"describedObject": {
"kind": "Pod",
"namespace": "httpserver",
"name": "httpserver-6f94475d45-7rln9",
"apiVersion": "/v1"
},
"metricName": "httpserver_requests_qps",
"timestamp": "2020-11-17T09:14:36Z",
"value": "500m",
"selector": null
}
]
}

Testing HPA

If the scaling out is triggered when the average QPS of each service Pod reaches 50 requests/second, and the minimum and maximum number of replicas are 1 and 1000 respectively, the configuration example will be as follows:
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: httpserver
namespace: httpserver
spec:
minReplicas: 1
maxReplicas: 1000
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: httpserver
metrics:
- type: Pods
pods:
metric:
name: httpserver_requests_qps
target:
averageValue: 50
type: AverageValue
Run the following command to test the service and observe whether the scaling out is triggered, as shown below:
# Use wrk or another HTTP stress test tool to perform stress tests on the httpserver service.
$ kubectl proxy --port=8080 > /dev/null 2>&1 &
$ wrk -t12 -c3000 -d60s http://localhost:8080/api/v1/namespaces/httpserver/services/httpserver:http/proxy/test
$ kubectl get hpa -n httpserver
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
httpserver Deployment/httpserver 35266m/50 1 1000 2 50m

# Observe the scaling of hpa
$ kubectl get pods -n httpserver
NAME READY STATUS RESTARTS AGE
httpserver-7f8dffd449-pgsb7 0/1 ContainerCreating 0 4s
httpserver-7f8dffd449-wsl95 1/1 Running 0 93s
httpserver-7f8dffd449-pgsb7 1/1 Running 0 4s
If the scaling out is triggered normally, it means that HPA has implemented auto scaling based on service custom metrics.
Was this page helpful?
You can also Contact Sales or Submit a Ticket for help.
Yes
No

Feedback