tencent cloud

Tencent Kubernetes Engine

Release Notes and Announcements
Release Notes
Announcements
Release Notes
Product Introduction
Overview
Strengths
Architecture
Scenarios
Features
Concepts
Native Kubernetes Terms
Common High-Risk Operations
Regions and Availability Zones
Service Regions and Service Providers
Open Source Components
Purchase Guide
Purchase Instructions
Purchase a TKE General Cluster
Purchasing Native Nodes
Purchasing a Super Node
Getting Started
Beginner’s Guide
Quickly Creating a Standard Cluster
Examples
Container Application Deployment Check List
Cluster Configuration
General Cluster Overview
Cluster Management
Network Management
Storage Management
Node Management
GPU Resource Management
Remote Terminals
Application Configuration
Workload Management
Service and Configuration Management
Component and Application Management
Auto Scaling
Container Login Methods
Observability Configuration
Ops Observability
Cost Insights and Optimization
Scheduler Configuration
Scheduling Component Overview
Resource Utilization Optimization Scheduling
Business Priority Assurance Scheduling
QoS Awareness Scheduling
Security and Stability
TKE Security Group Settings
Identity Authentication and Authorization
Application Security
Multi-cluster Management
Planned Upgrade
Backup Center
Cloud Native Service Guide
Cloud Service for etcd
TMP
TKE Serverless Cluster Guide
TKE Registered Cluster Guide
Use Cases
Cluster
Serverless Cluster
Scheduling
Security
Service Deployment
Network
Release
Logs
Monitoring
OPS
Terraform
DevOps
Auto Scaling
Containerization
Microservice
Cost Management
Hybrid Cloud
AI
Troubleshooting
Disk Full
High Workload
Memory Fragmentation
Cluster DNS Troubleshooting
Cluster kube-proxy Troubleshooting
Cluster API Server Inaccessibility Troubleshooting
Service and Ingress Inaccessibility Troubleshooting
Common Service & Ingress Errors and Solutions
Engel Ingres appears in Connechtin Reverside
CLB Ingress Creation Error
Troubleshooting for Pod Network Inaccessibility
Pod Status Exception and Handling
Authorizing Tencent Cloud OPS Team for Troubleshooting
CLB Loopback
API Documentation
History
Introduction
API Category
Making API Requests
Elastic Cluster APIs
Resource Reserved Coupon APIs
Cluster APIs
Third-party Node APIs
Relevant APIs for Addon
Network APIs
Node APIs
Node Pool APIs
TKE Edge Cluster APIs
Cloud Native Monitoring APIs
Scaling group APIs
Super Node APIs
Other APIs
Data Types
Error Codes
TKE API 2022-05-01
FAQs
TKE General Cluster
TKE Serverless Cluster
About OPS
Hidden Danger Handling
About Services
Image Repositories
About Remote Terminals
Event FAQs
Resource Management
Service Agreement
TKE Service Level Agreement
TKE Serverless Service Level Agreement
Contact Us
Glossary

HPA Basic Operations

PDF
Mode fokus
Ukuran font
Terakhir diperbarui: 2025-09-25 18:28:43

Overview

Horizontal Pod Autoscaler (HPA) can automatically scale the number of Pods for services according to the average CPU utilization of target Pods and other metrics. This document describes how to implement Pod autoscaling via Tencent Cloud TKE console.

How it Works

The HPA backend components pull monitoring metrics of containers and Pods from Tencent Cloud’s Cloud Monitor every 15 seconds and calculate the desired number of replicas based on the current monitoring data, the current number of replicas, and the desired value of the metrics. When there is a gap between the desired number and the actual number of replicas, HPA will trigger a Deployment to adjust the number of Pod replicas, thereby achieving auto-scaling. Take CPU utilization as an example. Suppose there are two Pods with an average CPU utilization of 90%, and the target CPU utilization is set to 60% for autoscaling. Then the number of Pods will be automatically adjusted as follows: 90% × 2 / 60% = 3 Pods.
Note:
If you set multiple auto scaling metrics, HPA will separately calculate the target numbers of replicas according to each metric and then take the maximum number to use for auto scaling.

Notes

If you choose CPU utilization (by request) as the metric type, a CPU request must be set for the container.
Set reasonable targets for the policy metrics. For example, set 70% for containers and applications and leave 30%.
Keep Pods and nodes healthy; avoid frequently recreating Pods.
Ensure that the load balancer works stably.
If the gap between the actual number and desired number of replicas is smaller than 10%, HPA will not adjust the number of replicas.
If the value of Deployment.spec.replicas corresponding to the service is 0, HPA will not work.
If multiple HPAs are bound to a single Deployment, the HPAs will take effect simultaneously, which will cause workload replicas to be repeatedly scaled.

Prerequisites

You have registered a Tencent Cloud account.
You have logged in to the TKE console.
You have created a cluster. For more information, see Creating a Cluster.

Directions

Enabling Auto Scaling

You can enable auto scaling in one of the following ways.

Setting auto-adjustment of the number of Pods

1. On the Cluster Management page, click the cluster ID for which a scaling group is to be created.
2. Select Workload > Deployment. On the Deployment page, click Create.
3. On the Create Deployment page, select Auto adjustment for the number of Pods as shown below:


Trigger Policy: the policy metrics that trigger the auto-scaling. For details, see Metric Type.
Number of Pods: enter the minimum and maximum numbers according to your needs. The number of Pods will be auto-adjusted within the range.

Creating auto-scaling group

1. On the Cluster Management page, click the cluster ID for which a scaling group is to be created.
2. Select Auto Scaling > HorizontalPodAutoscaler. On the HorizontalPodAutoscaler page, and click Create.
3. On the Create HPA page, configure the HPA as needed.


Name: Enter the name of the auto scaling group to be created.
Namespace: select based on your needs.
Workload Type: select based on your needs.
Associated Workload: select based on your needs. The value cannot be empty.
Trigger Policy: the policy metrics that trigger the auto-scaling. For details, see Metric Type.
Number of Pods: enter the minimum and maximum numbers according to your needs. The number of Pods will be auto-adjusted within the range.
4. Click Create HPA.

Creating using YAML

1. On the Cluster Management page, click the cluster ID for which a scaling group is to be created.
2. On the cluster basic information page, click Creating using YAML in the top right corner.


3. Edit the content according to your needs and click Complete to create the HPA.

Updating Auto Scaling Rules

You can update auto scaling rules in one of the following ways.

Updating the Pod Quantity

1. On the Cluster Management page, click the target cluster ID.
2. Select Workload > Deployment to enter the Deployment page and click Update Pod Number.


3. On the Update Pod Number page, select Auto adjustment and set parameters as needed.


4. Click Update number of instance.

Modifying HPA Configuration

1. On the Cluster Management page, click the cluster ID for which a scaling group is to be created.
2. Select Auto Scaling > HorizontalPodAutoscaler. On the HorizontalPodAutoscaler page, click Modify configuration in the Operation column of the HPA whose configuration is to be updated.


3. On the Update Configuration page, change the settings according to your needs and click Update HPA.

Editing YAML

1. On the Cluster Management page, click the cluster ID for which a scaling group is to be created.
2. Select Auto Scaling > HorizontalPodAutoscaler. On the HorizontalPodAutoscaler page, click Edit YAML in the Operation column of the HPA whose configuration is to be updated.


3. On the Edit YAML page, edit parameters as needed and click Complete.


Metric Type

For more information on metrics and types, see HPA Metrics.

Bantuan dan Dukungan

Apakah halaman ini membantu?

masukan