tencent cloud

Tencent Kubernetes Engine

Release Notes and Announcements
Release Notes
Announcements
Release Notes
Product Introduction
Overview
Strengths
Architecture
Scenarios
Features
Concepts
Native Kubernetes Terms
Common High-Risk Operations
Regions and Availability Zones
Service Regions and Service Providers
Open Source Components
Purchase Guide
Purchase Instructions
Purchase a TKE General Cluster
Purchasing Native Nodes
Purchasing a Super Node
Getting Started
Beginner’s Guide
Quickly Creating a Standard Cluster
Examples
Container Application Deployment Check List
Cluster Configuration
General Cluster Overview
Cluster Management
Network Management
Storage Management
Node Management
GPU Resource Management
Remote Terminals
Application Configuration
Workload Management
Service and Configuration Management
Component and Application Management
Auto Scaling
Container Login Methods
Observability Configuration
Ops Observability
Cost Insights and Optimization
Scheduler Configuration
Scheduling Component Overview
Resource Utilization Optimization Scheduling
Business Priority Assurance Scheduling
QoS Awareness Scheduling
Security and Stability
TKE Security Group Settings
Identity Authentication and Authorization
Application Security
Multi-cluster Management
Planned Upgrade
Backup Center
Cloud Native Service Guide
Cloud Service for etcd
TMP
TKE Serverless Cluster Guide
TKE Registered Cluster Guide
Use Cases
Cluster
Serverless Cluster
Scheduling
Security
Service Deployment
Network
Release
Logs
Monitoring
OPS
Terraform
DevOps
Auto Scaling
Containerization
Microservice
Cost Management
Hybrid Cloud
AI
Troubleshooting
Disk Full
High Workload
Memory Fragmentation
Cluster DNS Troubleshooting
Cluster kube-proxy Troubleshooting
Cluster API Server Inaccessibility Troubleshooting
Service and Ingress Inaccessibility Troubleshooting
Common Service & Ingress Errors and Solutions
Engel Ingres appears in Connechtin Reverside
CLB Ingress Creation Error
Troubleshooting for Pod Network Inaccessibility
Pod Status Exception and Handling
Authorizing Tencent Cloud OPS Team for Troubleshooting
CLB Loopback
API Documentation
History
Introduction
API Category
Making API Requests
Elastic Cluster APIs
Resource Reserved Coupon APIs
Cluster APIs
Third-party Node APIs
Relevant APIs for Addon
Network APIs
Node APIs
Node Pool APIs
TKE Edge Cluster APIs
Cloud Native Monitoring APIs
Scaling group APIs
Super Node APIs
Other APIs
Data Types
Error Codes
TKE API 2022-05-01
FAQs
TKE General Cluster
TKE Serverless Cluster
About OPS
Hidden Danger Handling
About Services
Image Repositories
About Remote Terminals
Event FAQs
Resource Management
Service Agreement
TKE Service Level Agreement
TKE Serverless Service Level Agreement
Contact Us
Glossary
DocumentationTencent Kubernetes EngineUse CasesAuto ScalingImplementing elasticity based on traffic prediction with EHPA

Implementing elasticity based on traffic prediction with EHPA

PDF
Focus Mode
Font Size
Last updated: 2024-05-24 15:26:52

EHPA introduction

Effective Horizontal Pod Autoscaler (EHPA) is an auto scaling product provided by the Crane open source project. It is based on the community HPA for underlying elastic control and supports a richer set of elastic trigger policies (prediction, observation, and cycle) to enhance the efficiency of elastic control and ensure service quality.
Key features of EHPA include:
Preemptive scaling to ensure service quality: By predicting future traffic peaks, it preemptively scales up to avoid avalanches and service stability failures due to untimely scaling.
Reducing invalid scaling-down: By predicting future demand, it minimizes unnecessary scaling-down, stabilizes the resource utilization rate of workloads, and eliminates spike misjudgments.
Support to Cron configuration: It supports elastic configuration based on Cron to cope with abnormal traffic peaks during major promotion activities.
Community compatibility: It uses the community HPA as the execution layer for elastic control, fully compatible with the community.

Installation

For EHPA installation, refer to the Crane open source project documentation. Refer to Intelligent Autoscaling Practices Based on Effective HPA for Custom Metrics for details.

Product features

Elasticity based on prediction

The load of most online applications is periodic. Users can predict future loads based on daily or weekly trends. EHPA uses DSP algorithms to predict future time series data of applications.
The following code is an example of an EHPA template with predictive capabilities enabled:
apiVersion: autoscaling.crane.io/v1alpha1
kind: EffectiveHorizontalPodAutoscaler
spec:
prediction:
predictionWindowSeconds: 3600
predictionAlgorithm:
algorithmType: dsp
dsp:
sampleInterval: "60s"
historyLength: "3d"

Fallback for monitoring data

When using predictive algorithms for prediction, you may be concerned about the accuracy of the predictive data. Therefore, when calculating the number of replicas, EHPA will not only calculate based on predictive data but also consider actual monitoring data for fallback to enhance the safety of elasticity. The specific implementation principle is, when you define spec.metrics in EHPA and enable elasticity prediction, the EffectiveHPAController will automatically generate multiple Metric Specs to create the underlying management's HPA.
For example, when a user defines the following Metric Spec in an EHPA YAML file:
apiVersion: autoscaling.crane.io/v1alpha1
kind: EffectiveHorizontalPodAutoscaler
spec:
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50
The system will be automatically converted to two HPA threshold configurations:
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
spec:
metrics:
- pods:
metric:
name: crane_pod_cpu_usage
selector:
matchLabels:
autoscaling.crane.io/effective-hpa-uid: f9b92249-eab9-4671-afe0-17925e5987b8
target:
type: AverageValue
averageValue: 100m
type: Pods
- resource:
name: cpu
target:
type: Utilization
averageUtilization: 50
type: Resource
In the example above, the Metric threshold configurations created by the user in EHPA will automatically be converted into two Metric threshold configurations on the underlying HPA: predictive metric threshold and actual monitored metric threshold.
The predictive metric threshold is a custom metric. Its value is provided by MetricAdapter of Crane.
The actual monitored metric threshold is a resource metric, the same as defined by the user in EHPA. Therefore, HPA will calculate the number of replicas based on the metrics actually monitored by the application.
When multiple elastic metric thresholds are configured, HPA will calculate the number of replicas for each metric separately and select the highest number of replicas as the final recommended result for elasticity.

Horizontal elasticity execution process

1. EffectiveHPAController creates HorizontalPodAutoscaler and TimeSeriesPrediction objects.
2. PredictionCore retrieves historical metrics from Prometheus, calculates them using the predictive algorithm, and records the results to TimeSeriesPrediction.
3. HPAController reads metric data from KubeApiServer through the metric client.
4. KubeApiServer routes the request to MetricAdapter of Crane.
5. HPAController calculates the results returned by all metrics to determine the final elasticity replica recommendation.
6. HPAController uses the scale API to scale up/down the target application.
The overall flowchart is as shown below:
crane-ehpa


User cases

The actual effect of EHPA is demonstrated through a customer case in the production environment. In this case, the production data is replayed in the pre-release environment and the elasticity effects of using EHPA and the community HPA are compared.
The red line in the following figure represents the actual CPU utilization curve of the application within one day. You can see usage peaks at 8 am, 12 pm, and 8 pm. The green line represents the CPU utilization predicted by EHPA.
craen-ehpa-metrics-chart

The following graph shows the curve of the number of replicas for automatic elasticity. The red line represents the number of replicas curve for the community HPA, and the green line represents the number of replicas curve for EHPA.
crane-ehpa-metrics-replicas-chart

EHPA has the following strengths:
It scales before the arrival of traffic peaks.
It reduces ineffective scale-downs when the traffic first drops then immediately rises.
Compared to HPA, EHPA has fewer elasticity operations but is more efficient.

Scale policies

EHPA offers two types of scale policies: Auto and Preview. You can switch between them at any time and it will take effect immediately.

Auto

Under the Auto policy, EHPA automatically performs scaling. By default, EHPA's policy is set to Auto. In this mode, EHPA creates a community HPA object and automatically takes over its lifecycle. We recommend that you not modify or control this underlying HPA object, as it will be deleted when EHPA is removed.

Preview

The Preview policy provides the ability for EHPA not to automatically execute scaling. Therefore, you can observe the number of replicas calculated by EHPA through the desiredReplicas field. You can switch between the two modes at any time. When you switch to the Preview mode, you can adjust the number of application replicas through spec.specificReplicas. If spec.specificReplicas is empty, scaling is not performed for the application, but the number of replicas is still calculated.
Below is an example of an EHPA template configured in the Preview mode:
apiVersion: autoscaling.crane.io/v1alpha1
kind: EffectiveHorizontalPodAutoscaler
spec:
scaleStrategy: Preview # ScaleStrategy indicates the policy to scaling target, value can be "Auto" and "Preview".
pecificReplicas: 5 # SpecificReplicas specify the target replicas.
status:
expectReplicas: 4 # expectReplicas is the calculated replicas that based on prediction metrics or spec.specificReplicas.
currentReplicas: 4 # currentReplicas is actual replicas from target



Help and Support

Was this page helpful?

Help us improve! Rate your documentation experience in 5 mins.

Feedback