tencent cloud

Tencent Kubernetes Engine

Release Notes and Announcements
Release Notes
Announcements
Release Notes
Product Introduction
Overview
Strengths
Architecture
Scenarios
Features
Concepts
Native Kubernetes Terms
Common High-Risk Operations
Regions and Availability Zones
Service Regions and Service Providers
Open Source Components
Purchase Guide
Purchase Instructions
Purchase a TKE General Cluster
Purchasing Native Nodes
Purchasing a Super Node
Getting Started
Beginner’s Guide
Quickly Creating a Standard Cluster
Examples
Container Application Deployment Check List
Cluster Configuration
General Cluster Overview
Cluster Management
Network Management
Storage Management
Node Management
GPU Resource Management
Remote Terminals
Application Configuration
Workload Management
Service and Configuration Management
Component and Application Management
Auto Scaling
Container Login Methods
Observability Configuration
Ops Observability
Cost Insights and Optimization
Scheduler Configuration
Scheduling Component Overview
Resource Utilization Optimization Scheduling
Business Priority Assurance Scheduling
QoS Awareness Scheduling
Security and Stability
TKE Security Group Settings
Identity Authentication and Authorization
Application Security
Multi-cluster Management
Planned Upgrade
Backup Center
Cloud Native Service Guide
Cloud Service for etcd
TMP
TKE Serverless Cluster Guide
TKE Registered Cluster Guide
Use Cases
Cluster
Serverless Cluster
Scheduling
Security
Service Deployment
Network
Release
Logs
Monitoring
OPS
Terraform
DevOps
Auto Scaling
Containerization
Microservice
Cost Management
Hybrid Cloud
AI
Troubleshooting
Disk Full
High Workload
Memory Fragmentation
Cluster DNS Troubleshooting
Cluster kube-proxy Troubleshooting
Cluster API Server Inaccessibility Troubleshooting
Service and Ingress Inaccessibility Troubleshooting
Common Service & Ingress Errors and Solutions
Engel Ingres appears in Connechtin Reverside
CLB Ingress Creation Error
Troubleshooting for Pod Network Inaccessibility
Pod Status Exception and Handling
Authorizing Tencent Cloud OPS Team for Troubleshooting
CLB Loopback
API Documentation
History
Introduction
API Category
Making API Requests
Elastic Cluster APIs
Resource Reserved Coupon APIs
Cluster APIs
Third-party Node APIs
Relevant APIs for Addon
Network APIs
Node APIs
Node Pool APIs
TKE Edge Cluster APIs
Cloud Native Monitoring APIs
Scaling group APIs
Super Node APIs
Other APIs
Data Types
Error Codes
TKE API 2022-05-01
FAQs
TKE General Cluster
TKE Serverless Cluster
About OPS
Hidden Danger Handling
About Services
Image Repositories
About Remote Terminals
Event FAQs
Resource Management
Service Agreement
TKE Service Level Agreement
TKE Serverless Service Level Agreement
Contact Us
Glossary

Node Resource Reservation Description (New)

PDF
Modo Foco
Tamanho da Fonte
Última atualização: 2024-12-26 17:56:57
TKE clusters occupy certain node resources to run add-ons (such as kubelet, kube-proxy, and runtime). Therefore, the total number of node resources and the number of allocable resources in a cluster may differ from each other. This document mainly describes a new algorithm of node resource reservation in TKE clusters with Kubernetes versions 1.30 and later, helping you set reasonable numbers of requested resources and limited resources for Pods when deploying an application.

Applicable Scope

This algorithm is applicable to both normal and native nodes in clusters with Kubernetes versions 1.30 and later.

Definition

Resources used by the entire machine = Resources occupied by system processes to maintain node operation + Resources occupied by Pod operation
Note:
This document primarily discusses the reservation algorithm for resources occupied by system processes to maintain node operation.

Node CPU Reservation Rules

Since CPU is a compressible resource and there are some per-CPU kernel threads, we strive to maintain algorithm consistency and provide more resources on large-scale models. The updated algorithm is as follows:
6% of the first core.
1% of the next core (for up to 2 cores).
0.5% of the next 2 cores (for up to 4 cores).
0.25% of any core (for more than 4 cores).
Note that creating too many Pods on small-scale machines may result in insufficient reserved CPU, affecting the system stability. The following behaviors may also occupy system resources and you can use custom parameters to appropriately adjust the reserved resources:
Printing excessive container logs (which are piped to containerd and finally compressed and written to disk by kubelet);
Executing commands by the exec probe too frequently (such as once a second) via runc in a container, which may occupy a large amount of CPU;
Deploying other services on nodes (non-Pod managed services will occupy reserved node resources, leading to system instability).
The comparison between the new and old algorithms in CPU resource consumption is as follows:
CPU (Cores)
Old Algorithm
New Algorithm
1
0.1
0.06
2
0.1
0.07
4
0.1
0.08
8
0.2
0.09
16
0.4
0.11
32
0.8
0.15
64
1.6
0.23
128
2.4
0.39
256
3.04
0.71

Node Memory Reservation Rules

Since the memory is a non-compressible resource, you should configure the reserved memory with caution. Tests showed that the memory is closely related to the number of Pods and machine specifications. Through adjustment, the new algorithm is min(old algorithm, 20 MiB * number of Pods + 256 MiB) . Note that deploying other services on nodes may also occupy reserved node resources, reducing the node stability. If you need to deploy non-Pod managed services, it is recommended to adjust the reserved resources.
The comparison between the new and old algorithms in memory resource consumption is as follows:
Memory Size
(GiB)
Old Algorithm
16 Pods
(GiB)
32 Pods
(GiB)
64 Pods
(GiB)
128 Pods
(GiB)
256 Pods
(GiB)
1
0.25
0.25
0.25
0.25
0.25
0.25
2
0.5
0.5
0.5
0.5
0.5
0.5
4
1
0.58
0.9
1
1
1
8
1.8
0.58
0.9
1.54
1.8
1.8
16
2.6
0.58
0.9
1.54
2.6
2.6
32
3.56
0.58
0.9
1.54
2.82
3.56
64
5.48
0.58
0.9
1.54
2.82
5.38
128
9.32
0.58
0.9
1.54
2.82
5.38
256
11.88
0.58
0.9
1.54
2.82
5.38

FAQs

What Are the Differences Between the New and Old Algorithms?

Old algorithm: reserves fixed amounts of resources based on machine specifications. For details, see Resource Reservation Description. The algorithm is relatively conservative, reserving a large amount of resources, so fewer resources are available for business use.
New algorithm: obtains the resource reservation formula by conducting large-scale load tests on various machine specifications with different numbers of Pods. The new algorithm provides more resources for business use on large-scale models.

Explanation About Configuring kube-reserved But Not Configuring system-reserved

Kubelet uses capacity - kubeReserved - systemReserved - evictionHard for allocable node resources to control the number of resources available for Pods. According to the documentation, kubeReserved can be used to restrict the resource usage of the kubelet and runtime add-ons, while systemReserved can be used to restrict the resource usage of system services. But the prerequisite is to enable enforce-node-allocatable and specify kube-reserved-cgroup and system-reserved-cgroup, imposing additional requirements on the cgroup layout of the system. Moreover, considering the overall stability, both the official provider of the community edition and many cloud vendors do not enable these restrictions. Therefore, TKE does not impose resource limits on kubelet and system add-ons. It makes no practical difference whether the reserved resources are configured completely with kube-reserved or separately with kube-reserved and system-reserved.

Does the New Algorithm Support Nodes of Earlier Versions?

Not yet. The new algorithm will apply to the existing nodes through node upgrade in the future.

Ajuda e Suporte

Esta página foi útil?

comentários