Scenarios
During microservice release, service changes may cause traffic errors or interruptions. Spring Cloud Tencent provides plugins to achieve lossless deployment and decommissioning. The principle is:
Graceful deployment: After the service starts, it waits until fully ready before registering with the service registry (delayed registration when no health probe interface is configured) and begins serving traffic. It then proceeds to rolling update the next node in conjunction with the Kubernetes lifecycle.
Graceful deactivation: Before the service is stopped, it first deregisters from the service registry and rejects new requests. The service waits for existing requests to complete before going offline.
lossless deployment
Solution One: Service Readiness/Delayed Registration
In some scenarios, services support delayed loading, asynchronously loading resources after startup. For example: services need to obtain data or files from file storage COS and can only provide services externally after the data or files are fetched. If services are registered directly after application startup, it will cause service calls to fail because the service is not actually ready. Therefore, by ensuring the service is ready before the service is registered and provided externally, it can ensure a smooth and lossless service launch.
Polaris supports the following two service readiness/delayed registration scenarios:
Scenario 1: The service exposes a health check endpoint. Service registration occurs only after the endpoint is successfully probed.
Scenario 2: If the service does not expose a health check endpoint, service registration is delayed for a period. The default delayed registration duration is 30 seconds, which can be customized via configuration.
Operation Steps
Step 2: Add the spring cloud tencent lossless launch/shutdown plugin dependency.
Add dependencies in pom.xml:
<dependencies>
<dependency>
<groupId>com.tencent.cloud</groupId>
<artifactId>spring-cloud-tencent-lossless-plugin</artifactId>
</dependency>
</dependencies>
Step 3: Add lossless launch/shutdown related configuration items in the application's Polaris configuration file. The specific parameters are as follows:
|
spring.cloud.polaris.lossless.enabled | false | No | Switch for zero-downtime deployment and graceful shutdown. |
spring.cloud.polaris.lossless.health-check-path | None | No | Health check API for business applications. |
spring.cloud.polaris.lossless.delay-register-interval | 30000 | No | Delay registration time when no health check API is configured for business applications. Default: 30000 (unit: ms). |
spring.cloud.polaris.lossless.health-check-interval | 5000 | No | Health check interval after the health check API for business applications is configured. Default: 5000 (unit: ms). |
spring.cloud.polaris.admin.host | 0.0.0.0 | No | IP address bound to the service for zero-downtime deployment and graceful shutdown. |
spring.cloud.polaris.admin.port | 28080 | No | Port number bound to the service for zero-downtime deployment and graceful shutdown. This port is enabled only when monitoring reporting (non-pushgateway mode) or the graceful shutdown feature is enabled. |
Configuration example:
Configure in the bootstrap.yml file located in the project's main/resources directory:
spring:
application:
name: ${application.name}
cloud:
polaris:
enabled: true
address: grpc://${replace with the Polaris service address}:8091
namespace: default
# Configuration for lossless launch/shutdown
lossless:
enabled: true
health-check-path: /
delay-register-interval: 300000
health-check-interval: 10000
admin:
host: 0.0.0.0
port: 28080
Option 2: Service Registration Readiness Check
Typically, K8s provides a readiness check mechanism to perform health checks on instances before they become ready. It generally assumes that the application is ready as soon as the port is active. However, there is a gap between the port becoming active and the successful service registration. This may cause situations where the service fails to register, the old application instance is taken offline, and the next node begins deployment. Ultimately, this results in exceptions in consumer-side invocations.
TSE Polaris provides an interface and port for service registration status to cooperate with Kubernetes readiness checks. When application registration is complete, it returns a 200 status code to help Kubernetes determine the application is ready; returns a 500 status code if registration is incomplete, helping Kubernetes identify the application as not ready. Particularly during instance rolling updates, it waits until instances are ready before proceeding to update the next node.
Operation Steps:
Step 2: Add the spring cloud tencent lossless launch/shutdown plugin dependency and configure related settings for lossless launch/shutdown. For details, see Option 1. Step 3: Configure the readiness check in a K8s application deployment platform such as TKE, as shown in the following figure. Path: /online.
Port: 28080.
zero-downtime decommissioning
During a rolling release or deactivation process, when a service instance of the called service deregisters from the registry, and the calling service updates IPs from the registry, a time gap exists. This may still route requests to deactivated instances, causing service request failures.
TSE Polaris provides a graceful shutdown interface: /offline, integrating with the Kubernetes lifecycle to achieve lossless service deactivation. The overall process is as follows:
Operation Steps:
Step 2: Add the spring cloud tencent lossless launch/shutdown plugin dependency and configure related settings for lossless launch/shutdown. For details, see Option 1. Step 3: Configure the preStop lifecycle check in a K8s application deployment platform such as TKE. preStop configuration check command: curl -X PUT http://localhost:28080/offline && sleep 20