Event Name | Event Description | Handling Method and Recommendation |
Automatic Disk Capacity Expansion | High disk utilization triggers automatic expansion. | Evaluate whether to upgrade the disk specifications. |
Dynamic disk message retention policy | High disk utilization triggers the dynamic retention policy. | Evaluate whether to upgrade the disk specifications. |
Kafka version upgrade | The instance is upgraded to a newer version. | Check whether the version upgrade API is called via a cloud API or the console. |
AZ change | The instance is changed to a different AZ. | Check whether the AZ change API is called via a cloud API or the console. |
Event Name | Event Description | Handling Method and Recommendation |
Automatic Partition Balancing | The instance automatically triggers partition migration to balance the load. | Pay attention to migration latency and avoid execution during peak business hours. |
Manual partition balancing | Partition leader reassignment is manually initiated. | Verify traffic balance after the operation. |
Kernel minor version upgrade | Emergency update to fix security vulnerabilities or feature defects. | Check version compatibility and check whether the upgrade is called via a cloud API. |
Upgrade | Instance specifications, such as disk, bandwidth, and partition, are upgraded. | It is recommended to follow up and confirm the cost changes. |
Downgrade | The instance specifications are downgraded to reduce costs. | It is recommended to follow up and observe the impact of the downgrade on the business for further evaluation. |
Public Network Bandwidth Adjustment | The upper limit of public network bandwidth is modified. | Check whether the bandwidth change API is called via a cloud API. If an error occurs, it is recommended to troubleshoot accordingly. |
Routing Policy Change | The routing rule for Virtual Private Cloud (VPC) network access is adjusted. | If an error occurs, verify the producer/consumer connectivity. |
ACL Policy Change | An access control rule is added or deleted at the topic or IP level. | If an error occurs, it is recommended to check the client logs. |
User adding/deletion | A Simple Authentication and Security Layer (SASL) authentication account is created or deleted. | It is recommended to synchronously update the authentication credentials for the producer and consumer. |
Enable Elastic Bandwidth | Automatic scaling of bandwidth based on traffic peaks is enabled. | It is recommended to follow up on cost changes. |
Turn off elastic bandwidth capacity | Automatic scaling of bandwidth is disabled. | It is recommended to refer to the peak traffic of the last 7 days and manually set the bandwidth upper limit. |
Event Name | Event Description | Handling Method and Recommendation |
Scheduled instance diagnosis | The system periodically performs instance health checks, covering network, disk, and broker status. | When an exception is detected, view the diagnostic report and perform targeted handling. |
Instant instance diagnosis | The user manually triggers a real-time health check. | When an exception is detected, view the diagnostic report and perform targeted handling. |
Event Name | Event Description | Handling Method and Recommendation |
Consumer group member heartbeat timeout | A heartbeat timeout occurs on a consumer. | It is recommended to check whether the consumer is functioning normally. Occasional heartbeat timeouts may be caused by fluctuations in consumer group members. If heartbeat timeouts occur frequently, it is recommended to check whether message processing is blocked based on the business logic. If blocking exists, adjust the downstream blocking points. Alternatively, try adjusting the instance configuration. |
Consumer group member update | A consumer group member is updated. | It is recommended to check the changes in consumer group members, such as whether there are newly released or added members. |
Consumer group rebalancing | A consumer group rebalance occurs. | Occasional consumer group rebalances may be caused by normal fluctuations. If consumer group rebalances persist or occur frequently, it is recommended to perform further troubleshooting to prevent an impact on consumption. Check whether there is a heartbeat timeout in the consumer group and handle it. Check whether consumers are frequently created or deleted. Troubleshoot other consumer group events. |
Cluster node launch | A node comes online in the cluster. | It is recommended to first check in Event Center whether any configuration change events have occurred. When configuration changes are in progress, a cluster node going online/offline is normal and requires no special attention. Events concerning the occasional node going online/offline may be caused by underlying server fluctuations and require no special attention. If events concerning cluster node going online occur, with persistent impacts on production or consumption activities, and the situation has not recovered for an extended period, contact us. |
Cluster node removal | A node goes offline in the cluster. | It is recommended to first check in Event Center whether any configuration change events have occurred. When configuration changes are in progress, a cluster node going online/offline is normal and requires no special attention. Events concerning the occasional node going online/offline may be caused by underlying server fluctuations and require no special attention. If events concerning cluster node going offline occur, with persistent impacts on production or consumption activities, and the situation has not recovered for an extended period, contact us. |
Event Chinese NameEvent Chinese Name | Event NameEvent Name | Event DescriptionEvent Description | Handling Method and RecommendationHandling Method and Recommendation |
Leader switchLeader switch | Leader switchLeader switch | The leader replica of a partition is transferred (may be triggered by configuration changes, replica balancing, disaster recovery failover, planned maintenance, or broker downtime).The leader replica of a partition is transferred (may be triggered by configuration changes, replica balancing, disaster recovery failover, planned maintenance, or broker downtime). | Occasional leader switches are normal and require no special attention.Occasional leader switches are normal and require no special attention. If leader switch events persist, and production/consumption is continuously affected without recovery for an extended period, you can try restarting the client.If leader switch events persist, and production/consumption is continuously affected without recovery for an extended period, you can try restarting the client. |


Feedback