Release Notes and Announcements

Release Notes

Announcements

Security Announcements

Product Introduction

Overview

Strengths

Architecture

Features

Use Cases

Constraints and Limits

Technical Support Scope

Product release

Purchase Guide

EMR on CVM Billing Instructions

EMR on TKE Billing Instructions

EMR Serverless HBase Billing Instructions

Getting Started

EMR on CVM Quick Start

EMR on TKE Quick Start

EMR on CVM Operation Guide

Planning Cluster

Administrative rights

Configuring Cluster

Managing Cluster

Managing Service

Monitoring and Alarms

TCInsight

EMR on TKE Operation Guide

Introduction to EMR on TKE

Configuring Cluster

Cluster Management

Service Management

Monitoring and Ops

Application Analysis

EMR Serverless HBase Operation Guide

EMR Serverless HBase Product Introduction

Quotas and Limits

Planning an Instance

Managing an Instance

Monitoring and Alarms

Development Guide

EMR Development Guide

Hadoop Development Guide

Spark Development Guide

Hbase Development Guide

Phoenix on Hbase Development Guide

Hive Development Guide

Presto Development Guide

Sqoop Development Guide

Hue Development Guide

Oozie Development Guide

Flume Development Guide

Kerberos Development Guide

Knox Development Guide

Alluxio Development Guide

Kylin Development Guide

Livy Development Guide

Kyuubi Development Guide

Zeppelin Development Guide

Hudi Development Guide

Superset Development Guide

Impala Development Guide

Druid Development Guide

TensorFlow Development Guide

Kudu Development Guide

Ranger Development Guide

Kafka Development Guide

Iceberg Development Guide

StarRocks Development Guide

Flink Development Guide

JupyterLab Development Guide

MLflow Development Guide

Practical Tutorial

Practice of EMR on CVM Ops

Data Migration

Practical Tutorial on Custom Scaling

API Documentation

History

Introduction

API Category

Cluster Resource Management APIs

Cluster Services APIs

User Management APIs

Data Inquiry APIs

Scaling APIs

Configuration APIs

Other APIs

Serverless HBase APIs

YARN Resource Scheduling APIs

Making API Requests

Data Types

Error Codes

FAQs

EMR on CVM

Service Level Agreement

Service List

PDF

フォーカスモード

フォントサイズ

最終更新日: 2024-10-30 10:40:57

The service list displays the services installed on the cluster, along with their health status, configuration status, version information, and more. It also provides convenient service Ops and management tools, including general service operations as well as service-specific command-based Ops operations.
Service Health Status
The service health status indicates whether the current service is operating normally. It is aggregated from the health statuses of various roles.
﻿
﻿
﻿
There are four main types of service health statuses: Good, At Risk, Unavailable, and Unknown or Undetected. Each status type is displayed with a different color.
Service Health Status
Status Description
Status Aggregation Rule
Green: good.
The service is running normally.
All role instances have a health status of Good.
Orange: At risk.
The service is available, but some role instances are unavailable or at risk and need attention.
Some instances of a specific role within this component are either unavailable or at risk. For example, HDFS has one NameNode role instance and two DataNode role instances. Among them, one DataNode instance is unavailable, while the other DataNode instance and the NameNode instance are healthy, resulting in an At Risk health status for HDFS.
Red: unavailable.
The service is unavailable. All instances of a certain role are in an unhealthy status and require immediate handling.
All instances of a particular role in this component are in an unhealthy status. For example, HDFS has one NameNode role instance and two DataNode role instances. Among them, both DataNode instances are unavailable, while the NameNode instance is healthy, resulting in Unavailable health status for HDFS.
Gray: unknown or undetected.
The service health status is unknown or has not been detected. Components without processes are marked as Undetected for health status. For components with processes, if they enter maintenance mode or their operation status has stopped, they are also marked as Undetected. If the health status information of a role instance cannot be correctly obtained, it is marked as Unknown. If there are no issues found with the business, no further attention is needed.
1. All role instances of this component are neither at risk nor unavailable, and at least one role instance has an unknown health status. For example, HDFS has one NameNode role instance and two DataNode role instances, with one DataNode role instance having an unknown health status, while the other DataNode role instance and the NameNode role instance have a good health status. Therefore, the health status of HDFS is unknown.
2. And all role instances of this service have an undetected health status. When all role instances of the service enter maintenance mode or have stopped operation, their health status will not be detected.
3. If the component has no processes, its health status will not be detected, such as Iceberg, Hudi, and Flink.
Service Operations
Common service operations include service restart, start, pause, enter/exit maintenance mode, and view ports. Command-based service operations include HDFS NameNode primary/secondary switch, HDFS data balancing, YARN ResourceManager primary/secondary switch, and YARN queue refresh. The operation descriptions are as follows:
Service Operation
Description
HDFS NameNode primary/secondary switch
Abbreviated as NN primary/secondary switch, it switches the current Active NameNode to StandBy status and the previously StandBy NameNode to Active status.
HDFS data balancing
Usually executed when new DataNodes are added. This operation distributes data evenly, avoiding hotspot issues and ensuring a more balanced read/write load across the cluster.
HDFS management status switch
Only supports switching the DataNode to Maintenance Status (IN_MAINTENANCE). This feature is typically used when a DataNode needs to be temporarily taken offline without migrating data. Currently, this feature is supported in Hadoop 3.x and later versions. For detailed directions, see Best Practices for HDFS DataNode Maintenance Status Switching.
YARN ResourceManager primary/secondary switch
Abbreviated as RM primary/secondary switch, it switches the current Active ResourceManager to StandBy status and the previously StandBy ResourceManager to Active status.
RM primary/secondary switch is only allowed when yarn.resourcemanager.ha.automatic-failover.enabled is disabled.
If the RM primary/secondary switch option is not displayed in the YARN card operations dropdown list, locate yarn.resourcemanager.ha.automatic-failover.enabled in YARN Configuration Management - yarn-site.xml and disable it.
YARN queue refresh
When new content is added or existing content is updated in capacity-scheduler.xml or fair-scheduler.xml, this operation allows the changes to take effect in ResourceManager.
Note: Do not delete active queues defined in capacity-scheduler.xml or fair-scheduler.xml.
Using Ranger for modifying metabase
When changing the underlying database of Ranger, you need to modify the conf/install.properties file and execute the setup.sh script locally. This operation provides a one-click metabase configuration feature, preventing service issues due to incomplete configuration changes when modifying the Ranger metabase address.
This operation currently only supports MySQL databases, and the Test Connection feature is only for testing the connection of the administrator user. This operation synchronizes the database information to the local ranger-admin-site.xml configuration file, but it will not update the content of ranger-admin-site.xml in the configuration management. If the user modifies and publishes ranger-admin-site.xml through the configuration management page due to additional requirements, it may result in the database information being overwritten, causing exceptions.
To perform a service operation, follow these steps:
1. Log in to the EMR console and click the corresponding Cluster ID/Name in the cluster list to enter the cluster details page.
2. On the cluster details page, select Cluster Services, and then select the component card you want to operate on.
3. For example, to perform an HDFS NN primary/secondary switch, select HDFS component card operation > NN Failover in the Cluster Services to proceed with the operation.
Service Pause Methods List
The supported pause methods for each service component are as follows:
Component
Service
Pause Method
Description
Remarks
HDFS
NameNode
Quick pause
Directly stops service
-
﻿
DataNode
Quick pause
Directly stops service
-
﻿
JournalNode
Quick pause
Directly stops service
-
﻿
zkfc
Quick pause
Directly stops service
-
YARN
ResourceManager
Quick pause
Directly stops service
-
﻿
NodeManager
Quick pause
Directly stops service
-
﻿
JobHistoryServer
Quick pause
Directly stops service
-
﻿
TimeLineServer
Quick pause
Directly stops service
-
HBASE
HbaseThrift
Quick pause
Directly stops service
-
﻿
HMaster
Quick pause
Directly stops service
-
﻿
RegionServer
Quick pause
Directly stops service
-
﻿
RegionServer
Safe pause
Before RegionServer is stopped, the regions on it will be migrated.
Supports setting thread concurrency.
HIVE
HiveMetaStore
Quick pause
Directly stops service
-
﻿
HiveServer2
Quick pause
Directly stops service
-
﻿
HiveWebHcat
Quick pause
Directly stops service
-
PRESTO
PrestoCoordinator
Quick pause
Directly stops service
-
﻿
PrestoWorker
Quick pause
Directly stops service
-
ZOOKEEPER
QuorumPeerMain
Quick pause
Directly stops service
-
SPARK
SparkJobHistoryServer
Quick pause
Directly stops service
-
HUE
Hue
Quick pause
Directly stops service
-
OOZIE
Oozie
Quick pause
Directly stops service
-
STORM
Nimbus
Quick pause
Directly stops service
-
﻿
Supervisor
Quick pause
Directly stops service
-
﻿
Logviewer
Quick pause
Directly stops service
-
﻿
Ui
Quick pause
Directly stops service
-
RANGER
Ranger
Quick pause
Directly stops service
-
ALLUXIO
AlluxioMaster
Quick pause
Directly stops service
-
﻿
AlluxioWorker
Quick pause
Directly stops service
-
GANGLIA
Httpd
Quick pause
Directly stops service
-
﻿
Gmetad
Quick pause
Directly stops service
-
﻿
Gmond
Quick pause
Directly stops service
-

ヘルプとサポート

この記事はお役に立ちましたか？

営業担当者にお問い合わせいただくかチケットを提出してサポートを求めることができます。

フィードバック

tencent cloud

Elastic MapReduce

Service List

Service Health Status

Service Operations

Service Pause Methods List

ヘルプとサポート

Service Health Status	Status Description	Status Aggregation Rule
Green: good.	The service is running normally.	All role instances have a health status of Good.
Orange: At risk.	The service is available, but some role instances are unavailable or at risk and need attention.	Some instances of a specific role within this component are either unavailable or at risk. For example, HDFS has one NameNode role instance and two DataNode role instances. Among them, one DataNode instance is unavailable, while the other DataNode instance and the NameNode instance are healthy, resulting in an At Risk health status for HDFS.
Red: unavailable.	The service is unavailable. All instances of a certain role are in an unhealthy status and require immediate handling.	All instances of a particular role in this component are in an unhealthy status. For example, HDFS has one NameNode role instance and two DataNode role instances. Among them, both DataNode instances are unavailable, while the NameNode instance is healthy, resulting in Unavailable health status for HDFS.
Gray: unknown or undetected.	The service health status is unknown or has not been detected. Components without processes are marked as Undetected for health status. For components with processes, if they enter maintenance mode or their operation status has stopped, they are also marked as Undetected. If the health status information of a role instance cannot be correctly obtained, it is marked as Unknown. If there are no issues found with the business, no further attention is needed.	1. All role instances of this component are neither at risk nor unavailable, and at least one role instance has an unknown health status. For example, HDFS has one NameNode role instance and two DataNode role instances, with one DataNode role instance having an unknown health status, while the other DataNode role instance and the NameNode role instance have a good health status. Therefore, the health status of HDFS is unknown. 2. And all role instances of this service have an undetected health status. When all role instances of the service enter maintenance mode or have stopped operation, their health status will not be detected. 3. If the component has no processes, its health status will not be detected, such as Iceberg, Hudi, and Flink.

Service Operation	Description
HDFS NameNode primary/secondary switch	Abbreviated as NN primary/secondary switch, it switches the current Active NameNode to StandBy status and the previously StandBy NameNode to Active status.
HDFS data balancing	Usually executed when new DataNodes are added. This operation distributes data evenly, avoiding hotspot issues and ensuring a more balanced read/write load across the cluster.
HDFS management status switch	Only supports switching the DataNode to Maintenance Status (IN_MAINTENANCE). This feature is typically used when a DataNode needs to be temporarily taken offline without migrating data. Currently, this feature is supported in Hadoop 3.x and later versions. For detailed directions, see Best Practices for HDFS DataNode Maintenance Status Switching.
YARN ResourceManager primary/secondary switch	Abbreviated as RM primary/secondary switch, it switches the current Active ResourceManager to StandBy status and the previously StandBy ResourceManager to Active status. RM primary/secondary switch is only allowed when yarn.resourcemanager.ha.automatic-failover.enabled is disabled. If the RM primary/secondary switch option is not displayed in the YARN card operations dropdown list, locate yarn.resourcemanager.ha.automatic-failover.enabled in YARN Configuration Management - yarn-site.xml and disable it.
YARN queue refresh	When new content is added or existing content is updated in capacity-scheduler.xml or fair-scheduler.xml, this operation allows the changes to take effect in ResourceManager. Note: Do not delete active queues defined in capacity-scheduler.xml or fair-scheduler.xml.
Using Ranger for modifying metabase	When changing the underlying database of Ranger, you need to modify the conf/install.properties file and execute the setup.sh script locally. This operation provides a one-click metabase configuration feature, preventing service issues due to incomplete configuration changes when modifying the Ranger metabase address. This operation currently only supports MySQL databases, and the Test Connection feature is only for testing the connection of the administrator user. This operation synchronizes the database information to the local ranger-admin-site.xml configuration file, but it will not update the content of ranger-admin-site.xml in the configuration management. If the user modifies and publishes ranger-admin-site.xml through the configuration management page due to additional requirements, it may result in the database information being overwritten, causing exceptions.

Component	Service	Pause Method	Description	Remarks
HDFS	NameNode	Quick pause	Directly stops service	-
		DataNode	Quick pause	Directly stops service	-
		JournalNode	Quick pause	Directly stops service	-
		zkfc	Quick pause	Directly stops service	-
YARN	ResourceManager	Quick pause	Directly stops service	-
		NodeManager	Quick pause	Directly stops service	-
		JobHistoryServer	Quick pause	Directly stops service	-
		TimeLineServer	Quick pause	Directly stops service	-
HBASE	HbaseThrift	Quick pause	Directly stops service	-
		HMaster	Quick pause	Directly stops service	-
		RegionServer	Quick pause	Directly stops service	-
		RegionServer	Safe pause	Before RegionServer is stopped, the regions on it will be migrated.	Supports setting thread concurrency.
HIVE	HiveMetaStore	Quick pause	Directly stops service	-
		HiveServer2	Quick pause	Directly stops service	-
		HiveWebHcat	Quick pause	Directly stops service	-
PRESTO	PrestoCoordinator	Quick pause	Directly stops service	-
PRESTO		PrestoWorker	Quick pause	Directly stops service	-
ZOOKEEPER	QuorumPeerMain	Quick pause	Directly stops service	-
SPARK	SparkJobHistoryServer	Quick pause	Directly stops service	-
HUE	Hue	Quick pause	Directly stops service	-
OOZIE	Oozie	Quick pause	Directly stops service	-
STORM	Nimbus	Quick pause	Directly stops service	-
		Supervisor	Quick pause	Directly stops service	-
		Logviewer	Quick pause	Directly stops service	-
		Ui	Quick pause	Directly stops service	-
RANGER	Ranger	Quick pause	Directly stops service	-
ALLUXIO	AlluxioMaster	Quick pause	Directly stops service	-
ALLUXIO		AlluxioWorker	Quick pause	Directly stops service	-
GANGLIA	Httpd	Quick pause	Directly stops service	-
		Gmetad	Quick pause	Directly stops service	-
		Gmond	Quick pause	Directly stops service	-