tencent cloud

YARN Monitoring Metrics
Last updated:2026-01-13 17:37:54
YARN Monitoring Metrics
Last updated: 2026-01-13 17:37:54

YARN - overview

Title
Metric
Unit
Description
Nodes
NumActiveNMs
-
Number of live NodeManagers
NumDecommissionedNMs
-
Number of decommissioned NodeManagers
NumLostNMs
-
Number of lost NodeManagers
NumUnhealthyNMs
-
Number of unhealthy NodeManagers
CPU cores
AllocatedVCores
-
Number of allocated VCores in the current queue
ReservedVCores
-
Number of reserved VCores in the current queue
AvailableVCores
-
Number of available VCores in the current queue
PendingVCores
-
Number of pending VCores in resource requests in the current queue
Total applications
AppsSubmitted
-
Number of submitted jobs in the current queue
AppsRunning
-
Number of running jobs in the current queue
AppsPending
-
Number of pending jobs in the current queue
AppsCompleted
-
Number of completed jobs in the current queue
AppsKilled
-
Number of killed jobs in the current queue
AppsFailed
-
Number of failed jobs in the current queue
ActiveApplications
-
Number of active jobs in the current queue
running_0
-
Number of running jobs in the current queue that have run for less than 60 minutes
running_60
-
Number of running jobs in the current queue that have run for 60–300 minutes
running_300
-
Number of running jobs in the current queue that have run for 300–1,440 minutes
running_1440
-
Number of running jobs in the current queue that have run for more than 1,440 minutes
Memory size
AllocatedMB
MB
Amount of allocated memory in the current queue
AvailableMB
MB
Amount of available memory in the current queue
PendingMB
MB
Amount of pending memory in resource requests in the current queue
ReservedMB
MB
Amount of reserved memory in the current queue
Containers
AllocatedContainers
-
Number of allocated containers in the current queue
PendingContainers
-
Number of pending containers in resource requests in the current queue
ReservedContainers
-
Number of reserved containers in the current queue
Total allocated/released containers
AggregateContainersAllocated
-
Total number of allocated containers in the current queue
AggregateContainersReleased
-
Total number of released containers in the current queue
Users
ActiveUsers
-
Number of active users in the current queue
Memory
allocatedMB
MB
Amount of allocated memory in the cluster
availableMB
MB
Amount of available memory in the cluster
reservedMB
MB
Amount of reserved memory in the cluster
totalMB
MB
Total amount of memory in the cluster
Applications
completed
-
Number of completed jobs in the cluster during the statistical period
failed
-
Number of failed jobs in the cluster during the statistical period
killed
-
Number of killed jobs in the cluster during the statistical period
pending
-
Number of pending jobs in the cluster during the statistical period
running
-
Number of running jobs in the cluster during the statistical period
submitted
-
Number of submitted jobs in the cluster during the statistical period
Containers
containersAllocated
-
Number of allocated containers in the cluster
containersPending
-
Number of pending containers in the cluster
containersReserved
-
Number of reserved containers in the cluster
Memory utilization
usageRatio
%
Current memory utilization of the cluster
Memory Utilization Size
configMemRatioMax_queue
%
The maximum proportion of the memory allocated to the queue
configMemRatio_queue
%
The proportion of the memory allocated to the queue
The proportion of the memory to the cluster
configMemRatio_cluster
%
The rate of the memory allocated by the queue to the memory of the cluster
configMemMaxRatio_cluster
%
The rate of the maximum memory allocated for the queue to the memory of the cluster
usedMemRatio_cluster
%
The rate of the memory used by the queue to the memory of the cluster
Cores
allocatedVirtualCores
-
Number of allocated CPU cores in the cluster
availableVirtualCores
-
Number of available CPU cores in the cluster
reservedVirtualCores
-
Number of reserved CPU cores in the cluster
totalVirtualCores
-
Total number of CPU cores in the cluster
CPU utilization
usageRatio
%
Current CPU utilization of the cluster
CPU Utilization Size
configVCoresRatioMax_queue
%
The maximum proportion of the CPU allocated to the queue
configVCoresRatio_queue
%
The proportion of the CPU allocated to the queue
The proportion of the CPU to the cluster
configVCoresRatio_cluster
%
The proportion of the CPU allocated for the queue to the cluster CPU
configVCoresMaxRatio_cluster
%
The proportion of the maximum CPU allocated for the queue to the cluster CPU
usedVCoresRatio_cluster
%
The proportion of the CPU used by the queue to the cluster CPU
Launched AMs
AMLaunchDelayNumOps
-
Launched AMs
Average time for RM to launch AM
AMLaunchDelayAvgTime
ms
Average time for RM to launch AM
Total registered AMs
AMRegisterDelayNumOps
-
Total registered AMs
Average time for AM to register with RM
AMRegisterDelayAvgTime
ms
Average time for AM to register with RM
Queue CPU utilization
YARN.RM.QUEUE.VCORES.RATIO
-
Utilization of CPU allocated for the current queue
Queue memory utilization
YARN.RM.QUEUE.MEM.RATIO
-
Utilization of memory allocated for the current queue
The percentage of available memory resource
availableMemPercentage
%
The percentage of currently available memory resources in cluster
The percentage of pending container
containerPendingRatio
%
The percentage of pending container
The percentage of available CPU
availableCoresPercentage
%
The percentage of available CPU

YARN - ResourceManager

Title
Metric
Unit
Description
RPC authentications/authorizations
RpcAuthenticationFailures
-
Number of failed RPC authentications
RpcAuthenticationSuccesses
-
Number of successful RPC authentications
RpcAuthorizationFailures
-
Number of failed RPC authorizations
RpcAuthorizationSuccesses
-
Number of successful RPC authorizations
Data received/sent by RPC
ReceivedBytes
bytes/s
Amount of data received by RPC
SentBytes
bytes/s
Amount of data sent by RPC
RPC connections
NumOpenConnections
-
Current number of open connections
RPC requests
RpcProcessingTimeNumOps
-
Number of RPC requests
RpcQueueTimeNumOps
-
Number of RPC requests
RPC queue length
CallQueueLength
-
Length of the current RPC queue
Average RPC processing time
RpcProcessingTimeAvgTime
s
Average RPC request processing time
RpcQueueTimeAvgTime
s
Average time of RPC in the queue
GC count
YGC
-
Young GC count
FGC
-
Full GC count
GC time
FGCT
s
Full GC time
GCT
s
Garbage collection time
YGCT
s
Young GC time
Memory zone proportion
S0
%
Percentage of used Survivor 0 memory
E
%
Percentage of used Eden memory
CCS
%
Percentage of used compressed class space memory
S1
%
Percentage of used Survivor 1 memory
O
%
Percentage of used Old memory
M
%
Percentage of used Metaspace memory
Heap memory utilization
MemHeapUsedRate
%
The percentage of the number of HeapMemory currently used by the JVM to the number of HeapMemory configured by the JVM
JVM threads
ThreadsNew
-
Number of threads in NEW status
ThreadsRunnable
-
Number of threads in RUNNABLE status
ThreadsBlocked
-
Number of threads in BLOCKED status
ThreadsWaiting
-
Number of threads in WAITING status
ThreadsTimedWaiting
-
Number of threads in TIMED WAITING status
ThreadsTerminated
-
Number of threads in Terminated status
JVM logs
LogFatal
-
Number of Fatal logs
LogError
-
Number of Error logs
LogWarn
-
Number of Warn logs
LogInfo
-
Number of Info logs
JVM memory
MemNonHeapUsedM
MB
Non-heap memory size used by process
MemNonHeapCommittedM
MB
Non-heap memory size committed to process
MemHeapUsedM
MB
Heap memory size used by process
MemHeapCommittedM
MB
Heap memory size committed to process
MemHeapMaxM
MB
Maximum heap memory size available to process
MemMaxM
MB
Maximum memory size available to process
CPU utilization
ProcessCpuLoad
%
CPU utilization
Cumulative CPU usage time
ProcessCpuTime
ms
Cumulative CPU usage time
File descriptors
MaxFileDescriptorCount
-
Maximum number of file descriptors
OpenFileDescriptorCount
-
Number of opened file descriptors
Process execution duration
Uptime
s
Process execution duration
Worker threads
DaemonThreadCount
-
Number of daemon threads in the process
ThreadCount
-
Number of threads in the process
Node status
haState
1: Active
0: Standby
ResourceManager active/standby status
Active/Standby switch
switchOccurred
-
ResourceManager active/standby switch
Number of CPU Cores under Tag
AllocatedVCores
Count
Number of VCore allocated to the current Tag
Memory size under Tag
AllocatedMB
MBytes
Memory size allocated to the current Tag
Number of audit log writing failures to ES
WriteEsFailed
Count
Number of audit log writing failures to ES
Number of audit log writing successes to ES
WriteEsSuccess
Count
Number of audit log writing successes to ES

YARN - JobHistoryServer

Title
Metric
Unit
Description
JVM threads
ThreadsNew
-
Number of threads in NEW status
ThreadsRunnable
-
Number of threads in RUNNABLE status
ThreadsBlocked
-
Number of threads in BLOCKED status
ThreadsWaiting
-
Number of threads in WAITING status
ThreadsTimedWaiting
-
Number of threads in TIMED WAITING status
ThreadsTerminated
-
Number of threads in Terminated status
JVM logs
LogFatal
-
Number of FATAL-level logs
LogError
-
Number of ERROR-level logs
LogWarn
-
Number of WARN-level logs
LogInfo
-
Number of INFO-level logs
JVM memory
MemNonHeapUsedM
MB
Non-heap memory size used by process
MemNonHeapCommittedM
MB
Non-heap memory size committed to process
MemHeapUsedM
MB
Heap memory size used by process
MemHeapCommittedM
MB
Heap memory size committed to process
MemHeapMaxM
MB
Maximum heap memory size available to process
MemMaxM
MB
Maximum memory size available to process
Heap memory utilization
MemHeapUsedRate
%
The percentage of the number of HeapMemory currently used by the JVM to the number of HeapMemory configured by the JVM
GC count
YGC
-
Young GC count
FGC
-
Full GC count
GC time
FGCT
s
Full GC time
GCT
s
Garbage collection time
YGCT
s
Young GC time
Memory zone proportion
S0
%
Percentage of used Survivor 0 memory
E
%
Percentage of used Eden memory
CCS
%
Percentage of used compressed class space memory
S1
%
Percentage of used Survivor 1 memory
O
%
Percentage of used Old memory
M
%
Percentage of used Metaspace memory
CPU utilization
ProcessCpuLoad
%
CPU utilization
Cumulative CPU usage time
ProcessCpuTime
ms
Cumulative CPU usage time
File descriptors
MaxFileDescriptorCount
-
Maximum number of file descriptors
OpenFileDescriptorCount
-
Number of opened file descriptors
Process execution duration
Uptime
s
Process execution duration
Worker threads
DaemonThreadCount
-
Number of daemon threads in the process
ThreadCount
-
Number of threads in the process
Current active connections
numActiveConnections
Count
Current active connections
Total count of exceptions captured by the Shuffle service
numCaughtExceptions
Count
Total count of exceptions captured by the Shuffle service
Direct memory used by the Shuffle service
usedDirectMemory
Bytes
Direct memory used by the Shuffle service
Delay in fetching merged block metadata
fetchMergedBlocksMetaLatencyMillis_mean
ms
Average delay in fetching merged block metadata
Final stage delay for merging Shuffle data
finalizeShuffleMergeLatencyMillis_mean
ms
Average final stage delay for merging Shuffle data
Heap memory used by the Shuffle service
usedHeapMemory
Bytes
Heap memory used by the Shuffle service
Delay in opening data blocks
openBlockRequestLatencyMillis_mean
Count
Average delay in opening data blocks
Current number of registered client connections
numRegisteredConnections
Count
Current number of registered client connections
Number of Executors registered to the Shuffle service
registeredExecutorsSize
Count
Number of Executors registered to the Shuffle service
Executor registration request latency
registerExecutorRequestLatencyMillis_mean
ms
Average Executor registration request latency

YARN-Timeline

Title
Metric
Unit
Description
JVM GC count
GcCount
count
JVM GC count
JVM GC time
GcTimeMillis
ms
JVM GC time
JVM memory
MemNonHeapUsedM
MB
Non-heap memory size used by the process
MemNonHeapCommittedM
MB
Non-heap memory size committed to the process
MemNonHeapMaxM
MB
Heap memory size used by the process
MemHeapUsedM
MB
Heap memory size committed to the process
MemHeapCommittedM
MB
Maximum heap memory size available to the process
MemHeapMaxM
MB
Non-heap memory size used by the process
Get domain operand
Ops
count
Get domain operand
Batch obtain domain operands
Ops
count
Batch obtain domain operands
Batch get domains average time
Time
ms
Bulk get domains average time
Get domain average time
Time
ms
Get domain average time
Bulk get entities operand
Ops
count
Bulk get entities operand
Get batch entities average time
Time
ms
Get batch entities average time
Get entity operand
Ops
count
Get entity operand
Get entity average time
Time
ms
Get entity average time
Get batch events operand
Ops
count
Get batch events operand
Get batch events average time
Time
ms
Get batch events average time
Update batch entities operand
Ops
count
Update batch entities operand
Update batch entities average time
Time
ms
Update batch entities average time
Update domain operand
Ops
count
Update domain operand
Update domain average time
Time
ms
Update domain average time
Was this page helpful?
You can also Contact Sales or Submit a Ticket for help.
Yes
No

Feedback