tencent cloud

Feedback

YARN Monitoring Metrics

Last updated: 2023-05-30 11:24:00

    YARN - overview

    Title Metric Unit Description
    Nodes NumActiveNMs - Number of live NodeManagers
    NumDecommissionedNMs - Number of decommissioned NodeManagers
    NumLostNMs - Number of lost NodeManagers
    NumUnhealthyNMs - Number of unhealthy NodeManagers
    CPU cores AllocatedVCores - Number of allocated VCores in the current queue
    ReservedVCores - Number of reserved VCores in the current queue
    AvailableVCores - Number of available VCores in the current queue
    PendingVCores - Number of pending VCores in resource requests in the current queue
    Total applications AppsSubmitted - Number of submitted jobs in the current queue
    AppsRunning - Number of running jobs in the current queue
    AppsPending - Number of pending jobs in the current queue
    AppsCompleted - Number of completed jobs in the current queue
    AppsKilled - Number of killed jobs in the current queue
    AppsFailed - Number of failed jobs in the current queue
    ActiveApplications - Number of active jobs in the current queue
    running_0 - Number of running jobs in the current queue that have run for less than 60 minutes
    running_60 - Number of running jobs in the current queue that have run for 60–300 minutes
    running_300 - Number of running jobs in the current queue that have run for 300–1,440 minutes
    running_1440 - Number of running jobs in the current queue that have run for more than 1,440 minutes
    Memory size AllocatedMB MB Amount of allocated memory in the current queue
    AvailableMB MB Amount of available memory in the current queue
    PendingMB MB Amount of pending memory in resource requests in the current queue
    ReservedMB MB Amount of reserved memory in the current queue
    Containers AllocatedContainers - Number of allocated containers in the current queue
    PendingContainers - Number of pending containers in resource requests in the current queue
    ReservedContainers - Number of reserved containers in the current queue
    Total allocated/released containers AggregateContainersAllocated - Total number of allocated containers in the current queue
    AggregateContainersReleased - Total number of released containers in the current queue
    Users ActiveUsers - Number of active users in the current queue
    Memory allocatedMB MB Amount of allocated memory in the cluster
    availableMB MB Amount of available memory in the cluster
    reservedMB MB Amount of reserved memory in the cluster
    totalMB MB Total amount of memory in the cluster
    Applications completed - Number of completed jobs in the cluster during the statistical period
    failed - Number of failed jobs in the cluster during the statistical period
    killed - Number of killed jobs in the cluster during the statistical period
    pending - Number of pending jobs in the cluster during the statistical period
    running - Number of running jobs in the cluster during the statistical period
    submitted - Number of submitted jobs in the cluster during the statistical period
    Containers containersAllocated - Number of allocated containers in the cluster
    containersPending - Number of pending containers in the cluster
    containersReserved - Number of reserved containers in the cluster
    Memory utilization usageRatio % Current memory utilization of the cluster
    Cores allocatedVirtualCores - Number of allocated CPU cores in the cluster
    availableVirtualCores - Number of available CPU cores in the cluster
    reservedVirtualCores - Number of reserved CPU cores in the cluster
    totalVirtualCores - Total number of CPU cores in the cluster
    CPU utilization usageRatio % Current CPU utilization of the cluster
    Launched AMs AMLaunchDelayNumOps - Launched AMs
    Average time for RM to launch AM AMLaunchDelayAvgTime ms Average time for RM to launch AM
    Total registered AMs AMRegisterDelayNumOps - Total registered AMs
    Average time for AM to register with RM AMRegisterDelayAvgTime ms Average time for AM to register with RM
    Queue CPU utilization YARN.RM.QUEUE.VCORES.RATIO - Utilization of CPU allocated for the current queue
    Queue memory utilization YARN.RM.QUEUE.MEM.RATIO - Utilization of memory allocated for the current queue

    YARN - ResourceManager

    Title Metric Unit Description
    RPC authentications/authorizations RpcAuthenticationFailures - Number of failed RPC authentications
    RpcAuthenticationSuccesses - Number of successful RPC authentications
    RpcAuthorizationFailures - Number of failed RPC authorizations
    RpcAuthorizationSuccesses - Number of successful RPC authorizations
    Data received/sent by RPC ReceivedBytes bytes/s Amount of data received by RPC
    SentBytes bytes/s Amount of data sent by RPC
    RPC connections NumOpenConnections - Current number of open connections
    RPC requests RpcProcessingTimeNumOps - Number of RPC requests
    RpcQueueTimeNumOps - Number of RPC requests
    RPC queue length CallQueueLength - Length of the current RPC queue
    Average RPC processing time RpcProcessingTimeAvgTime s Average RPC request processing time
    RpcQueueTimeAvgTime s Average time of RPC in the queue
    GC count YGC - Young GC count
    FGC - Full GC count
    GC time FGCT s Full GC time
    GCT s Garbage collection time
    YGCT s Young GC time
    Memory zone proportion S0 % Percentage of used Survivor 0 memory
    E % Percentage of used Eden memory
    CCS % Percentage of used compressed class space memory
    S1 % Percentage of used Survivor 1 memory
    O % Percentage of used Old memory
    M % Percentage of used Metaspace memory
    JVM threads ThreadsNew - Number of threads in NEW status
    ThreadsRunnable - Number of threads in RUNNABLE status
    ThreadsBlocked - Number of threads in BLOCKED status
    ThreadsWaiting - Number of threads in WAITING status
    ThreadsTimedWaiting - Number of threads in TIMED WAITING status
    ThreadsTerminated - Number of threads in Terminated status
    JVM logs LogFatal - Number of Fatal logs
    LogError - Number of Error logs
    LogWarn - Number of Warn logs
    LogInfo - Number of Info logs
    JVM memory MemNonHeapUsedM MB Non-heap memory size used by process
    MemNonHeapCommittedM MB Non-heap memory size committed to process
    MemHeapUsedM MB Heap memory size used by process
    MemHeapCommittedM MB Heap memory size committed to process
    MemHeapMaxM MB Maximum heap memory size available to process
    MemMaxM MB Maximum memory size available to process
    CPU utilization ProcessCpuLoad % CPU utilization
    Cumulative CPU usage time ProcessCpuTime ms Cumulative CPU usage time
    File descriptors MaxFileDescriptorCount - Maximum number of file descriptors
    OpenFileDescriptorCount - Number of opened file descriptors
    Process execution duration Uptime s Process execution duration
    Worker threads DaemonThreadCount - Number of daemon threads in the process
    ThreadCount - Number of threads in the process
    Node status haState 1:Active,0:Standby ResourceManager active/standby status
    Active/Standby switch switchOccurred - ResourceManager active/standby switch

    YARN - JobHistoryServer

    Title Metric Unit Description
    JVM threads ThreadsNew - Number of threads in NEW status
    ThreadsRunnable - Number of threads in RUNNABLE status
    ThreadsBlocked - Number of threads in BLOCKED status
    ThreadsWaiting - Number of threads in WAITING status
    ThreadsTimedWaiting - Number of threads in TIMED WAITING status
    ThreadsTerminated - Number of threads in Terminated status
    JVM logs LogFatal - Number of FATAL-level logs
    LogError - Number of ERROR-level logs
    LogWarn - Number of WARN-level logs
    LogInfo - Number of INFO-level logs
    JVM memory MemNonHeapUsedM MB Non-heap memory size used by process
    MemNonHeapCommittedM MB Non-heap memory size committed to process
    MemHeapUsedM MB Heap memory size used by process
    MemHeapCommittedM MB Heap memory size committed to process
    MemHeapMaxM MB Maximum heap memory size available to process
    MemMaxM MB Maximum memory size available to process
    GC count YGC - Young GC count
    FGC - Full GC count
    GC time FGCT s Full GC time
    GCT s Garbage collection time
    YGCT s Young GC time
    Memory zone proportion S0 % Percentage of used Survivor 0 memory
    E % Percentage of used Eden memory
    CCS % Percentage of used compressed class space memory
    S1 % Percentage of used Survivor 1 memory
    O % Percentage of used Old memory
    M % Percentage of used Metaspace memory
    CPU utilization ProcessCpuLoad % CPU utilization
    Cumulative CPU usage time ProcessCpuTime ms Cumulative CPU usage time
    File descriptors MaxFileDescriptorCount - Maximum number of file descriptors
    OpenFileDescriptorCount - Number of opened file descriptors
    Process execution duration Uptime s Process execution duration
    Worker threads DaemonThreadCount - Number of daemon threads in the process
    ThreadCount - Number of threads in the process

    YARN - NodeManager

    Title Metric Unit Description
    GC count YGC - Young GC count
    FGC - Full GC count
    GC time FGCT s Full GC time
    GCT s Garbage collection time
    YGCT s Young GC time
    Memory zone proportion S0 % Percentage of used Survivor 0 memory
    E % Percentage of used Eden memory
    CCS % Percentage of used compressed class space memory
    S1 % Percentage of used Survivor 1 memory
    O % Percentage of used Old memory
    M % Percentage of used Metaspace memory
    JVM threads ThreadsNew - Number of threads in NEW status
    ThreadsRunnable - Number of threads in RUNNABLE status
    ThreadsBlocked - Number of threads in BLOCKED status
    ThreadsWaiting - Number of threads in WAITING status
    ThreadsTimedWaiting - Number of threads in TIMED WAITING status
    ThreadsTerminated - Number of threads currently in TERMINATED status
    JVM logs LogFatal - Number of FATAL-level logs
    LogError - Number of ERROR-level logs
    LogWarn - Number of WARN-level logs
    LogInfo - Number of INFO-level logs
    JVM memory MemNonHeapUsedM MB Non-heap memory size used by process
    MemNonHeapCommittedM MB Non-heap memory size committed to process
    MemHeapUsedM MB Heap memory size used by process
    MemHeapCommittedM MB Heap memory size committed to process
    MemHeapMaxM MB Maximum heap memory size available to process
    MemMaxM MB Maximum memory size available to process
    Total containers ContainersLaunched - Number of launched containers
    ContainersCompleted - Number of completed containers
    ContainersFailed - Number of failed containers
    ContainersKilled - Number of killed containers
    ContainersIniting - Number of containers being initialized
    ContainersRunning - Number of running containers
    AllocatedContainers - Number of containers allocated by NodeManager
    Average container launch time ContainerLaunchDurationAvgTime ms Average container launch time
    Container launches ContainerLaunchDurationNumOps - Container launches
    CPU cores AvailableVCores - Number of VCores available to NodeManager
    AllocatedVCores - Number of VCores allocated by NodeManager
    Memory size AllocatedGB GB Amount of memory allocated by NodeManager
    AvailableGB GB Amount of memory available to NodeManager
    CPU utilization ProcessCpuLoad % CPU utilization
    Cumulative CPU usage time ProcessCpuTime ms Cumulative CPU usage time
    File descriptors MaxFileDescriptorCount - Maximum number of file descriptors
    OpenFileDescriptorCount - Number of opened file descriptors
    Process execution duration Uptime s Process execution duration
    Worker threads DaemonThreadCount - Number of daemon threads in the process
    ThreadCount - Number of threads in the process
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support