tencent cloud

Feedback

Kudu Monitoring Metrics

Last updated: 2023-12-27 14:51:37

    Kudu - overview

    Title
    Metric
    Unit
    Description
    Tablets
    TabletRunning
    -
    Total number of tablets currently running on all tablet servers
    Difference in the number of tablet replicas
    ClusterReplicaSkew
    -
    Difference between the number of replicas on the tablet server hosting the most replicas and the number of replicas on the tablet server hosting the fewest replicas
    TServer threads
    ThreadsRunning
    -
    Number of threads currently running on all tablet servers
    Master threads
    ThreadsRunning
    -
    Number of threads currently running on all masters
    TServer logs
    ErrorMessages
    -
    Number of ERROR-level log messages emitted in all processes
    Master logs
    ErrorMessages
    -
    Number of ERROR-level log messages emitted in all processes
    WarningMessages
    -
    Number of WARNING-level log messages emitted in all processes
    Oversized write requests
    OversizedWriteRequests
    -
    Number of oversized write requests to the system catalog tablet rejected by the master since start

    Kudu - server

    Title
    Metric
    Unit
    Description
    Block cache hit
    BlockCacheHit
    -
    Number of block cache hits. When confirming the cache efficiency, use the value of this metric instead of that of cache_hits
    BlockCacheMiss
    -
    Number of block cache misses. When confirming the cache efficiency, use the value of this metric instead of that of cache_misses
    Block cache utilization
    BlockCacheUsage
    bytes
    Memory used by block cache
    File cache hit
    FileCacheHit
    -
    Number of file descriptor cache hits. When confirming the cache efficiency, use the value of this metric instead of that of cache_hits
    FileCacheMiss
    -
    Number of file descriptor cache misses. When confirming the cache efficiency, use the value of this metric instead of that of cache_misses
    File cache utilization
    FileCacheUsage
    -
    Number of entries file cache
    Scanner
    ActiveScanners
    -
    Number of currently active scanners
    ExpiredScanners
    -
    Number of scanners that have expired due to inactivity since service start
    Block manager blocks
    BlockUnderManagement
    -
    Number of currently managed data blocks
    BlockOpenReading
    -
    Number of data blocks currently opened for read
    BlockOpenWriting
    -
    Number of data blocks currently opened for write
    Block manager bytes
    BytesUnderManagement
    bytes
    Number of bytes of currently managed data blocks
    Block manager containers
    ContainersUnderManagement
    -
    Number of log block containers
    FullContainersUnderManagement
    -
    Number of full log block containers
    Tablet leaders
    NumRaftLeaders
    -
    Number of tablet replicas that are Raft leaders
    Tablet sessions
    OpenClientSessions
    -
    Number of currently opened tablet copy client sessions on this server
    OpemSourceSessions
    -
    Number of currently opened tablet copy source sessions on this server
    Tablets
    TabletBootstrapping
    -
    Number of currently bootstrapping tablets
    TabletFailed
    -
    Number of failed tablets
    TabletInitialized
    -
    Number of currently initialized tablets
    TabletNotInitialized
    -
    Number of currently uninitialized tablets
    TabletRunning
    -
    Number of currently running tablets/Number of currently running threads
    TabletShutdown
    -
    Number of currently shut down tablets
    TabletStopped
    -
    Number of currently stopped tablets
    TabletStopping
    -
    Number of currently stopping tablets
    CPU time
    CpuStime
    ms
    Total system CPU time of process
    CpuUtime
    ms
    Total user CPU time of process
    Data path
    DataDirsFailed
    -
    Number of data directories whose disks are currently in failed status
    DataDirsFull
    -
    Number of data directories whose disks are currently full
    Thread
    ThreadsRunning
    -
    Number of currently running threads
    Context
    InvoluntarySwitches
    -
    Total involuntary context switches
    VoluntarySwitches
    -
    Total voluntary context switches
    Spinlock
    SpinlockContentionTime
    μs
    Amount of time consumed by contention on internal spinlocks since server start
    Log information
    ErrorMessages
    -
    Number of ERROR-level log messages emitted by the application
    WarningMessages
    -
    Number of WARNING-level log messages emitted by the application
    Operations in queue
    TotalCount
    -
    Total number
    Min
    -
    Minimum number of tasks waiting in the queue
    Max
    -
    Maximum number of tasks waiting in the queue
    Mean
    -
    Average number of tasks waiting in the queue
    Percentile_99_9
    -
    99.9th percentile of the number of tasks waiting in the queue
    Operation execution duration
    TotalCount
    μs
    Total number of operations
    Min
    μs
    Minimum run time
    Max
    μs
    Maximum run time
    Mean
    μs
    Average run time
    Percentile_99_9
    μs
    99.9th percentile of the run time
    Queuing wait time
    TotalCount
    μs
    Total number of operations
    Min
    μs
    Minimum wait time
    Max
    μs
    Maximum wait time
    Mean
    μs
    Average wait time
    Percentile_99_9
    μs
    99.9th percentile of the wait time
    Allocated bytes
    AllocatedBytes
    bytes
    Number of bytes used by applications. This usually does not match the memory usage reported by the operating system because it does not include TCMalloc overhead or memory fragments
    Hybrid clock error
    HybridClockError
    μs
    Server clock maximum error; returns 2^64-1 when unable to read the base clock
    Hybrid clock timestamp
    HybridClockTimestamp
    μs
    Hybrid clock timestamp; returns 2^64-1 when unable to read the base clock
    TCMalloc memory
    HeapSize
    bytes
    Bytes of system memory reserved by TCMalloc
    CurrentThreadCacheBytes
    bytes
    A measure of some of the memory TCMalloc is using (for small objects)
    TotalThreadCacheBytes
    bytes
    A limit to how much memory TCMalloc dedicates for small objects
    TCMalloc PageHeap
    FreeBytes
    bytes
    Number of bytes of free mapped pages in the page heap
    UnMappedBytes
    bytes
    Number of bytes of free unmapped pages in the page heap
    RPC request
    ConnectionsAccepted
    -
    Number of incoming TCP connections made to the RPC server
    QueueOverflow
    -
    Number of RPCs dropped because the service queue was full
    TimesOutInQueue
    -
    Number of RPCs that timed out while waiting in the service queue and thus were not processed
    RPC FetchData
    TotalCount
    μs
    Total number of operations
    Min
    μs
    Minimum processing time
    Max
    μs
    Maximum processing time
    Mean
    μs
    Average processing time
    Percentile_99_9
    μs
    99.9th percentile of the processing time
    RPC AlterSchema
    TotalCount
    μs
    Total number of operations
    Min
    μs
    Minimum processing time
    Max
    μs
    Maximum processing time
    Mean
    μs
    Average processing time
    Percentile_99_9
    μs
    99.9th percentile of the processing time
    RPC CreateTablet
    TotalCount
    μs
    Total number of operations
    Min
    μs
    Minimum processing time
    Max
    μs
    Maximum processing time
    Mean
    μs
    Average processing time
    Percentile_99_9
    μs
    99.9th percentile of the processing time
    RPC DeleteTablet
    TotalCount
    μs
    Total number of operations
    Min
    μs
    Minimum processing time
    Max
    μs
    Maximum processing time
    Mean
    μs
    Average processing time
    Percentile_99_9
    μs
    99.9th percentile of the processing time
    RPC Quiesce
    TotalCount
    μs
    Total number of operations
    Min
    μs
    Minimum processing time
    Max
    μs
    Maximum processing time
    Mean
    μs
    Average processing time
    Percentile_99_9
    μs
    99.9th percentile of the processing time
    RPC scan
    TotalCount
    μs
    Total number of operations
    Min
    μs
    Minimum processing time
    Max
    μs
    Maximum processing time
    Mean
    μs
    Average processing time
    Percentile_99_9
    μs
    99.9th percentile of the processing time
    RPC ScannerKeepAlive
    TotalCount
    μs
    Total number of operations
    Min
    μs
    Minimum processing time
    Max
    μs
    Maximum processing time
    Mean
    μs
    Average processing time
    Percentile_99_9
    μs
    99.9th percentile of the processing time
    RPC write
    TotalCount
    μs
    Total number of operations
    Min
    μs
    Minimum processing time
    Max
    μs
    Maximum processing time
    Mean
    μs
    Average processing time
    Percentile_99_9
    μs
    99.9th percentile of the processing time
    Write requests rejected due to queue overloading
    QueueOverloadRejections
    count
    Number of write requests rejected due to queue overloading
    Scan rate
    ScannedFromDiskRate
    bytes/s
    Amount of data scanned per second
    ScannerReturnedRate
    bytes/s
    Amount of data returned per second
    Scanner bytes
    ScannedFromDisk
    bytes
    Total amount of data scanned from disk
    ScannerReturned
    bytes
    Total amount of returned data
    Total row operations
    RowsInserted
    count
    Number of rows inserted into the node
    RowsDeleted
    count
    Number of rows deleted from the node
    RowsUpserted
    count
    Number of rows upserted into the node
    RowsUpdated
    count
    Number of rows updated on the node
    Row operation rate
    RowsInsertedRate
    count/s
    Number of rows inserted into the node per second
    RowsDeletedRate
    count/s
    Number of rows deleted from the node per second
    RowsUpsertedRate
    count/s
    Number of rows upserted into the node per second
    RowsUpdatedRate
    count/s
    Number of rows updated on the node per second

    Kudu - master

    Title
    Metric
    Unit
    Description
    Block cache hit
    BlockCacheHit
    -
    Number of block cache hits. When confirming the cache efficiency, use the value of this metric instead of that of cache_hits
    BlockCacheMiss
    -
    Number of block cache misses. When confirming the cache efficiency, use the value of this metric instead of that of cache_misses
    Block cache utilization
    BlockCacheUsage
    bytes
    Memory used by block cache
    File cache hit
    FileCacheHit
    -
    Number of file descriptor cache hits. When confirming the cache efficiency, use the value of this metric instead of that of cache_hits
    FileCacheMiss
    -
    Number of file descriptor cache misses. When confirming the cache efficiency, use the value of this metric instead of that of cache_misses
    File cache utilization
    FileCacheUsage
    -
    Number of entries file cache
    Block manager blocks
    BlockUnderManagement
    -
    Number of currently managed data blocks
    BlockOpenReading
    -
    Number of data blocks currently opened for read
    BlockOpenWriting
    -
    Number of data blocks currently opened for write
    Block manager bytes
    BytesUnderManagement
    bytes
    Number of bytes of currently managed data blocks
    Block manager containers
    ContainersUnderManagement
    -
    Number of log block containers
    FullContainersUnderManagement
    -
    Number of full log block containers
    CPU time
    CpuStime
    ms
    Total system CPU time of process
    CpuUtime
    ms
    Total user CPU time of process
    Thread
    ThreadsRunning
    -
    Number of currently running threads
    Data path
    DataDirsFailed
    -
    Number of data directories whose disks are currently in failed status
    DataDirsFull
    -
    Number of data directories whose disks are currently full
    Allocated bytes
    AllocatedBytes
    bytes
    Number of bytes used by applications. This usually does not match the memory usage reported by the operating system because it does not include TCMalloc overhead or memory fragments
    Log information
    ErrorMessages
    -
    Number of ERROR-level log messages emitted by the application
    WarningMessages
    -
    Number of WARNING-level log messages emitted by the application
    Context
    InvoluntarySwitches
    -
    Total involuntary context switches
    VoluntarySwitches
    -
    Total voluntary context switches
    Operations in queue
    TotalCount
    -
    Total number
    Min
    -
    Minimum number of tasks waiting in the queue
    Max
    -
    Maximum number of tasks waiting in the queue
    Mean
    -
    Average number of tasks waiting in the queue
    Percentile_99_9
    -
    99.9th percentile of the number of tasks waiting in the queue
    Queuing wait time
    TotalCount
    μs
    Total number of operations
    Min
    μs
    Minimum wait time
    Max
    μs
    Maximum wait time
    Mean
    μs
    Average wait time
    Percentile_99_9
    μs
    99.9th percentile of the wait time
    Operation execution duration
    TotalCount
    μs
    Total number of operations
    Min
    μs
    Minimum run time
    Max
    μs
    Maximum run time
    Mean
    μs
    Average run time
    Percentile_99_9
    μs
    99.9th percentile of the run time
    Spinlock
    SpinlockContentionTime
    μs
    Amount of time consumed by contention on internal spinlocks since server start
    Oversized write requests
    OversizedWriteRequests
    -
    Number of oversized write requests to the system catalog tablet rejected since start
    Hybrid clock error
    HybridClockError
    μs
    Server clock maximum error; returns 2^64-1 when unable to read the base clock
    Hybrid clock timestamp
    HybridClockTimestamp
    μs
    Hybrid clock timestamp; returns 2^64-1 when unable to read the base clock
    Difference in the number of tablet replicas
    ClusterReplicaSkew
    -
    Difference between the number of replicas on the tablet server hosting the most replicas and the number of replicas on the tablet server hosting the fewest replicas
    Tablet leaders
    NumRaftLeaders
    -
    Number of tablet replicas that are Raft leaders
    Tablet sessions
    OpemSourceSessions
    -
    Number of currently opened tablet copy source sessions on this server
    TCMalloc memory
    HeapSize
    bytes
    Bytes of system memory reserved by TCMalloc
    CurrentThreadCacheBytes
    bytes
    A measure of some of the memory TCMalloc is using (for small objects)
    TotalThreadCacheBytes
    bytes
    A limit to how much memory TCMalloc dedicates for small objects
    TCMalloc page heap
    FreeBytes
    bytes
    Number of bytes of free mapped pages in the page heap
    UnMappedBytes
    bytes
    Number of bytes of free unmapped pages in the page heap
    RPC request
    ConnectionsAccepted
    -
    Number of incoming TCP connections made to the RPC server
    QueueOverflow
    -
    Number of RPCs dropped because the service queue was full
    TimesOutInQueue
    -
    Number of RPCs that timed out while waiting in the service queue and thus were not processed
    RPC RunLeaderElection
    TotalCount
    μs
    Total number of operations
    Min
    μs
    Minimum processing time
    Max
    μs
    Maximum processing time
    Mean
    μs
    Average processing time
    Percentile_99_9
    μs
    99.9th percentile of the processing time
    RPC ConnectToMaster
    TotalCount
    μs
    Total number of operations
    Min
    μs
    Minimum processing time
    Max
    μs
    Maximum processing time
    Mean
    μs
    Average processing time
    Percentile_99_9
    μs
    99.9th percentile of the processing time
    RPC Ping
    TotalCount
    μs
    Total number of operations
    Min
    μs
    Minimum processing time
    Max
    μs
    Maximum processing time
    Mean
    μs
    Average processing time
    Percentile_99_9
    μs
    99.9th percentile of the processing time
    RPC TSHeartbeat
    TotalCount
    μs
    Total number of operations
    Min
    μs
    Minimum processing time
    Max
    μs
    Maximum processing time
    Mean
    μs
    Average processing time
    Percentile_99_9
    μs
    99.9th percentile of the processing time
    RPC FetchData
    TotalCount
    μs
    Total number of operations
    Min
    μs
    Minimum processing time
    Max
    μs
    Maximum processing time
    Mean
    μs
    Average processing time
    Percentile_99_9
    μs
    99.9th percentile of the processing time
    
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support