Release Notes
Announcements
Security Announcements
Title | Metric | Unit | Description |
Tablets | TabletRunning | - | Total number of tablets currently running on all tablet servers |
Difference in the number of tablet replicas | ClusterReplicaSkew | - | Difference between the number of replicas on the tablet server hosting the most replicas and the number of replicas on the tablet server hosting the fewest replicas |
TServer threads | ThreadsRunning | - | Number of threads currently running on all tablet servers |
Master threads | ThreadsRunning | - | Number of threads currently running on all masters |
TServer logs | ErrorMessages | - | Number of ERROR-level log messages emitted in all processes |
Master logs | ErrorMessages | - | Number of ERROR-level log messages emitted in all processes |
| WarningMessages | - | Number of WARNING-level log messages emitted in all processes |
Oversized write requests | OversizedWriteRequests | - | Number of oversized write requests to the system catalog tablet rejected by the master since start |
Title | Metric | Unit | Description |
Block cache hit | BlockCacheHit | - | Number of block cache hits. When confirming the cache efficiency, use the value of this metric instead of that of cache_hits |
| BlockCacheMiss | - | Number of block cache misses. When confirming the cache efficiency, use the value of this metric instead of that of cache_misses |
Block cache utilization | BlockCacheUsage | bytes | Memory used by block cache |
File cache hit | FileCacheHit | - | Number of file descriptor cache hits. When confirming the cache efficiency, use the value of this metric instead of that of cache_hits |
| FileCacheMiss | - | Number of file descriptor cache misses. When confirming the cache efficiency, use the value of this metric instead of that of cache_misses |
File cache utilization | FileCacheUsage | - | Number of entries file cache |
Scanner | ActiveScanners | - | Number of currently active scanners |
| ExpiredScanners | - | Number of scanners that have expired due to inactivity since service start |
Block manager blocks | BlockUnderManagement | - | Number of currently managed data blocks |
| BlockOpenReading | - | Number of data blocks currently opened for read |
| BlockOpenWriting | - | Number of data blocks currently opened for write |
Block manager bytes | BytesUnderManagement | bytes | Number of bytes of currently managed data blocks |
Block manager containers | ContainersUnderManagement | - | Number of log block containers |
| FullContainersUnderManagement | - | Number of full log block containers |
Tablet leaders | NumRaftLeaders | - | Number of tablet replicas that are Raft leaders |
Tablet sessions | OpenClientSessions | - | Number of currently opened tablet copy client sessions on this server |
| OpemSourceSessions | - | Number of currently opened tablet copy source sessions on this server |
Tablets | TabletBootstrapping | - | Number of currently bootstrapping tablets |
| TabletFailed | - | Number of failed tablets |
| TabletInitialized | - | Number of currently initialized tablets |
| TabletNotInitialized | - | Number of currently uninitialized tablets |
| TabletRunning | - | Number of currently running tablets/Number of currently running threads |
| TabletShutdown | - | Number of currently shut down tablets |
| TabletStopped | - | Number of currently stopped tablets |
| TabletStopping | - | Number of currently stopping tablets |
CPU time | CpuStime | ms | Total system CPU time of process |
| CpuUtime | ms | Total user CPU time of process |
Data path | DataDirsFailed | - | Number of data directories whose disks are currently in failed status |
| DataDirsFull | - | Number of data directories whose disks are currently full |
Thread | ThreadsRunning | - | Number of currently running threads |
Context | InvoluntarySwitches | - | Total involuntary context switches |
| VoluntarySwitches | - | Total voluntary context switches |
Spinlock | SpinlockContentionTime | μs | Amount of time consumed by contention on internal spinlocks since server start |
Log information | ErrorMessages | - | Number of ERROR-level log messages emitted by the application |
| WarningMessages | - | Number of WARNING-level log messages emitted by the application |
Operations in queue | TotalCount | - | Total number |
| Min | - | Minimum number of tasks waiting in the queue |
| Max | - | Maximum number of tasks waiting in the queue |
| Mean | - | Average number of tasks waiting in the queue |
| Percentile_99_9 | - | 99.9th percentile of the number of tasks waiting in the queue |
Operation execution duration | TotalCount | μs | Total number of operations |
| Min | μs | Minimum run time |
| Max | μs | Maximum run time |
| Mean | μs | Average run time |
| Percentile_99_9 | μs | 99.9th percentile of the run time |
Queuing wait time | TotalCount | μs | Total number of operations |
| Min | μs | Minimum wait time |
| Max | μs | Maximum wait time |
| Mean | μs | Average wait time |
| Percentile_99_9 | μs | 99.9th percentile of the wait time |
Allocated bytes | AllocatedBytes | bytes | Number of bytes used by applications. This usually does not match the memory usage reported by the operating system because it does not include TCMalloc overhead or memory fragments |
Hybrid clock error | HybridClockError | μs | Server clock maximum error; returns 2^64-1 when unable to read the base clock |
Hybrid clock timestamp | HybridClockTimestamp | μs | Hybrid clock timestamp; returns 2^64-1 when unable to read the base clock |
TCMalloc memory | HeapSize | bytes | Bytes of system memory reserved by TCMalloc |
| CurrentThreadCacheBytes | bytes | A measure of some of the memory TCMalloc is using (for small objects) |
| TotalThreadCacheBytes | bytes | A limit to how much memory TCMalloc dedicates for small objects |
TCMalloc PageHeap | FreeBytes | bytes | Number of bytes of free mapped pages in the page heap |
| UnMappedBytes | bytes | Number of bytes of free unmapped pages in the page heap |
RPC request | ConnectionsAccepted | - | Number of incoming TCP connections made to the RPC server |
| QueueOverflow | - | Number of RPCs dropped because the service queue was full |
| TimesOutInQueue | - | Number of RPCs that timed out while waiting in the service queue and thus were not processed |
RPC FetchData | TotalCount | μs | Total number of operations |
| Min | μs | Minimum processing time |
| Max | μs | Maximum processing time |
| Mean | μs | Average processing time |
| Percentile_99_9 | μs | 99.9th percentile of the processing time |
RPC AlterSchema | TotalCount | μs | Total number of operations |
| Min | μs | Minimum processing time |
| Max | μs | Maximum processing time |
| Mean | μs | Average processing time |
| Percentile_99_9 | μs | 99.9th percentile of the processing time |
RPC CreateTablet | TotalCount | μs | Total number of operations |
| Min | μs | Minimum processing time |
| Max | μs | Maximum processing time |
| Mean | μs | Average processing time |
| Percentile_99_9 | μs | 99.9th percentile of the processing time |
RPC DeleteTablet | TotalCount | μs | Total number of operations |
| Min | μs | Minimum processing time |
| Max | μs | Maximum processing time |
| Mean | μs | Average processing time |
| Percentile_99_9 | μs | 99.9th percentile of the processing time |
RPC Quiesce | TotalCount | μs | Total number of operations |
| Min | μs | Minimum processing time |
| Max | μs | Maximum processing time |
| Mean | μs | Average processing time |
| Percentile_99_9 | μs | 99.9th percentile of the processing time |
RPC scan | TotalCount | μs | Total number of operations |
| Min | μs | Minimum processing time |
| Max | μs | Maximum processing time |
| Mean | μs | Average processing time |
| Percentile_99_9 | μs | 99.9th percentile of the processing time |
RPC ScannerKeepAlive | TotalCount | μs | Total number of operations |
| Min | μs | Minimum processing time |
| Max | μs | Maximum processing time |
| Mean | μs | Average processing time |
| Percentile_99_9 | μs | 99.9th percentile of the processing time |
RPC write | TotalCount | μs | Total number of operations |
| Min | μs | Minimum processing time |
| Max | μs | Maximum processing time |
| Mean | μs | Average processing time |
| Percentile_99_9 | μs | 99.9th percentile of the processing time |
Write requests rejected due to queue overloading | QueueOverloadRejections | count | Number of write requests rejected due to queue overloading |
Scan rate | ScannedFromDiskRate | bytes/s | Amount of data scanned per second |
| ScannerReturnedRate | bytes/s | Amount of data returned per second |
Scanner bytes | ScannedFromDisk | bytes | Total amount of data scanned from disk |
| ScannerReturned | bytes | Total amount of returned data |
Total row operations | RowsInserted | count | Number of rows inserted into the node |
| RowsDeleted | count | Number of rows deleted from the node |
| RowsUpserted | count | Number of rows upserted into the node |
| RowsUpdated | count | Number of rows updated on the node |
Row operation rate | RowsInsertedRate | count/s | Number of rows inserted into the node per second |
| RowsDeletedRate | count/s | Number of rows deleted from the node per second |
| RowsUpsertedRate | count/s | Number of rows upserted into the node per second |
| RowsUpdatedRate | count/s | Number of rows updated on the node per second |
ExpScanner | ExpiredScanners | /sec | Average number of expired scanners per second due to inactivity since self-service started during the metric collection cycle |
Mem Rowset | Total | bytes | Size of tablet’s memrowset used by node |
Purge cache | DeltaMemStore | count | Number of DeltaMemStore refresh |
| MemRowSet | count | Number of MemRowSet refresh |
Disk Rowsets statistics | Total | count | Total volume of node tablet diskrowsets |
| Avg | count | Average number of node tablet diskrowsets |
| Max | count | Maximum number of node tablet diskrowsets |
Tablet data size | OnDisk | bytes | Tablet data size on node |
Average height of Disk Rowsets | Total | count | Overall average height of diskrowsets on node tablet |
| Avg | count | Average height of diskrowsets on node tablet |
| Max | count | Average height of maximum DiskRowSets of tablet on nodes |
Compactions Running statistics | RowSet | count | Total size of RowSet merging of tablet on nodes |
| Major Delta | count | Total size of Major Delta merging of tablet on nodes |
| Minor Delta | count | Total size of Minor Delta merging of tablet on nodes |
Tablet cache refresh | Bytes Flushed | bytes/s | Average per second data volume of tablet cache refresh on nodes within the metric collection period |
RPC request rejected | leader | /sec | Average number of RPC requests denied per second due to memory pressure on the leader node during the metric collection period |
| follower | /sec | Average number of RPC requests denied per second due to memory pressure on the follower node during the metric collection period |
Access queue time | TotalCount,Percentile_99,Min,Max,Mean | μs | 99th percentile of access RPC request processing time in the work queue |
Scanner time | TotalCount,Percentile_99,Min,Max,Mean | μs | 99th percentile of scan duration |
Process memory | AllocatedMB | MB | The byte count used by the application converted to MB. This usually does not match the memory usage reported by the operating system, as it does not include TCMalloc overhead or memory fragmentation. |
| MemLimit | MB | Memory limit threshold configured by kuduserver |
Heap memory usage percentage | UsedRate | % | Used memory on node AllocatedMB/Configured limit memory MemLimit |
Title | Metric | Unit | Description |
Block cache hit | BlockCacheHit | - | Number of block cache hits. When confirming the cache efficiency, use the value of this metric instead of that of cache_hits |
| BlockCacheMiss | - | Number of block cache misses. When confirming the cache efficiency, use the value of this metric instead of that of cache_misses |
Block cache utilization | BlockCacheUsage | bytes | Memory used by block cache |
File cache hit | FileCacheHit | - | Number of file descriptor cache hits. When confirming the cache efficiency, use the value of this metric instead of that of cache_hits |
| FileCacheMiss | - | Number of file descriptor cache misses. When confirming the cache efficiency, use the value of this metric instead of that of cache_misses |
File cache utilization | FileCacheUsage | - | Number of entries file cache |
Block manager blocks | BlockUnderManagement | - | Number of currently managed data blocks |
| BlockOpenReading | - | Number of data blocks currently opened for read |
| BlockOpenWriting | - | Number of data blocks currently opened for write |
Block manager bytes | BytesUnderManagement | bytes | Number of bytes of currently managed data blocks |
Block manager containers | ContainersUnderManagement | - | Number of log block containers |
| FullContainersUnderManagement | - | Number of full log block containers |
CPU time | CpuStime | ms | Total system CPU time of process |
| CpuUtime | ms | Total user CPU time of process |
Thread | ThreadsRunning | - | Number of currently running threads |
Data path | DataDirsFailed | - | Number of data directories whose disks are currently in failed status |
| DataDirsFull | - | Number of data directories whose disks are currently full |
Allocated bytes | AllocatedBytes | bytes | Number of bytes used by applications. This usually does not match the memory usage reported by the operating system because it does not include TCMalloc overhead or memory fragments |
Log information | ErrorMessages | - | Number of ERROR-level log messages emitted by the application |
| WarningMessages | - | Number of WARNING-level log messages emitted by the application |
Context | InvoluntarySwitches | - | Total involuntary context switches |
| VoluntarySwitches | - | Total voluntary context switches |
Operations in queue | TotalCount | - | Total number |
| Min | - | Minimum number of tasks waiting in the queue |
| Max | - | Maximum number of tasks waiting in the queue |
| Mean | - | Average number of tasks waiting in the queue |
| Percentile_99_9 | - | 99.9th percentile of the number of tasks waiting in the queue |
Queuing wait time | TotalCount | μs | Total number of operations |
| Min | μs | Minimum wait time |
| Max | μs | Maximum wait time |
| Mean | μs | Average wait time |
| Percentile_99_9 | μs | 99.9th percentile of the wait time |
Operation execution duration | TotalCount | μs | Total number of operations |
| Min | μs | Minimum run time |
| Max | μs | Maximum run time |
| Mean | μs | Average run time |
| Percentile_99_9 | μs | 99.9th percentile of the run time |
Spinlock | SpinlockContentionTime | μs | Amount of time consumed by contention on internal spinlocks since server start |
Oversized write requests | OversizedWriteRequests | - | Number of oversized write requests to the system catalog tablet rejected since start |
Hybrid clock error | HybridClockError | μs | Server clock maximum error; returns 2^64-1 when unable to read the base clock |
Hybrid clock timestamp | HybridClockTimestamp | μs | Hybrid clock timestamp; returns 2^64-1 when unable to read the base clock |
Difference in the number of tablet replicas | ClusterReplicaSkew | - | Difference between the number of replicas on the tablet server hosting the most replicas and the number of replicas on the tablet server hosting the fewest replicas |
Tablet leaders | NumRaftLeaders | - | Number of tablet replicas that are Raft leaders |
Tablet sessions | OpemSourceSessions | - | Number of currently opened tablet copy source sessions on this server |
TCMalloc memory | HeapSize | bytes | Bytes of system memory reserved by TCMalloc |
| CurrentThreadCacheBytes | bytes | A measure of some of the memory TCMalloc is using (for small objects) |
| TotalThreadCacheBytes | bytes | A limit to how much memory TCMalloc dedicates for small objects |
TCMalloc page heap | FreeBytes | bytes | Number of bytes of free mapped pages in the page heap |
| UnMappedBytes | bytes | Number of bytes of free unmapped pages in the page heap |
RPC request | ConnectionsAccepted | - | Number of incoming TCP connections made to the RPC server |
| QueueOverflow | - | Number of RPCs dropped because the service queue was full |
| TimesOutInQueue | - | Number of RPCs that timed out while waiting in the service queue and thus were not processed |
RPC RunLeaderElection | TotalCount | μs | Total number of operations |
| Min | μs | Minimum processing time |
| Max | μs | Maximum processing time |
| Mean | μs | Average processing time |
| Percentile_99_9 | μs | 99.9th percentile of the processing time |
RPC ConnectToMaster | TotalCount | μs | Total number of operations |
| Min | μs | Minimum processing time |
| Max | μs | Maximum processing time |
| Mean | μs | Average processing time |
| Percentile_99_9 | μs | 99.9th percentile of the processing time |
RPC Ping | TotalCount | μs | Total number of operations |
| Min | μs | Minimum processing time |
| Max | μs | Maximum processing time |
| Mean | μs | Average processing time |
| Percentile_99_9 | μs | 99.9th percentile of the processing time |
RPC TSHeartbeat | TotalCount | μs | Total number of operations |
| Min | μs | Minimum processing time |
| Max | μs | Maximum processing time |
| Mean | μs | Average processing time |
| Percentile_99_9 | μs | 99.9th percentile of the processing time |
RPC FetchData | TotalCount | μs | Total number of operations |
| Min | μs | Minimum processing time |
| Max | μs | Maximum processing time |
| Mean | μs | Average processing time |
| Percentile_99_9 | μs | 99.9th percentile of the processing time |
フィードバック