Category | Monitoring Content |
Language Models | Metrics for text generation models, such as RPM (Requests Per Minute), TTFT (Time To First Token), TPOT (Time Per Output Token), and other metrics. |
Filter Options | Description |
Filter Dimension | Supports filtering by service/model and switching to different aggregation views. |
Service Selection | Dropdown to select specific inference services (default displays All Services). |
Time Range | 1 hour/Today/Last 3 days/Last 7 days/Last 30 days, or Custom time range. |
Metric | Full Name | Unit | Description |
Requests Per Minute (RPM) | Requests Per Minute | reqs/min | Number of requests per minute, reflecting request throughput. |
Time To First Token (TTFT) | Time To First Token | ms | The response time from sending the request to receiving the first Token. |
Time Per Output Token (TPOT) | Time Per Output Token | ms | Average time to generate each output token. |

Was this page helpful?
You can also Contact sales or Submit a Ticket for help.
Help us improve! Rate your documentation experience in 5 mins.
Feedback