tencent cloud

Feedback

Monitoring and Alarms

Last updated: 2024-01-03 11:43:43

    Overview

    TDMQ for RabbitMQ supports monitoring resources created under your account, including clusters, vhosts, queues, and exchanges. You can analyze the cluster usage based on the monitoring data and handle possible risks promptly. You can also set alarm rules for monitoring metrics, so that you can receive alarm messages when metrics are abnormal. This helps you deal with risks in time and ensure the stable operations of your system.

    Monitoring Metrics

    The monitoring metrics supported by TDMQ for RabbitMQ are as follows:
    Resource Type
    Monitoring Metric
    Unit
    Description
    Cluster
    Vhost quantity
    Count
    Total number of vhosts in the cluster within the selected time range.
    RabbitMQ production speed
    Count/s
    The speed at which all clients in the cluster produce messages to exchanges within the selected time range.
    RabbitMQ production throughput
    Bytes/s
    Throughput of all clients in the cluster producing messages to exchanges within the selected time range.
    RabbitMQ delivery speed
    Count/s
    The speed at which all queues in the cluster deliver messages to the client within the selected time range.
    RabbitMQ delivery throughput
    Bytes/s
    Throughput of all queues in the cluster delivering messages to the client within the selected time range.
    RabbitMQ exchange quantity
    Count
    Total number of exchanges in the cluster within the selected time range.
    RabbitMQ queue quantity
    Count
    Total number of queues in the cluster within the selected time range.
    Vhost
    RabbitMQ exchange quantity
    Count
    Total number of exchanges in all vhosts in the cluster within the selected time range.
    RabbitMQ queue quantity
    Count
    Total number of queues in all vhosts in the cluster within the selected time range.
    Exchange
    RabbitMQ average production duration
    ms
    Average time taken by the client to produce messages to each exchange within the selected time range.
    RabbitMQ P95 production duration
    ms
    95th percentile of the time taken by the client to produce messages to each exchange within the selected time range.
    RabbitMQ P99 production duration
    ms
    99th percentile of the time taken by the client to produce messages to each exchange within the selected time range.
    RabbitMQ P999 production duration
    ms
    99.9th percentile of the time taken by the client to produce messages to each exchange within the selected time range.
    RabbitMQ maximum production duration
    ms
    Maximum time taken by the client to produce messages to each exchange within the selected time range.
    RabbitMQ binding distribution speed
    Count/s
    The speed at which each exchange sends messages to the bound queue within the selected time range.
    Binding distribution throughput
    Bytes/s
    Throughput of each exchange sending messages to the bound queue within the selected time range.
    RabbitMQ production speed
    Count/s
    The speed at which the client produces messages to each exchange within the selected time range.
    RabbitMQ production throughput
    Bytes/s
    Throughput of the client producing messages to each exchange within the selected time range.
    Queue
    Binding distribution production speed
    Count/s
    The speed at which each queue receives messages from the exchange within the selected time range.
    Binding distribution production throughput
    Bytes/s
    Throughput of each queue receiving messages from the exchange within the selected time range.
    Pull speed
    Count/s
    The speed at which the client pulls messages in the queue within the selected time range (the pull consumption mode refers to the `basicGet` command).
    Pull throughput
    Bytes/s
    Throughput of the client pulling messages in the queue within the selected time range (the pull consumption mode refers to the `basicGet` command).
    Acknowledgement speed
    Count/s
    The speed at which the client acknowledges the receipt of messages within the selected time frame.
    Negative acknowledgment speed
    Count/s
    The speed at which the client acknowledges no receipt of messages within the selected time frame.
    Redelivery speed
    Count/s
    The speed at which each queue in the cluster redelivers messages to the client when the client does not respond for a long time within the selected time range.
    Total messages
    Count
    Total number of messages received by each queue in the cluster at the current time.
    Total consumable messages
    Count
    Total number of unconsumed messages among messages received by each queue in the cluster at the current time.
    Delivered yet unacknowledged messages
    Count
    Total number of unacknowledged messages delivered by each queue in the cluster to the client at the current time.
    Average acknowledge duration
    ms
    Average time taken by the client to acknowledge queue messages within the selected time range.
    P95 acknowledgment duration
    ms
    95th percentile of the time taken by the client to acknowledge queue messages within the selected time range.
    P99 acknowledgment duration
    ms
    99th percentile of the time taken by the client to acknowledge queue messages within the selected time range.
    P999 acknowledgment duration
    ms
    99.9th percentile of the time taken by the client to acknowledge queue messages within the selected time range.
    Maximum acknowledgment duration
    ms
    Maximum time taken by the client to acknowledge queue messages within the selected time range.
    Consumers
    Count
    Number of clients connected to each queue in the cluster at the current time.
    Message storage usage
    Bytes
    Disk size used by messages in each queue in the cluster at the current time.
    Message heap size
    Bytes
    Size of unconsumed messages heaped in each queue in the cluster at the current time.
    Heaped messages
    Count
    Number of unconsumed messages heaped in each queue in the cluster at the current time.
    Delivery throughput
    Bytes/s
    Throughput of each queue in the cluster delivering messages to the client within the selected time range (the push consumption mode refers to the `basicConsume` command).
    Delivery speed
    Count/s
    The speed at which each queue in the cluster delivers messages to the client within the selected time range (the push consumption mode refers to the `basicConsume` command).

    Viewing Monitoring Data

    1. Log in to the TDMQ for RabbitMQ console.
    2. Select Cluster on the left sidebar, select a region, and click the ID of the target cluster to enter the cluster details page.
    3. At the top of the cluster details page, select the Monitoring tab to enter the monitoring page.
    4. Select the target resource and set the time range to view the corresponding monitoring data.

    Configuring Alarm Rule

    Creating alarm rule

    You can configure alarm rules for monitoring metrics. When a monitoring metric reaches the set alarm threshold, Cloud Monitor will notify you of exceptions in time via the configured notification channel.
    1. On the Monitoring page of the cluster, click the alarm icon below to enter the CM console and configure an alarm policy.
    2. On the alarm configuration page, select a policy type and instance, and set the alarm rule and notification template.
    Policy Type: Select TDMQ/RabbitMQ.
    Alarm Object: Select the RabbitMQ resource for which to configure the alarm policy.
    Trigger Condition: You can select Select template or Configure manually. The latter is selected by default. For more information on manual configuration, see the description below. For more information on how to create a template, see Creating trigger condition template.
    Note:
    Metric: For example, if you select 1 minute as the statistical period for the "average production duration" metric, then if the average production duration exceeds the threshold for N consecutive data points, an alarm will be triggered.
    Alarm Frequency: For example, "Alarm once every 30 minutes" means that there will be only one alarm triggered every 30 minutes if a metric exceeds the threshold in several consecutive statistical periods. Another alarm will be triggered only if the metric exceeds the threshold again in the next 30 minutes.
    Notification Template: You can select an existing notification template or create one to set the alarm recipient objects and receiving channels.
    3. Click Complete.
    Note:
    For more information on alarms, see Creating Alarm Policy.

    Creating trigger condition template

    1. Log in to the CM console.
    2. On the left sidebar, click Trigger Condition Template to enter the Template list page.
    3. Click Create on the Trigger Condition Template page.
    4. On the template creation page, configure the policy type.
    Policy Type: Select TDMQ/RabbitMQ.
    Use preset trigger condition: Select this option and the system recommended alarm policy will be displayed.
    5. After confirming that everything is correct, click Save.
    6. Return to alarm policy creation page and click Refresh. The alarm policy template just configured will be displayed.
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support