tencent cloud

Feedback

Flink Integration

Last updated: 2021-12-17 10:18:35

    Overview

    When using Flink, you need to monitor its task running status to know whether the tasks run normally and troubleshoot faults. TMP integrates Pushgateway to allow Flink to write metrics and provides an out-of-the-box Grafana monitoring dashboard for it.

    Prerequisites

    1. The EMR product you purchased includes the Flink component, and a Flink task is running in your instance.
    2. You have created a TKE cluster in the region and VPC of your TMP instance.

    Directions

    Integrating product

    Getting Pushgateway access configuration

    1. Log in to the EMR console, select the corresponding instance, and select Basic Info > Instance Info to get the Pushgateway address and token.
    2. Get the APPID on the Account Info page.
    1. Log in to the EMR console, select the corresponding instance, and select Cluster Service.
    2. Find the Flink configuration item and select Configuration Management in the Operation column on the right to enter the configuration management page.
    3. On the right of the page, click Add Configuration Item and add the following configuration items one by one:
      Configuration Item Default Value Data Type Description Suggestion
      metrics.reporter.promgateway.class None String Name of the Java class for exporting metrics to Pushgateway -
      metrics.reporter.promgateway.
      jobName
      None String Push task name Specify an easily understandable string
      metrics.reporter.promgateway.
      randomJobNameSuffix
      true Boolean Whether to add a random string after the task name Set it to `true`. If no random string is added, metrics of different Flink tasks will overwrite each other
      metrics.reporter.promgateway.
      groupingKey
      None String Global label added to each metric in the format of `k1=v1;k2=v2` Add the EMR instance ID to distinguish between the data of different instances, such as `instance_id=emr-xxx`
      metrics.reporter.promgateway.
      interval
      None Time Time interval for pushing metrics, such as 30s We recommend you set the value to about 1 minute
      metrics.reporter.promgateway.host None String Pushgateway service address It is the service address of the TMP instance in the console
      metrics.reporter.promgateway.port -1 Integer Pushgateway service port It is the port of the TMP instance in the console
      metrics.reporter.promgateway.
      needBasicAuth
      false Boolean Whether the Pushgateway service requires authentication Set it to `true`, as the Pushgateway of TMP requires authentication
      metrics.reporter.promgateway.user None String Username for authentication It is your `APPID`
      metrics.reporter.promgateway.
      password
      None String Password for authentication It is the access token of the TMP instance in the console
      metrics.reporter.promgateway.
      deleteOnShutdown
      true Boolean Whether to delete the corresponding metrics on the Pushgateway after the Flink task is completed Set it to `true`
      Below is a sample configuration:
      metrics.reporter.promgateway.class: org.apache.flink.metrics.prometheus.PrometheusPushGatewayReporter
      metrics.reporter.promgateway.jobName: climatePredict
      metrics.reporter.promgateway.randomJobNameSuffix:true
      metrics.reporter.promgateway.interval: 60 SECONDS
      metrics.reporter.promgateway.groupingKey:instance_id=emr-xxxx
      metrics.reporter.promgateway.host: 172.xx.xx.xx
      metrics.reporter.promgateway.port: 9090
      metrics.reporter.promgateway.needBasicAuth: true
      metrics.reporter.promgateway.user: appid
      metrics.reporter.promgateway.password: token
      

    The Pushgateway plugin in the official package currently does not support configuring the authentication information, but TMP requires authentication before data can be written. Therefore, we recommend you use the JAR package we provide. We have also submitted a pull request for supporting authentication to the Flink team.

    1. To prevent class conflicts, if you have already used the official Flink plugin, run the following command to delete it first:
      cd /usr/local/service/flink/lib
      rm flink-metrics-prometheus*jar
      
    2. In the EMR console, select the corresponding instance, and select Cluster Resource > Resource Management > Master to view the master node.
    3. Click the instance ID to go to the CVM console, log in to the CVM instance, and run the following command to install the plugin:
      cd /usr/local/service/flink/lib
      wget https://rig-1258344699.cos.ap-guangzhou.myqcloud.com/flink/flink-metrics-prometheus_2.11-auth.jar -O flink-metrics-prometheus_2.11-auth.jar
      

    Verifying

    1. Run the flink run command on the master node to submit a new task and view the task log:

      grep metrics /usr/local/service/flink/log/flink-hadoop-client-*.log
      
    2. If the log contains the following content, the configuration is successfully loaded:

      Note:

      As tasks previously submitted in the cluster use the old configuration file, their metrics are not reported.

    Viewing monitoring information

    1. In Integration Center in the target TMP instance, find Flink monitoring, install the corresponding Grafana dashboard, and then you can enable the Flink monitoring dashboard.
    2. Enter Grafana and click to expand the Flink monitoring panel.
    3. Click Flink Job List to view the monitoring information.
    4. Click a job name or job ID in the table to view the job monitoring details.
    5. Click Flink Cluster in the top-right corner to view the Flink cluster monitoring information.
    6. Click a task name in the table to view the task monitoring details.

    Integrating with alert feature

    1. Log in to the TMP console and select the target TMP instance to enter the management page.
    2. Click Alerting Rule and add the corresponding alerting rules. For more information, please see Creating Alerting Rule.
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support