tencent cloud

Flink Integration
Last updated: 2024-01-29 15:55:07
Flink Integration
Last updated: 2024-01-29 15:55:07

Overview

When using Flink, you need to monitor its task running status to know whether the tasks run normally and troubleshoot faults. TMP integrates Pushgateway to allow Flink to write metrics and provides an out-of-the-box Grafana monitoring dashboard for it.

Prerequisites

1. The EMR product you purchased includes the Flink component, and a Flink task is running in your instance.
2. You have created a TKE cluster in the region and VPC of your TMP instance.

Directions

Integrating product

Getting Pushgateway access configuration

1. Log in to the EMR console, select the corresponding instance, and select Basic Info > Instance Info to get the Pushgateway address and token.


2. Get the APPID on the Account Info page.
1. Log in to the EMR console, select the corresponding instance, and select Cluster Service.
2. Find the Flink configuration item and select Configuration Management in the Operation column on the right to enter the configuration management page.
3. On the right of the page, click Add Configuration Item and add the following configuration items one by one:
Configuration Item
Default Value
Data Type
Description
Suggestion
metrics.reporter.promgateway.class
None
String
Name of the Java class for exporting metrics to Pushgateway
-
metrics.reporter.promgateway.jobName
None
String
Push task name
Specify an easily understandable string
metrics.reporter.promgateway.randomJobNameSuffix
true
Boolean
Whether to add a random string after the task name
Set it to `true`. If no random string is added, metrics of different Flink tasks will overwrite each other
metrics.reporter.promgateway.groupingKey
None
String
Global label added to each metric in the format of `k1=v1;k2=v2`
Add the EMR instance ID to distinguish between the data of different instances, such as `instance_id=emr-xxx`
metrics.reporter.promgateway.interval
None
Time
Time interval for pushing metrics, such as 30s
We recommend you set the value to about 1 minute
metrics.reporter.promgateway.host
None
String
Pushgateway service address
It is the service address of the TMP instance in the console
metrics.reporter.promgateway.port
-1
Integer
Pushgateway service port
It is the port of the TMP instance in the console
metrics.reporter.promgateway.needBasicAuth
false
Boolean
Whether the Pushgateway service requires authentication
Set it to `true`, as the Pushgateway of TMP requires authentication
metrics.reporter.promgateway.user
None
String
Username for authentication
It is your `APPID`
metrics.reporter.promgateway.password
None
String
Password for authentication
It is the access token of the TMP instance in the console
metrics.reporter.promgateway.deleteOnShutdown
true
Boolean
Whether to delete the corresponding metrics on the Pushgateway after the Flink task is completed
Set it to `true`
Below is a sample configuration:
metrics.reporter.promgateway.class: org.apache.flink.metrics.prometheus.PrometheusPushGatewayReporter
metrics.reporter.promgateway.jobName: climatePredict
metrics.reporter.promgateway.randomJobNameSuffix:true
metrics.reporter.promgateway.interval: 60 SECONDS
metrics.reporter.promgateway.groupingKey:instance_id=emr-xxxx
metrics.reporter.promgateway.host: 172.xx.xx.xx
metrics.reporter.promgateway.port: 9090
metrics.reporter.promgateway.needBasicAuth: true
metrics.reporter.promgateway.user: appid
metrics.reporter.promgateway.password: token
The Pushgateway plugin in the official package currently does not support configuring the authentication information, but TMP requires authentication before data can be written. Therefore, we recommend you use the JAR package we provide. We have also submitted a pull request for supporting authentication to the Flink team.
1. To prevent class conflicts, if you have already used the official Flink plugin, run the following command to delete it first:
cd /usr/local/service/flink/lib
rm flink-metrics-prometheus*jar
2. In the EMR console, select the corresponding instance, and select Cluster Resource > Resource Management > Master to view the master node.
3. Click the instance ID to go to the CVM console, log in to the CVM instance, and run the following command to install the plugin:
cd /usr/local/service/flink/lib
wget https://rig-1258344699.cos.ap-guangzhou.myqcloud.com/flink/flink-metrics-prometheus_2.11-auth.jar -O flink-metrics-prometheus_2.11-auth.jar

Verifying

1. Run the flink run command on the master node to submit a new task and view the task log:
grep metrics /usr/local/service/flink/log/flink-hadoop-client-*.log
2. If the log contains the following content, the configuration is successfully loaded:


Note:
As tasks previously submitted in the cluster use the old configuration file, their metrics are not reported.

Viewing monitoring information

1. In Integration Center in the target TMP instance, find Flink monitoring, install the corresponding Grafana dashboard, and then you can enable the Flink monitoring dashboard.
2. Enter Grafana and click

to expand the Flink monitoring panel.


3. Click Flink Job List to view the monitoring information.


4. Click a job name or job ID in the table to view the job monitoring details.


5. Click Flink Cluster in the top-right corner to view the Flink cluster monitoring information.


6. Click a task name in the table to view the task monitoring details.



Integrating with alert feature

1. Log in to the TMP console and select the target TMP instance to enter the management page.
2. Click Alerting Rule and add the corresponding alerting rules. For more information, please see Creating Alerting Rule.
Was this page helpful?
You can also Contact Sales or Submit a Ticket for help.
Yes
No

Feedback