Release Notes and Announcements
- Release Notes
- Announcements
- Security Announcements
Product Introduction
- Overview
- Strengths
- Architecture
- Features
- Use Cases
- Constraints and Limits
- Technical Support Scope
- Product release
Purchase Guide
- EMR on CVM Billing Instructions
- EMR on TKE Billing Instructions
- EMR Serverless HBase Billing Instructions
- EMR Serverless TCBase Billing Overview
Getting Started
- EMR on CVM Quick Start
- EMR on TKE Quick Start
EMR on CVM Operation Guide
- Planning Cluster
- Administrative rights
- Configuring Cluster
- Managing Cluster
- Managing Service
- Monitoring and Alarms
- TCInsight
EMR on TKE Operation Guide
- Introduction to EMR on TKE
- Configuring Cluster
- Cluster Management
- Service Management
- Monitoring and Ops
- Application Analysis
EMR Serverless HBase Operation Guide
- EMR Serverless HBase Product Introduction
- Quotas and Limits
- Planning an Instance
- Managing an Instance
- Monitoring and Alarms
- Development Guide
EMR Serverless TCBase Operation Guide
- Introduction to EMR Serverless TCBase
- Managing Instances
- Managing Services
- Monitoring and Alarms
EMR Development Guide
- Hadoop Development Guide
- Spark Development Guide
- Hbase Development Guide
- Phoenix on Hbase Development Guide
- Hive Development Guide
- Presto Development Guide
- Sqoop Development Guide
- Hue Development Guide
- Oozie Development Guide
- Flume Development Guide
- Kerberos Development Guide
- Knox Development Guide
- Alluxio Development Guide
- Kylin Development Guide
- Livy Development Guide
- Kyuubi Development Guide
- Zeppelin Development Guide
- Hudi Development Guide
- Superset Development Guide
- Impala Development Guide
- Druid Development Guide
- TensorFlow Development Guide
- Kudu Development Guide
- Ranger Development Guide
- Kafka Development Guide
- StarRocks Development Guide
- Flink Development Guide
- JupyterLab Development Guide
- MLflow Development Guide
Practical Tutorial
- Practice of EMR on CVM Ops
- Data Migration
- Practical Tutorial on Custom Scaling
API Documentation
- History
- Introduction
- API Category
- Cluster Resource Management APIs
- Cluster Services APIs
- User Management APIs
- Data Inquiry APIs
- Scaling APIs
- Configuration APIs
- Other APIs
- Serverless HBase APIs
- YARN Resource Scheduling APIs
- Making API Requests
- Data Types
- Error Codes
FAQs
- EMR on CVM
Service Level Agreement
Contact Us

EMR on TKE Quick Start

Download

フォーカスモード

フォントサイズ

最終更新日: 2025-08-21 16:42:02

This document introduces the complete process of quickly creating an EMR on TKE cluster through the EMR console, submitting a job, and viewing the results.
Preparations
1. Before using an EMR cluster, you need to register a Tencent Cloud account and complete identity verification. For detailed directions, see Enterprise Identity Verification Guide.
2. Grant the system default role EMR_QCSRole to the service account for EMR. For detailed directions, see Role Authorization.
3. Complete the authorization of the service account for EMR with the relevant roles. For detailed directions, see Administrative Privileges.
4. For online account recharge, EMR on TKE offers pay-as-you-go billing. Before creating a cluster, you need to recharge your account balance to ensure it is greater than or equal to the configuration fees required for cluster creation, excluding vouchers,  and other promotions. For detailed instructions, see Top-up.
Creating Clusters
Log in to the EMR Console, click Create Cluster on the EMR on TKE cluster list page, and complete the relevant configuration on the purchase page. When the cluster status shows Running, it indicates that the cluster has been successfully created.
Configuration Item
Configuration Items Description
Example
Cluster name
The name of the cluster, which is customizable.
EMR-7sx2aqmu
Region
The physical data center where the cluster is deployed.
Note: Once the cluster is created, the region cannot be changed, so choose carefully.
Beijing, Shanghai, Guangzhou, Nanjing, and Singapore.
Container type
The service role is deployed by using resources provided by the container layer, supporting both TKE General and TKE Serverless clusters.
TKE
Cluster network and subnet
Used for purchasing a db. It is necessary to ensure that the EMR cluster network is consistent with the container cluster network.
Guangzhou Zone 7.
Security group
Configure security groups at the cluster level.
Create a security group.
Billing mode
Billing mode for cluster deployment.
 Pay-as-You-go
Cluster Type
The data lake and machine learning cluster types are supported. The default is data lake cluster type.
Data lake
Product version
The components and their versions bundled with different product versions vary.
EMR-TKE1.0.0 includes Hadoop 2.8.5 and Spark 3.2.1.
Deployment service
Optional components that can be customized and combined based on your needs. Select at least one component.
Hive-2.3.9 and Impala-3.4.1.
COS bucket
Used for storing logs, JAR packages, and other information.
-
Set Password
Set the webUI password. The current password is only used to initially set up the service webUI access password.
8-16 characters, including uppercase letters, lowercase letters, numbers, and special characters. Special characters supported are !@%^*, and the password cannot start with a special character.
Submitting Jobs and Viewing Results
After the cluster is successfully created, you can create and submit jobs on that cluster. This document takes submitting Kyuubi Spark and Hive on Spark jobs and viewing job information as examples. The operations are as follows.
Hue Submission
1. Click the corresponding Cluster ID/Name in the cluster list to enter the cluster details page.
2. In the cluster details page, click Cluster Services and select Hue.
3. In the role management page, open the More dropdown in the action column, click Enable Network Access, then select Public Network LB and click Confirm Enable. Once the process is completed, the public network LB for the pod where hue is located will be successfully created.
4. Click View Info/View WebUI in the upper right corner to view the access address for Hue, and click Access Hue WebUI.
5. Authenticate to enter the Hue page. Typically, the authentication user is root, and the password is the one set during cluster creation.
6. Use the Hive tab to submit a Hive on Spark task.
7. Use SparkSql_Kyuubi to submit a SparkSQL task.
Hive on Spark table creation and queries:
﻿
﻿
﻿

Kyuubi queries:
﻿
﻿
JDBC Submission for Hive Spark
1. If you need to connect to HiveServer using a public IP address, go to Cluster Services > Hive > HiveServer2 > Operations > More > Enable Network Access, and enable public network access for HiveServer2.
2. If using a public network connection, you need to check the security group in Cluster Information, then go to CVM > Security Groups and edit the security group to allow client IP access to port 7001. If using a private network connection, you can skip steps 1 and 2.
Using Maven to Write JDBC Code
First, add the following dependencies required for JDBC in the pom.xml file:
 <dependency>
      <groupId>org.apache.hive</groupId>
      <artifactId>hive-jdbc</artifactId>
      <version>2.3.7</version>
 </dependency>
 <dependency>
      <groupId>org.apache.hadoop</groupId>
      <artifactId>hadoop-common</artifactId>
      <version>2.8.5</version>
 </dependency>
Add the following packaging and compilation plugins:
<build>
<plugins>
 <plugin>
   <groupId>org.apache.maven.plugins</groupId>
   <artifactId>maven-compiler-plugin</artifactId>
   <configuration>
     <source>1.8</source>
     <target>1.8</target>
     <encoding>utf-8</encoding>
   </configuration>
 </plugin>
 <plugin>
   <artifactId>maven-assembly-plugin</artifactId>
   <configuration>
     <descriptorRefs>
     <descriptorRef>jar-with-dependencies</descriptorRef>
     </descriptorRefs>
   </configuration>
   <executions>
     <execution>
       <id>make-assembly</id>
       <phase>package</phase>
       <goals>
         <goal>single</goal>
       </goals>
     </execution>
   </executions>
 </plugin>
</plugins>
</build>
Create a file named HiveJdbcTest.java as follows:
package org.apache.hive;
import java.sql.*;
﻿
﻿
/**
 * Created by tencent on 2023/6/20.
 */
public class HiveJdbcTest {
  private static String driverName =
          "org.apache.hive.jdbc.HiveDriver";
  public static void main(String[] args)
          throws SQLException {
    try {
      Class.forName(driverName);
    } catch (ClassNotFoundException e) {
      e.printStackTrace();
      System.exit(1);
    }
    Connection con = DriverManager.getConnection(
            "jdbc:hive2://$hs2host:7001/test_db", "hadoop", "");
    Statement stmt = con.createStatement();
    String tableName = "test_jdbc";
    stmt.execute("drop table if exists " + tableName);
    stmt.execute("create table " + tableName +
            " (key int, value string)");
    System.out.println("Create table success!");
    // show tables
    String sql = "show tables '" + tableName + "'";
    System.out.println("Running: " + sql);
    ResultSet res = stmt.executeQuery(sql);
    if (res.next()) {
      System.out.println(res.getString(1));
    }
    // describe table
    sql = "describe " + tableName;
    System.out.println("Running: " + sql);
    res = stmt.executeQuery(sql);
    while (res.next()) {
      System.out.println(res.getString(1) + "\\t" + res.getString(2));
    }
    sql = "insert into " + tableName + " values (42,\\"hello\\"),(48,\\"world\\")";
    stmt.execute(sql);
    sql = "select * from " + tableName;
    System.out.println("Running: " + sql);
    res = stmt.executeQuery(sql);
    while (res.next()) {
      System.out.println(String.valueOf(res.getInt(1)) + "\\t"
              + res.getString(2));
    }
    sql = "select count(1) from " + tableName;
    System.out.println("Running: " + sql);
    res = stmt.executeQuery(sql);
    while (res.next()) {
      System.out.println(res.getString(1));
    }
  }
}
Replace $hs2host in the code with your HiveServer2 address.This program will create a test_jdbc table in test_db, insert two records, and query to output the data. Run the following command to package the entire project:
mvn package
Uploading JAR and Running
Upload the JAR packaged by using the above command to a machine that can access the HiveServer2 service or to your local machine (if it is local, ensure it can access HiveServer2 properly), and run it using the following command:
java -classpath ${package}-jar-with-dependencies.jar org.apache.hive.HiveJdbcTest
Package is your custom artifactId-version. The results are as follows:
Create table success!
Running: show tables 'test_jdbc'
test_jdbc
Running: describe test_jdbc
key     int
value   string
﻿
Running: select * from test_jdbc
42      hello
48      world
Running: select count(1) from test_jdbc
2
JDBC Submission for Hive Spark
1. If you need to connect to HiveServer using a public IP address, go to Cluster Services > Kyuubi > KyuubiServer > Operations > More > Enable Network Access, and enable public network access for HiveServer2.
2. If using a public network connection, you need to check the security group in Cluster Information, then go to CVM > Security Groups and edit the security group to allow client IP access to port 10009. If using a private network connection, you can skip steps 1 and 2.
Using Maven to Write JDBC Code
The JDBC dependencies and packaging plugin configurations are the same as in JDBC Submission for Hive Spark. Create KyuubiJdbcTest.java with the following content:
package org.apache.hive;
import java.sql.*;
package org.apache.hive;
import java.sql.*;
﻿
﻿
/**
 * Created by tencent on 2023/6/20.
 */
public class KyuubiJdbcTest {
  private static String driverName =
          "org.apache.hive.jdbc.HiveDriver";
  public static void main(String[] args)
          throws SQLException {
    try {
      Class.forName(driverName);
    } catch (ClassNotFoundException e) {
      e.printStackTrace();
      System.exit(1);
    }
    Connection con = DriverManager.getConnection(
            "jdbc:hive2://$kyuubihost:10009/test_db", "hadoop", "");
    Statement stmt = con.createStatement();
    String tableName = "test_kyuubi";
    stmt.execute("drop table if exists " + tableName);
    stmt.execute("create table " + tableName +
            " (key int, value string)");
    System.out.println("Create table success!");
    // show tables
    String sql = "show tables '" + tableName + "'";
    System.out.println("Running: " + sql);
    ResultSet res = stmt.executeQuery(sql);
    if (res.next()) {
      System.out.println(res.getString(1));
    }
    // describe table
    sql = "describe " + tableName;
    System.out.println("Running: " + sql);
    res = stmt.executeQuery(sql);
    while (res.next()) {
      System.out.println(res.getString(1) + "\\t" + res.getString(2));
    }
    sql = "insert into " + tableName + " values (42,\\"hello\\"),(48,\\"world\\")";
    stmt.execute(sql);
    sql = "select * from " + tableName;
    System.out.println("Running: " + sql);
    res = stmt.executeQuery(sql);
    while (res.next()) {
      System.out.println(String.valueOf(res.getInt(1)) + "\\t"
              + res.getString(2));
    }
    sql = "select count(1) from " + tableName;
    System.out.println("Running: " + sql);
    res = stmt.executeQuery(sql);
    while (res.next()) {
      System.out.println(res.getString(1));
    }
   }
  }
Replace $kyuubihost in the code with your KyuubiServer address.This program will create a test_jdbc table in test_db, insert two records, and query to output the data. Run the following command to package the entire project:
mvn package
Uploading JAR and Running
The upload process is the same as in JDBC Submission for Hive Spark. Run KyuubiJdbcTest using the following command:
#### Uploading JAR and Running
The upload process is the same as in JDBC Submission for Hive Spark. Run KyuubiJdbcTest using the following command:
java -classpath ${package}-jar-with-dependencies.jar org.apache.hive.HiveJdbcTest
﻿
Package is your custom artifactId-version. The results are as follows:
<<p243>Create table success!
Running: show tables 'test_kyuubi'
test_db
Running: describe test_kyuubi
key     int
value   string
Running: select * from test_kyuubi
42      hello
48      world
Running: select count(1) from test_kyuubi
Terminating Clusters
When the created cluster is no longer needed, you can terminate the cluster and return the resources. Terminating the cluster will forcibly stop all services provided by the cluster and release the resources.
On the EMR on TKE page, select Terminate from the More options for the target cluster. In the pop-up dialog box, click Terminate Now.
﻿

ヘルプとサポート

この記事はお役に立ちましたか？

営業担当者にお問い合わせいただくかチケットを提出してサポートを求めることができます。

フィードバック

tencent cloud

Elastic MapReduce

EMR on TKE Quick Start

Preparations

Creating Clusters

Submitting Jobs and Viewing Results

Hue Submission

JDBC Submission for Hive Spark

Using Maven to Write JDBC Code

Uploading JAR and Running

JDBC Submission for Hive Spark

Using Maven to Write JDBC Code

Uploading JAR and Running

Terminating Clusters

ヘルプとサポート

Configuration Item	Configuration Items Description	Example
Cluster name	The name of the cluster, which is customizable.	EMR-7sx2aqmu
Region	The physical data center where the cluster is deployed. Note: Once the cluster is created, the region cannot be changed, so choose carefully.	Beijing, Shanghai, Guangzhou, Nanjing, and Singapore.
Container type	The service role is deployed by using resources provided by the container layer, supporting both TKE General and TKE Serverless clusters.	TKE
Cluster network and subnet	Used for purchasing a db. It is necessary to ensure that the EMR cluster network is consistent with the container cluster network.	Guangzhou Zone 7.
Security group	Configure security groups at the cluster level.	Create a security group.
Billing mode	Billing mode for cluster deployment.	Pay-as-You-go
Cluster Type	The data lake and machine learning cluster types are supported. The default is data lake cluster type.	Data lake
Product version	The components and their versions bundled with different product versions vary.	EMR-TKE1.0.0 includes Hadoop 2.8.5 and Spark 3.2.1.
Deployment service	Optional components that can be customized and combined based on your needs. Select at least one component.	Hive-2.3.9 and Impala-3.4.1.
COS bucket	Used for storing logs, JAR packages, and other information.	-
Set Password	Set the webUI password. The current password is only used to initially set up the service webUI access password.	8-16 characters, including uppercase letters, lowercase letters, numbers, and special characters. Special characters supported are !@%^*, and the password cannot start with a special character.