tencent cloud

Elastic MapReduce

Release Notes and Announcements
Release Notes
Announcements
Security Announcements
Product Introduction
Overview
Strengths
Architecture
Features
Use Cases
Constraints and Limits
Technical Support Scope
Product release
Purchase Guide
EMR on CVM Billing Instructions
EMR on TKE Billing Instructions
EMR Serverless HBase Billing Instructions
Getting Started
EMR on CVM Quick Start
EMR on TKE Quick Start
EMR on CVM Operation Guide
Planning Cluster
Administrative rights
Configuring Cluster
Managing Cluster
Managing Service
Monitoring and Alarms
TCInsight
EMR on TKE Operation Guide
Introduction to EMR on TKE
Configuring Cluster
Cluster Management
Service Management
Monitoring and Ops
Application Analysis
EMR Serverless HBase Operation Guide
EMR Serverless HBase Product Introduction
Quotas and Limits
Planning an Instance
Managing an Instance
Monitoring and Alarms
Development Guide
EMR Development Guide
Hadoop Development Guide
Spark Development Guide
Hbase Development Guide
Phoenix on Hbase Development Guide
Hive Development Guide
Presto Development Guide
Sqoop Development Guide
Hue Development Guide
Oozie Development Guide
Flume Development Guide
Kerberos Development Guide
Knox Development Guide
Alluxio Development Guide
Kylin Development Guide
Livy Development Guide
Kyuubi Development Guide
Zeppelin Development Guide
Hudi Development Guide
Superset Development Guide
Impala Development Guide
Druid Development Guide
TensorFlow Development Guide
Kudu Development Guide
Ranger Development Guide
Kafka Development Guide
Iceberg Development Guide
StarRocks Development Guide
Flink Development Guide
JupyterLab Development Guide
MLflow Development Guide
Practical Tutorial
Practice of EMR on CVM Ops
Data Migration
Practical Tutorial on Custom Scaling
API Documentation
History
Introduction
API Category
Cluster Resource Management APIs
Cluster Services APIs
User Management APIs
Data Inquiry APIs
Scaling APIs
Configuration APIs
Other APIs
Serverless HBase APIs
YARN Resource Scheduling APIs
Making API Requests
Data Types
Error Codes
FAQs
EMR on CVM
Service Level Agreement
Contact Us

RSS Uniffle Operation Guide

PDF
Focus Mode
Font Size
Last updated: 2026-01-14 14:31:32

Feature Introduction

Shuffle, as a core operator of big data computing, provides high coverage rate but witnesses high resource consumption and prominent stability problems, as the EMR customers and the user scale of Spark continuously grow, and containerized applications become popular.
Based on such background, Tencent Cloud Elastic MapReduce (EMR) has added a new RSS cluster type and supports the Uniffle component. It allows Spark clusters of both EMR on CVM and EMR on TKE versions to bind to RSS clusters with one click, providing remote Shuffle services for Spark jobs. This aims to improve the performance of large data Shuffle tasks or enhance task stability in node recycling scenarios.

Creating an RSS Cluster

1. Log in to the EMR console, click Creating a Cluster on the EMR on CVM cluster list page, and select RSS as the cluster type when creating the cluster. You need to configure the software, region and hardware, basic settings, and confirm the configuration information in 4 steps.
2. In the software configuration step, carefully select the region, as Shuffle services do not support cross-region operations.
3. In the region and hardware configuration step, the minimum number of nodes for a high-availability cluster is 4, including 2 master nodes and at least 2 core nodes. Non-HA clusters have only one replica stored and is suitable for testing but not recommended for production environments.
Note:
1. Evaluate the memory configuration, disk IO bandwidth, etc., for the new RSS cluster based on the number of parallel Spark SQL tasks and the Shuffle data volume of the tasks.
2. An RSS cluster supports shared use for all Spark clusters in the same region.
4. After completing the above 4 configuration steps, click Purchase to make the payment. Once the payment is successful, the EMR cluster will enter the creation process. You can find the newly created cluster in the EMR console in about 10 minutes.

Operation Steps

Using Uniffle in Spark Clusters

Using Uniffle in EMR on CVM Spark Clusters

1. In the EMR on CVM cluster list, select the cluster that needs to enable the Shuffle service, and click Manage under Service.
2. Select Cluster Service, and in the Spark Service panel, click Operation and then Set Shuffle Service.
3. Enable the service and select the required RSS cluster ID that is to be bound.
By default, spark.shuffle.manager is disabled. If it is enabled, all submitted Spark tasks will use the RSS-Shuffle service. It is recommended to enable it at the task level. The steps are as follows:
1. Modify the spark-defaults.conf configuration file for the Spark component on the Spark configuration management page.
spark.shuffle.manager=org.apache.spark.shuffle.RssShuffleManager
spark.driver.extraClassPath=/usr/local/service/spark/rss-client/* (If this parameter already exists, separate multiple values with a colon :)
spark.executor.extraClassPath=/usr/local/service/spark/rss-client/* (If this parameter already exists, separate multiple values with a colon :)
2. Spark dynamic allocation.
2.1 If you use dynamic resource allocation (spark.dynamicAllocation.enabled=true) with Spark and need to use Uniffle, you need to apply a patch on Spark according to the specific version, and submit a ticket.
2.2 Modify the spark-defaults.conf configuration file: spark.shuffle.service.enabled=false.
Note:
If the spark version is 3.5 or above, you need to additionally configure the parameter spark.shuffle.sort.io.plugin.class=org.apache.spark.shuffle.RssShuffleDataIo.

Disabling Uniffle in EMR on CVM Spark Clusters

1. In the Spark Service panel, click Operation, Set Shuffle Service, and click the button to disable the service.
2. Remove the Spark parameter spark.shuffle.manager=org.apache.spark.shuffle.RssShuffleManager on the configuration management page and apply the change.

Using Uniffle in EMR on TKE Spark Clusters

The operations are the same as those for using Uniffle in EMR on CVM spark clusters, but the paths for associating and unbinding the RSS cluster are different. The operation paths are as follows:
1. In the EMR on TKE cluster list, select the cluster that needs to enable the Shuffle service, and click the cluster ID to enter the Cluster Information page.
2. Select Cluster Service. In cluster service Spark Service, click Operation in the upper right corner, and click Set Shuffle Service.
3. Enable the service and select the required RSS cluster ID that is to be bound.

Help and Support

Was this page helpful?

Help us improve! Rate your documentation experience in 5 mins.

Feedback