tencent cloud

Elastic MapReduce

Release Notes and Announcements
Release Notes
Announcements
Security Announcements
Product Introduction
Overview
Strengths
Architecture
Features
Use Cases
Constraints and Limits
Technical Support Scope
Product release
Purchase Guide
EMR on CVM Billing Instructions
EMR on TKE Billing Instructions
EMR Serverless HBase Billing Instructions
Getting Started
EMR on CVM Quick Start
EMR on TKE Quick Start
EMR on CVM Operation Guide
Planning Cluster
Administrative rights
Configuring Cluster
Managing Cluster
Managing Service
Monitoring and Alarms
TCInsight
EMR on TKE Operation Guide
Introduction to EMR on TKE
Configuring Cluster
Cluster Management
Service Management
Monitoring and Ops
Application Analysis
EMR Serverless HBase Operation Guide
EMR Serverless HBase Product Introduction
Quotas and Limits
Planning an Instance
Managing an Instance
Monitoring and Alarms
Development Guide
EMR Development Guide
Hadoop Development Guide
Spark Development Guide
Hbase Development Guide
Phoenix on Hbase Development Guide
Hive Development Guide
Presto Development Guide
Sqoop Development Guide
Hue Development Guide
Oozie Development Guide
Flume Development Guide
Kerberos Development Guide
Knox Development Guide
Alluxio Development Guide
Kylin Development Guide
Livy Development Guide
Kyuubi Development Guide
Zeppelin Development Guide
Hudi Development Guide
Superset Development Guide
Impala Development Guide
Druid Development Guide
TensorFlow Development Guide
Kudu Development Guide
Ranger Development Guide
Kafka Development Guide
Iceberg Development Guide
StarRocks Development Guide
Flink Development Guide
JupyterLab Development Guide
MLflow Development Guide
Practical Tutorial
Practice of EMR on CVM Ops
Data Migration
Practical Tutorial on Custom Scaling
API Documentation
History
Introduction
API Category
Cluster Resource Management APIs
Cluster Services APIs
User Management APIs
Data Inquiry APIs
Scaling APIs
Configuration APIs
Other APIs
Serverless HBase APIs
YARN Resource Scheduling APIs
Making API Requests
Data Types
Error Codes
FAQs
EMR on CVM
Service Level Agreement
Contact Us

Configuring Capacity Scheduler

PDF
Modo Foco
Tamanho da Fonte
Última atualização: 2023-12-27 14:32:16

Overview

Capacity Scheduler organizes resources in a hierarchical manner, allowing multiple users to share cluster resources based on multi-level resource restrictions.

Directions

Creating a resource pool

1. Log in to the EMR console and click Details of the target Hadoop cluster in the cluster list to go to the cluster details page.
2. On the cluster details page, select Cluster Service and choose Operation > Resource Scheduling in the top-right corner of the YARN component block to go to the Resource Scheduling page.


3. Toggle on Resource scheduler and configure the scheduler.
4. Create a resource pool for Capacity Scheduler. Select Capacity Scheduler and click Create Resource Pool to create a resource pool. You can edit or clone an existing resource pool as well as create a subpool for it. You can also click Default settings to set the number of times of delayed scheduling.



Fields and parameters:
Field Name
Parameter
Description
Resource Pool Name
yarn.scheduler.capacity.<queue-path>.queues</queue-path>
Name of the resource pool or queue
Label Settings
N/A
Set the specified label that can be accessed by the queue.
Capacity
yarn.scheduler.capacity.<queue-path>.capacity</queue-path>
Available resource amount. The total capacity of the subpools of a parent pool is 100. Available resource amount = resource amount of the parent pool * percentage set here. The queue can consume more resources than the queue's capacity if there are idle resources in other queues.
Max Capacity
yarn.scheduler.capacity.<queue-path>.maximum-capacity</queue-path>
Maximum queue capacity in percentage. Because of resource sharing, the amount of resources used by a queue may exceed its capacity, and this field specifies the maximum amount of resources that can be used by the queue.
Default Label Expression
yarn.scheduler.capacity.<queue-path>.default-node-label-expression</queue-path>
If a resource request does not have a node label specified, the application will be submitted to the corresponding partition specified by this configuration item. By default, the value is empty, i.e., applications will be allocated to containers on nodes with no label.
Min User Capacity
yarn.scheduler.capacity.<queue-path>.minimum-user-limit-percent</queue-path>
Minimum resources in percentage guaranteed for each user. Each queue enforces a limit on the percentage of resources allocated to a user at any given time. When multiple users' applications are running in a queue concurrently, the amount of resources used by each user varies between a minimum and maximum value. The minimum value depends on the number of running applications, and the maximum value is determined by `minimum-user-limit-percent`.
User Resource Factor
yarn.scheduler.capacity.<queue-path>.user-limit-factor</queue-path>
Maximum amount of resources in percentage that can be used by each user. For example, if the value is `30`, the amount of resources for each user cannot exceed 30% of the queue capacity at any given time.
Max Memory per Container
yarn.scheduler.capacity.<queue-path>.maximum-allocation-mb</queue-path>
Maximum memory that can be allocated to each container. The value will overwrite and cannot be greater than that of the system's `yarn.scheduler.maximum-allocation-mb`.
Max vCores per Container
yarn.scheduler.capacity.<queue-path>.maximum-allocation-vcores</queue-path>
Maximum number of CPU cores that can be allocated to each container. The value will overwrite and cannot be greater than that of the system's `yarn.scheduler.maximum-allocation-vcores`.
Resource Pool Status
yarn.scheduler.capacity.<queue-path>.state</queue-path>
Status of the queue. The value can be `Running` or `Stopped`. If a queue is in the `Stopped` status, new applications cannot be submitted to it or any of its subqueues.
Max Apps
yarn.scheduler.capacity.<queue-path>.maximum-applications</queue-path>
Maximum number of concurrent active (both running and pending) applications allowed in the system
Max Resources for AM
yarn.scheduler.capacity.<queue-path>.maximum-am-resource-percent</queue-path>
Maximum percentage of resources in the cluster which can be used to run application masters. It controls the number of concurrent active applications.
Resource Pool Priority
yarn.scheduler.capacity.root.<leaf-queue-path>.default-application-priority</leaf-queue-path>
Configure the priority of the resource queue, which is `0` by default. The larger the value, the higher the priority.
Submission
yarn.scheduler.capacity.root.<queue-path>.acl_submit_applications
List of users that can submit apps to the queue
Management
yarn.scheduler.capacity.root.<queue-path>.acl_administer_queue
List of users that can manage the queue
Delayed Scheduling
yarn.scheduler.capacity.node-locality-delay
Set the allowed number of times of delayed scheduling to ensure the local execution of tasks. If the value is `-1`, delayed scheduling will be disabled.

Configuring resource pool mappings

1. In the Policy Settings section, choose Resource Pool Mappings > Create Resource Pool Mapping to create a resource pool mapping.




2. Configure Overwrite Specified Queues. This feature is disabled by default. For example, you have defined a mapped queue in Resource Pool Mappings and specified a queue other than the mapped queue when submitting a job; if the specified queue is default and Overwrite Specified Queues is enabled, the mapped queue will be used; otherwise, the specified queue will be used.

Sample label-based scheduling

1. Log in to the EMR console and click Details of the target Hadoop cluster in the cluster list to go to the cluster details page.
2. On the cluster details page, select Cluster Service and choose Operation > Resource Scheduling in the top-right corner of the YARN component block to go to the Resource Scheduling page.
3. Toggle on Resource scheduler and select Capacity Scheduler.
4. Toggle on Label-based scheduling and click Label management to go to the label management** page.


5. Click Create Label, enter the label name, and set the label type and node to be bound to as needed.


6. After the label is set, click Apply. Then, you can view and edit the resource queue of the label in the resource pool.


7. On the Resource Scheduling page, click Create Resource Pool and set Label, Capacity, Max Capacity, and other parameters as needed.
Note
The capacity and maximum capacity of a resource pool can be configured by label.

8. After the resource pool is set, click Apply to submit the task to the backend.
Caution
Proceed with caution when restarting the ResourceManager. If you are prompted that the ResourceManager will be restarted after you click Apply, check whether the operation is successful in Scheduling History and whether the ResourceManager is healthy in Role Management.

Ajuda e Suporte

Esta página foi útil?

comentários