HDFS federation management is an HDFS federated cluster deployment and management feature, including NameService management and mount table management. Federation management is supported for Hadoop-type clusters in HA mode. There are two federation types to choose from: ViewFs federation and router-based federation, and the federation type cannot be changed once selected. A router node will be used to deploy an added NameNode. This router node does not support termination and role start/stop at the node level.
- The HDFS federation management feature is currently made available through an allowlist. To use it, submit a ticket for application.
- All EMR versions support ViewFs federation. As only HDFS v2.9.0 and later support router-based federation, only EMR v3.x.x and later support router-based federation.
- Suspending the NameNode role process on a federated node on the Role Management page may affect the cluster scaling, so you need to resume the process first before scaling the cluster.
- After NameNodes are federated, the
fs.defaultFSconfigurations on different NameNodes will differ. When delivering the configuration of the HDFS
core-site.xmlfile, do not select the cluster level; otherwise, the
fs.defaultFSvalues on NameNodes will be overwritten. Other configuration files will not be affected.
- For HDFS versions earlier than 3.3.0, when you successfully add a NameService to a router-based federation (not for the first time ), you need to restart the old DFSRouter process on the Role Management page (preferably during off-peak hours). For HDFS v3.3.0 and later which support hot loading the configuration, this operation is not required.
- After adding a federated NameService to a cluster with Kerberos enabled, you need to restart the YARN ResourceManager first (preferably during off-peak hours) before you can use the files on the new NameService for jobs submitted to YARN.
- The NameService name cannot be modified or deleted once set and cannot be system keywords such as "nsfed", "haclusterX", and "ClusterX".
- Path direction:
- Log in to the NameNode and run
hdfs dfs -ls /to point to the path under the namespace managed by the NameNode. For ViewFs federation, you need to use
hdfs dfs -ls ViewFs://ClusterX/to point to the global path; for router-based federation, use
hdfs dfs -ls hdfs://nsfed/instead.
- Log in to another node, such as the router node serving as a client.
hdfs dfs -ls /points to the global path.
- The data of all business components needs to be placed in first-level directories but not the root directory for access, as the root directory cannot be mounted.
- The default NameService has the
/emrdirectory, which needs to be mounted.