tencent cloud

Elastic MapReduce

  • Release Notes and Announcements
  • Product Introduction
  • Purchase Guide
    • EMR on CVM Billing Instructions
    • EMR on TKE Billing Instructions
    • EMR Serverless HBase Billing Instructions
    • EMR Serverless TCBase Billing Overview
  • Getting Started
  • EMR on CVM Operation Guide
    • Planning Cluster
    • Administrative rights
    • Configuring Cluster
    • Managing Cluster
    • Managing Service
    • Monitoring and Alarms
    • TCInsight
  • EMR on TKE Operation Guide
  • EMR Serverless HBase Operation Guide
  • EMR Serverless TCBase Operation Guide
  • EMR Development Guide
    • Hadoop Development Guide
    • Spark Development Guide
    • Hbase Development Guide
    • Phoenix on Hbase Development Guide
    • Hive Development Guide
    • Presto Development Guide
    • Sqoop Development Guide
    • Hue Development Guide
    • Oozie Development Guide
    • Flume Development Guide
    • Kerberos Development Guide
    • Knox Development Guide
    • Alluxio Development Guide
    • Kylin Development Guide
    • Livy Development Guide
    • Kyuubi Development Guide
    • Zeppelin Development Guide
    • Hudi Development Guide
    • Superset Development Guide
    • Impala Development Guide
    • Druid Development Guide
    • TensorFlow Development Guide
    • Kudu Development Guide
    • Ranger Development Guide
    • Kafka Development Guide
    • StarRocks Development Guide
    • Flink Development Guide
    • JupyterLab Development Guide
    • MLflow Development Guide
  • Practical Tutorial
    • Practice of EMR on CVM Ops
    • Data Migration
    • Practical Tutorial on Custom Scaling
  • API Documentation
    • History
    • Introduction
    • API Category
    • Making API Requests
    • Cluster Resource Management APIs
    • Cluster Services APIs
    • User Management APIs
    • Information Query APIs
    • Scaling APIs
    • Configuration APIs
    • Other APIs
    • Cluster Lifecycle APIs
    • Serverless HBase APIs
    • YARN Resource Scheduling APIs
    • Data Types
    • Error Codes
  • FAQs
    • EMR on CVM
  • Service Level Agreement
  • Contact Us

Hive Metadata Management

Download
포커스 모드
폰트 크기
마지막 업데이트 시간: 2024-10-30 10:29:49
When you deploy the Hive component, Hive metadata can be stored in two ways: the first option is the default cluster, where metadata is stored in a separately purchased MetaDB within the cluster; the second option is to associate an external Hive Metastore, where you can choose to link to EMR-MetaDB or a self-built MySQL database, with metadata stored in the associated database, which will not be destroyed when the cluster is terminated.
The default cluster automatically purchases a separate MetaDB CloudDB instance as the metadata storage unit, storing metadata together with other components. This MetaDB CloudDB will be terminated along with the cluster. To retain metadata, you need to manually save it in the CloudDB beforehand.
Note
1. Hive metadata is stored together with the metadata of Druid, Superset, Hue, Ranger, Oozie, and Presto components.
2. The cluster requires a separate purchase of a MetaDB as a metadata storage unit.
3. The MetaDB is terminated along with the cluster, meaning that the metadata is also terminated with the cluster.

Associating EMR-MetaDB to Share Hive Metadata

When you create a cluster, the system will pull an available MetaDB from the cloud to store the metadata for the new cluster’s Hive component. This eliminates the need for a separate MetaDB purchase, reducing costs. Moreover, Hive metadata will not be terminated along with the current cluster.
Note
1. The available MetaDB instance ID should be one of the MetaDBs existing under the same account within EMR clusters.
2. When you select one or more of the following components such as Hue, Ranger, Oozie, Druid, and Superset, the system will automatically purchase a MetaDB for storing metadata of components other than Hive.
3. To terminate an associated EMR-MetaDB, you need to do so via CloudDB. Once it is terminated, the Hive Metastore cannot be recovered.
4. Ensure that the network of the associated EMR-MetaDB is in the same network environment as that of the created cluster.
1. After a cluster is created and a Hive component is selected, click Next and choose the associated EMR-MetaDB:



2. For clusters without the Hive component installed, when the Hive component is added, select the associated EMR-MetaDB:




Associating Self-Built MySQL to Share Hive Metadata

Associating a self-built local MySQL database as Hive metadata storage also avoids the need to purchase a separate MetaDB for storing Hive metadata, thus saving costs. You should accurately enter the local address starting with jdbc:mysql://, the database name, and the database login password, and ensure the network is connected with the current cluster.
Note
1. Ensure that the self-built database and the EMR cluster are in the same network.
2. Enter the correct database username and password.
3. When you select one or more components such as Hue, Ranger, Oozie, Druid, or Superset, the system will automatically purchase a MetaDB for storing metadata, excluding Hive.
4. Ensure that the Hive metadata version in the custom database is greater than or equal to the Hive version in the new cluster.
5. Ensure that the self-built MySQL contains an initialized Hive Metastore and table.
1. After a cluster is created and a Hive component is selected, click Next and link the self-built MySQL database:



2. For clusters without the Hive component installed, when the Hive component is added, link the self-built MySQL database:




Fixing Issues with HIVE Linking Self-Built Metadata

Selecting a self-built MySQL without HIVE metadata during the creation of an EMR cluster can result in HIVE process issues.

Issue Reproduction





Solutions

For Hive metadata without data, follow the directions below:
Description
Replace ${ip}, ${port}, and ${database} with the actual values used by the user.
1. Stop HiveServer2 (hs2) and metastore for Hive in the console.
2. Modify the hive-site.xml to proto-hive-site.xml for the Hive component and issue it accordingly.Configuration item: javax.jdo.option.ConnectionURL
jdbc:mysql://${ip}:${port}/${database}?useSSL=false&createDatabaseIfNotExist=true&characterEncodin
g=UTF-8
3. Delete the database in the CDB:
drop database ${database};
4. Execute the following command as the hadoop user:
/usr/local/service/hive/bin/schematool -dbType mysql -initSchema
5. Start hs2 and metastore for Hive from the console.
6. Check if Hive is functioning normally.
If there are character exceptions, execute the following command in CDB:
alter database ${database} character set latin1;
flush privileges;


도움말 및 지원

문제 해결에 도움이 되었나요?

피드백