tencent cloud

Elastic MapReduce

Release Notes and Announcements
Release Notes
Announcements
Security Announcements
Product Introduction
Overview
Strengths
Architecture
Features
Use Cases
Constraints and Limits
Technical Support Scope
Product release
Purchase Guide
EMR on CVM Billing Instructions
EMR on TKE Billing Instructions
EMR Serverless HBase Billing Instructions
Getting Started
EMR on CVM Quick Start
EMR on TKE Quick Start
EMR on CVM Operation Guide
Planning Cluster
Administrative rights
Configuring Cluster
Managing Cluster
Managing Service
Monitoring and Alarms
TCInsight
EMR on TKE Operation Guide
Introduction to EMR on TKE
Configuring Cluster
Cluster Management
Service Management
Monitoring and Ops
Application Analysis
EMR Serverless HBase Operation Guide
EMR Serverless HBase Product Introduction
Quotas and Limits
Planning an Instance
Managing an Instance
Monitoring and Alarms
Development Guide
EMR Development Guide
Hadoop Development Guide
Spark Development Guide
Hbase Development Guide
Phoenix on Hbase Development Guide
Hive Development Guide
Presto Development Guide
Sqoop Development Guide
Hue Development Guide
Oozie Development Guide
Flume Development Guide
Kerberos Development Guide
Knox Development Guide
Alluxio Development Guide
Kylin Development Guide
Livy Development Guide
Kyuubi Development Guide
Zeppelin Development Guide
Hudi Development Guide
Superset Development Guide
Impala Development Guide
Druid Development Guide
TensorFlow Development Guide
Kudu Development Guide
Ranger Development Guide
Kafka Development Guide
Iceberg Development Guide
StarRocks Development Guide
Flink Development Guide
JupyterLab Development Guide
MLflow Development Guide
Practical Tutorial
Practice of EMR on CVM Ops
Data Migration
Practical Tutorial on Custom Scaling
API Documentation
History
Introduction
API Category
Cluster Resource Management APIs
Cluster Services APIs
User Management APIs
Data Inquiry APIs
Scaling APIs
Configuration APIs
Other APIs
Serverless HBase APIs
YARN Resource Scheduling APIs
Making API Requests
Data Types
Error Codes
FAQs
EMR on CVM
Service Level Agreement
Contact Us

Accessing Iceberg Data with Hive

PDF
포커스 모드
폰트 크기
마지막 업데이트 시간: 2024-10-30 11:41:59

Development Preparation

Make sure you have activated Tencent Cloud and created an EMR cluster. For more details, see Creating a Cluster.
During the creation of an EMR cluster, select the Hive, Spark, and Iceberg components in the software configuration interface.

Using Spark to Create an Iceberg Table

Log in to the Master node, switch to the hadoop user, and execute the following command to start SparkSQL:
spark-sql --master local[*] --conf spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions --conf spark.sql.catalog.local=org.apache.iceberg.spark.SparkCatalog --conf spark.sql.catalog.local.type=hadoop --conf spark.sql.catalog.local.warehouse=/usr/hive/warehouse --jars /usr/local/service/iceberg/iceberg-spark-runtime-3.2_2.12-0.13.0.jar

Note:
The Iceberg-related packages are located in the /usr/local/service/iceberg/ directory. The versions of the dependency packages used by --jars may vary between different EMR versions, so check and use the correct dependency packages.
Create a table:
spark-sql> CREATE TABLE local.default.t1 (id int, name string) USING iceberg;
Time taken: 2.752 seconds
Insert data:
spark-sql> INSERT INTO local.default.t1 values(1, "tom");
Time taken: 2.71 seconds
Query data:
spark-sql> SELECT * from local.default.t1;
1 tom
Time taken: 0.558 seconds, Fetched 1 row(s)

Using Hive to View Iceberg Data

Log in to the Master node, switch to the hadoop user, and execute the following command to connect to Hive:
hive
Add the Iceberg dependency package:
hive> add jar /usr/local/service/iceberg/iceberg-hive-runtime-0.13.0.jar;
Create an external table:
hive> CREATE EXTERNAL TABLE t1
STORED BY 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler'
LOCATION '/usr/hive/warehouse/default/t1'
TBLPROPERTIES ('iceberg.catalog'='location_based_table');
Query the record count of the t1 table:
hive> select count(*) from t1;
OK
1
Time taken: 26.255 seconds, Fetched: 1 row(s)

도움말 및 지원

문제 해결에 도움이 되었나요?

피드백