tencent cloud

Stream Compute Service

Releases Notes and Announcements
Release Notes
Product Introduction
Overview
Strengths
Use Cases
Purchase Guide
Billing Overview
Billing Mode
Refund
Configuration Adjustments
Getting Started
Preparations
Creating a Private Cluster
Creating a SQL Job
Creating a JAR Job
Creating an ETL Job
Creating a Python Job
Operation Guide
Managing Jobs
Developing Jobs
Monitoring Jobs
Job Logs
Events and Diagnosis
Managing Metadata
Managing Checkpoints
Tuning Jobs
Managing Dependencies
Managing Clusters
Managing Permissions
SQL Developer Guide
Overview
Glossary and Data Types
DDL Statements
DML Statements
Merging MySQL CDC Sources
Connectors
SET Statement
Operators and Built-in Functions
Identifiers and Reserved Words
Python Developer Guide
ETL Developer Guide
Overview
Glossary
Connectors
FAQ
Contact Us

Hive Catalogs

PDF
포커스 모드
폰트 크기
마지막 업데이트 시간: 2023-11-07 16:26:48

Overview

You can configure and use Hive catalogs, and view Hive metadata of a SQL job in the Stream Compute Service console. After metadata is stored in a Hive metastore, you have no need to explicitly declare the DDL statements in the job and can directly reference the metadata in the three-segment format.

Hive ‍support

Flink Version
Description
1.11
Not supported
1.13
Hive v2.2.0, v2.3.2, v2.3.5, and v3.1.1 supported
1.14
Not supported

Prerequisites

You have activated the Hive metastore service on the Hive metastore. The related commands are as follows:
hive --service metastore: Activate the Hive metastore service.
ps -ef|grep metastore: Check whether the service is successfully activated.

Directions

Creating a Hive catalog

Switch to the _dc ‍directory, and click Create Hive catalog. Upload the configuration files hive-site.xml (adding urls to it), hdfs-site.xml, hivemetastore-site.xml, and hiveserver2-site.xml (download them here).

Creating a database

You can create databases in a SQL job. The database reference uses a two-segment format of catalog_name.database_name.
CREATE DATABASE IF NOT EXISTS `hiveCatalogName`.`databaseName`;

Creating a table

You can create tables in a SQL job. The table reference uses a three-segment format of catalog_name.database_name.table_name.
CREATE TABLE IF NOT EXISTS `hiveCatalogName`.`databaseName`.`tableName` (
user_id INT,
item_id INT,
category_id INT,
-- ts AS localtimestamp,
-- WATERMARK FOR ts AS ts,
behavior VARCHAR
) WITH (
'connector' = 'datagen',
'rows-per-second' = '1', -- The number of records generated per second
'fields.user_id.kind' = 'sequence', -- Whether a bounded sequence (if yes, the output automatically stops after the sequence ends)
'fields.user_id.start' = '1', -- The start value of the sequence
'fields.user_id.end' = '10000', -- The end value of the sequence
'fields.item_id.kind' = 'random', -- A random number without range
'fields.item_id.min' = '1', -- ‍The minimum random number
'fields.item_id.max' = '1000', -- The maximum random number
'fields.category_id.kind' = 'random', -- A random number without range
'fields.category_id.min' = '1', -- The minimum random number
'fields.category_id.max' = '1000', -- The maximum random number
'fields.behavior.length' = '5' -- Random string length
);

Referencing a Hive catalog table in a SQL job

Place the cursor to the position where a metadata table is to be inserted in a SQL job, find the target table on the left sidebar, and click Operation > Reference.
INSERT INTO
`hiveCatalogName`.`databaseName`.`sink_tableName`
SELECT
*
FROM
`hiveCatalogName`.`databaseName`.`source_tableName`;
Note
You ‍can reference only one Hive catalog in a job.
The DROP operation is unavailable on a Hive catalog.

Deleting a Hive metastore

On the left sidebar, click Delete of the Hive catalog to be deleted.

Granting permissions

To use Hive catalogs in Stream Compute Service, access to HDFS files is required during the job execution, and the related Flink user must be granted the access permissions. The details are as follows:
Execute the following commands on all Hive master nodes.
useradd flink
groupadd supergroup
usermod -a -G supergroup flink
hdfs dfsadmin -refreshUserToGroupsMappings
We recommend you grant the permissions in Hive, and add the following configuration options to hive-site.xml:
<property>
<name>hive.metastore.authorization.storage.checks</name>
<value>true</value>
<description>Should the metastore do authorization checks against
the underlying storage for operations like drop-partition (disallow
the drop-partition if the user in question doesn't have permissions
to delete the corresponding directory on the storage).</description><property>


도움말 및 지원

문제 해결에 도움이 되었나요?

피드백