Release Notes

Dynamic Release Record (2026)

Product Introduction

Product Overview

Product Advantages

Product Architecture

Product Features

Application Scenarios

Purchase Guide

Billing Overview

Product Version Purchase Instructions

Execute Resource Purchase Description

Billing Modes

Overdue Policy

Refund

Preparations

Overview of Account and Permission Management

Add allowlist /security groups (Optional)

Operation Guide

Console Operation

Project Management

Data Integration

Studio

Data Development

Data Analysis

Data Science

Data Governance (with Unity Semantics)

API Documentation

History

Introduction

API Category

Making API Requests

Smart Ops Related Interfaces

Project Management APIs

Resource Group APIs

Data Development APIs

Data Asset - Data Dictionary APIs

Data Development APIs

Ops Center APIs

Data Operations Related Interfaces

Data Exploration APIs

Asset APIs

Metadata Related Interfaces

Task Operations APIs

Data Security APIs

Instance Operation and Maintenance Related Interfaces

Data Map and Data Dictionary APIs

Data Quality Related Interfaces

DataInLong APIs

Platform Management APIs

Data Source Management APIs

Data Quality APIs

Platform Management APIs

Asset Data APIs

Data Source Management APIs

Data Types

Error Codes

WeData API 2025-08-06

Service Level Agreements

Related Agreement

Data Processing And Security Agreement

Glossary

S3 Data Source

PDF

Mode fokus

Ukuran font

Terakhir diperbarui: 2025-07-25 17:23:07

S3 Offline Single Table Read Node Configuration
﻿
Parameter
Description
Data Source
Select an available S3 data source from the current project.
Synchronization Scope
Full synchronization: The synchronization scope is all files under the file path.
Incremental: The synchronization scope is the updated files between the planned scheduling time in the previous period and the planned scheduling time in this period under the file path. When debugging a synchronization task, if incremental synchronization is selected, all files under the file path are synchronized by default.
Note:
The synchronization scope is only valid for file paths in the form of a file directory. If the path is specific to a file, the synchronization scope does not take effect, and synchronization is performed by file.
Synchronization Method
S3 supports two synchronization methods:
Data synchronization: Parse structured data content and map and sync data content based on field relationships.
File transfer: Transmit the entire file without content parsing, applicable to unstructured data synchronization.
Note:
File transfer only supports data sources where both the source end and the target end are of file types (COS/HDFS/SFTP/FTP/S3/Azure Blob/Http), and the synchronization method for both the source end and the target end must be file transfer.
File Path
Path of the S3 file system, supports the use of * wildcard.
S3 file path must include the bucket name, for example s3a://bucket_name.
File Type
S3 supports five file types: TXT, ORC, PARQUET, CSV, JSON.
TXT: Represents the TextFile file format.
ORC: Represents the ORCFile file format.
PARQUET: Represents the ordinary Parquet file format.
CSV: Represents the ordinary HDFS file format (logical two-dimensional table).
JSON: Represents the JSON file format.
Field Separator
For TXT/CSV file types, the field separator is set when reading in synchronous data mode, with a default value of (,).
Note:
Unsupported ${} as a delimiter. ${a} will be recognized as a configured parameter a.
Line Separator(Optional)
For TXT/CSV file types, the line delimiter set when reading in synchronous data mode. Up to 3 values can be input, and multiple input values are all treated as line delimiters. If not specified, linux defaults to \\n, and windows defaults to \\r\\n.
Note:
1. Setting multiple line delimiters will impact read performance.
2. Does not support ${} as a delimiter, ${a} will be recognized as a configured parameter a.
Encode
Coding configuration for reading files. Supports utf8 and gbk encoding.
Null Value Conversion
When reading, convert the specified string to NULL. NULL represents an unknown or inapplicable value, different from 0, an empty string, or other numeric values.
Note:
Unsupported ${} as a specified string. ${a} will be recognized as a configured parameter a.
Compression Format
Currently supports: none, deflate, gzip, bzip2, lz4, snappy.
Since snappy currently does not have a unified stream format, Data Integration currently only supports the most widely used:
 hadoop-snappy (snappy stream format on Hadoop)
framing-snappy (google-recommended snappy stream format)
Quote character
No configuration: The source reads the data as its original content.
The source end reads the value within double quotes (") as data content. Note: If the data format is not standardized, it may cause OOM when reading.
Single quotation mark ('): The source end reads the value within the single quotation mark (') as data content. Note: If the data format is not standardized, reading may cause OOM.
Skip the Header
No: When reading, do not skip the header.
Yes: When reading, skip the header.
S3 Offline Single Table Write Node Configuration
﻿
Parameter
Description
Data Source
Select an available S3 data source from the current project.
Synchronization Method
S3 supports two synchronization methods:
Data synchronization: Parse structured data content and map and sync data content based on field relationships.
File transfer: Transmit the entire file without content parsing, applicable to unstructured data synchronization.
Note:
File transfer only supports data sources where both the source end and the target end are of file types (COS/HDFS/SFTP/FTP/S3/Azure Blob/Http), and the synchronization method for both the source end and the target end must be file transfer.
File Path
Write to the path of the S3 file system.
S3 file path must include the bucket name, such as s3a://bucket_name.
Write Mode
overwrite: Clean up all files with the file name as the prefix before writing.
append: Write without any pre-processing and ensure no file name conflict.
nonConflict: Report an error if the file name is duplicated.
File Type
S3 supports four file types: TXT, ORC, PARQUET, CSV.
TXT: Represents the TextFile file format.
ORC: Represents the ORCFile file format.
PARQUET: Represents the ordinary Parquet file format.
CSV: Represents the ordinary HDFS file format (logical two-dimensional table).
Field Separator
For TXT/CSV file types, the field separator is set when writing in synchronous data mode, with a default value of (,).
Note:
Unsupported ${} as a delimiter. ${a} will be recognized as a configured parameter a.
Line Separator(Optional)
For TXT/CSV file types, the line separator set when writing in synchronous data mode. Multiple input values are all treated as line separators. If not specified, linux defaults to \\n, and windows defaults to \\r\\n. When manually filling, one value can be input as the target line separator for data writing.
Note: Unsupported ${} as a delimiter. ${a} will be recognized as a configured parameter a.
Encode
Coding configuration for writing files. Supports utf8 and gbk encoding.
Empty Character String Processing
No action taken: Do not process empty strings when writing.
Processed as NULL: When writing, process empty strings as NULL.
Null Value Conversion
When writing, convert NULL to the specified string. NULL represents an unknown or inapplicable value, different from 0, an empty string, or other numeric values.
Note:
Unsupported ${} as a specified string. ${a} will be recognized as a configured parameter a.
Compression Format
Currently supports: none, deflate, gzip, bzip2, lz4, snappy.
Since snappy currently does not have a unified stream format, Data Integration currently only supports the most widely used:
 hadoop-snappy (snappy stream format on Hadoop)
 framing-snappy (google-recommended snappy stream format)
Quote character
No configuration: The target does not perform the add operation for quotation marks when writing data, and the values match the source.
" (double quotation mark): The target automatically adds double quotation marks " to each value when writing data, such as "123".
single quotation mark (' ): When writing data, automatically add a single quotation mark to each value, for example '123'.
Header included or Not
No: When writing, exclude the header.
Yes: When writing, include the header.
Tagged completed file
Blank .ok file, indicating task transmission completed.
Do not generate: After task completion, do not generate blank .ok files. (Chosen by default)
Generate a blank .ok file after task completion
Tagged completed file name
When filling in, include the complete file path, name, and suffix. By default, a .ok file is generated in the target path according to the target file name and suffix.
Support using time scheduling parameters such as ${yyyyMMdd} in data sync and file transfer modes.
Support using built-in variable ${filetrans_di_src} to represent source file name.
For example: /user/test/${filetrans_di_src}_${yyyyMMdd}.ok.
Note:
In file transfer mode, when using ${filetrans_di_src} to reference the source file name in the target file name, and the tagged completed file is generated as an .ok file according to the target file name and suffix, the number of generated tagged completed files depends on the number of files at the source end.
﻿

Bantuan dan Dukungan

Apakah halaman ini membantu?

Anda juga dapat Menghubungi Penjualan atau Mengirimkan Tiket untuk meminta bantuan.

masukan

tencent cloud

Tencent Cloud WeData

S3 Data Source

S3 Offline Single Table Read Node Configuration

S3 Offline Single Table Write Node Configuration

Bantuan dan Dukungan

Parameter	Description
Data Source	Select an available S3 data source from the current project.
Synchronization Scope	Full synchronization: The synchronization scope is all files under the file path. Incremental: The synchronization scope is the updated files between the planned scheduling time in the previous period and the planned scheduling time in this period under the file path. When debugging a synchronization task, if incremental synchronization is selected, all files under the file path are synchronized by default. Note: The synchronization scope is only valid for file paths in the form of a file directory. If the path is specific to a file, the synchronization scope does not take effect, and synchronization is performed by file.
Synchronization Method	S3 supports two synchronization methods: Data synchronization: Parse structured data content and map and sync data content based on field relationships. File transfer: Transmit the entire file without content parsing, applicable to unstructured data synchronization. Note: File transfer only supports data sources where both the source end and the target end are of file types (COS/HDFS/SFTP/FTP/S3/Azure Blob/Http), and the synchronization method for both the source end and the target end must be file transfer.
File Path	Path of the S3 file system, supports the use of * wildcard. S3 file path must include the bucket name, for example s3a://bucket_name.
File Type	S3 supports five file types: TXT, ORC, PARQUET, CSV, JSON. TXT: Represents the TextFile file format. ORC: Represents the ORCFile file format. PARQUET: Represents the ordinary Parquet file format. CSV: Represents the ordinary HDFS file format (logical two-dimensional table). JSON: Represents the JSON file format.
Field Separator	For TXT/CSV file types, the field separator is set when reading in synchronous data mode, with a default value of (,). Note: Unsupported ${} as a delimiter. ${a} will be recognized as a configured parameter a.
Line Separator(Optional)	For TXT/CSV file types, the line delimiter set when reading in synchronous data mode. Up to 3 values can be input, and multiple input values are all treated as line delimiters. If not specified, linux defaults to \\n, and windows defaults to \\r\\n. Note: 1. Setting multiple line delimiters will impact read performance. 2. Does not support ${} as a delimiter, ${a} will be recognized as a configured parameter a.
Encode	Coding configuration for reading files. Supports utf8 and gbk encoding.
Null Value Conversion	When reading, convert the specified string to NULL. NULL represents an unknown or inapplicable value, different from 0, an empty string, or other numeric values. Note: Unsupported ${} as a specified string. ${a} will be recognized as a configured parameter a.
Compression Format	Currently supports: none, deflate, gzip, bzip2, lz4, snappy. Since snappy currently does not have a unified stream format, Data Integration currently only supports the most widely used: hadoop-snappy (snappy stream format on Hadoop) framing-snappy (google-recommended snappy stream format)
Quote character	No configuration: The source reads the data as its original content. The source end reads the value within double quotes (") as data content. Note: If the data format is not standardized, it may cause OOM when reading. Single quotation mark ('): The source end reads the value within the single quotation mark (') as data content. Note: If the data format is not standardized, reading may cause OOM.
Skip the Header	No: When reading, do not skip the header. Yes: When reading, skip the header.