Dynamic Release Record (2026)

Parameter | Description |
Data Source | Select an available S3 data source from the current project. |
Synchronization Scope | Full synchronization: The synchronization scope is all files under the file path. Incremental: The synchronization scope is the updated files between the planned scheduling time in the previous period and the planned scheduling time in this period under the file path. When debugging a synchronization task, if incremental synchronization is selected, all files under the file path are synchronized by default. Note: The synchronization scope is only valid for file paths in the form of a file directory. If the path is specific to a file, the synchronization scope does not take effect, and synchronization is performed by file. |
Synchronization Method | S3 supports two synchronization methods: Data synchronization: Parse structured data content and map and sync data content based on field relationships. File transfer: Transmit the entire file without content parsing, applicable to unstructured data synchronization. Note: File transfer only supports data sources where both the source end and the target end are of file types (COS/HDFS/SFTP/FTP/S3/Azure Blob/Http), and the synchronization method for both the source end and the target end must be file transfer. |
File Path | Path of the S3 file system, supports the use of * wildcard. S3 file path must include the bucket name, for example s3a://bucket_name. |
File Type | S3 supports five file types: TXT, ORC, PARQUET, CSV, JSON. TXT: Represents the TextFile file format. ORC: Represents the ORCFile file format. PARQUET: Represents the ordinary Parquet file format. CSV: Represents the ordinary HDFS file format (logical two-dimensional table). JSON: Represents the JSON file format. |
Field Separator | For TXT/CSV file types, the field separator is set when reading in synchronous data mode, with a default value of (,). Note: Unsupported ${} as a delimiter. ${a} will be recognized as a configured parameter a. |
Line Separator(Optional) | For TXT/CSV file types, the line delimiter set when reading in synchronous data mode. Up to 3 values can be input, and multiple input values are all treated as line delimiters. If not specified, linux defaults to \\n, and windows defaults to \\r\\n. Note: 1. Setting multiple line delimiters will impact read performance. 2. Does not support ${} as a delimiter, ${a} will be recognized as a configured parameter a. |
Encode | Coding configuration for reading files. Supports utf8 and gbk encoding. |
Null Value Conversion | When reading, convert the specified string to NULL. NULL represents an unknown or inapplicable value, different from 0, an empty string, or other numeric values. Note: Unsupported ${} as a specified string. ${a} will be recognized as a configured parameter a. |
Compression Format | Currently supports: none, deflate, gzip, bzip2, lz4, snappy. Since snappy currently does not have a unified stream format, Data Integration currently only supports the most widely used: hadoop-snappy (snappy stream format on Hadoop) framing-snappy (google-recommended snappy stream format) |
Quote character | No configuration: The source reads the data as its original content. The source end reads the value within double quotes (") as data content. Note: If the data format is not standardized, it may cause OOM when reading. Single quotation mark ('): The source end reads the value within the single quotation mark (') as data content. Note: If the data format is not standardized, reading may cause OOM. |
Skip the Header | No: When reading, do not skip the header. Yes: When reading, skip the header. |

Parameter | Description |
Data Source | Select an available S3 data source from the current project. |
Synchronization Method | S3 supports two synchronization methods: Data synchronization: Parse structured data content and map and sync data content based on field relationships. File transfer: Transmit the entire file without content parsing, applicable to unstructured data synchronization. Note: File transfer only supports data sources where both the source end and the target end are of file types (COS/HDFS/SFTP/FTP/S3/Azure Blob/Http), and the synchronization method for both the source end and the target end must be file transfer. |
File Path | Write to the path of the S3 file system. S3 file path must include the bucket name, such as s3a://bucket_name. |
Write Mode | overwrite: Clean up all files with the file name as the prefix before writing. append: Write without any pre-processing and ensure no file name conflict. nonConflict: Report an error if the file name is duplicated. |
File Type | S3 supports four file types: TXT, ORC, PARQUET, CSV. TXT: Represents the TextFile file format. ORC: Represents the ORCFile file format. PARQUET: Represents the ordinary Parquet file format. CSV: Represents the ordinary HDFS file format (logical two-dimensional table). |
Field Separator | For TXT/CSV file types, the field separator is set when writing in synchronous data mode, with a default value of (,). Note: Unsupported ${} as a delimiter. ${a} will be recognized as a configured parameter a. |
Line Separator(Optional) | For TXT/CSV file types, the line separator set when writing in synchronous data mode. Multiple input values are all treated as line separators. If not specified, linux defaults to \\n, and windows defaults to \\r\\n. When manually filling, one value can be input as the target line separator for data writing. Note: Unsupported ${} as a delimiter. ${a} will be recognized as a configured parameter a. |
Encode | Coding configuration for writing files. Supports utf8 and gbk encoding. |
Empty Character String Processing | No action taken: Do not process empty strings when writing. Processed as NULL: When writing, process empty strings as NULL. |
Null Value Conversion | When writing, convert NULL to the specified string. NULL represents an unknown or inapplicable value, different from 0, an empty string, or other numeric values. Note: Unsupported ${} as a specified string. ${a} will be recognized as a configured parameter a. |
Compression Format | Currently supports: none, deflate, gzip, bzip2, lz4, snappy. Since snappy currently does not have a unified stream format, Data Integration currently only supports the most widely used: hadoop-snappy (snappy stream format on Hadoop) framing-snappy (google-recommended snappy stream format) |
Quote character | No configuration: The target does not perform the add operation for quotation marks when writing data, and the values match the source. " (double quotation mark): The target automatically adds double quotation marks " to each value when writing data, such as "123". single quotation mark (' ): When writing data, automatically add a single quotation mark to each value, for example '123'. |
Header included or Not | No: When writing, exclude the header. Yes: When writing, include the header. |
Tagged completed file | Blank .ok file, indicating task transmission completed. Do not generate: After task completion, do not generate blank .ok files. (Chosen by default) Generate a blank .ok file after task completion |
Tagged completed file name | When filling in, include the complete file path, name, and suffix. By default, a .ok file is generated in the target path according to the target file name and suffix. Support using time scheduling parameters such as ${yyyyMMdd} in data sync and file transfer modes. Support using built-in variable ${filetrans_di_src} to represent source file name. For example: /user/test/${filetrans_di_src}_${yyyyMMdd}.ok. Note: In file transfer mode, when using ${filetrans_di_src} to reference the source file name in the target file name, and the tagged completed file is generated as an .ok file according to the target file name and suffix, the number of generated tagged completed files depends on the number of files at the source end. |
Apakah halaman ini membantu?
Anda juga dapat Menghubungi Penjualan atau Mengirimkan Tiket untuk meminta bantuan.
masukan