Release Notes

Dynamic Release Record (2026)

Product Introduction

Product Overview

Product Advantages

Product Architecture

Product Features

Application Scenarios

Purchase Guide

Billing Overview

Product Version Purchase Instructions

Execute Resource Purchase Description

Billing Modes

Overdue Policy

Refund

Preparations

Overview of Account and Permission Management

Add allowlist /security groups (Optional)

Operation Guide

Console Operation

Project Management

Data Integration

Studio

Data Development

Data Analysis

Data Science

Data Governance (with Unity Semantics)

API Documentation

History

Introduction

API Category

Making API Requests

Smart Ops Related Interfaces

Project Management APIs

Resource Group APIs

Data Development APIs

Data Asset - Data Dictionary APIs

Data Development APIs

Ops Center APIs

Data Operations Related Interfaces

Data Exploration APIs

Asset APIs

Metadata Related Interfaces

Task Operations APIs

Data Security APIs

Instance Operation and Maintenance Related Interfaces

Data Map and Data Dictionary APIs

Data Quality Related Interfaces

DataInLong APIs

Platform Management APIs

Data Source Management APIs

Data Quality APIs

Platform Management APIs

Asset Data APIs

Data Source Management APIs

Data Types

Error Codes

WeData API 2025-08-06

Service Level Agreements

Related Agreement

Data Processing And Security Agreement

Glossary

Kafka/CKafka Data Source

PDF

Mode fokus

Ukuran font

Terakhir diperbarui: 2024-11-01 17:56:18

Supported Editions
1. Supports Kafka 2.4.x, 2.7.x, 2.8.x versions.
2. Both Kafka and CKafka can select the Kafka data source and configure read and write settings according to Kafka steps.
Use Limits
When parameter.groupId and parameter.kafkaConfig.group.id are configured simultaneously, parameter.groupId has higher priority than the group.id in kafkaConfig settings.
Kafka Offline Single Table Read Node Configuration
﻿
﻿
﻿
Parameters
Description
Data Source
The Kafka data source type supports Kafka and Ckafka.
topic
Kafka's Topic is an aggregation of messages sources that Kafka processes.
Serialization Format
Kafka data to be read supports constant columns, data columns, and attribute columns:
Constant Column: Columns wrapped in single quotes are constant columns, e.g., ["'abc'", "'123'"]
Data Column
If your data is in JSON format, it supports retrieving JSON attributes, e.g., ["event_id"]
If your data is in JSON format, it supports retrieving nested JSON attributes, e.g., ["tag.desc"]
Attribute Column
__key__ represents the message key
__value__ represents the full content of the message 
__partition__ represents the partition of the current message
__headers__ represents the headers information of the current message
__offset__ represents the offset of the current message
__timestamp__ represents the timestamp of the current message
Consumer group ID
Avoid duplicating this parameter with other consumer processes to ensure the accuracy of the consumption offset. If this parameter is not specified, the default setting is group.id=WeData_group_${taskID}.
Cycle Starting Offset
When the task cycle runs, each time it reads from Kafka, it starts from the planned scheduling time of the previous cycle by default. Options: Start Offset of Partition, Current Offset of Consumer Group, Specified Time,  Offset Specify Offset.
Specific Time: When data is written into Kafka, a unixtime timestamp is automatically generated as the record time for that data. The synchronization task reads the corresponding data from Kafka after converting the user-configured yyyymmddhhmmss value into a unixtimestamp. For example, "beginDateTime": "20210125000000".
Start Offset of Partition:Extract data from the smallest undeleted offset in each partition of the Kafka topic.
Current Offset of Consumer Group: Read data starting from the offset saved by the Consumer Group ID specified in the task configuration. Typically, this is the offset where the process using this Consumer Group ID last stopped (it's best to ensure that only this configured Data Integration task uses the Consumer Group ID to avoid data loss). If using the Consumer Group Current Offset, you must configure the Consumer Group ID. Otherwise, the Data Integration task will generate a random Consumer Group ID, and since the new Consumer Group ID has no saved offsets, it can cause errors or read data from the start or end offset based on the offset reset strategy. Additionally, the group offset is periodically and automatically submitted to the Kafka server by the client, so if the task fails and is rerun later, there may be duplicate or missing data. In wizard mode, records read beyond the end offset are automatically discarded. The discarded data's group offset has already been submitted to the server, making it inaccessible in the next cycle task run.
Cycle End Offset
When the task cycle runs, each time it reads from Kafka, it ends at the scheduled end time. By default, the encoding specified in this configuration item is used to parse strings when keyType or valueType is configured as STRING.
Offset Reading Mode
The starting offset for synchronizing data when manually running synchronization tasks. Two reading modes are provided:
latest: Read from the last offset position.
earliest: Read from the starting offset.
Kafka Offline Single Table Write Node Configuration
﻿
﻿
﻿
Parameters
Description
Data Destination
The Kafka data source type supports Kafka and Ckafka.
Topic
Kafka's Topic is an aggregation of messages sources that Kafka processes.
Serialization Format
Kafka data to be read supports constant columns, data columns, and attribute columns:
Constant Column: Columns wrapped in single quotes are constant columns, e.g., ["'abc'", "'123'"]
Data Column:
If your data is in JSON format, it supports retrieving JSON attributes, e.g., ["event_id"]
If your data is in JSON format, it supports retrieving nested JSON attributes, e.g., ["tag.desc"]
Attribute Column:
__key__ represents the message key.
__value__ represents the full content of the message. 
__partition__ represents the partition of the current message.
__headers__ represents the headers information of the current message.
__offset__ represents the offset of the current message.
__timestamp__ represents the timestamp of the current message.
Partition Mapping
Supports round-robin partition write, specified field hash write, and specify partition modes.
If you choose specified field hash write mode, you need to specify the field name.
If you choose specify partition mode, you need to set the partition number.
Advanced Settings (Optional)
You can configure parameters according to business needs.
Data type conversion support
Read
Other data types are converted to String type by default.
Format of the value written to Kafka
Source Field Type
Internal Types
JSON
Integer,Long,BigInteger
Long
﻿
Float,Double
Double
﻿
Boolean
Bool
﻿
JSONArray,JSONObject
String
CSV
Integer,Long,BigInteger
Long
﻿
Float,Double
Double
﻿
Boolean
Bool
﻿
JSONArray,JSONObject
String
AVRO
Integer,Long,BigInteger
Long
﻿
Float,Double
Double
﻿
Boolean
Bool
﻿
JSONArray,JSONObject
String
Write
Other data types are converted to String type by default.
Format of the value written to Kafka
Internal Types
Target Field Type
JSON
Long
Integer,Bigint
﻿
Double
Float,Double,Decimal
﻿
Bool
Boolean
﻿
String
String,Varchar,Array
﻿
Date
Date,Timestamp
CSV
Long
Integer,Bigint
﻿
Double
Float,Double,Decimal
﻿
Bool
Boolean
﻿
String
String,Varchar,Array
﻿
Date
Date,Timestamp
AVRO
Long
Integer,Bigint
﻿
Double
Float,Double,Decimal
﻿
Bool
Boolean
﻿
String
String,Varchar,Array
﻿
Date
Date,Timestamp

Bantuan dan Dukungan

Apakah halaman ini membantu?

Anda juga dapat Menghubungi Penjualan atau Mengirimkan Tiket untuk meminta bantuan.

masukan

tencent cloud

Tencent Cloud WeData

Kafka/CKafka Data Source

Supported Editions

Use Limits

Kafka Offline Single Table Read Node Configuration

Kafka Offline Single Table Write Node Configuration

Data type conversion support

Read

Write

Bantuan dan Dukungan

Parameters	Description
Data Source	The Kafka data source type supports Kafka and Ckafka.
topic	Kafka's Topic is an aggregation of messages sources that Kafka processes.
Serialization Format	Kafka data to be read supports constant columns, data columns, and attribute columns: Constant Column: Columns wrapped in single quotes are constant columns, e.g., ["'abc'", "'123'"] Data Column If your data is in JSON format, it supports retrieving JSON attributes, e.g., ["event_id"] If your data is in JSON format, it supports retrieving nested JSON attributes, e.g., ["tag.desc"] Attribute Column __key__ represents the message key __value__ represents the full content of the message __partition__ represents the partition of the current message __headers__ represents the headers information of the current message __offset__ represents the offset of the current message __timestamp__ represents the timestamp of the current message
Consumer group ID	Avoid duplicating this parameter with other consumer processes to ensure the accuracy of the consumption offset. If this parameter is not specified, the default setting is group.id=WeData_group_${taskID}.
Cycle Starting Offset	When the task cycle runs, each time it reads from Kafka, it starts from the planned scheduling time of the previous cycle by default. Options: Start Offset of Partition, Current Offset of Consumer Group, Specified Time, Offset Specify Offset. Specific Time: When data is written into Kafka, a unixtime timestamp is automatically generated as the record time for that data. The synchronization task reads the corresponding data from Kafka after converting the user-configured yyyymmddhhmmss value into a unixtimestamp. For example, "beginDateTime": "20210125000000". Start Offset of Partition:Extract data from the smallest undeleted offset in each partition of the Kafka topic. Current Offset of Consumer Group: Read data starting from the offset saved by the Consumer Group ID specified in the task configuration. Typically, this is the offset where the process using this Consumer Group ID last stopped (it's best to ensure that only this configured Data Integration task uses the Consumer Group ID to avoid data loss). If using the Consumer Group Current Offset, you must configure the Consumer Group ID. Otherwise, the Data Integration task will generate a random Consumer Group ID, and since the new Consumer Group ID has no saved offsets, it can cause errors or read data from the start or end offset based on the offset reset strategy. Additionally, the group offset is periodically and automatically submitted to the Kafka server by the client, so if the task fails and is rerun later, there may be duplicate or missing data. In wizard mode, records read beyond the end offset are automatically discarded. The discarded data's group offset has already been submitted to the server, making it inaccessible in the next cycle task run.
Cycle End Offset	When the task cycle runs, each time it reads from Kafka, it ends at the scheduled end time. By default, the encoding specified in this configuration item is used to parse strings when keyType or valueType is configured as STRING.
Offset Reading Mode	The starting offset for synchronizing data when manually running synchronization tasks. Two reading modes are provided: latest: Read from the last offset position. earliest: Read from the starting offset.

Parameters	Description
Data Destination	The Kafka data source type supports Kafka and Ckafka.
Topic	Kafka's Topic is an aggregation of messages sources that Kafka processes.
Serialization Format	Kafka data to be read supports constant columns, data columns, and attribute columns: Constant Column: Columns wrapped in single quotes are constant columns, e.g., ["'abc'", "'123'"] Data Column: If your data is in JSON format, it supports retrieving JSON attributes, e.g., ["event_id"] If your data is in JSON format, it supports retrieving nested JSON attributes, e.g., ["tag.desc"] Attribute Column: __key__ represents the message key. __value__ represents the full content of the message. __partition__ represents the partition of the current message. __headers__ represents the headers information of the current message. __offset__ represents the offset of the current message. __timestamp__ represents the timestamp of the current message.
Partition Mapping	Supports round-robin partition write, specified field hash write, and specify partition modes. If you choose specified field hash write mode, you need to specify the field name. If you choose specify partition mode, you need to set the partition number.
Advanced Settings (Optional)	You can configure parameters according to business needs.

Format of the value written to Kafka	Source Field Type	Internal Types
JSON	Integer,Long,BigInteger	Long
		Float,Double	Double
		Boolean	Bool
		JSONArray,JSONObject	String
CSV	Integer,Long,BigInteger	Long
		Float,Double	Double
		Boolean	Bool
		JSONArray,JSONObject	String
AVRO	Integer,Long,BigInteger	Long
		Float,Double	Double
		Boolean	Bool
		JSONArray,JSONObject	String

Format of the value written to Kafka	Internal Types	Target Field Type
JSON	Long	Integer,Bigint
		Double	Float,Double,Decimal
		Bool	Boolean
		String	String,Varchar,Array
		Date	Date,Timestamp
CSV	Long	Integer,Bigint
		Double	Float,Double,Decimal
		Bool	Boolean
		String	String,Varchar,Array
		Date	Date,Timestamp
AVRO	Long	Integer,Bigint
		Double	Float,Double,Decimal
		Bool	Boolean
		String	String,Varchar,Array
		Date	Date,Timestamp