tencent cloud

Image Chat or File Chat (Real-time Document Parsing + Chat)
Last updated:2026-02-12 11:34:21
Image Chat or File Chat (Real-time Document Parsing + Chat)
Last updated: 2026-02-12 11:34:21

1. Overview

During conversations, you can upload images to perform QA, as well as upload documents for file summarization and content extraction.
Image Q&A:



Note:
First, you need to set up a multimodal model in the model settings.

Then, enable image uploads in the dialogue settings.


File Q&A:



Note:
As product capabilities evolve, limitations on file types and sizes will be gradually lifted. For details, see the limitations on the Intelligent Agent Development Platform page.
Document Q&A requires document parsing, followed by filling parameters and using WebSocket or HTTP SSE for dialogue.

2. Image Q&A Implementation Steps

1. Call the DescribeStorageCredential API to get temporary key.
2. Upload to COS using the SDK with a temporary key.
Note:
The method of operating COS on the Tencent Cloud Agent Development Platform (Tencent Cloud ADP) may not be the same as the direct method of using a fixed key. For detailed steps, please refer to the Tencent Cloud Agent Development Platform COS Operation Guide.
3. Fill parameters and use WebSocket or HTTP SSE for dialogue.
3.1 The concatenation method for image COS domain names or IP addresses is as follows:
https:// +
{DescribeStorageCredential.Response.Bucket} +
"." +
{DescribeStorageCredential.Response.Type} +
"." +
{DescribeStorageCredential.Response.Region} +
".myqcloud.com" +
{DescribeStorageCredential.Response.UploadPath}
If sending images, pass the Markdown-formatted image link here, such as ![](image link). Note the IsPublic field when obtaining temporary keys.
Input parameters example:
an image
{
"request_id": "LHWObW2Sea-3173301679",
"session_id": "f5652b8c-e88e-4abc-8629-7952b741a433",
"bot_app_key":"xxxxxx",
"visitor_biz_id":"LHWObW2Sea",
"file_infos": [],
"content": "Describe the following image![](https://lke-realtime-1251316161.cos.ap-jakarta.myqcloud.com/public/1746827241600319488/1780784842443587584/image/xxxx.jpg)"
}
multiple images
{
"request_id": "LHWObW2Sea-317330dwe9",
"session_id": "f5652b8c-e88e-4111-8629-7952b741a433",
"bot_app_key":"xxxxxx",
"visitor_biz_id":"LHWObW2Sea",
"file_infos": [],
"content": "Describe the following images![](https://lke-realtime-1251316161.cos.ap-jakarta.myqcloud.com/public/1746827241600319488/1780784842443587584/image/xxxx.jpg)![](https://lke-realtime-1251316161.cos.ap-jakarta.myqcloud.com/public/1753030485940633600/1887842680054218752/image/bbbbb.png)"
}

3. Document QA Implementation Steps

For knowledge QA, if you need to upload documents for real-time QA, you must first connect to the real-time document parsing API. The process is as follows:
1. Call the DescribeStorageCredential API to get temporary key.
2. Upload to COS using the SDK with a temporary key.
Note:
The method of operating COS on the Tencent Cloud Agent Development Platform (Tencent Cloud ADP) may not be the same as the direct method of using a fixed key. For detailed steps, please refer to the Tencent Cloud Agent Development Platform COS Operation Guide.
3. Call the real-time document parsing docParse API to get the doc_id.
4. Fill parameters and use WebSocket or HTTP SSE for dialogue.

3.1 Real-Time Document Parsing

Real-time document parsing API address: https://wss.lke.tencentcloud.com/v1/qbot/chat/docParse
Request Method: POST.
3.1.1 Request Parameter
Place it in the HTTP Body and send it in the form of JSON, as follows:
Name
Type
Required or Not
Description
session_id
string(64)
Yes
Session ID, used to identify a session (provided by an external system, it is advisable for different user clients to import different session IDs, otherwise messages from different users within the same application may get mixed up).
Parameter length: 2-64 characters
Verification rule: ^[a-zA-Z0-9_-]{2,64}$. You can use uuid to generate this value.
uuid example: 1b9c0b03-dc83-47ac-8394-b366e3ea67ef
Note:
The session_id for document parsing should be consistent with the session. If a multi-round session involves multiple document uploads, the session_id must remain the same and will be validated during the session.
bot_app_key
string(8)
Yes
Application key (provided by operations)
request_id
string(255)
Yes
Unique request ID. Recommend using UUID to ensure uniqueness.
cos_bucket
string
Yes
Get the cos bucket from the API response for obtaining a temporary key.
file_type
string
Yes
File type (txt|doc|docx|pdf|ppt|pptx)

Example value: txt|doc|docx|pdf|ppt|pptx
file_name
string
Yes
File Name
Example: test.docx
cos_url
string
Yes
Platform cos path, maintain parameter consistency with the UploadPath parameter in the DescribeStorageCredential API query
Example: /corp/1750375931926544384/1750376442139246592/doc/AaCIYEATBTYUQXDfXOTN-1807688648286535680.txt
cos_hash
string
Yes
cos_hash x-cos-hash-crc64ecma Verify the consistency between the file uploaded to the cloud and the local file by checking the CRC64 code in the header. After successful upload to COS, obtain from the return header.
e_tag
string
Yes
ETag, short for Entity Tag, is an information Tag that identifies the object content when the object is created. It can be used to check whether the content of the object has changed. After successful upload to cos, obtain it from the return header.

Example value: "6886efe263f34c9f9401c2d910b02635".
size
string
Yes
File size
Note:
This field is of string type.
3.1.2 Calling the API Using Curl
curl --location 'https://wss.lke.tencentcloud.com/v1/qbot/chat/docParse' \\
--header 'Content-Type: application/json' \\
--data '{
"session_id": "<your session_id>",
"request_id": "<random uuid>",
"cos_bucket": "lke-realtime-1251316161",
"file_type": "docx",
"file_name": "test.docx",
"cos_url": "/corp/1750375931926544384/1750376442139246592/doc/AaCIYEATBTYUQXDfXOTN-1807688648286535680.txt",
"e_tag": "\\"6886efe263f34c9f9401c2d910b02635\\"",
"cos_hash": "6138891591882964610",
"size": "355",
"bot_app_key": "<your appkey>"
}'
3.1.3 Postman Call Example



3.1.4 Response Parameter
SSE streaming return
Name
Type
Description
session_id
string(64)
Session ID, same as the session_id when making a request
trace_id
string
Unique ID returned
is_final
bool
Message output completed
doc_id
string
Document parsing API response doc_id
process
int32
Current progress, integer, value 100 means Successfully Ended
status
string
Status: PARSING, SUCCESS, FAILED
timestamp
int64
timestamp, unit: seconds
error_message
string
error information return on error
3.1.5 Response Example
{"type":"parsing","payload":{"doc_id":"0","error_message":"","is_final":false,"process":0,"session_id":"c7852s9d-aba8-4ee8-9c88-d65f28ddbc47","status":"PARSING","timestamp":1719821535,"trace_id":"1f1e5bfc9a3588d3abc62b9729fc6f62"},"message_id":"1b28b359-203e-4dbc-a103-6d92629cb1e0"}

{"type":"parsing","payload":{"doc_id":"0","error_message":"","is_final":false,"process":2,"session_id":"c7852s9d-aba8-4ee8-9c88-d65f28ddbc47","status":"PARSING","timestamp":1719821535,"trace_id":"1f1e5bfc9a3588d3abc62b9729fc6f62"},"message_id":"60c2a29a-7658-4186-90a9-d81c8c0b14b4"}

{"type":"parsing","payload":{"doc_id":"0","error_message":"","is_final":false,"process":85,"session_id":"c7852s9d-aba8-4ee8-9c88-d65f28ddbc47","status":"PARSING","timestamp":1719821536,"trace_id":"1f1e5bfc9a3588d3abc62b9729fc6f62"},"message_id":"65ca6da3-8909-42c4-9ea1-4a09be299a7b"}

{"type":"parsing","payload":{"doc_id":"1807688654434383264","error_message":"","is_final":true,"process":100,"session_id":"c7852s9d-aba8-4ee8-9c88-d65f28ddbc47","status":"SUCCESS","timestamp":1719821536,"trace_id":"1f1e5bfc9a3588d3abc62b9729fc6f62"},"message_id":"43046854-c596-45f5-9195-3df4f82a67ff"}

3.2 Document Dialogue Parameter Example

The method for obtaining the file COS address is as follows:
Standard mode:
Call the DescribeStorageCredential API to obtain Bucket, Type, Region, and UploadPath, then concatenate them in the following format:
https:// +
{DescribeStorageCredential.Response.Bucket} +
"." +
{DescribeStorageCredential.Response.Type} +
"." +
{DescribeStorageCredential.Response.Region} +
".myqcloud.com" +
{DescribeStorageCredential.Response.UploadPath}
Multi-Agent mode:
1. Call the DescribeStorageCredential API to obtain temporary Token, SecretId, and SecretKey.
2. Call the pre-signed URL interface to obtain the temporary access COS address.
Example
Real-time document without content:
{
"payload": {
"request_id": "LkfnMf5IrS-4486169698",
"session_id": "c60f0463-0176-4c15-a236-e4cbb6c21f97",
"file_infos": [
{
"doc_id": "1833044072669409152",
"file_name": "Riverside Scenery",
"file_type": "docx",
"file_size": "34859",
"file_url": "https://lke-realtime-1251316161.cos.ap-jakarta.myqcloud.com/corp/1753030485940633600/1781229014471147520/doc/xxxxxxx.docx"
}
],
"content": ""
}
}
Real-time document with content:
{
"payload": {
"request_id": "XCBLyiWwYV-9639940128",
"session_id": "c60f0463-0176-4c15-a236-e4cbb6c21f97",
"file_infos": [
{
"doc_id": "1833044598014003968",
"file_name": "Riverside Scenery",
"file_type": "docx",
"file_size": "34859",
"file_url": "https://lke-realtime-1251316161.cos.ap-jakarta.myqcloud.com/corp/1753030485940633600/1781229014471147520/doc/xxxxxxdocx"
}
],
"content": "extract the main points from the document"
}
}

3.3 Special Notes

After parsing the document is completed, assemble it into the file_infos field for WebSocket or HTTP SSE dialogue, then call the dialogue to use it. Here are some simple file_infos field examples.
Note:
1. doc_id and file_size are of string type; if integer type is passed, it will result in a parameter error.
2. Do not include the file extension in file_name.
3. The session_id used in the conversation should be consistent with the session_id passed when the docParse interface is called; otherwise, the uploaded files will be unable to be retrieved.
4. Documents uploaded for real-time document parsing have a limit on conversation turns. The conversation turns should be consistent with the context turns configured on the page. The default retention period is 24h; if a new conversation based on the document occurs within 24h, it will be renewed for another 24h. Do not repeatedly use the same document id for Q&A.

4. Image QA and Document QA Demo

Note:
For other programming languages, Demo is currently not available. You can refer to the documentation and existing Demos to implement it yourself.
The purpose of the Demo is for quick verification, using the HTTP SSE dialogue method. Users can choose the Websocket dialogue method based on actual business needs.



Was this page helpful?
You can also Contact Sales or Submit a Ticket for help.
Yes
No

Feedback