ADP Document Parsing Protocol

Download

Focus Mode

Font Size

Last updated: 2026-02-06 15:30:35

Note:
ADP Document Parsing Protocol is primarily used to clarify the document parsing protocol defined by ADP, guiding users in service encapsulation. To understand the end-to-end process of accessing document parsing services in ADP, you may refer to the ADP Document Parsing Service Access Guide.
Version information
Version number:v1.0
Update date:2025-12-23
Protocol:HTTPS
Request method:POST
Data format:JSON
Character encoding:UTF-8
Authentication method
Bearer Token Authentication
All API requests must carry a Bearer Token in the HTTP Header for authentication.
Header format:
Authorization: Bearer {your_access_token}
Example:
Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...
API List
1. Submit a Document Parsing Task
API Description
Submit a document parsing task that supports various document formats such as PDF, Word, Excel, and PPT. After task submission, a TaskId will be returned for subsequent querying of parsing results.
API address
POST https://api.example.com/api/v1/document/parse/submit
Request Headers (Headers)
Parameter Name
Type
Required
Description
Authorization
string
Yes
Bearer Token Authentication
Content-Type
string
Yes
application/json
Request Parameters (Request Body)
Parameter Name
Type
Required
Description
Examples
FileType
string
Yes
File types supported: PDF, DOCX, DOC, XLSX, XLS, PPTX, PPT, TXT, MD, HTML
"PDF"
FileName
string
Yes
File name, including extension
"example.pdf"
FileUrl
string
Yes
File download address must be an accessible HTTPS link, supporting signed temporary URLs (validity period should be ≥ 1 hour).
File url
Request Example
{
  "FileType": "PDF",
  "FileName": "example.pdf",
  "FileUrl": "https://example.cos.ap-jakarta.myqcloud.com/public/example/example.pdf"
}
Response Parameters (Response)
Parameter Name
Type
Description
Response
object
Response object.
Response.RequestId
string
Request ID for troubleshooting.
Response.TaskId
string
Task ID for querying parsing results.
Response Example
Successful Response (200 OK):
{
  "Response": {
    "RequestId": "5e148c27-9c21-43cd-992c-799117bb4216",
    "TaskId": "236e51fd-827b-41cb-b303-56003a817ce5"
  }
}
Error Response:
{
  "Response": {
    "Error": {
      "Code": "InvalidParameter",
      "Message": "FileUrl is required"
    },
    "RequestId": "5e148c27-9c21-43cd-992c-799117bb4216"
  }
}
Error Code Description
Error Code
HTTP Status Code
Description
InvalidParameter
400
The request parameters were incorrect.
InvalidFileUrl
400
File URL invalid or inaccessible
UnsupportedFileType
400
Unsupported file type
Unauthorized
401
Unauthorized, Token missing
Forbidden
403
Token invalid or expired
FileTooLarge
413
File size exceeds the limit.
TooManyRequests
429
Request rate limit exceeded.
InternalError
500
Internal service error
2. Query Document Parsing Results
API Description
Query the status and results of a document parsing task based on the TaskId. Upon completion of parsing, download URLs for the result files will be provided.
API address
POST https://api.example.com/api/v1/document/parse/query
Request Headers (Headers)
Parameter Name
Type
Required
Description
Authorization
string
Yes
Bearer Token Authentication
Content-Type
string
Yes
application/json
Request Parameters (Request Body)
Parameter Name
Type
Required
Description
Examples
TaskId
string
Yes
Task ID returned by the submission interface.
"236e51fd-827b-41cb-b303-56003a817ce5"
Request Example
{
  "TaskId": "236e51fd-827b-41cb-b303-56003a817ce5"
}
Response Parameters (Response)
Parameter Name
Type
Description
Response
object
Response object.
Response.RequestId
string
Request ID
Response.Status
string
Task Status:
Pending (Waiting)
Processing (In Progress)
Success (Succeeded)
Failed (Failed)
Response.DocumentRecognizeResultUrl
string
Download URLs for the parsed result files (ZIP format), returned only when Status is Success.
Response.Progress
integer
Task progress (0-100), returned only when Status is Processing.
Response.ErrorCode
string
Error code, returned only when Status is Failed.
Response.ErrorMessage
string
Error message, returned only when Status is Failed.
Response Example
Successful Response - Task Completed (200 OK):
{
  "Response": {
    "RequestId": "ffe23ed8-2b64-4835-aedc-ca9a5b5a7384",
    "Status": "Success",
    "DocumentRecognizeResultUrl": "https://example.cos.ap-jakarta.myqcloud.com/public/example/example.zip"
  }
}
Successful Response - Task in Progress (200 OK):
{
  "Response": {
    "RequestId": "ffe23ed8-2b64-4835-aedc-ca9a5b5a7384",
    "Status": "Processing",
    "Progress": 65
  }
}
Successful Response - Task Pending (200 OK):
{
  "Response": {
    "RequestId": "ffe23ed8-2b64-4835-aedc-ca9a5b5a7384",
    "Status": "Pending"
  }
}
Successful Response - Task Failed (200 OK):
{
  "Response": {
    "RequestId": "ffe23ed8-2b64-4835-aedc-ca9a5b5a7384",
    "Status": "Failed",
    "ErrorCode": "ParseError",
    "ErrorMessage": "Document format corrupted and cannot be parsed"
  }
}
Error Response - Task Does Not Exist (404 Not Found):
{
  "Response": {
    "Error": {
      "Code": "TaskNotFound",
      "Message": "Task not found"
    },
    "RequestId": "ffe23ed8-2b64-4835-aedc-ca9a5b5a7384"
  }
}
Task Status Descriptions
Status
Description
Recommended Actions
Pending
Task has been submitted and is pending processing.
Continue polling the task status.
Processing
Task in progress.
Continue polling the task status.
Success
Task completed successfully.
Download the result file.
Failed
Task failed.
View the error message and resubmit.
Error Code Description
Error Code
HTTP Status Code
Description
InvalidParameter
400
The request parameters were incorrect.
TaskNotFound
404
Task does not exist.
Unauthorized
401
Unauthorized, Token missing.
Forbidden
403
Token invalid or expired.
InternalError
500
Internal service error
3. Synchronous Parsing Interface
API Description
Submit a document parsing task that supports various document formats such as PDF, Word, Excel, and PPT. After task submission, a TaskId will be returned for subsequent querying of parsing results.
API address
POST https://api.example.com/api/v1/document/parse/sync_parse
Request Headers (Headers)
Parameter Name
Type
Required
Description
Authorization
string
Yes
Bearer Token Authentication
Content-Type
string
Yes
application/json
Request Parameters (Request Body)
Parameter Name
Type
Required
Description
Examples
FileType
string
Yes
File types supported: PDF, DOCX, DOC, XLSX, XLS, PPTX, PPT, TXT, MD, HTML.
"PDF"
FileName
string
Yes
File name, including extension
"example.pdf"
FileUrl
string
Yes
File download address must be an accessible HTTPS link, supporting signed temporary URLs (validity period should be ≥ 1 hour).
File url
Request Example
{
  "FileType": "PDF",
  "FileName": "example.pdf",
  "FileUrl": "https://example.cos.ap-jakarta.myqcloud.com/public/example/example.pdf"
}
Response Parameters (Response)
Parameter Name
Type
Description
Response
object
Response object.
Response.RequestId
string
Request ID
Response.Status
string
Task Status:
Pending (Waiting)
Processing (In Progress)
Success (Succeeded)
Failed (Failed)
Response.DocumentRecognizeResultUrl
string
Download URLs for the parsed result files (ZIP format), returned only when Status is Success.
Response.Progress
integer
Task progress (0-100), returned only when Status is Processing.
Response.ErrorCode
string
Error code, returned only when Status is Failed.
Response.ErrorMessage
string
Error message, returned only when Status is Failed.
Response Example
Successful Response - Task Completed (200 OK):
{
  "Response": {
    "RequestId": "ffe23ed8-2b64-4835-aedc-ca9a5b5a7384",
    "Status": "Success",
    "DocumentRecognizeResultUrl": "https://example.cos.ap-jakarta.myqcloud.com/public/example/example.zip"
  }
}
Error Response - Task Failed (200 OK):
{
  "Response": {
    "RequestId": "ffe23ed8-2b64-4835-aedc-ca9a5b5a7384",
    "Status": "Failed",
    "ErrorCode": "ParseError",
    "ErrorMessage": "Document format corrupted and cannot be parsed"
  }
}
Error Response - Parameter Error (400 Bad Request):
{
  "Response": {
    "Error": {
      "Code": "InvalidParameter",
      "Message": "FileUrl is required"
    },
    "RequestId": "ffe23ed8-2b64-4835-aedc-ca9a5b5a7384"
  }
}
Error Code Description
Error Code
HTTP Status Code
Description
InvalidParameter
400
The request parameters were incorrect.
InvalidFileUrl
400
File URL invalid or inaccessible.
UnsupportedFileType
400
Unsupported file type
Unauthorized
401
Unauthorized, Token missing
Forbidden
403
Token invalid or expired
FileTooLarge
413
File size exceeds the limit.
TooManyRequests
429
Request rate limit exceeded.
RequestTimeout
408
Request timeout (parsing takes too long)
InternalError
500
Internal service error
Parsing Result File Description
Result File Format
After the parsing is complete, DocumentRecognizeResultUrl returns a download URL for a ZIP archive, containing the following:
76aef68b-c444-41d2-829a-d513fa35e42b.zip                     # zip file downloaded from DocumentRecognizeResultUrl
├── 76aef68b-c444-41d2-829a-d513fa35e42b/                    # Subdirectory (named with task ID)
│   ├── 76aef68b-c444-41d2-829a-d513fa35e42b_parse_page0.json  # Parsing result of page 1
│   ├── 76aef68b-c444-41d2-829a-d513fa35e42b_parse_page1.json  # Parsing result of page 2
│   ├── ...                                                  # Parsing results of more pages
│   └── images/                                              # Directory for extracted images
│       ├── 76c7b6051d432f6527bd91a02321d126-image.png     # Image file (named with UUID)
│       └── ...
Parsing Result File Structure Description
The parsing result of each page is saved in a separate JSON file (*_parse_page{N}.json), containing the complete identification information of that page.
Parsing Page Structure (Page Object)
Field Name
Type
Description
PageNumber
integer
The page number, starting from 1
Angle
integer
Page rotation angle (°)
RotatedAngle
integer
Current rotation angle (°)
Height
integer
Page height (px)
Width
integer
Page width (px)
OriginHeight
integer
Original page height (px)
OriginWidth
integer
Original page width (px)
Elements
array
Page element list, see below for details.
Element Structure (Element Object)
Field Name
Type
Description
Index
integer
The index of the element in the page, starting from 0.
Type
string
Element types: title (title), text (text), figure (figure), table (table), figure_text (figure text)
Text
string
Element text content
Level
integer
Element level: 0 indicates top-level elements, 1 indicates nested elements
Polygon
object
Element position coordinates (quadrilateral), see below for details.
InsetImageName
string
Embedded image name (if any)
Elements
array
Nested child element list (recursive structure)
ImagePath
string
Image file path (relative to the ZIP root directory)
Coordinate Structure (Polygon Object)
Field Name
Type
Description
LeftTop
object
Top-left corner coordinates {"X": int, "Y": int}
RightTop
object
Top-right corner coordinates {"X": int, "Y": int}
LeftBottom
object
Bottom-left corner coordinates {"X": int, "Y": int}
RightBottom
object
Bottom-right corner coordinates {"X": int, "Y": int}
Coordinate system description:
The origin (0, 0) is located at the top-left corner of the page.
The X-axis increases to the right.
The Y-axis increases downward.
Element Type Description
Type
Description
Characteristic
title
Title
They are typically section headings in documents.
text
Plain text
Paragraphs, body text
figure
Chart
Contains visual content such as images, charts, and so on, and may include nested figure_text
figure_text
Text in charts
Text content identified within charts
table
Table
Structured tabular data
Parsing Result Example
Example 1: Page with Title and Charts
{
  "PageNumber": 1,
  "Angle": 0,
  "RotatedAngle": 0,
  "Height": 286,
  "Width": 736,
  "OriginHeight": 286,
  "OriginWidth": 736,
  "Elements": [
    {
      "Index": 0,
      "Type": "title",
      "Text": "# Data Scale",
      "Level": 0,
      "Polygon": {
        "LeftTop": {"X": 3, "Y": 98},
        "RightTop": {"X": 25, "Y": 98},
        "LeftBottom": {"X": 3, "Y": 169},
        "RightBottom": {"X": 25, "Y": 169}
      },
      "InsetImageName": "",
      "Elements": null,
      "ImagePath": ""
    },
    {
      "Index": 1,
      "Type": "figure",
      "Text": "",
      "Level": 0,
      "Polygon": {
        "LeftTop": {"X": 41, "Y": 4},
        "RightTop": {"X": 733, "Y": 4},
        "LeftBottom": {"X": 41, "Y": 286},
        "RightBottom": {"X": 733, "Y": 286}
      },
      "InsetImageName": "",
      "Elements": [
        {
          "Index": 0,
          "Type": "figure_text",
          "Text": "10 000 000\\n1 000 000\\n100 000\\n10 000\\n1 000",
          "Level": 1,
          "Polygon": {
            "LeftTop": {"X": 41, "Y": 4},
            "RightTop": {"X": 733, "Y": 4},
            "LeftBottom": {"X": 41, "Y": 286},
            "RightBottom": {"X": 733, "Y": 286}
          },
          "InsetImageName": "",
          "Elements": null,
          "ImagePath": ""
        }
      ],
      "ImagePath": "images/76c7b6051d432f6527bd91a02321d126-image.png"
    }
  ]
}
Example 2: Page Containing Text and Tables
{
  "PageNumber": 2,
  "Angle": 0,
  "RotatedAngle": 0,
  "Height": 842,
  "Width": 595,
  "OriginHeight": 842,
  "OriginWidth": 595,
  "Elements": [
    {
      "Index": 0,
      "Type": "text",
      "Text": "This is a plain text used to describe the main information of the document.",
      "Level": 0,
      "Polygon": {
        "LeftTop": {"X": 50, "Y": 100},
        "RightTop": {"X": 545, "Y": 100},
        "LeftBottom": {"X": 50, "Y": 130},
        "RightBottom": {"X": 545, "Y": 130}
      },
      "InsetImageName": "",
      "Elements": null,
      "ImagePath": ""
    },
    {
      "Index": 1,
      "Type": "table",
      "Text": "Name\\tAge\\tPosition\\nZhang San\\t28\\tEngineer\\nLi Si\\t32\\tManager",
      "Level": 0,
      "Polygon": {
        "LeftTop": {"X": 50, "Y": 200},
        "RightTop": {"X": 545, "Y": 200},
        "LeftBottom": {"X": 50, "Y": 350},
        "RightBottom": {"X": 545, "Y": 350}
      },
      "InsetImageName": "",
      "Elements": null,
      "ImagePath": ""
    }
  ]
}
API Call Sample Code
Python Examples
import requests
import time
import json
﻿
# Configuration
API_BASE_URL = "https://api.example.com"
BEARER_TOKEN = "your_access_token"
﻿
headers = {
    "Authorization": f"Bearer {BEARER_TOKEN}",
    "Content-Type": "application/json"
}
﻿
# 1. Submit a parsing task
def submit_parse_task(file_url, file_name, file_type):
    url = f"{API_BASE_URL}/api/v1/document/parse/submit"
    payload = {
        "FileType": file_type,
        "FileName": file_name,
        "FileUrl": file_url
    }
﻿
    response = requests.post(url, headers=headers, json=payload)
    response.raise_for_status()
﻿
    result = response.json()
    return result["Response"]["TaskId"]
﻿
# 2. Query the parsing result
def query_parse_result(task_id, max_retries=60, interval=5):
    url = f"{API_BASE_URL}/api/v1/document/parse/query"
    payload = {"TaskId": task_id}
﻿
    for i in range(max_retries):
        response = requests.post(url, headers=headers, json=payload)
        response.raise_for_status()
﻿
        result = response.json()["Response"]
        status = result["Status"]
﻿
        print(f"[{i+1}/{max_retries}] Task status: {status}")
﻿
        if status == "Success":
            return result["DocumentRecognizeResultUrl"]
        elif status == "Failed":
            raise Exception(f"Parse failed: {result.get('ErrorMessage')}")
﻿
        time.sleep(interval)
﻿
    raise Exception("Query timeout")
﻿
# 3. Usage Example
try:
    # Submit Task
    task_id = submit_parse_task(
        file_url="https://example.com/document.pdf",
        file_name="document.pdf",
        file_type="PDF"
    )
    print(f"Task submitted: {task_id}")
﻿
    # Query Result
    result_url = query_parse_result(task_id)
    print(f"Parse completed: {result_url}")
﻿
    # Download Results
    # download_and_extract(result_url)
﻿
except Exception as e:
    print(f"Error: {e}")
Running Environment
Operating System: Ubuntu 24.04.3 LTS / x86_64
Runtime Version: Python 3.11.1
Python Synchronous Interface Example
import requests
import json
﻿
# Configuration
API_BASE_URL = "https://api.example.com"
BEARER_TOKEN = "your_access_token"
﻿
headers = {
    "Authorization": f"Bearer {BEARER_TOKEN}",
    "Content-Type": "application/json"
}
﻿
# Synchronous Parsing (Get results in a single request)
def sync_parse_document(file_url, file_name, file_type):
    url = f"{API_BASE_URL}/api/v1/document/parse/sync_parse"
    payload = {
        "FileType": file_type,
        "FileName": file_name,
        "FileUrl": file_url
    }
﻿
    # Set a longer timeout (recommended 5 minutes)
    response = requests.post(url, headers=headers, json=payload, timeout=300)
    response.raise_for_status()
﻿
    result = response.json()["Response"]
﻿
    if result["Status"] == "Success":
        return result["DocumentRecognizeResultUrl"]
    elif result["Status"] == "Failed":
        raise Exception(f"Parse failed: {result.get('ErrorMessage', 'Unknown error')}")
    else:
        raise Exception(f"Unexpected status: {result['Status']}")
﻿
# Usage Example
try:
    result_url = sync_parse_document(
        file_url="https://example.com/document.pdf",
        file_name="document.pdf",
        file_type="PDF"
    )
    print(f"Parse completed: {result_url}")
﻿
    # Download Results
    # download_and_extract(result_url)
﻿
except requests.exceptions.Timeout:
    print("Error: Request timeout (file too large or complex)")
except Exception as e:
    print(f"Error: {e}")
Running Environment
Operating System: Ubuntu 24.04.3 LTS / x86_64
Runtime Version: Python 3.11.1
CURL Example
Asynchronous Interface Invocation
Submit Task:
curl -X POST https://api.example.com/api/v1/document/parse/submit \\
  -H "Authorization: Bearer your_access_token" \\
  -H "Content-Type: application/json" \\
  -d '{
    "FileType": "PDF",
    "FileName": "example.pdf",
    "FileUrl": "https://qidian-qbot-1251316161.cos.ap-jakarta.myqcloud.com/public/example/example.pdf"
  }'
Query Result:
curl -X POST https://api.example.com/api/v1/document/parse/query \\
  -H "Authorization: Bearer your_access_token" \\
  -H "Content-Type: application/json" \\
  -d '{
    "TaskId": "236e51fd-827b-41cb-b303-56003a817ce5"
  }'
Synchronous Interface Call
Synchronous Parsing (Get results in a single request):
curl -X POST https://api.example.com/api/v1/document/parse/sync_parse \\
  -H "Authorization: Bearer your_access_token" \\
  -H "Content-Type: application/json" \\
  --max-time 300 \\
  -d '{
    "FileType": "PDF",
    "FileName": "example.pdf",
    "FileUrl": "https://qidian-qbot-1251316161.cos.ap-jakarta.myqcloud.com/public/example/example.pdf"
  }'
Note:
1. --max-time 300: sets the request timeout to 300 seconds (5 minutes), and can be adjusted appropriately based on file size.
2. The response directly returns the parsing results without requiring additional queries.
Must-Knows
If authentication is involved, authentication error codes must strictly return 401/403; otherwise, it will prevent the addition of custom models.

Help and Support

Was this page helpful?

You can also Contact sales or Submit a Ticket for help.

Help us improve! Rate your documentation experience in 5 mins.

Feedback

Parameter Name	Type	Required	Description
Authorization	string	Yes	Bearer Token Authentication
Content-Type	string	Yes	application/json

Error Code	HTTP Status Code	Description
InvalidParameter	400	The request parameters were incorrect.
InvalidFileUrl	400	File URL invalid or inaccessible
UnsupportedFileType	400	Unsupported file type
Unauthorized	401	Unauthorized, Token missing
Forbidden	403	Token invalid or expired
FileTooLarge	413	File size exceeds the limit.
TooManyRequests	429	Request rate limit exceeded.
InternalError	500	Internal service error

Status	Description	Recommended Actions
Pending	Task has been submitted and is pending processing.	Continue polling the task status.
Processing	Task in progress.	Continue polling the task status.
Success	Task completed successfully.	Download the result file.
Failed	Task failed.	View the error message and resubmit.

Field Name	Type	Description
PageNumber	integer	The page number, starting from 1
Angle	integer	Page rotation angle (°)
RotatedAngle	integer	Current rotation angle (°)
Height	integer	Page height (px)
Width	integer	Page width (px)
OriginHeight	integer	Original page height (px)
OriginWidth	integer	Original page width (px)
Elements	array	Page element list, see below for details.

Type	Description	Characteristic
title	Title	They are typically section headings in documents.
text	Plain text	Paragraphs, body text
figure	Chart	Contains visual content such as images, charts, and so on, and may include nested `figure_text`
figure_text	Text in charts	Text content identified within charts
table	Table	Structured tabular data

tencent cloud

Tencent Cloud Agent Development Platform

ADP Document Parsing Protocol

Version information

Authentication method

Bearer Token Authentication

API List

1. Submit a Document Parsing Task

API Description

API address

Request Headers (Headers)

Request Parameters (Request Body)

Request Example

Response Parameters (Response)

Response Example

Error Code Description

2. Query Document Parsing Results

API Description

API address

Request Headers (Headers)

Request Parameters (Request Body)

Request Example

Response Parameters (Response)

Response Example

Task Status Descriptions

Error Code Description

3. Synchronous Parsing Interface

API Description

API address

Request Headers (Headers)

Request Parameters (Request Body)

Request Example

Response Parameters (Response)

Response Example

Error Code Description

Parsing Result File Description

Result File Format

Parsing Result File Structure Description

Parsing Page Structure (Page Object)

Element Structure (Element Object)

Coordinate Structure (Polygon Object)

Element Type Description

Parsing Result Example

Example 1: Page with Title and Charts

Example 2: Page Containing Text and Tables

API Call Sample Code

Python Examples

Running Environment

Python Synchronous Interface Example

Running Environment

CURL Example

Asynchronous Interface Invocation

Synchronous Interface Call

Must-Knows

Help and Support