tencent cloud

Cloud Object Storage

Release Notes and Announcements
Release Notes
Announcements
Product Introduction
Overview
Features
Use Cases
Strengths
Concepts
Regions and Access Endpoints
Specifications and Limits
Service Regions and Service Providers
Billing
Billing Overview
Billing Method
Billable Items
Free Tier
Billing Examples
Viewing and Downloading Bill
Payment Overdue
FAQs
Getting Started
Console
Getting Started with COSBrowser
User Guide
Creating Request
Bucket
Object
Data Management
Batch Operation
Global Acceleration
Monitoring and Alarms
Operations Center
Data Processing
Content Moderation
Smart Toolbox
Data Processing Workflow
Application Integration
User Tools
Tool Overview
Installation and Configuration of Environment
COSBrowser
COSCLI (Beta)
COSCMD
COS Migration
FTP Server
Hadoop
COSDistCp
HDFS TO COS
GooseFS-Lite
Online Tools
Diagnostic Tool
Use Cases
Overview
Access Control and Permission Management
Performance Optimization
Accessing COS with AWS S3 SDK
Data Disaster Recovery and Backup
Domain Name Management Practice
Image Processing
Audio/Video Practices
Workflow
Direct Data Upload
Content Moderation
Data Security
Data Verification
Big Data Practice
COS Cost Optimization Solutions
Using COS in the Third-party Applications
Migration Guide
Migrating Local Data to COS
Migrating Data from Third-Party Cloud Storage Service to COS
Migrating Data from URL to COS
Migrating Data Within COS
Migrating Data Between HDFS and COS
Data Lake Storage
Cloud Native Datalake Storage
Metadata Accelerator
GooseFS
Data Processing
Data Processing Overview
Image Processing
Media Processing
Content Moderation
File Processing Service
File Preview
Troubleshooting
Obtaining RequestId
Slow Upload over Public Network
403 Error for COS Access
Resource Access Error
POST Object Common Exceptions
API Documentation
Introduction
Common Request Headers
Common Response Headers
Error Codes
Request Signature
Action List
Service APIs
Bucket APIs
Object APIs
Batch Operation APIs
Data Processing APIs
Job and Workflow
Content Moderation APIs
Cloud Antivirus API
SDK Documentation
SDK Overview
Preparations
Android SDK
C SDK
C++ SDK
.NET(C#) SDK
Flutter SDK
Go SDK
iOS SDK
Java SDK
JavaScript SDK
Node.js SDK
PHP SDK
Python SDK
React Native SDK
Mini Program SDK
Error Codes
Harmony SDK
Endpoint SDK Quality Optimization
Security and Compliance
Data Disaster Recovery
Data Security
Cloud Access Management
FAQs
Popular Questions
General
Billing
Domain Name Compliance Issues
Bucket Configuration
Domain Names and CDN
Object Operations
Logging and Monitoring
Permission Management
Data Processing
Data Security
Pre-signed URL Issues
SDKs
Tools
APIs
Agreements
Service Level Agreement
Privacy Policy
Data Processing And Security Agreement
Contact Us
Glossary

Submit a Task

PDF
Focus Mode
Font Size
Last updated: 2026-01-12 22:36:57

Feature Description

Submit an ASR task.

Authorization Description

When using a sub-account, add the ci:CreateAsrJobs permission to the action in the authorization policy. For all supported API operations in Cloud Infinite, see CI action.
When a sub-account uses an asynchronous processing interface, it needs to be granted the cam:passrole permission. The asynchronous processing interface performs COS read-write operations through cam roles. The passrole permission is used for role passing. For details, refer to Access Management > Write Operation > passrole API.

Service Activation

To use this feature, bind a bucket and enable Cloud Infinite service in advance.
To use this feature, enable Smart Audio Service in advance via the console or API.
Note:
Note: After binding Cloud Infinite (CI), if you manually perform the unbind operation on the bucket, you will not be able to continue using this function.

Use Limits

When using this API, please confirm the related use limits. For details, see Use Limits.

Fee Instructions

This API is a paid service. The generated costs will be collected by Cloud Infinite. For billing details, see smart audio fee.


Request

Request sample

POST /jobs HTTP/1.1
Host: <BucketName-APPID>.ci.<Region>.myqcloud.com
Date: <GMT Date>
Authorization: <Auth String>
Content-Length: <length>
Content-Type: application/xml

<body>
Note:
Authorization: Auth String. For details, see the Request Signature document.

Request header

This API only uses common request headers. For details, see Common Request Headers documentation.

Request body

The following figure shows the request bodies required for implementing this request operation.
<Request>
<Tag>SpeechRecognition</Tag>
<Input>
<Object>input/test.mp3</Object>
</Input>
<Operation>
<TemplateId>t1460606b9752148c4ab182f55163ba7cd</TemplateId>
<Output>
<Region>ap-chongqing</Region>
<Bucket>test-123456789</Bucket>
<Object>output/asr.txt</Object>
</Output>
<UserData>This is my data.</UserData>
<JobLevel>0</JobLevel>
</Operation>
<CallBack>http://callback.demo.com</CallBack>
<CallBackFormat>JSON</CallBackFormat>
</Request>
The data are described as follows:
Node Name (Keyword)
Parent Node
Description
Type
Required or Not
Request
None.
Container for saving requests
Container
Yes
Container Type Request data description as follows:
Node Name (Keyword)
Parent Node
Description
Type
Required or Not
Tag
Request
Create task Tag: SpeechRecognition
String
Yes
Input
Request
Object information to be operated
Container
Yes
Operation
Request
Operation rule
Container
Yes
CallBackFormat
Request
Job callback format, JSON or XML, default XML, priority is higher than queue callback format
String
No
CallBackType
Request
Job callback type, Url or TDMQ, default Url, priority is higher than queue callback type
String
No
CallBack
Request
Job callback address, priority is higher than queue callback address. When set to no, it means the queue callback address does not generate callbacks.
String
No
CallBackMqConfig
Request
Task callback TDMQ configuration, required when CallBackType is TDMQ. For details, see CallBackMqConfig
Container
No
Container Type
Input
data description as follows:
Node Name (Keyword)
Parent Node
Description
Type
Required or Not
Object
Request.Input
File path
String
No
Container Type
Operation
data description as follows:
Node Name (Keyword)
Parent Node
Description
Type
Required or Not
TemplateId
Request.Operation
ASR template ID. For details, see Creating ASR Templates
String
No
SpeechRecognition
Request.Operation
ASR parameters, same as Request.SpeechRecognition in the Create ASR Template API
Container
No
Output
Request.Operation
Output Configuration
Container
Yes
UserData
Request.Operation
Pass through user information, printable ASCII, length not exceeding 1024
String
No
JobLevel
Request.Operation
Task priority, level limit: 0, 1, 2. The larger the level, the higher the task priority. Default is 0.
String
No
Note:
Note: The ASR parameter must be set through TemplateId or SpeechRecognition, with TemplateId having higher priority.
Container Type
Output
data description as follows:
Node Name (Keyword)
Parent Node
Description
Type
Required or Not
Region
Request.Operation.Output
Bucket Region
String
Yes
Bucket
Request.Operation.Output
Bucket for result storage
String
Yes
Object
Request.Operation.Output
File name of the result
String
Yes

Response

Response Headers

This API only returns the public response header. For details, see Common Response Headers documentation.

Response Body

The response body is returned as application/xml. An example including the complete node data is shown below:
<Response>
<JobsDetail>
<Code>Success</Code>
<CreationTime>2021-08-05T15:43:50+0800</CreationTime>
<EndTime>-</EndTime>
<Input>
<BucketId>test-1234567890</BucketId>
<Object>input/test.mp3</Object>
<Region>ap-chongqing</Region>
</Input>
<JobId>s58ccb634149211ed84ce2b1cd7fbb14a</JobId>
<Message/>
<Operation>
<Output>
<Bucket>test-1234567890</Bucket>
<Object>output/asr.txt</Object>
<Region>ap-chongqing</Region>
</Output>
<TemplateId>t1460606b9752148c4ab182f55163ba7cd</TemplateId>
<TemplateName>speech_demo</TemplateName>
<UserData>This is my data.</UserData>
<JobLevel>0</JobLevel>
</Operation>
<QueueId>pcd463e1467964d39ad2d3f66aacd8199</QueueId>
<QueueType>Speeching</QueueType>
<StartTime>-</StartTime>
<State>Submitted</State>
<Tag>SpeechRecognition</Tag>
</JobsDetail>
</Response>
The data are as follows:
Node Name (Keyword)
Parent Node
Description
Type
Response
None.
Container for saving results
Container
Container node Response content:
Node Name (Keyword)
Parent Node
Description
Type
JobsDetail
Response
Task Details
Container array
Container node
JobsDetail
content:
Node Name (Keyword)
Parent Node
Description
Type
Code
Response.JobsDetail
Error code, only meaningful when State is Failed
String
Message
Response.JobsDetail
Error description, only meaningful when State is Failed
String
JobId
Response.JobsDetail
ID of the newly created task
String
Tag
Response.JobsDetail
Tag of the newly created task: SpeechRecognition
String
State
Response.JobsDetail
Task Status
Submitted: submitted, pending execution
Running: executing
Success: execution successful
Failed: execution failed
Pause: task is paused. When the pause queue is triggered, to be executed tasks change to paused state.
Cancel: task cancelled
String
CreationTime
Response.JobsDetail
TaskTask creation time
String
StartTime
Response.JobsDetail
Task Start Time
String
EndTime
Response.JobsDetail
Task Closed At
String
QueueId
Response.JobsDetail
Task belonging to Queue ID
String
QueueType
Response.JobsDetail
Queue type of the task
String
Input
Response.JobsDetail
Input resource address of the task
Container
Operation
Response.JobsDetail
Rule of the task
Container
Content of the Container node Input
Node Name (Keyword)
Parent Node
Description
Type
Region
Response.JobsDetail.Input
Bucket Region
String
BucketId
Response.JobsDetail.Input
The bucket where the source file resides
String
Object
Response.JobsDetail.Input
Filename of the source file
String
Content of the Container node Operation
Node Name (Keyword)
Parent Node
Description
Type
TemplateId
Response.JobsDetail.Operation
Template ID of the task
String
TemplateName
Response.JobsDetail.Operation
Template name of the task, return when TemplateId exists
String
SpeechRecognition
Response.JobsDetail.Operation
Container
Output
Response.JobsDetail.Operation
Container
UserData
Response.JobsDetail.Operation
Pass through user information
String
JobLevel
Response.JobsDetail.Operation
Task priority.
String
SpeechRecognitionResult
Response.JobsDetail.Operation
ASR task result, do not return if none
Container
Contents of the Container node SpeechRecognitionResult:
Node Name (Keyword)
Parent Node
Description
Type
AudioTime
Response.JobsDetail.Operation.SpeechRecognitionResult
audio duration (seconds)
String
Result
Response.JobsDetail.Operation.SpeechRecognitionResult
SpeechRecognitionResult
String
FlashResult
Response.JobsDetail.Operation.SpeechRecognitionResult
Ultra-fast ASR result
Container array
ResultDetail
Response.JobsDetail.Operation.SpeechRecognitionResult
Recognition result details, including word time offsets for each sentence, generally used in subtitle generation scenarios. (This field is not null when ResTextFormat=1 in the speech recognition request.)
Note: This field may be null, indicating that no valid value can be obtained.
Container array
Container node FlashResult content:
Node Name (Keyword)
Parent Node
Description
Type
channel_id
Response.JobsDetail.Operation.SpeechRecognitionResult.FlashResult
Sound channel flag, starting from 0, corresponds to the number of audio channels
Int
text
Response.JobsDetail.Operation.SpeechRecognitionResult.FlashResult
Sound channel audio integrity recognition result
String
sentence_list
Response.JobsDetail.Operation.SpeechRecognitionResult.FlashResult
Sentence/paragraph-level recognition result list
Container array
Content of the Container node sentence_list:
Node Name (Keyword)
Parent Node
Description
Type
text
Response.JobsDetail.Operation.SpeechRecognitionResult.FlashResult.sentence_list
Sentence/paragraph-level text
String
start_time
Response.JobsDetail.Operation.SpeechRecognitionResult.FlashResult.sentence_list
Start Time
Int
end_time
Response.JobsDetail.Operation.SpeechRecognitionResult.FlashResult.sentence_list
End Time
Int
speaker_id
Response.JobsDetail.Operation.SpeechRecognitionResult.FlashResult.sentence_list
speaker Id (if speaker_diarization is set up in request, distinguish speakers by speaker_Id)
Int
word_list
Response.JobsDetail.Operation.SpeechRecognitionResult.FlashResult.sentence_list
Word-level recognition result list
Container array
Content of the Container node word_list:
Node Name (Keyword)
Parent Node
Description
Type
word
Response.JobsDetail.Operation.SpeechRecognitionResult.FlashResult.sentence_list.word_list
word-level text
String
start_time
Response.JobsDetail.Operation.SpeechRecognitionResult.FlashResult.sentence_list.word_list
Start Time
Int
end_time
Response.JobsDetail.Operation.SpeechRecognitionResult.FlashResult.sentence_list.word_list
End Time
Int
Content of the Container node ResultDetail
Node Name (Keyword)
Parent Node
Description
Type
FinalSentence
Response.JobsDetail.Operation.SpeechRecognitionResult.ResultDetail
Final recognition result of a sentence
String
SliceSentence
Response.JobsDetail.Operation.SpeechRecognitionResult.ResultDetail
intermediate recognition result of a sentence, split into multiple words using space
String
StartMs
Response.JobsDetail.Operation.SpeechRecognitionResult.ResultDetail
Start time of a sentence (ms)
String
EndMs
Response.JobsDetail.Operation.SpeechRecognitionResult.ResultDetail
End time of a sentence (ms)
String
WordsNum
Response.JobsDetail.Operation.SpeechRecognitionResult.ResultDetail
Number of words in a sentence
String
SpeechSpeed
Response.JobsDetail.Operation.SpeechRecognitionResult.ResultDetail
Speaking rate of a sentence, unit: words/sec
String
SpeakerId
Response.JobsDetail.Operation.SpeechRecognitionResult.ResultDetail
Sound channel or speaker Id (if speaker_diarization is set up or ChannelNum is set to 2 (stereo), distinguish speakers or channels)
String
Words
Response.JobsDetail.Operation.SpeechRecognitionResult.ResultDetail
Word details in a sentence
Container array
Content of the Container node Words:
Node Name (Keyword)
Parent Node
Description
Type
Word
Response.JobsDetail.Operation.SpeechRecognitionResult.ResultDetail.Words
word text
String
OffsetStartMs
Response.JobsDetail.Operation.SpeechRecognitionResult.ResultDetail.Words
Start time offset in a sentence
String
OffsetEndMs
Response.JobsDetail.Operation.SpeechRecognitionResult.ResultDetail.Words
End time offset in a sentence
String

Error Code

This request returns common error responses and error codes. For more information, see Error Codes.

Practical Case

Request: Use ASR Template ID

POST /jobs HTTP/1.1
Authorization: q-sign-algorithm=sha1&q-ak=************************************&q-sign-time=1497530202;1497610202&q-key-time=1497530202;1497610202&q-header-list=&q-url-param-list=&q-signature=****************************************
Host: test-1234567890.ci.ap-beijing.myqcloud.com
Content-Length: 166
Content-Type: application/xml

<Request>
<Tag>SpeechRecognition</Tag>
<Input>
<Object>input/test.mp3</Object>
</Input>
<Operation>
<TemplateId>t1460606b9752148c4ab182f55163ba7cd</TemplateId>
<Output>
<Region>ap-chongqing</Region>
<Bucket>test-123456789</Bucket>
<Object>output/asr.txt</Object>
</Output>
<UserData>This is my data.</UserData>
<JobLevel>0</JobLevel>
</Operation>
<CallBack>http://callback.demo.com</CallBack>
<CallBackFormat>JSON</CallBackFormat>
</Request>

Response

HTTP/1.1 200 OK
Content-Type: application/xml
Content-Length: 230
Connection: keep-alive
Date: Mon, 28 Jun 2022 15:23:12 GMT
Server: tencent-ci
x-ci-request-id: NTk0MjdmODlfMjQ4OGY3XzYzYzhf****

<Response>
<JobsDetail>
<Code>Success</Code>
<CreationTime>2021-08-05T15:43:50+0800</CreationTime>
<EndTime>-</EndTime>
<Input>
<BucketId>test-1234567890</BucketId>
<Object>input/test.mp3</Object>
<Region>ap-chongqing</Region>
</Input>
<JobId>s58ccb634149211ed84ce2b1cd7fbb14a</JobId>
<Message/>
<Operation>
<JobLevel>0</JobLevel>
<Output>
<Bucket>test-1234567890</Bucket>
<Object>output/asr.txt</Object>
<Region>ap-chongqing</Region>
</Output>
<TemplateId>t1460606b9752148c4ab182f55163ba7cd</TemplateId>
<TemplateName>speech_demo</TemplateName>
<UserData>This is my data.</UserData>
<JobLevel>0</JobLevel>
</Operation>
<QueueId>pcd463e1467964d39ad2d3f66aacd8199</QueueId>
<QueueType>Speeching</QueueType>
<StartTime>-</StartTime>
<State>Submitted</State>
<Tag>SpeechRecognition</Tag>
</JobsDetail>
</Response>

Request 2: Use ASR Parameters

POST /jobs HTTP/1.1
Authorization: q-sign-algorithm=sha1&q-ak=************************************&q-sign-time=1497530202;1497610202&q-key-time=1497530202;1497610202&q-header-list=&q-url-param-list=&q-signature=****************************************
Host: test-1234567890.ci.ap-beijing.myqcloud.com
Content-Length: 166
Content-Type: application/xml

<Request>
<Tag>SpeechRecognition</Tag>
<Input>
<Object>input/test.mp3</Object>
</Input>
<Operation>
<SpeechRecognition>
<EngineModelType>16k_zh_video</EngineModelType>
<ChannelNum>1</ChannelNum>
<FilterDirty>1</ChannelNum>
<FilterModal>1</ChannelNum>
</SpeechRecognition>
<Output>
<Region>ap-chongqing</Region>
<Bucket>test-123456789</Bucket>
<Object>output/asr.txt</Object>
</Output>
<UserData>This is my data.</UserData>
<JobLevel>0</JobLevel>
</Operation>
<CallBack>http://callback.demo.com</CallBack>
<CallBackFormat>JSON</CallBackFormat>
</Request>

Response

HTTP/1.1 200 OK
Content-Type: application/xml
Content-Length: 230
Connection: keep-alive
Date: Mon, 28 Jun 2022 15:23:12 GMT
Server: tencent-ci
x-ci-request-id: NTk0MjdmODlfMjQ4OGY3XzYzYzhf****


<Response>
<JobsDetail>
<Code>Success</Code>
<CreationTime>2021-08-05T15:43:50+0800</CreationTime>
<EndTime>-</EndTime>
<Input>
<BucketId>test-1234567890</BucketId>
<Object>input/test.mp3</Object>
<Region>ap-chongqing</Region>
</Input>
<JobId>s58ccb634149211ed84ce2b1cd7fbb14a</JobId>
<Message/>
<Operation>
<Output>
<Bucket>test-1234567890</Bucket>
<Object>output/asr.txt</Object>
<Region>ap-chongqing</Region>
</Output>
<SpeechRecognition>
<ChannelNum>1</ChannelNum>
<ConvertNumMode>0</ConvertNumMode>
<EngineModelType>16k_zh_video</EngineModelType>
<FilterDirty>0</FilterDirty>
<FilterModal>0</FilterModal>
<FilterPunc>0</FilterPunc>
<OutputFileType>txt</OutputFileType>
<ResTextFormat>0</ResTextFormat>
<SpeakerDiarization>0</SpeakerDiarization>
<SpeakerNumber>0</SpeakerNumber>
</SpeechRecognition>
<UserData>This is my data.</UserData>
<JobLevel>0</JobLevel>
</Operation>
<QueueId>pcd463e1467964d39ad2d3f66aacd8199</QueueId>
<QueueType>Speeching</QueueType>
<StartTime>-</StartTime>
<State>Submitted</State>
<Tag>SpeechRecognition</Tag>
</JobsDetail>
</Response>


Help and Support

Was this page helpful?

Help us improve! Rate your documentation experience in 5 mins.

Feedback