tencent cloud

Media Processing Service

Release Notes and Announcements
Release Notes
Announcements
Product Introduction
Overview
Features
Strengths
Use Cases
Purchase Guide
Billing Overview
Purchase Instructions
Top Up and Purchase
Overdue Payments
Refund
Getting Started
Console Guide
Overview
Creating Tasks
Tasks
Orchestrations
Templates
Resource Packs
Video Evaluation
AIGC Content Generation
Terminal SDK
Subtitle Editing Tool
Usage Statistics
Cloud Access Management Sample
Integration Tutorials
Audio/Video Transcoding Integration
Audio/Video Enhancement Integration
Audio Separation Integration
Integration of Digital and Visible Watermarks
Media AI Integration Tutorial
Media Quality Inspection Integration
Terminal SDK integration
MPS Live Stream Recording integration
DRM integration
Other tutorials
Application Scenario and Practical Tutorial
Image Quality Improvement Scenario
Audio/Video Cost Optimization Scenario
Short Drama Translation Scenario
AI-Generated Content Scenario
Online Education Scenarios
API Documentation
History
Introduction
API Category
Making API Requests
Processing Task Initiation APIs
Task Management APIs
Transcoding and Enhancement Template APIs
Watermark Template APIs
Screenshot Template APIs
Media AI Template APIs
Media AI—Hotword Lexicon APIs
Media AI—Sample Management APIs
Media Quality Inspection Template APIs
Live Streaming Recording Template APIs
Orchestration Management APIs
Data Statistics APIs
StreamLink—Security Group Management APIs
Other APIs
Image Processing Template APIs
AI Generation APIs
Other APIs
Data Types
Error Codes
Other Documents
WebSocket Protocol for Recognition
FAQs
Basics
Account Authorization
Task Configuration
Task Initiation
Task Result Viewing
Related Agreement
Service Level Agreement
Privacy Policy
Data Processing And Security Agreement
Contact Us
Glossary

RecognizeAudio

PDF
Focus Mode
Font Size
Last updated: 2026-03-10 11:15:00

1. API Description

Domain name for API request: mps.intl.tencentcloudapi.com.

This API is used to return the speech recognition results synchronously.

A maximum of 5 requests can be initiated per second for this API.

We recommend you to use API Explorer
Try it
API Explorer provides a range of capabilities, including online call, signature authentication, SDK code generation, and API quick search. It enables you to view the request, response, and auto-generated examples.

2. Input Parameters

The following request parameter list only provides API request parameters and some common parameters. For the complete common parameter list, see Common Request Parameters.

Parameter Name Required Type Description
Action Yes String Common Params. The value used for this API: RecognizeAudio.
Version Yes String Common Params. The value used for this API: 2019-06-12.
Region No String Common Params. This parameter is not required for this API.
AudioData Yes String Base64-encoded audio data.
Source No String Target language for recognition. If this is not specified, the language is automatically identified (auto).Note: If the automatic identification provides unsatisfactory results, you can specify the language to improve the accuracy.Supported languages:auto: automatic identification.zh: Simplified Chinese.en: English.ja: Japanese.ko: Korean.vi: Vietnamese.ms: Malay.id: Indonesian.fil: Filipino.th: Thai.pt: Portuguese.tr: Turkish.ar: Arabic.es: Spanish.hi: Hindi.fr: French.de: German.it: Italian.yue: Cantonese.ru: Russian.af: Afrikaans.sq: Albanian.am: Amharic.hy: Armenian.az: Azerbaijani.eu: Basque.bn: Bengali.bs: Bosnian.bg: Bulgarian.my: Burmese.ca: Catalan.hr: Croatian.cs: Czech.da: Danish.nl: Dutch.et: Estonian.fi: Finnish.gl: Galician.ka: Georgian.el: Greek.gu: Gujarati.iw: Hebrew.hu: Hungarian.is: Icelandic.jv: Javanese.kn: Kannada.kk: Kazakh.km: Khmer.rw: Kinyarwanda.lo: Lao.lv: Latvian.lt: Lithuanian.mk: Macedonian.ml: Malayalam.mr: Marathi.mn: Mongolian.ne: Nepali.no: Norwegian Bokmal.fa: Persian.pl: Polish.ro: Romanian.sr: Serbian.si: Sinhala.sk: Slovak.sl: Slovenian.st: Southern Sotho.su: Sundanese.sw: Swahili.sv: Swedish.ta: Tamil.te: Telugu.ts: Tsonga.uk: Ukrainian.ur: Urdu.uz: Uzbek.ve: Vendaxh: Xhosa.zu: Zulu.
AudioFormat No String Audio data format. Default value: pcm.Supported formats:pcm (mono 16-bit PCM data with a sample rate of 16000).ogg-opus (mono Opus-encoded Ogg data with sample rates of 16000, 24000, or 48000).
SampleRate No Integer Audio sample rate.Supported sample rates:pcm 16000
ogg-opus 16000 / 24000 / 48000
UserExtPara No String Extended parameter. This is left empty by default. Use this parameter for special requirements.

3. Output Parameters

Parameter Name Type Description
Text String Recognition result of the entire audio.
AudioLength Float Audio duration, in seconds.
Sentence Array of RecognizeAudioSentence Recognition results of individual sentences.
RequestId String The unique request ID, generated by the server, will be returned for every request (if the request fails to reach the server for other reasons, the request will not obtain a RequestId). RequestId is required for locating a problem.

4. Example

Example1 RecognizeAudio

Input Example

POST / HTTP/1.1
Host: mps.intl.tencentcloudapi.com
Content-Type: application/json
X-TC-Action: RecognizeAudio
<Common request parameters>

{
    "Source": "zh",
    "AudioFormat": "pcm",
    "AudioData": "KwDn/zIA5v///wUA0v8D"
}

Output Example

{
    "RequestId": "f27f3866-3882-4c18-a4ac-3b3d83fd2f5a",
    "Response": {
        "AudioLength": 4.2,
        "RequestId": "f27f3866-3882-4c18-a4ac-3b3d83fd2f5a",
        "Sentence": [
            {
                "End": 3.59,
                "Start": 0.03,
                "Text": "The third and fourth meetings were held at the Great Hall of the People.",
                "WordsInfo": [
                    {
                        "End": 0.27,
                        "Start": 0.03,
                        "Word": "The"
                    },
                    {
                        "End": 0.43,
                        "Start": 0.27,
                        "Word": "third"
                    },
                    {
                        "End": 0.51,
                        "Start": 0.43,
                        "Word": "and"
                    },
                    {
                        "End": 0.71,
                        "Start": 0.51,
                        "Word": "fourth"
                    },
                    {
                        "End": 0.91,
                        "Start": 0.71,
                        "Word": "meetings"
                    },
                    {
                        "End": 1.07,
                        "Start": 0.91,
                        "Word": "were"
                    },
                    {
                        "End": 1.55,
                        "Start": 1.39,
                        "Word": "held"
                    },
                    {
                        "End": 1.71,
                        "Start": 1.55,
                        "Word": "at"
                    },
                    {
                        "End": 1.95,
                        "Start": 1.75,
                        "Word": "the"
                    },
                    {
                        "End": 2.15,
                        "Start": 1.95,
                        "Word": "Great"
                    },
                    {
                        "End": 2.39,
                        "Start": 2.15,
                        "Word": "Hall"
                    },
                    {
                        "End": 2.75,
                        "Start": 2.47,
                        "Word": "of"
                    },
                    {
                        "End": 2.91,
                        "Start": 2.75,
                        "Word": "the"
                    },
                    {
                        "End": 3.11,
                        "Start": 2.91,
                        "Word": "People."
                    }
                ]
            }
        ],
        "Text": "The third and fourth meetings were held at the Great Hall of the People."
    }
}

5. Developer Resources

SDK

TencentCloud API 3.0 integrates SDKs that support various programming languages to make it easier for you to call APIs.

Command Line Interface

6. Error Code

The following only lists the error codes related to the API business logic. For other error codes, see Common Error Codes.

Error Code Description
InternalError.RecognitionError Recognition error.
InvalidParameterValue.AudioData Invalid audio data.
InvalidParameterValue.AudioDataTooLong The audio data is too long.
InvalidParameterValue.AudioFormat Unsupported audio data format.
InvalidParameterValue.SampleRate Invalid audio sample rate.
InvalidParameterValue.SourceLanguage SourceLanguage parameter error.
ResourceNotFound.UserUnregister The user is not registered.

Help and Support

Was this page helpful?

Help us improve! Rate your documentation experience in 5 mins.

Feedback