Tencent Cloud

Recent Pages

Deploying ASR

Last updated: 2023-05-05 14:55:54

Use cases
Draw and guess: The audio of a user in the room is pulled through this API for real-time recognition and then converted into text, which is called back to the customer’s business server for business logic judgment.
Audio audit: This API is closely related to business. It delivers the data streams to the speech recognition API for speech recognition and keyword-based filtering.
Real-time subtitles: The audio data in a room is recognized through this API in real time and converted into text, which is displayed at the frontend.
Architecture
The following figure shows the detailed process:
﻿
﻿
﻿
Application strengths
Real-time return: The audio data in a Tencent Real-Time Communication (TRTC) room is recognized and returned in real time, which is fast and efficient.
Simple process: TRTC and Automatic Speech Recognition (ASR) are deeply integrated, so that the data streams are fully pipelined without complex operations.
Flexible use: The data can be associated with the business logic in real time after it is returned to the business server.
Must-knows
In general, it takes a long time to process speech recognition because Async Execution is enabled during function deployment.
The recognition results are sent to the business server. WebSocket connections are not supported. Therefore, the recognition results cannot be sent to clients.
The default authentication type is App Authentication. For more information, see Application Management. You can change the authentication type to Authentication-Free during a test. For more information, see [Step 3. Create an API with Mock as the Backend Type](https://www. tencentcloud.com/document/product/628/44318!6af3e7433c19e0c5b321cde7fed199bd).
Directions
1. Activate the service
You must activate Tencent Cloud ASR. For more information, see [Activate the service](https://www.tencentcloud.com/document/product/1118/43344#3.-.E6.96.B0.E6.89.8B.E5.85.A5.E9.97.A8! 673aaf6cd84c2d22e15eb0eeef6866e8).
2. Deploying function
1. Log in to the SLS console.
2. Click Create application to go to the "Create application" page.
3. Select Live Stream Real-time ASR and set parameters on the Basic Configuration page.
Application name: Specify a custom name for the application.
Region: Select the region based on the actual business.
Key information: You can view the key’s information of the Tencent Cloud account on the Manage API Key page.
4. Click Complete.
5. Click the function name in the Function name column in the Cloud function section to go to the details page of the function, and click "Trigger management" on the left to view the access path.
3. Configuring the ASR startup API
proto: HTTPS
Method: POST
URL: https://service-xxx-xxxx.sh.apigw.tencentcs.com/release/asr_speech
Request parameters:
Parameter
Type
Required
Description
SdkAppId
Int
Yes
Application ID. Each TRTC application has a unique application ID.
RoomId
Int
No
Room ID of the integer type. Each room of a TRTC application has a unique room ID.
StrRoomId
String
No
Room ID of the string type. Either RoomId or StrRoomId must be configured. If both are configured, RoomId is used.
UserId
String
Yes
ID of the user who uses the recording service. Each user of a TRTC application has a unique user ID.
UserSig
String
Yes
Signature of the user who uses the recording service. The signature is used for login authentication of the user.
Callback
String
No
The address to which a webhook is sent by using the POST method when the recording ends.
Sample request:
{
    "SdkAppId": 1400000000,
    "RoomId": 43474,
    "UserId": "user_55952145",
    "UserSig": "eJwtzNEKgkAUBNBxxxxxxx",
    "Callback": "https:xxxxxxxx.com/post/xxx"
}
Recognition result webhook API
Webhook parameters:
Parameters
Type
Required
Description
SdkAppId
Int
Yes
Application ID.
RoomId
int
Yes
Room ID of the integer type.
UserId
String
Yes
ID of the recognized user.
StrRoomId
String
Yes
Room ID of the string type.
Result
Array
Yes
Results of audio recognition in the format of [{},{},{},{}].
Status
String
Yes
Recognition status of the current user. Valid values: normal and finished.
The value of Result is a JSON array that contains the following objects:
Parameter
Type
Required
Description
Voice
String
Yes
Text of the current sentence in UTF8.
Index
Integer
Yes
Sequence number of the current sentence in the entire audio stream. The sequence number starts from 0.
StartTime
Integer
Yes
Start time of the current sentence in the entire audio stream.
EndTime
Integer
Yes
End time of the current sentence in the entire audio stream.
Message
String
Yes
Execution result of the recognition task, such as recognition finished, recognition in progress, and recognition failed.
Sample result:
{
    "RequestID": "95941e2c85898384a95b81c2a5******",
    "SdkAppId": 1400000000,
    "RoomId": 43474,
    "UserId": "user_55952145",
    "Status": "recognizing/finished",
    "Result": [{
        "Voice": "Real-time voice recognition",
        "Index": 0,
        "StartTime": 0,
        "EndTime": 1024,
        "Message": "success"
    }]
}
﻿

Contact Us

Contact our sales team or business advisors to help your business.

Technical Support

Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

7x24 Phone Support

tencent cloud

Recent Pages

Deploying ASR

Use cases

Architecture

Application strengths

Must-knows

Directions

1. Activate the service

2. Deploying function

3. Configuring the ASR startup API

Recognition result webhook API

Was this page helpful?

Was this page helpful?

Parameter	Type	Required	Description
SdkAppId	Int	Yes	Application ID. Each TRTC application has a unique application ID.
RoomId	Int	No	Room ID of the integer type. Each room of a TRTC application has a unique room ID.
StrRoomId	String	No	Room ID of the string type. Either `RoomId` or `StrRoomId` must be configured. If both are configured, `RoomId` is used.
UserId	String	Yes	ID of the user who uses the recording service. Each user of a TRTC application has a unique user ID.
UserSig	String	Yes	Signature of the user who uses the recording service. The signature is used for login authentication of the user.
Callback	String	No	The address to which a webhook is sent by using the POST method when the recording ends.

Parameters	Type	Required	Description
SdkAppId	Int	Yes	Application ID.
RoomId	int	Yes	Room ID of the integer type.
UserId	String	Yes	ID of the recognized user.
StrRoomId	String	Yes	Room ID of the string type.
Result	Array	Yes	Results of audio recognition in the format of [{},{},{},{}].
Status	String	Yes	Recognition status of the current user. Valid values: `normal` and `finished`.

tencent cloud

Sign Up

Log in

Recent Pages

Deploying ASR

Use cases

Architecture

Application strengths

Must-knows

Directions

1. Activate the service

2. Deploying function

3. Configuring the ASR startup API

Recognition result webhook API

Was this page helpful?

Was this page helpful?