소식 및 공지 사항
- Tencent Cloud 오디오/비디오 단말 SDK 재생 업그레이드 및 권한 부여 인증 추가
- TRTC 월간 구독 패키지 출시 관련 안내
제품 소개
구매 가이드
- Billing Overview
- 무료 시간 안내
- Monthly subscription
- RTC Engine Top-Up Package
- Pay-as-you-go
- TRTC Overdue and Suspension Policy
- 과금 FAQ
- Refund Instructions
신규 사용자 가이드
Call
- 개요(TUICallKit)
- Activate the Service
- Run Demo
- 빠른 통합(TUICallKit)
- 오프라인 푸시
- Conversational Chat
- 온클라우드 녹화(TUICallKit)
- AI Noise Reduction
- UI 사용자 정의
- Calls integration to Chat
- Additional Features
- No UI Integration
- Server APIs
- Client APIs
- Solution
- ErrorCode
- 릴리스 노트
- FAQs
라이브 스트리밍
- Billing of Video Live Component
- Overview
- Activating the Service (TUILiveKit)
- Demo 실행
- No UI Integration
- UI Customization
- Live Streaming Management System
- Video Live Streaming
- Voice Chat Room
- Advanced Features
- Client APIs
- Server APIs
- Error Codes
- Release Notes
- FAQs
RTC Engine
- Activate Service
- SDK 다운로드
- API 코드 예시
- Usage Guidelines
- API 클라이언트 API
- 고급 기능
RTC RESTFUL API
- History
- Introduction
- API Category
- Room Management APIs
- Retweet APIs
- On-cloud recording APIs
- Data Monitoring APIs
- Pull stream Relay Related interface
- Page Recording APIs
- AI Service APIs
- Cloud Slicing APIs
- Cloud Moderation APIs
- Companion Transcription APIs
- Making API Requests
- Call Quality Monitoring APIs
- Usage Statistics APIs
- Data Types
- Appendix
- Error Codes
콘솔 가이드
- 애플리케이션 관리
- 사용량 통계
- 모니터링 대시보드
- 개발 보조
Solution
- Real-Time Chorus
FAQs
Protocols and Policies
TRTC 정책
- 개인 정보 보호 정책
- 데이터 처리 및 보안 계약
용어집

Billing of AI Recognition

Download

포커스 모드

폰트 크기

마지막 업데이트 시간: 2026-05-11 17:40:57

These billing instructions apply to three services: Speech-to-Text , AI Real-Time Translation and Text-to-Speech.
​​Speech-to-Text:​​ Transcribes spoken audio into text using Automatic Speech Recognition (ASR/STT). This is commonly used to generate real-time captions.
​​AI Real-Time Translation:​​ Translates transcribed text into target languages to deliver real-time multilingual subtitles.
Text-to-Speech: Converts text into speech in real-time using TTS (Text to Speech) technology.
Billing Instructions
Speech-to-Text, AI Real-Time Translation, and Text-to-Speech are available without a monthly subscription. Each account receives 10,000 free minutes per month, which can be used across these services. Once your free quota is exhausted, charges automatically switch to pay-as-you-go if that billing option is enabled.
Speech-to-Text
This service recognizes and transcribes audio streams from specified users or all users in a TRTC room.
Third-party STT is not supported in AI Real-Time Translation scenarios to ensure consistency and output quality.
Billing mode: Postpaid. 
Billing cycle: Daily. Specific billing details and the statement issuance time are subject to Billing Statement.
AI Real-Time Translation
This service translates transcribed content into one or more specified target languages in real-time.
Billing mode: Postpaid.
Billing cycle: Daily. Specific billing details and the statement issuance time are subject to Billing Statement.
Text-to-Speech
 This service converts text to natural, fluent speech in real-time, enabling live voice output.
Billing mode: Postpaid.
Billing cycle: Daily. Specific billing details and the statement issuance time are subject to Billing Statement.
Pricing
The following table provides the list prices and language support details for the Speech-to-Text, AI Real-Time Translation and Real-Time Text-to-Speech services:
ServiceType
Model Type
Unit Price 
Support Languages
Speech-to-Text
Standard 
0.02 (USD/Minute)
Supports 22 languages, including:
Chinese, Chinese (Traditional), English, Vietnamese, Japanese, Korean, Indonesian, Thai, Portuguese, Turkish, Arabic, Spanish, Hindi, French, Malay, Filipino, German, Italian, Russian, Swedish, Danish, and Norwegian. If additional language support is required, please contact sales or submit a support ticket.
AI Real-Time Translation
Standard 
0.016 (USD/Minute)
Supports 15 languages, including: 
Chinese, English, Vietnamese, Japanese, Korean, Indonesian, Thai, Portuguese, Arabic, Spanish, French, Malay, German, Italian, and Russian.
If additional language support is required, please contact sales or submit a support ticket.
Text-to-Speech
Flash
0.06 (USD/1,000 characters)
Supports Chinese, English, Japanese, Korean, and Cantonese (dialect).
﻿
Multilingual model
﻿
Please contact sales or submit a support ticket for additional languages.
Metering & Usage Notes
Note:
Speech-to-Text and AI Real-Time Translation Service duration is metered in seconds and accumulated on a per SDKAppID basis. For billing, the total daily seconds are converted to minutes, and any remaining seconds are rounded up to the next full minute.
When Speech-to-Text or AI Real-Time Translation is enabled in a TRTC room, a robot will join as a virtual participant to subscribe to the relevant audio/video streams. This subscription incurs audio and video usage duration.
Billing for the Text-to-Speech is based on the daily cumulative character count, measured at the character level. The pricing unit is 1,000 characters, calculated to three decimal places.
For Speech Synthesis (TTS) billing, the character count is calculated as follows: each Chinese character (including Japanese Kanji, Korean Hanja, and other CJK ideographs) counts as 2 characters. All other characters—including English letters, characters from other languages, punctuation marks, symbols, spaces, and line breaks—count as 1 character.
Speech-to-Text, Real-time Translation, and Real-time Speech Synthesis (integrated in Conversational AI) have a concurrency limit of 100. For Real-time Speech Synthesis in other scenarios, the limit is 20 QPS. Please contact sales or submit a support ticket if you require higher concurrency.
Speech-to-Text
Only the duration of audio streams actively undergoing recognition is billed.
In multi-stream scenarios, the cumulative duration of all input streams is used for billing.
The system activates the ASR for speech-to-text service only after the user turns on the microphone. The duration matches the length of time the microphone remains active.
AI Real-Time Translation
Billed based on the duration of the input audio streams actively translated.  
If a single input stream is translated into multiple target languages, billing is calculated as Input Duration × Number of Output Languages
Text-to-Speech
Usage is measured based on the number of input text characters for real-time speech synthesis.
For each individual broadcaster stream, charges are applied according to the number of characters that need to be synthesized.
Billing Examples
All billing figures in the following examples are calculated and rounded to three decimal places.
Consider the following use case:
Users A and B：Engage in a conversation in Chinese
Viewer C:  Requires English captions and English speech output
Viewer D:  Requires Japanese captions and Japanese speech output
The system processes the conversation through the following steps:
1. Speech-to-Text  (converting Chinese speech to text)
2. Real-Time Translation (translating text to English and Japanese)
3. Text-to-Speech (converting translated text to speech)
Usage Details:
Conversation duration: 10 minutes
Total text characters: 26 thousand characters (The text character counts in this example are provided for illustrative purposes only)
User A's English text: 8 thousand characters
User A's Japanese text: 5 thousand characters
User B's English text: 8 thousand characters
User B's Japanese text: 5 thousand characters
The corresponding charges are calculated as follows:
Billing Type
User A
User B
Subtotal
Speech-to-Text
10 minutes
10 minutes
20 minutes
AI Real-Time Translation
10 minutes × 2
10 minutes × 2
40 minutes
Text-to-Speech
8 thousand English characters + 5 thousand Japanese characters
8 thousand English characters + 5 thousand Japanese characters
26 thousand characters
Speech-to-Text charges: 20 minutes of usage is incurred, unit price is 0.020 USD/minute, the cost is 0.020 × 20 = 0.4 USD;
AI Real-Time Translation charges: 40 minutes of usage is incurred, unit price is 0.016 USD/minute, the cost is 0.016 × 40 = 0.64 USD.
Text-to-Speech charges: 26.000 thousand characters of usage is incurred, unit price is 0.060 USD/thousand characters, the cost is 0.060 × 26 = 1.56 USD
In this scenario, you need to pay the total fee: 2.6 USD.
Integration Guide
For integration steps, please refer to the Speech-to-Text and Translation Integration Instructions.
To configure Text-to-Speech (TTS) in Conversational AI, refer to Conversational AI TTS Configuration.
﻿

도움말 및 지원

문제 해결에 도움이 되었나요?

더 자세한 내용은 문의하기 또는 티겟 제출 을 통해 문의할 수 있습니다.

피드백

tencent cloud

Tencent Real-Time Communication

Billing of AI Recognition

Billing Instructions

Speech-to-Text

AI Real-Time Translation

Text-to-Speech

Pricing

Metering & Usage Notes

Speech-to-Text

AI Real-Time Translation

Text-to-Speech

Billing Examples

Integration Guide

도움말 및 지원

ServiceType	Model Type	Unit Price	Support Languages
Speech-to-Text	Standard	0.02 (USD/Minute)	Supports 22 languages, including: Chinese, Chinese (Traditional), English, Vietnamese, Japanese, Korean, Indonesian, Thai, Portuguese, Turkish, Arabic, Spanish, Hindi, French, Malay, Filipino, German, Italian, Russian, Swedish, Danish, and Norwegian. If additional language support is required, please contact sales or submit a support ticket.
AI Real-Time Translation	Standard	0.016 (USD/Minute)	Supports 15 languages, including: Chinese, English, Vietnamese, Japanese, Korean, Indonesian, Thai, Portuguese, Arabic, Spanish, French, Malay, German, Italian, and Russian. If additional language support is required, please contact sales or submit a support ticket.
Text-to-Speech	Flash	0.06 (USD/1,000 characters)	Supports Chinese, English, Japanese, Korean, and Cantonese (dialect).
Text-to-Speech		0.06 (USD/1,000 characters)	Multilingual model	Please contact sales or submit a support ticket for additional languages.

Billing Type	User A	User B	Subtotal
Speech-to-Text	10 minutes	10 minutes	20 minutes
AI Real-Time Translation	10 minutes × 2	10 minutes × 2	40 minutes
Text-to-Speech	8 thousand English characters + 5 thousand Japanese characters	8 thousand English characters + 5 thousand Japanese characters	26 thousand characters