TTS is pay-as-you-go. For billing details, see Purchase Guide.
Currently, TTS supports Chinese, English, Cantonese, and Chinese-English mix.
No. If you need on-premises deployment, submit a ticket for assistance.
Currently, TTS offers 53 types of voices for your choice, for each of which you can adjust various parameters such as volume and speech rate.
Premium voices have higher fidelity and more natural-sounding quality than standard voices. The number range of premium voices start from 101000. Standard voices and premium voices are pay-as-you-go at the rate of 0.125 USD/10,000 characters and 0.185 USD/10,000 characters respectively.
You can choose voices based on your business needs. For example, in the audiobook scenario, you can use premium voices for frequently accessed top books and use standard voices for other books. This strikes a balance between the user experience and costs.
Yes. Custom pronunciation and number reading can be implemented through SSML.
No. Fees are charged by the number of eventually processed characters.
TTS APIs don't retain data. Currently, only 200 generated speech files can be retained in the console. We recommend you download the generated files for local storage.
Yes. You can implement this through SSML.
No. TTS converts text to natural-sounding speeches and doesn't support inserting such sound effects.
Yes. For detailed directions, see TextToVoice.
Yes, but this depends on your business. TTS doesn't limit the use cases of speeches as long as they are legal.
Yes. You can customize the interval by using the
break tag in SSML.