Privacy Policy
DSA(Data Sharing Agreement)
Parameters | Type | Mandatory | Description |
RefPhotoUrl | string | Yes | Template image, format support jpg, jpeg, png, bmp, webp. 1. The file size must be within 10M. 2. The image unilateral resolution requirement is between 192 and 4096. 3. The image aspect ratio (width:height) is within the range of 1:2 to 2:1. 4. The image must feature a real person or realistic human cartoon face. Avoid uploading images with no face, incomplete face, unclear face, excessively large degree of deflection, or lips obstructed. |
DriverType | string | Yes | Driver type. This field is required. 1. Text-driven. InputSsml field required. 2. OriginalVoice: Original voice audio-driven. InputAudioUrl field required. |
InputAudioUrl | string | No | Audio URL for driving Digital Human. This field is required when DriverType is OriginalVoice. Audio format requirements: 1. Duration ranges from [2, 60] seconds. 2. Supported formats: wav, mp3, wma, m4a, aac, ogg. 3. File size: no more than 20M. |
InputSsml | string | No | Broadcast Text content supports SSML Tags. Refer to Digital Human SSML Markup Language Specification for supported tag types. Refer to the example for tag syntax. Content must not include line breaks. Symbols must be escaped. Upper limit is 300 words (Text-driven underlying layer converts to audio. If duration exceeds 60 seconds, the task will fail to create). No less than 4 words (counted as unicode characters). This field is required when DriverType is empty or Text. |
SpeechParam | object | No | Define audio parameters. This field is required when DriverType is Text. |
SpeechParam.Speed | float | No | The speech rate (1.0 is normal speed, range [0.5-1.5]. A value of 0.5 indicates the slowest speed and a value of 1.5 indicates the fastest speed. Speech rate control is not effective when DriverType is set to audio-driven). This field is required when DriverType is Text. |
SpeechParam.TimbreKey | string | No | Voice type Key. This field is required when DriverType is Text. |
SpeechParam.Volume | int | No | Volume level, ranging from 0 to 10. The default is 0, which represents normal volume. The higher the values, the louder the volume. Note: TimbreKey does not support audio volume adjustment for male_1-20 and female_1-23 (male voice 1-20, female voice 1-23). |
SpeechParam.EmotionCategory | string | No | Controls the emotion of the synthesized audio, supported only for multi-emotion timbres. See the Personal Asset Management API Paginated Query Timbre List for available values. |
SpeechParam.EmotionIntensity | int | No | Controls the intensity of the synthesized audio emotion, with a range of [50,200]. This is only effective when EmotionCategory is not empty. |
SpeechParam.TimbreLanguage | string | No | Voice type language. See the Personal Asset Management API Paginated Query Timbre List for available languages. For multilingual voice types, the corresponding language must be selected during synthesis. |
ConcurrencyType | string | No | Resource type used for video production tasks. 1. Exclusive: Use concurrent calls, no deduction from the hourly package. Purchase concurrency pack is required. If not purchased, task submission fails. 2. Shared: Calls deduct from the hourly package. Purchase hourly package is required. If not purchased, task submission fails. 3. Not specified: If you purchase concurrency or both concurrency and hourly package, it defaults to "Exclusive". If you do not purchase concurrency but purchase hourly package, it defaults to "Shared". If neither is purchased, task submission fails. |
CallbackUrl | string | No | When users add a callback URL, video production results will be sent in fixed format via POST request to the URL address. The fixed format is described in Appendix II: Callback Request Body Format. Note: 1. Limit CallbackUrl length less than 1000. 2. Only one request will be sent. Regardless of the issue causing the request to fail, it cannot be resent. |
VideoParam | object | No | Define related parameters for the output video. Use field default values when left blank. |
VideoParam.EmotionLevel | int | No | Output video emotion intensity: selectable levels: 1, 2, 3; default value is 2. Larger values increase audio control intensity but may cause unnatural results. |
Parameters | Type | Mandatory | Description |
TaskId | string | Yes | Video production task ID. Use the TaskId to access the Audio and Video Production Progress Query API to obtain production progress and production result. |
{"Header": {},"Payload": {"RefPhotoUrl": "http://virtualhuman-cos-test-1251316161.cos.ap-nanjing.myqcloud.com/ref_photo.jpg","DriverType": "Text","InputSsml": "Hello, I am the virtual <phoneme alphabet=\\"py\\" ph=\\"fu4\\">anchor</phoneme>","SpeechParam": {"TimbreKey": "female_1","Volume": 1,"Speed": 1.0}}}
{"Header": {},"Payload": {"RefPhotoUrl": "http://virtualhuman-cos-test-1251316161.cos.ap-nanjing.myqcloud.com/ref_photo.jpg","DriverType": "OriginalVoice","InputAudioUrl": "http://virtualhuman-cos-test-1251316161.cos.ap-nanjing.myqcloud.com/audio.mp3"}}
{"Header": {"Code": 0,"DialogID": "","Message": "","RequestID": "fde854eaa981c7f2f7285d1c7eca335b","SessionID": "gzb7dec22117297528294581119"},"Payload": {"TaskId": "81883d47c6154edf8e276531f09227b6"}}
Apakah halaman ini membantu?
Anda juga dapat Menghubungi Penjualan atau Mengirimkan Tiket untuk meminta bantuan.
masukan