Product Overview
Purchase Guide
Introduction of Avatar
- Introduction to Image Categories
- Basic Image Library
- Guide on Avatar and Voice Clone
Digital Human Platform Operation Guide
- Accessing Platform
- Avatar Production and Asset Management
- Digital Human Conversation Interaction Application and Management
- Broadcast Digital Human Video Generation and Management
- Operations Management and Analysis
Server API Integration
- Digital Human API Access Mode Overview
- Avatar aPaas API Calling Methods
- Avatar Image Customization and Voice Clone API Documentation
- Video Generation Service API Documentation
- Interactive Digital Human Service API Documentation
- Personal Asset Management API Documentation
Client SDK Integration
- Overall Introduction
- 3D Client-Side Rendering SDK Integration
- 2D Client-Side Rendering SDK Integration
Digital Human SSML Markup Language Specification
Related Agreement
- Privacy Policy
- DSA (Data Sharing Agreement)
FAQs

Best Practices demo

Download

Focus Mode

Font Size

Last updated: 2026-04-24 20:32:21

Note:
Before using this tutorial, you need to download the Demo (Python).  
1. Overview
This Demo provides a complete Tencent Cloud Intelligent Digital Human (TCADH) interaction solution, supporting driving Digital Humans via text or audio, with real-time rendering of Digital Human video streams on H5 pages.
Core Capabilities
Capability
Description
Two methods of stream creation
AssetVirtualmanKey (Asset ID), VirtualmanProjectId (Digital Human Project ID)
Three types of streaming protocols
RTMP,TRTC,WebRTC
Two drive modes
text-driven (input text to make Digital Human speak), audio-driven (upload audio files to drive lip-sync movements)
H5 video playback
TRTC and WebRTC protocols automatically pop up the browser playback page.
File list
├── tencent_virtual_human_completeV1.py   # Main script (Python)
├── trtc_player.html                       # TRTC protocol H5 playback page
├── webrtc_player.html                     # WebRTC protocol H5 playback page
└── TcPlayer-2.4.5.js                      # TCPlayerLite SDK (WebRTC playback dependency)
2. Environment Preparation
2.1 Python dependencies
Warning:
Use Python version 3.11, as some dependencies are unavailable or have been removed in higher versions. It is strongly recommended to create a Python virtual environment first and install dependencies within the virtual environment.
pip install requests websocket-client pydub
2.2 System dependencies
The audio-driven feature requires ffmpeg (used for audio format conversion):
# macOS
brew install ffmpeg
﻿
# Ubuntu/Debian
sudo apt install ffmpeg
﻿
# Windows
# Download https://ffmpeg.org/download.html and add it to PATH
2.3 Tencent Cloud credentials
Obtain the following information on the Tencent Cloud Digital Human Platform:
Parameters
Description
Obtaining Method
appkey
Application Identifier
Digital Human Platform → Application Management (Refer to Figure 1)
accesstoken
Access Token
Digital Human Platform → Application Management (Refer to Figure 1)
asset_virtualman_key
Avatar Asset ID
Digital Human Platform → Image Asset Management (Refer to Figure 2)
virtualman_project_id
Digital Human Project ID
Digital Human Platform → Project Management
Note:
asset_virtualman_key and virtualman_project_id represent two different stream creation methods. When using, you can choose one of the two options.
﻿
Figure 1. How to Obtain the appkey and accesstoken
﻿
﻿
Figure 2. How to Obtain the asset_virtualman_key
﻿
﻿
Figure 3. How to Obtain the virtualman_project_id
﻿
Note:
Before proceeding, you must first create an interactive project. For instructions on how to create one, see the figure below. After the project is created, you need to configure the digital human's avatar and voice (see the documentation).
﻿
﻿
3. Quick Start
3.1 Configuration Parameters
Edit the CONFIG dictionary at the bottom of the tencent_virtual_human_completeV1.py file:
CONFIG = {
    "appkey": "your_appkey",
    "accesstoken": "your_accesstoken",
    "asset_virtualman_key": "your_asset_key",   # Image Asset ID (preferred)
    "virtualman_project_id": "",                  # Project ID (used when asset_key is empty)
    "protocol": "rtmp",                           # Streaming protocol: "rtmp" / "trtc" / "webrtc"
    "protocol_option": None,                      # Protocol option (see Section 5)
}
Running Environment
Operating System: Ubuntu 24.04.3 LTS / x86_64
Runtime Version: Python 3.11.1
3.2 Run script
python tencent_virtual_human_completeV1.py
The script will automatically execute the following process flow:
1️⃣ Create a session (automatically select the stream creation method)
2️⃣ Wait for session readiness (polling status, maximum 120 seconds)
3️⃣ Enable session
🎬 Automatically launch the H5 player (for TRTC/WebRTC protocols)
4️⃣ Create WebSocket connection
5️⃣ Enter interactive mode
6️⃣ Close session
3.3 Interactive mode
After the interactive mode is entered, the terminal will prompt you to select a driver mode:
📋 Driver mode selection:
  1 - text-driven (input text to make Digital Humans speak)
  2 - audio-driven (selecting audio files to drive the Digital Human)
  q - Exit
Input 1: Enter text-driven mode. After text is inputted, the Digital Human will use TTS to synthesize speech and synchronize lip-sync.
Input 2: Enter audio-driven mode. Enter the audio file path (supports formats such as mp3, wav), and the Digital Human will use your audio to drive the lip-sync.
Input q: Exit and close the session.
4. Stream creation method
4.1 AssetVirtualmanKey stream creation (default)
Use image asset ID to create a session, which applies to Digital Humans created via the image asset management page.
﻿
API interface: /v2/ivh/sessionmanager/sessionmanagerservice/createsessionbyasset
CONFIG = {
    "asset_virtualman_key": "your_asset_key",
    "virtualman_project_id": "",  # Leave blank
}
Running Environment
Operating System: Ubuntu 24.04.3 LTS / x86_64
Runtime Version: Python 3.11.1
Note:
You can refer to using personal asset image for stream creation to perform related operations.
4.2 VirtualmanProjectId stream creation
Use Digital Human project ID to create a session, which applies to Digital Humans created via the project management page.
﻿
API interface: /v2/ivh/sessionmanager/sessionmanagerservice/createsession
CONFIG = {
    "asset_virtualman_key": "",   # Leave blank
    "virtualman_project_id": "your_project_id",
}
Running Environment
Operating System: Ubuntu 24.04.3 LTS / x86_64
Runtime Version: Python 3.11.1
Note:
You can refer to Create New Live Stream Session to perform related operations.
4.3 Priority Logic
The script automatically selects the stream creation method through the unified entry point of create_session():
def create_session(self) -> Tuple[bool, str]:
    if self.asset_virtualman_key:
        return self.create_session_by_asset()    # Priority
    elif self.virtualman_project_id:
        return self.create_session_by_project()  # Fallback
    else:
        return False, "Both parameters are empty, unable to create stream"
5. Streaming Protocol and ProtocolOption
5.1 Comparison of Three Protocols
Protocol
Delay
Playback Method
Scenarios
RTMP protocol
2-5 seconds
VLC and other external players
Common scenarios, highest compatibility
TRTC protocol
200~400ms
Automatically pop up the H5 page (TRTC Web SDK)
Ultra-low latency real-time interaction
WebRTC protocol
500ms-1s
Automatically pop up the H5 page (TCPlayerLite)
Low-latency Web playback
5.2 RTMP protocol (default)
CONFIG = {
    "protocol": "rtmp",
    "protocol_option": None,
}
After successful stream creation, the RTMP playback URL is returned, such as: rtmp://liveplay.ivh.qq.com/live/m789. It can be opened using players like VLC.
5.3 TRTC protocol
# Debug Mode (using the platform-unified AppId, no additional configuration required)
CONFIG = {
    "protocol": "trtc",
    "protocol_option": None,
}
After successful stream creation, the trtc:// format playback URL is returned, and the script automatically:
1. Start an HTTP server on local port 8080.
2. Parse parameters such as appId, roomId, and userSig in the trtc:// address.
3. Open the trtc_player.html playback page in the browser;
4. H5 page enters the TRTC room as audience role, automatically pulling the Digital Human video stream.
﻿
5.4 WebRTC Protocol
CONFIG = {
    "protocol": "webrtc",
    "protocol_option": None,
}
After successful stream creation, the webrtc:// format playback URL is returned, and the script automatically:
1. Start an HTTP server on local port 8080.
2. Open the webrtc_player.html playback page in the browser;
3. Use TCPlayerLite SDK for ultra-low latency playback.
﻿
5.5 ProtocolOption Advanced Configuration
ProtocolOption is used for TRTC production environments or custom streaming scenarios.
Available fields
Field
Type
Description
TrtcUseExternalApp
bool
Whether to use an external TRTC AppId
TrtcAppId
str
TRTC AppId (mandatory when an external AppId is used)
TrtcRoomId
int
TRTC digital room ID
TrtcStrRoomId
str
TRTC string room ID (choose one of the two options with TrtcRoomId)
TrtcAutoGenRoomIdType
int
Auto-generated room ID type: 0=number (default), 1=string
TrtcUserSig
str
TRTC user signature (mandatory when an external AppId is used)
TrtcPrivateMapKey
str
TRTC permission ticket (enter "dummy" if advanced permissions are not enabled)
CssCustomPushUrl
str
Custom CSS push URL (any protocol can be used)
Example Scenario
TRTC Production Mode (using external AppId):
CONFIG = {
    "protocol": "trtc",
    "protocol_option": {
        "TrtcUseExternalApp": True,
        "TrtcAppId": "1400xxxxxx",
        "TrtcRoomId": 12345,
        "TrtcUserSig": "eJw8js0Kgk...",
        "TrtcPrivateMapKey": "dummy"
    },
}
Custom Push URL (both RTMP/WebRTC supported):
CONFIG = {
    "protocol": "rtmp",
    "protocol_option": {
        "CssCustomPushUrl": "rtmp://domain/appName/streamName?txSecret={0}&txTime={1}"
    },
}
5.6 Optional configuration
If the avatar supports a transparent background, you can also adjust the Demo parameters to see the transparent background effect, as shown below:
"ExtraInfo": {"AlphaChannelEnable":True} # Enable Alpha channel (if transparent background is required)
6. H5 Playback Page
6.1 TRTC Playback Page (trtc_player.html)
Technical Solution: TRTC Web SDK v5 (loaded via unpkg CDN)
﻿
Core Logic:
Parsing appId, roomId, userId, and userSig from URL parameters.
Enter the TRTC room as audience role (role: 'audience').
Listen for the REMOTE_VIDEO_AVAILABLE event and automatically pull the remote video stream.
The video display uses object-fit: contain to be displayed in its entirety.
﻿
Key code snippet:
// Enter the room (pull stream only as audience, no pushing)
await trtc.enterRoom({
    sdkAppId: config.appId,
    userId: config.userId,
    userSig: config.userSig,
    roomId: config.roomId,
    scene: 'live',
    role: 'audience'
});
﻿
// Listen for and play remote video
trtc.on(TRTC.EVENT.REMOTE_VIDEO_AVAILABLE, async (event) => {
    await trtc.startRemoteVideo({
        userId: event.userId,
        streamType: event.streamType,
        view: 'remote-video',
        option: { objectFit: 'contain' }
    });
});
Standalone Usage (without relying on Python scripts):
http://localhost:8080/trtc_player.html?appId=1400695865&roomId=402183450&userId=user_xxx&userSig=eJw8...&virtualManUserId=402183450_ivh_anchor
﻿
6.2 WebRTC Playback Page (webrtc_player.html)
Technical Solution: TCPlayerLite v2.4.5 (locally deployed TcPlayer-2.4.5.js)
﻿
Core Logic:
Read the WebRTC playback URL from the URL parameter ?url=webrtc://...
Initialize the player using TCPlayerLite, and it will automatically play in live streaming mode.
﻿
Key code snippet:
player = new TcPlayer('player-container', {
    "webrtc": webrtcUrl,
    "width": '100%',
    "height": '540',
    "autoplay": true,
    "live": true,
    "controls": "none",
    "webrtcConfig": {
        "streamType": "auto"
    },
    "listener": function (msg) {
        handlePlayerEvent(msg);
    }
});
Standalone Use:
http://localhost:8080/webrtc_player.html?url=webrtc://liveplay.ivh.qq.com/live/m11533590420520971383?min_delay_ms=100
Note:
WebRTC playback page must be accessed via an HTTP server; you cannot directly open local files (file:// protocol is not supported).
﻿
7. API Signature Mechanism
All API requests require signature authentication. Signing process:
1. Parameter Sorting: Sort all request parameters in lexicographical order;
2. Concatenate strings: key1=value1&key2=value2&...
3. HMAC-SHA256: use accesstoken as the key to compute the signature;
4. Base64 + URL Encoding: Encode the signature result.
def _generate_signature(self, parameters: Dict[str, str]) -> str:
    sorted_params = sorted(parameters.items())
    signing_content = '&'.join(f'{k}={v}' for k, v in sorted_params)
    h = hmac.new(
        self.accesstoken.encode('utf-8'),
        signing_content.encode('utf-8'),
        hashlib.sha256
    )
    hash_in_base64 = base64.b64encode(h.digest()).decode('utf-8')
    return quote(hash_in_base64)
﻿
8. Session Lifecycle
┌─────────────┐
│  Create session     │  create_session()
└──────┬──────┘
       ▼
┌─────────────┐
│  Wait for readiness     │  wait_for_session_ready()  ← Poll SessionStatus
└──────┬──────┘       SessionStatus=3 → Preparing
       ▼              SessionStatus=1 → Ready
┌─────────────┐
│  Enable session     │  start_session()
└──────┬──────┘
       ▼
┌─────────────┐
│  H5 Player   │  start_h5_player()  ← Automatically launched for TRTC/WebRTC
└──────┬──────┘
       ▼
┌─────────────┐
│  WebSocket   │  create_websocket_connection()
│  Persistent connection channel   │
└──────┬──────┘
       ▼
┌─────────────┐
│  Interactive Drive     │  send_text_drive() / send_audio_drive()
│  (Loop)     │
└──────┬──────┘
       ▼
┌─────────────┐
│  Close session     │  close_session()
└─────────────┘
﻿
9. Driver Instructions
9.1 text-driven
By sending text via WebSocket, the Digital Human uses TTS to synthesize speech and synchronize lip-sync.
drive_cmd = {
    "Header": {},
    "Payload": {
        "ReqId": req_id,
        "SessionId": self.session_id,
        "Command": "SEND_TEXT",
        "Data": {
            "Text": "Hello, welcome to Tencent Cloud Digital Human Platform",
            "ChatCommand": "NotUseChat"
        }
    }
}
self.ws.send(json.dumps(drive_cmd, ensure_ascii=False))
9.2 audio-driven
Convert the audio file to PCM format (16kHz, mono, 16bit), packetize it, and send it via WebSocket.
﻿
Audio Conversion:
audio = AudioSegment.from_file(audio_file_path)
audio = audio.set_channels(1).set_frame_rate(16000).set_sample_width(2)
pcm_data = audio.raw_data
﻿
Packetization Sending Policy:
5120 bytes per packet (160ms audio)
The first 6 packets Rapid Sending (without interval)
Subsequent packets at 120ms intervals
Send the final end packet IsFinal: True
﻿
﻿
10. FAQs
Q1: Audio conversion failure
Ensure that ffmpeg is installed and available in the PATH. The script will automatically locate the ffmpeg path:
_ffmpeg_path = shutil.which('ffmpeg')
﻿
Q2: TRTC playback page video is cropped
H5 page has been configured with object-fit: contain. If issues persist, check whether the browser is the latest version.
﻿
Issue 3: WebRTC playback page fails to open
Ensure that the TcPlayer-2.4.5.js file is in the same directory as webrtc_player.html
Must be accessed via an HTTP server (the script will start automatically), and cannot be directly opened using the file:// protocol.
﻿
Q4: Session creation timeout
After session creation, the status "preparing" (SessionStatus=3) is normal as model loading is required. The script defaults to polling with a maximum wait time of 120 seconds.
﻿
Q5: Port 8080 is occupied
H5 Player uses port 8080 by default. If you need to change it, modify the start_h5_player() function's h5_port parameter:
h5_url = self.start_h5_player(h5_port=9090)
﻿
11. Reference Documents
Conversational Interaction Access Solution Overview
New Live Stream Session
Using Personal Asset Images for Stream Creation
Querying Session Status
TRTC Web SDK
WebRTC Player SDK
﻿
﻿

Help and Support

Was this page helpful?

You can also Contact sales or Submit a Ticket for help.

Help us improve! Rate your documentation experience in 5 mins.

Feedback

Capability	Description
Two methods of stream creation	AssetVirtualmanKey (Asset ID), VirtualmanProjectId (Digital Human Project ID)
Three types of streaming protocols	RTMP,TRTC,WebRTC
Two drive modes	text-driven (input text to make Digital Human speak), audio-driven (upload audio files to drive lip-sync movements)
H5 video playback	TRTC and WebRTC protocols automatically pop up the browser playback page.

Parameters	Description	Obtaining Method
`appkey`	Application Identifier	Digital Human Platform → Application Management (Refer to Figure 1)
`accesstoken`	Access Token	Digital Human Platform → Application Management (Refer to Figure 1)
`asset_virtualman_key`	Avatar Asset ID	Digital Human Platform → Image Asset Management (Refer to Figure 2)
`virtualman_project_id`	Digital Human Project ID	Digital Human Platform → Project Management

Protocol	Delay	Playback Method	Scenarios
RTMP protocol	2-5 seconds	VLC and other external players	Common scenarios, highest compatibility
TRTC protocol	200~400ms	Automatically pop up the H5 page (TRTC Web SDK)	Ultra-low latency real-time interaction
WebRTC protocol	500ms-1s	Automatically pop up the H5 page (TCPlayerLite)	Low-latency Web playback

Field	Type	Description
`TrtcUseExternalApp`	bool	Whether to use an external TRTC AppId
`TrtcAppId`	str	TRTC AppId (mandatory when an external AppId is used)
`TrtcRoomId`	int	TRTC digital room ID
`TrtcStrRoomId`	str	TRTC string room ID (choose one of the two options with TrtcRoomId)
`TrtcAutoGenRoomIdType`	int	Auto-generated room ID type: 0=number (default), 1=string
`TrtcUserSig`	str	TRTC user signature (mandatory when an external AppId is used)
`TrtcPrivateMapKey`	str	TRTC permission ticket (enter `"dummy"` if advanced permissions are not enabled)
`CssCustomPushUrl`	str	Custom CSS push URL (any protocol can be used)

tencent cloud

Tencent Cloud AI Digital Human

Best Practices demo

1. Overview

Core Capabilities

File list

2. Environment Preparation

2.1 Python dependencies

2.2 System dependencies

2.3 Tencent Cloud credentials

3. Quick Start

3.1 Configuration Parameters

Running Environment

3.2 Run script

3.3 Interactive mode

4. Stream creation method

4.1 AssetVirtualmanKey stream creation (default)

Running Environment

4.2 VirtualmanProjectId stream creation

Running Environment

4.3 Priority Logic

5. Streaming Protocol and ProtocolOption

5.1 Comparison of Three Protocols

5.2 RTMP protocol (default)

5.3 TRTC protocol

5.4 WebRTC Protocol

5.5 ProtocolOption Advanced Configuration

Available fields

Example Scenario

5.6 Optional configuration

6. H5 Playback Page

6.1 TRTC Playback Page (trtc_player.html)

6.2 WebRTC Playback Page (webrtc_player.html)

7. API Signature Mechanism

8. Session Lifecycle

9. Driver Instructions

9.1 text-driven

9.2 audio-driven

10. FAQs

Q1: Audio conversion failure

Q2: TRTC playback page video is cropped

Issue 3: WebRTC playback page fails to open

Q4: Session creation timeout

Q5: Port 8080 is occupied

11. Reference Documents

Help and Support