tencent cloud


Labeling and Categorization

Last updated: 2023-04-17 14:27:34


    The labeling and categorization feature performs structured analysis on various dimensions, including people, behavior, speech, text, objects, and scenes in audios/videos to generate high-accuracy audio/video labels, frame-specific labels, and audio/video categories automatically.

    Feature Description
    Audio/Video labeling Audio/Video labeling gives suggestions on the labels that can be added to an audio/video. Currently, it supports over 3,000 labels such as game, vehicle, musician, race car, pet, drum, bike, World of Warcraft, computer, and school, and it supports categories like people, event, scene, objects, landscape, food, animals.
    Frame-specific labeling Frame-specific labeling automatically recognizes labels in the video frames captured at the custom frame capturing interval, and locates the labels in the video. Frame labels are divided into nine categories, such as people, landscape, artificial object, building, plant, animal, and food, covering various aspects of daily life.
    Audio/Video categorization Audio/Video categorization gives suggestions for which category an audio/video should belong to. There are currently over twenty categories, such as car, parenting, fashion and entertainment, game, military, technology, politics, animal, food, sports, travel, animation, dance, music, television, variety show, host, political news, international news, and social news.
    The labeling and categorization feature helps you efficiently manage media resources and can be used to give personalized audio/video recommendations.

    Use cases

    Scenario Description
    Media resource management Users can search for media resources on audio/video platforms by category and label, greatly improving the search efficiency.
    Audio/Video creation Audio/Video creators can quickly search for materials by category or by label, helping them create content more efficiently.
    Personalized audio/video recommendations Businesses such as UGSV platforms, ecommerce platforms, and social media applications can push media content that precisely matches users' needs. This not only helps increase the clicks of the media content on platforms but also helps users save time when filtering content.
    Radio and TV cataloging The radio and TV industry can use the labeling and categorization feature of VOD to efficiently manage massive amounts of video content. Based on the recognized label and category information, videos can be quickly archived, labeled, and searched for.


    You can implement audio/video labeling, frame-specific labeling, and audio/video categorization through the intelligent labeling, intelligent labeling by frame, and intelligent categorization features as described in Video Content Analysis as follows:

    1. Create an audio/video content analysis template and configure intelligent labeling (TagConfigure), intelligent labeling by frame (FrameTagConfigure), and intelligent categorization (ClassificationConfigure) as needed. For example, the following API requests indicate to enable all labeling and categorization features and set the interval for frame-specific labeling to three seconds:
       "TagConfigure": {
           "Switch": "ON"
       "FrameTagConfigure": {
           "Switch": "ON",
           "ScreenshotInterval": 3
       "ClassificationConfigure": {
           "Switch": "ON"

    Get the template ID from the response.
    To use all labeling and categorization features, you can directly use template ID 20 in List of Preset Parameter Templates.
    2. Initiate a labeling and categorization task with the template ID obtained in step 1 as instructed in Video Content Analysis.
    3. Get the task result as instructed in Video Content Analysis.

    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support