tencent cloud

Feedback

Implementing Image Quality Enhancement with GN7vi Instances

Last updated: 2024-01-11 17:11:13

    Overview

    This document describes how to perform video encoding and decoding as well as AI image quality enhancement on GN7vi instances, which are fully compatible with the open-source FFmpeg.

    Directions

    Preparing the instance environment

    Create an instance:
    Instance: Select an instance based on the requirements for GPU and the number of cores as instructed in Computing Instance.
    Image: Select an image from the available images in the table.
    Driver: The automatic installation of CUDA and cuDNN is optional. You can also install them manually after creating the instance as needed:
    Log in to the created GPU instance as instructed in Logging in Using Standard Method (Recommended).
    Run the following command to have the GPU driver, CUDA, and cuDNN installed automatically.
    Note:
    Configure the environment with GPU driver 460.106.00, CUDA 11.2.2, and cuDNN 8.2.1.
    wget https://gpu-related-scripts-1251783334.cos.ap-guangzhou.myqcloud.com/gpu-auto-install/gpu_auto_install_220823.sh && wget https://gpu-related-scripts-1251783334.cos.ap-guangzhou.myqcloud.com/gpu-auto-install/driver460_cuda11.2.2.txt && sudo bash ./gpu_auto_install_220823.sh install --config_file=./driver460_cuda11.2.2.txt && source /etc/bash.bashrc && source ${HOME}/.bashrc

    File overview

    Run the following commands in sequence to view all files under tscsdk-center.
    cd /usr/local/qcloud/tscsdk-center
    ls -l
    The following information appears:
    
    
    The description is as follows:
    File Name
    Description
    fflib_gpu
    Dependency library for image quality processing.
    ffmpeg
    FFmpeg program embedded with the image quality processing feature.
    tenmodel
    AI model used in image quality processing.
    videos
    Built-in sample videos.

    Getting started

    1. Run the following commands in sequence to set environment variables.
    cd /usr/local/qcloud/tscsdk-center
    export LD_LIBRARY_PATH=./fflib_gpu:$LD_LIBRARY_PATH
    2. Under the tscsdk-center directory, run the following commands in sequence to generate the sample output video after image quality processing.
    LD video processing: LD videos usually have a resolution of up to 720p. The command uses the standard super resolution model in the balance mode of tenfilter and unsharp sharpening.
    ./ffmpeg -i ./videos/input1.mp4 -vf tenfilter=mag_filter=1:mag_sr=2:mag_sr_stre=balance,unsharp -c:v libten264 -ten264opts crf=26:vbv-maxrate=2000 -y output1.mp4
    HD video processing: HD videos usually have a resolution of above 720p. The command uses the high-quality super resolution model in tenfilter.
    ./ffmpeg -i ./videos/input2.mp4 -vf tenfilter=mag_srgan=1 -c:v libten264 -ten264opts crf=26:vbv-maxrate=2000 -y output2.mp4
    Fast processing model: The following commands use compression artifact removal in tenfilter, standard super resolution model in general mode, and unsharp sharpening.
    ./ffmpeg -i ./videos/input1.mp4 -vf tenfilter=af=auto,tenfilter=mag_filter=1:mag_sr=2:mag_sr_stre=normal,unsharp -c:v libten264 -ten264-params crf=26:vbv-maxrate=2000 -y fast_output1.mp4
    ./ffmpeg -i ./videos/input2.mp4 -vf tenfilter=af=auto,tenfilter=mag_filter=1:mag_sr=2:mag_sr_stre=normal,unsharp -c:v libten264 -ten264-params crf=26:vbv-maxrate=2000 -y fast_output2.mp4
    Note:
    The running speed of a model is affected by the input resolution. The higher the resolution, the slower the running. When a specific AI model is run for the first time, the initialization will take a long time. During subsequent command execution, the speed will be improved significantly. You can evaluate the running speed by comparing subsequent execution results.
    Some parameters in the ffmpeg command are as described below:
    Parameter
    Description
    -i videos/input1.mp4
    Specifies the input video file.
    -vf tenfilter=mag_srgan=1
    Specifies the video processing filter graph. For more information on parameter descriptions, see List of features of the AI model for video processing.
    -c:v libten264
    Specifies Tencent's proprietary Ten264 or Ten265 as the video encoder.
    -ten264opts crf=26:vbv-maxrate=2000
    Sets video encoder parameters. For more information on parameter descriptions, see List of features of the video encoder.
    -y output.mp4
    Specifies the output video file to automatically overwrite the existing file.
    3. Wait for the program to end and download the output video file. We recommend you use Xshell or MobaXterm. The following are screenshots of the four video files output by the commands.
    output1.mp4
    fast_output1.mp4
    output2.mp4
    fast_output2.mp4
    Screenshot taken at 01:15 (minute)
    
    
    Screenshot taken at 01:15 (minute)
    
    
    Screenshot taken at 00:10 (minute)
    
    
    Screenshot taken at 00:10 (minute)
    
    

    Feature List

    The tscsdk-center consists of two parts: AI model for video processing and Tencent's proprietary video encoder. The AI model for video processing is integrated by using the FFmpeg filter mechanism, so that filters can embed AI inference capabilities into video encoding/decoding and processing processes. This improves hardware utilization efficiency and throughput. With Tencent's proprietary video encoder, a higher video encoding compression rate is delivered in addition to image quality enhancement.

    List of features of the AI model for video processing

    The AI model for video processing is integrated in a filter named "tenfilter" and is called and configured through "-vf tenfilter=name1=value1:name2=value2". One AI model can be enabled in one tenfilter, and free combinations are available when there are multiple tenfilters.
    All AI models are as described below:
    Model or Feature Name
    Parameter
    Sample
    General parameters
    mdir: It is the configuration file path of the model, which defaults to `./tenmodel/tve-conf.json`.
    gpu: It is the GPU No. of the tenfilter.
    tenfilter=mdir=./tenmodel/tve-conf.json:gpu=1
    Compression artifact removal
    af: It is the strength of compression artifact removal, which can be only `auto` currently.
    tenfilter=af=auto
    Face protection
    face_protect_enable: The face protection logic is enabled when it is `1`.
    face_af_ratio: It is the face area denoising weakening coefficient.
    face_sp_ratio: It is the face area sharpening coefficient.
    tenfilter=face_protect_enable=1:face_af_ratio=0.5:face_sp_ratio=0.5
    Video frame interpolation
    mag_fps: Video frame interpolation is enabled when it is `1`.
    fps: It is the target frame rate.
    tenfilter=mag_fps=1:fps=60
    Color enhancement
    mag_filter: It needs to be set to `1`.
    cebb: Color enhancement is enabled when it is `1`.
    tenfilter=mag_filter=1:cebb=1
    Standard super resolution
    mag_filter: It needs to be set to `1`.
    mag_sr: It is the super resolution rate. Currently, only the twice super resolution is supported.
    mag_sr_stre: It is the super resolution mode, which can be set to `normal` or `balance`.
    tenfilter=mag_filter=1:mag_sr=2:mag_sr_stre=normal
    High-quality super resolution
    mag_srgan: High-quality super resolution is enabled when it is `1`.
    tenfilter=mag_srgan=1
    Video noise removal
    mag_filter: It needs to be set to `1`.
    dn: It is the noise removal strength, which can be only `3` currently.
    tenfilter=mag_filter=1:dn=3
    Video image quality enhancementFace enhancementFont enhancement(Support for multiple models)
    mag_filter: It needs to be set to `1`.
    eh: Image quality enhancement is enabled when it is `1`.
    faceeh: Face enhancement is enabled when it is `1`.
    fonteh: Font enhancement is enabled when it is `1`.
    prior: It is the priority for executing AI models, for example, "faceeh-eh-fonteh". Corresponding models need to be enabled. When `-parally` is added, parallel optimization is enabled.
    Single model:tenfilter=mag_filter=1:eh=1
    Multiple models:tenfilter=mag_filter=1:eh=1:faceeh=1:prior=faceeh-eh-parally

    Tencent's proprietary video encoder

    tscsdk-center contains Ten264 and Ten265 video encoders independently developed by Tencent. Encoder types and parameters can be set through command parameters during video processing. Each encoder can be specified and set in the following ways:
    Encoder Name
    Method to Specify
    Method to Set
    Ten264
    -vcodec libten264-c:v libten264
    -ten264opts name1=value1:name2=value2
    Ten265
    -vcodec libten265-c:v libten265
    -ten265-params name1=value1:name2=value2
    The parameters of each encoder are as detailed below:
    Ten264 encoder
    Ten265 encoder
    
    Parameter
    Description
    preset
    Specifies the configuration of the encoder's encoding parameter set.0: Ultrafast; 1: Superfast; 2: Very fast; 3: Faster; 4: Fast; 5: Medium; 6: Slow; 7: Slower; 8: Very slow; 9: Placebo.
    bitrate
    Bitrate of the output video in ABR mode.
    crf
    CRF value in CRF mode.
    aq-mode
    0: Disable aqmode; 1: Enable aqmode; 2: Variance-based aqmode; 3: Variance-based aqmode towards dark scenes.2 is the default value and produces better SSIM results.
    vbv-maxrate
    Maximum VBV bitrate. This value is the same as the configured bitrate by default.
    vbv-bufsize
    VBV buffer size. This value is four times the configured bitrate by default.
    rc-lookahead
    Length of the lookahead.
    scenecut
    Whether to enable scene switch. It is enabled by default and we generally recommend you keep it enabled.
    keyint
    Maximum keyframe interval. It is 256 by default and can be configured as needed; generally, you should configure it as the number of frames with a time interval of 2–5s.
    threads
    Number of threads in the used thread pool.
    lookahead-threads
    Number of threads used for lookahead.
    profile
    "baseline", "main", "high", "high422", and "high444".
    
    
    Parameter
    Description
    preset
    Specifies the configuration of the encoder's encoding parameter set.-1: Ripping; 0: Placebo; 1: Very slow; 2: Slower; 3: Slow; 4: Universal; 5: Medium; 6: Fast; 7: Faster; 8: Very fast; 9: Superfast.
    rc
    Bitrate control method.0: CQP; 1: ABR_VBV; 2: ABR; 3: CRF_VBV; 4: CRF.
    bitrate
    Bitrate of the output video in ABR mode.
    crf
    CRF value in CRF mode. Value range: [1,51].
    aq-mode
    0: Disable aqmode; 1: Enable aqmode; 2: Variance-based aqmode; 3: Variance-based aqmode towards dark scenes.2 is the default value and produces better SSIM results.
    vbv-maxrate
    Maximum VBV bitrate. This value is the same as the configured bitrate by default.
    vbv-bufsize
    VBV buffer size. This value is four times the configured bitrate by default.
    rc-lookahead
    Length of the lookahead.
    scenecut
    Scene switch threshold. Value range: [0,100]. 0 indicates disabled. It is enabled by default and we generally recommend you keep it enabled.
    open-gop
    Whether to enable open GOP.0: Disabled; 1: Enabled. It is enabled by default; in order to support random access in the live streaming scenario, we recommend you disable it.
    keyint
    Maximum keyframe interval. It is 256 by default and can be configured as needed; it needs to be a multiple of 8 greater than 50.
    ltr
    Whether to support long-term reference frames. 0: Disabled; 1: Enabled. It is enabled by default; if the hardware device that plays back HEVC videos is poor, we recommend you disable it.
    pool-threads
    Number of threads in the thread pool used by WPP. It is the same as the number of CPU cores by default; if you want to reduce CPU usage, lower this number.
    

    Usage Recommendations

    tscsdk-center allows for the flexible control of each AI model. If you have special requirements or scenarios, you can set the switch for each model and combine different models to deliver better video processing effects.
    tscsdk-center provides two super resolution models. The standard model is suitable for earlier sources at a low resolution, while the high-quality model is more suitable for HD sources. We recommend you evaluate the effects of the two models while considering the video source type.
    In tscsdk-center, AI models need to run on GPU, while video encoders run only on CPU. In most cases, when the GPU computing power is fully used, there will be some idle CPU space. Therefore, some video processing tasks running only on CPU can be assigned, such as video transcoding, to fully utilize hardware resources.
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support