tencent cloud

Feedback

Submitting Speech Recognition Job

Last updated: 2022-08-18 15:05:04

    Feature Description

    This API (CreateSpeechJobs) is used to submit a speech recognition job.

    Request

    Sample request

    POST /asr_jobs HTTP/1.1
    Host: <BucketName-APPID>.ci.<Region>.myqcloud.com
    Date: <GMT Date>
    Authorization: <Auth String>
    Content-Length: <length>
    Content-Type: application/xml
    <body>
    
    Note:

    • Authorization: Auth String (for more information, see Request Signature).
    • When this feature is used by a sub-account, relevant permissions must be granted. For more information, see Authorization Granularity.

    Request headers

    This API only uses common request headers. For more information, see Common Request Headers.

    Request body

    This request requires the following request body:

    <Request>
    <Tag>SpeechRecognition</Tag>
    <Input>
      <Object></Object>
    </Input>
    <Operation>
      <SpeechRecognition></SpeechRecognition>
      <Output>
        <Region></Region>
        <Bucket></Bucket>
        <Object></Object>
      </Output>
    </Operation>
    <QueueId></QueueId>
    </Request>
    

    The nodes are as described below:

    Node Name (Keyword) Parent Node Description Type Required
    Request None Request container Container Yes

    Request has the following sub-nodes:

    Node Name (Keyword) Parent Node Description Type Required
    Tag Request Job type, which currently can only be `SpeechRecognition`. String Yes
    Input Request Speech file to be manipulated Container Yes
    Operation Request Operation rule Container Yes
    QueueId Request ID of the queue which the job is in String Yes

    Input has the following sub-nodes:

    Node Name (Keyword) Parent Node Description Type Required
    Object Request.Input Speech file key in COS. `Bucket` is specified by `Host`. String Yes

    Operation has the following sub-nodes:

    Node Name (Keyword) Parent Node Description Type Required
    SpeechRecognition Request.Operation Job type parameter, which takes effect only if Tag is SpeechRecognition. Container No
    Output Request.Operation Result output address Container Yes

    SpeechRecognition has the following sub-nodes:

    Node Name (Keyword) Parent Node Description Type Required
    EngineModelType Request.Operation.Speech
    Recognition
    Engine model type.
    Phone call scenarios:
    • 8k_zh: 8 kHz, for Mandarin in general scenarios (available for dual-channel audio).
    • 8k_zh_s: 8 kHz, for Mandarin with speaker separation (available for mono-channel audio only).
    Non-phone call scenarios:
    • 16k_zh: 16 kHz, for Mandarin in general scenarios.
    • 16k_zh_video: 16 kHz, for Mandarin in audio/video scenarios.
    • 16k_en: 16 kHz, for English.
    • 16k_ca: 16 kHz, for Cantonese.
    String Yes
    ChannelNum Request.Operation.Speech
    Recognition
    Number of speech sound channels. 1: mono; 2: dual (for the 8k_zh engine only). Integer Yes
    ResTextFormat Request.Operation.Speech
    Recognition
    Format of the returned recognition result. 0: recognition result text, including the list of segment timestamps; 1: recognition result details, including the list of word timestamps (generally used to generate subtitles and for the 16k Mandarin engine only). Integer Yes
    FilterDirty Request.Operation.Speech
    Recognition
    Whether to filter restricted words (for the Mandarin engine only). 0 (default value): does not filter; 1: filters; 2: replaces restricted words with "*". Integer No
    FilterModal Request.Operation.Speech
    Recognition
    Whether to filter interjections (for the Mandarin engine only). 0 (default value): does not filter; 1: filters; 2: filters strictly. Integer No
    ConvertNumMode Request.Operation.Speech
    Recognition
    Whether to intelligently convert Chinese numbers to Arabic numerals (for the Mandarin engine only). 0: directly outputs Chinese numbers; 1 (default value): intelligently converts based on the scenario. Integer No

    Output has the following sub-nodes:

    Node Name (Keyword) Parent Node Description Type Required
    Region Request.Operation.Output Bucket region String Yes
    Bucket Request.Operation.Output Result storage bucket String Yes
    Object Request.Operation.Output Result filename String Yes

    Response

    Response headers

    This API only returns common response headers. For more information, see Common Response Headers.

    Response body

    The response body returns application/xml data. The following contains all the nodes:

    <Response>
    <JobsDetail>
      <Code></Code>
      <Message></Message>
      <JobId></JobId>
      <State></State>
      <CreationTime></CreationTime>
      <QueueId></QueueId>
      <Tag><Tag>
      <Input>
        <Object></Object>
      </Input>
      <Operation>
        <SpeechRecognition></SpeechRecognition>
        <Output>
          <Region></Region>
          <Bucket></Bucket>
          <Object></Object>
        </Output>
        <MediaInfo>
        </MeidaInfo>
      </Operation>
    </JobsDetail>
    </Response>
    

    The nodes are as described below:

    Node Name (Keyword) Parent Node Description Type
    Response None Response container Container

    Response has the following sub-nodes:

    Node Name (Keyword) Parent Node Description Type
    JobsDetail Response Job details Container

    JobsDetail has the following sub-nodes:

    Node Name (Keyword) Parent Node Description Type
    Code Response.JobsDetail Error code, which is meaningful only if State is Failed String
    Message Response.JobsDetail Error description, which is meaningful only if State is Failed String
    JobId Response.JobsDetail Job ID String
    Tag Response.JobsDetail Job type: SpeechRecognition String
    State Response.JobsDetail Job status. Valid values: Submitted, Running, Success, Failed, Pause, Cancel String
    CreationTime Response.JobsDetail Job creation time String
    QueueId Response.JobsDetail ID of the queue which the job is in String
    Input Response.JobsDetail Input resource address of the job Container
    Operation Response.JobsDetail Operation rule Container

    Input has the following sub-nodes:
    Same as the Request.Input node in the request.

    Operation has the following sub-nodes:

    Node Name (Keyword) Parent Node Description Type
    TemplateId Response.JobsDetail.Operation Job template ID String
    Output Response.JobsDetail.Operation File output address Container
    MediaInfo Response.JobsDetail.Operation Transcoding output video information. This node will not be returned if there is no output video. Container

    Output has the following sub-nodes:
    Same as the Request.Operation.Output node in the request.

    SpeechRecognition has the following sub-nodes:
    Same as the Request.Operation.SpeechRecognition node in the request.

    Error codes

    There are no special error messages for this request. For common error messages, see Error Codes.

    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support