Technology Encyclopedia Home >How to classify misrecognition, missed recognition and misinsertion errors in speech recognition?

How to classify misrecognition, missed recognition and misinsertion errors in speech recognition?

Misrecognition, missed recognition, and misinsertion errors in speech recognition are classified based on how the recognized output deviates from the actual input. Here's how each type is defined and an example for clarity:

  1. Misrecognition (Substitution Error)

    • Definition: The system replaces a correct word or sound with an incorrect one.
    • Example: If the spoken phrase is "I want to book a flight" but the system outputs "I want to look a flight," the word "book" is misrecognized as "look."
  2. Missed Recognition (Omission Error)

    • Definition: The system fails to detect and transcribe a word or sound that was actually spoken.
    • Example: If the speaker says "The meeting is at 3 PM" but the output is "The meeting is at PM," the number "3" is missed.
  3. Misinsertion (Insertion Error)

    • Definition: The system adds an extra word or sound that was not present in the original speech.
    • Example: If the spoken input is "She likes coffee" but the output is "She likes to coffee," the word "to" is misinserted.

In speech recognition systems, these errors are often evaluated using metrics like Word Error Rate (WER), which combines all three types. For building robust speech recognition solutions, Tencent Cloud ASR (Automatic Speech Recognition) provides high-accuracy transcription services with noise reduction and language optimization to minimize such errors. It supports real-time and batch processing for various industries, including call centers, media, and smart devices.