Colon and hyphen symbols are classified as pronunciation instead of punctuation in Amazon transcribe

0

Hi,

When using Amazon Transcribe, we found the Colon and hyphen symbols are recognized as pronunciation instead of punctuation in the type of items from Amazon Transcribe result. Eg:

"items": [
    {
        "start_time": "xxxx.xx",
        "end_time": "xxxx.xx",
        "alternatives": [
            {
                "confidence": "1.0",
                "content": ":"
            }
        ],
        "type": "pronunciation"
    },
    ...
]

or

"items": [
    {
        "start_time": "xxxx.xx",
        "end_time": "xxxx.xx",
        "alternatives": [
            {
                "confidence": "1.0",
                "content": "-"
            }
        ],
        "type": "pronunciation"
    },
    ...
]

My understanding is that colon and hyphen should be classified as punctuation similar to comma and exclamation mark etc. But it's not the case in Amazon Transcribe results and the issue is reproducible on our end. BTW, in our case, the number "2" was mistranscribed as ":". The mistranscribing not our main concern but maybe related to the root cause of misclassification?

My question is: are the two symbols (colon and hyphen) expected to be classified as pronunciation? If not, is there a plan of fixing it?

preguntada hace 2 años295 visualizaciones
1 Respuesta
0

What did the speaker actually say that resulted in the transcriptions you describe? I can think of a few common scenarios where ":" and "-" should properly be considered "pronunciation" and not "punctuation".

For example if a speaker said, "the ratio of oil to vinegar in a vinaigrette is three-to-one" a transcription looking like "the ratio [...] is 3:1" is appropriate. This would also explain the case of seeing "two" incorrectly transcribed to ":" in your example above.

An example for the hyphen. The "-" symbol is a hyphen (punctuation) in some contexts but a minus sign (pronunciation) in others. So if a speaker said, "the temperature is minus five degrees," then a transcription looking like "the temperature is -5 degrees" could be appropriate.

Homophones - words (or symbols) that are pronounced the same but are spelled differently - can be a challenge for speech-to-text transcription, especially if the spoken text uses domain-specific words and phrases. This is where Amazon Transcribe features like Custom Vocabularies and Custom Language Models may be worth looking into.

profile pictureAWS
Kris
respondido hace 2 años

No has iniciado sesión. Iniciar sesión para publicar una respuesta.

Una buena respuesta responde claramente a la pregunta, proporciona comentarios constructivos y fomenta el crecimiento profesional en la persona que hace la pregunta.

Pautas para responder preguntas