Colon and hyphen symbols are classified as pronunciation instead of punctuation in Amazon transcribe

0

Hi,

When using Amazon Transcribe, we found the Colon and hyphen symbols are recognized as pronunciation instead of punctuation in the type of items from Amazon Transcribe result. Eg:

"items": [
    {
        "start_time": "xxxx.xx",
        "end_time": "xxxx.xx",
        "alternatives": [
            {
                "confidence": "1.0",
                "content": ":"
            }
        ],
        "type": "pronunciation"
    },
    ...
]

or

"items": [
    {
        "start_time": "xxxx.xx",
        "end_time": "xxxx.xx",
        "alternatives": [
            {
                "confidence": "1.0",
                "content": "-"
            }
        ],
        "type": "pronunciation"
    },
    ...
]

My understanding is that colon and hyphen should be classified as punctuation similar to comma and exclamation mark etc. But it's not the case in Amazon Transcribe results and the issue is reproducible on our end. BTW, in our case, the number "2" was mistranscribed as ":". The mistranscribing not our main concern but maybe related to the root cause of misclassification?

My question is: are the two symbols (colon and hyphen) expected to be classified as pronunciation? If not, is there a plan of fixing it?

asked 2 years ago285 views
1 Answer
0

What did the speaker actually say that resulted in the transcriptions you describe? I can think of a few common scenarios where ":" and "-" should properly be considered "pronunciation" and not "punctuation".

For example if a speaker said, "the ratio of oil to vinegar in a vinaigrette is three-to-one" a transcription looking like "the ratio [...] is 3:1" is appropriate. This would also explain the case of seeing "two" incorrectly transcribed to ":" in your example above.

An example for the hyphen. The "-" symbol is a hyphen (punctuation) in some contexts but a minus sign (pronunciation) in others. So if a speaker said, "the temperature is minus five degrees," then a transcription looking like "the temperature is -5 degrees" could be appropriate.

Homophones - words (or symbols) that are pronounced the same but are spelled differently - can be a challenge for speech-to-text transcription, especially if the spoken text uses domain-specific words and phrases. This is where Amazon Transcribe features like Custom Vocabularies and Custom Language Models may be worth looking into.

profile pictureAWS
Kris
answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions