Colon and hyphen symbols are classified as pronunciation instead of punctuation in Amazon transcribe

0

Hi,

When using Amazon Transcribe, we found the Colon and hyphen symbols are recognized as pronunciation instead of punctuation in the type of items from Amazon Transcribe result. Eg:

"items": [
    {
        "start_time": "xxxx.xx",
        "end_time": "xxxx.xx",
        "alternatives": [
            {
                "confidence": "1.0",
                "content": ":"
            }
        ],
        "type": "pronunciation"
    },
    ...
]

or

"items": [
    {
        "start_time": "xxxx.xx",
        "end_time": "xxxx.xx",
        "alternatives": [
            {
                "confidence": "1.0",
                "content": "-"
            }
        ],
        "type": "pronunciation"
    },
    ...
]

My understanding is that colon and hyphen should be classified as punctuation similar to comma and exclamation mark etc. But it's not the case in Amazon Transcribe results and the issue is reproducible on our end. BTW, in our case, the number "2" was mistranscribed as ":". The mistranscribing not our main concern but maybe related to the root cause of misclassification?

My question is: are the two symbols (colon and hyphen) expected to be classified as pronunciation? If not, is there a plan of fixing it?

已提问 2 年前295 查看次数
1 回答
0

What did the speaker actually say that resulted in the transcriptions you describe? I can think of a few common scenarios where ":" and "-" should properly be considered "pronunciation" and not "punctuation".

For example if a speaker said, "the ratio of oil to vinegar in a vinaigrette is three-to-one" a transcription looking like "the ratio [...] is 3:1" is appropriate. This would also explain the case of seeing "two" incorrectly transcribed to ":" in your example above.

An example for the hyphen. The "-" symbol is a hyphen (punctuation) in some contexts but a minus sign (pronunciation) in others. So if a speaker said, "the temperature is minus five degrees," then a transcription looking like "the temperature is -5 degrees" could be appropriate.

Homophones - words (or symbols) that are pronounced the same but are spelled differently - can be a challenge for speech-to-text transcription, especially if the spoken text uses domain-specific words and phrases. This is where Amazon Transcribe features like Custom Vocabularies and Custom Language Models may be worth looking into.

profile pictureAWS
Kris
已回答 2 年前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则