Colon and hyphen symbols are classified as pronunciation instead of punctuation in Amazon transcribe

0

Hi,

When using Amazon Transcribe, we found the Colon and hyphen symbols are recognized as pronunciation instead of punctuation in the type of items from Amazon Transcribe result. Eg:

"items": [
    {
        "start_time": "xxxx.xx",
        "end_time": "xxxx.xx",
        "alternatives": [
            {
                "confidence": "1.0",
                "content": ":"
            }
        ],
        "type": "pronunciation"
    },
    ...
]

or

"items": [
    {
        "start_time": "xxxx.xx",
        "end_time": "xxxx.xx",
        "alternatives": [
            {
                "confidence": "1.0",
                "content": "-"
            }
        ],
        "type": "pronunciation"
    },
    ...
]

My understanding is that colon and hyphen should be classified as punctuation similar to comma and exclamation mark etc. But it's not the case in Amazon Transcribe results and the issue is reproducible on our end. BTW, in our case, the number "2" was mistranscribed as ":". The mistranscribing not our main concern but maybe related to the root cause of misclassification?

My question is: are the two symbols (colon and hyphen) expected to be classified as pronunciation? If not, is there a plan of fixing it?

已提問 2 年前檢視次數 295 次
1 個回答
0

What did the speaker actually say that resulted in the transcriptions you describe? I can think of a few common scenarios where ":" and "-" should properly be considered "pronunciation" and not "punctuation".

For example if a speaker said, "the ratio of oil to vinegar in a vinaigrette is three-to-one" a transcription looking like "the ratio [...] is 3:1" is appropriate. This would also explain the case of seeing "two" incorrectly transcribed to ":" in your example above.

An example for the hyphen. The "-" symbol is a hyphen (punctuation) in some contexts but a minus sign (pronunciation) in others. So if a speaker said, "the temperature is minus five degrees," then a transcription looking like "the temperature is -5 degrees" could be appropriate.

Homophones - words (or symbols) that are pronounced the same but are spelled differently - can be a challenge for speech-to-text transcription, especially if the spoken text uses domain-specific words and phrases. This is where Amazon Transcribe features like Custom Vocabularies and Custom Language Models may be worth looking into.

profile pictureAWS
Kris
已回答 2 年前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南