Colon and hyphen symbols are classified as pronunciation instead of punctuation in Amazon transcribe

0

Hi,

When using Amazon Transcribe, we found the Colon and hyphen symbols are recognized as pronunciation instead of punctuation in the type of items from Amazon Transcribe result. Eg:

"items": [
    {
        "start_time": "xxxx.xx",
        "end_time": "xxxx.xx",
        "alternatives": [
            {
                "confidence": "1.0",
                "content": ":"
            }
        ],
        "type": "pronunciation"
    },
    ...
]

or

"items": [
    {
        "start_time": "xxxx.xx",
        "end_time": "xxxx.xx",
        "alternatives": [
            {
                "confidence": "1.0",
                "content": "-"
            }
        ],
        "type": "pronunciation"
    },
    ...
]

My understanding is that colon and hyphen should be classified as punctuation similar to comma and exclamation mark etc. But it's not the case in Amazon Transcribe results and the issue is reproducible on our end. BTW, in our case, the number "2" was mistranscribed as ":". The mistranscribing not our main concern but maybe related to the root cause of misclassification?

My question is: are the two symbols (colon and hyphen) expected to be classified as pronunciation? If not, is there a plan of fixing it?

질문됨 2년 전295회 조회
1개 답변
0

What did the speaker actually say that resulted in the transcriptions you describe? I can think of a few common scenarios where ":" and "-" should properly be considered "pronunciation" and not "punctuation".

For example if a speaker said, "the ratio of oil to vinegar in a vinaigrette is three-to-one" a transcription looking like "the ratio [...] is 3:1" is appropriate. This would also explain the case of seeing "two" incorrectly transcribed to ":" in your example above.

An example for the hyphen. The "-" symbol is a hyphen (punctuation) in some contexts but a minus sign (pronunciation) in others. So if a speaker said, "the temperature is minus five degrees," then a transcription looking like "the temperature is -5 degrees" could be appropriate.

Homophones - words (or symbols) that are pronounced the same but are spelled differently - can be a challenge for speech-to-text transcription, especially if the spoken text uses domain-specific words and phrases. This is where Amazon Transcribe features like Custom Vocabularies and Custom Language Models may be worth looking into.

profile pictureAWS
Kris
답변함 2년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠