AWS transcribe - is there a way to transcribe numbers not as digits?

0

I am using AWS transcribe in the following format:

aws transcribe start-transcription-job --language-code en-US --media-format wav --media MediaFileUri=s3://my-bucket/my-audio-file.wav --output-bucket-name my-output-bucket

And in my output files, I am seeing that any number that's being said is transcribed as digits. so for example: "I just spent fifty dollars" is transcribed as "I just spent 50 dollars".

Is there a way to transcribe numbers in their written form and not digits?

1 Answer
0

hi,

At the moment, there is no api parameter to disable the number Transcribe feature(https://docs.aws.amazon.com/transcribe/latest/dg/how-numbers.html). but there are some post process step you can apply, for example, you can use

from num2words import num2words

# Define a function to convert numbers in a sentence to words
def convert_numbers_to_words(sentence):
    words = []
    for word in sentence.split():
        # Check if the word is a number
        if word.isnumeric():
            # Convert the number to words and append to the list
            words.append(num2words(word))
        else:
            # Append the original word to the list
            words.append(word)
    # Join the words back into a sentence
    return " ".join(words)

# Example usage
sentence = "I just spent 50 dollars"
converted_sentence = convert_numbers_to_words(sentence)
print(converted_sentence)
import inflect
import re

def convert_numbers_to_words(text):
    p = inflect.engine()
    words = text.split()
    new_words = []

    for word in words:
        if word.isdigit():
            word = p.number_to_words(word)
        new_words.append(word)

    return ' '.join(new_words)

transcribed_text = "I just spent 50 dollars"
converted_text = convert_numbers_to_words(transcribed_text)
print(converted_text)
  • or just your simple dictionary - {number: word} with re and replacement

hope that helps you.

AWS
answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions