AWS transcribe - is there a way to transcribe numbers not as digits?

0

I am using AWS transcribe in the following format:

aws transcribe start-transcription-job --language-code en-US --media-format wav --media MediaFileUri=s3://my-bucket/my-audio-file.wav --output-bucket-name my-output-bucket

And in my output files, I am seeing that any number that's being said is transcribed as digits. so for example: "I just spent fifty dollars" is transcribed as "I just spent 50 dollars".

Is there a way to transcribe numbers in their written form and not digits?

1 Risposta
0

hi,

At the moment, there is no api parameter to disable the number Transcribe feature(https://docs.aws.amazon.com/transcribe/latest/dg/how-numbers.html). but there are some post process step you can apply, for example, you can use

from num2words import num2words

# Define a function to convert numbers in a sentence to words
def convert_numbers_to_words(sentence):
    words = []
    for word in sentence.split():
        # Check if the word is a number
        if word.isnumeric():
            # Convert the number to words and append to the list
            words.append(num2words(word))
        else:
            # Append the original word to the list
            words.append(word)
    # Join the words back into a sentence
    return " ".join(words)

# Example usage
sentence = "I just spent 50 dollars"
converted_sentence = convert_numbers_to_words(sentence)
print(converted_sentence)
import inflect
import re

def convert_numbers_to_words(text):
    p = inflect.engine()
    words = text.split()
    new_words = []

    for word in words:
        if word.isdigit():
            word = p.number_to_words(word)
        new_words.append(word)

    return ' '.join(new_words)

transcribed_text = "I just spent 50 dollars"
converted_text = convert_numbers_to_words(transcribed_text)
print(converted_text)
  • or just your simple dictionary - {number: word} with re and replacement

hope that helps you.

AWS
con risposta un anno fa

Accesso non effettuato. Accedi per postare una risposta.

Una buona risposta soddisfa chiaramente la domanda, fornisce un feedback costruttivo e incoraggia la crescita professionale del richiedente.

Linee guida per rispondere alle domande