Hi, I am working on a web app that basically does what AWS's Real-time transcription does. However, my web app works a lot slower than AWS's Real-time transcription.

Since I tested from from the same machine, same location, hitting the same region (us-west-2), I suspect that my JavaScript code is not doing audio encoding fast enough. AWS Streaming Transcribe only has example code for streaming a file: can someone share snippet to illustrate how AWS's Real-time transcription does audio encoding?

P.S. My app uses code every similar to what is described in

I haven't directly worked in this area, but I think you could borrow some aspects of your solution from this code sample namely the Meeting Audio Processing backend based on Kinesis/Fargate. Pls check if this helps.

Thanks, Rama

  • Thanks for your reply, Rama. It turns out that my web app transcribes slower than AWS's Real-time transcription because my web app uses Medical Transcribe. It appears that Medical Transcribe needs a lot more time to 'end'/'finalize' the transcription.

