Optimize Amazon Comprehend Analysis Job use

0

I have an audio recording file, and I want to extract my custom entities recognise data from the audio file. I searched and noticed Amazon Comprehend service could help me.

To do this, I have to do the following steps:

  1. I should submit to server and transcribe the file using AWS Transcribe service - it takes about 15~20 seconds.
  2. I should get the transcribed text and save it to s3 bucket for the Comprehend Analysis Job, because when we create a job, they ask it as input file.
  3. I should create an Analysis Job with the input file.
  4. I should wait till the job is finished - it is very slow (about 2 ~ 7 minutes).
  5. After the job is finished, the output file is output.tar.gz. They give us compressed file, not plain text.
  6. I should pull the into the local server and the unzip the file and then get the content.
  7. I should parse the file content as json data.

It takes about 5~15 minutes to do the whole steps. Especially step 4 takes pretty much times. I want to optimize it as much as possible. Can you please help me?

  • Some questions to answer:

    1. Which Comprehend API are you planning on using?
    2. How large do you expect the input file to be? Depending on your answer to (1), you may be able to use the synchronous API version which would reduce the total time of using Comprehend's API.
Thomas
asked 6 months ago98 views
No Answers

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions