Amazon Textract extraction speed

0

Good morning, I have an authorized quota of 60 per second for the API "DetectDocumentText throttle limit in transaction per second", we make the connection and everything works fine, the text extraction is correct, but the processing speed is very slow, you are using quota between 10 to 17 per minute when it should be with a quota utilization between 3000 and 3600 per minute, I want to know what else we can review to increase capacity because we need to process 5,000,000 pages per day and at this rate we will never end . The instance where it is connected with the texttract API is a C4.8xlarge, the S3 storage. I am very attentive to your comments, thank you very much.

CDT
已提问 1 年前626 查看次数
2 回答
1
  1. You might evaluate if your size of your picture, therefore quality is bigger than you really need so it's slowing your process. I would try to down-sample the pictures and see if the accuracy does not decrease but it increase potentially the processing time.
  2. I'm not sure I understood your architecture flow fully, but I advice you to look into this sample where you can see how to handle concurrent Textract requests and also the queue https://github.com/aws-samples/amazon-textract-serverless-large-scale-document-processing
profile pictureAWS
已回答 1 年前
  • If my response helped, please consider accept it so it can help others, thanks!

0

Please reach out to your local AWS account team and/or Solutions Architect. Given the scale that you're operating at there are likely a few different conversations that need to be had.

profile pictureAWS
专家
已回答 1 年前
  • Ok, very thank you!

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则