Amazon Textract extraction speed

0

Good morning, I have an authorized quota of 60 per second for the API "DetectDocumentText throttle limit in transaction per second", we make the connection and everything works fine, the text extraction is correct, but the processing speed is very slow, you are using quota between 10 to 17 per minute when it should be with a quota utilization between 3000 and 3600 per minute, I want to know what else we can review to increase capacity because we need to process 5,000,000 pages per day and at this rate we will never end . The instance where it is connected with the texttract API is a C4.8xlarge, the S3 storage. I am very attentive to your comments, thank you very much.

CDT
질문됨 일 년 전625회 조회
2개 답변
1
  1. You might evaluate if your size of your picture, therefore quality is bigger than you really need so it's slowing your process. I would try to down-sample the pictures and see if the accuracy does not decrease but it increase potentially the processing time.
  2. I'm not sure I understood your architecture flow fully, but I advice you to look into this sample where you can see how to handle concurrent Textract requests and also the queue https://github.com/aws-samples/amazon-textract-serverless-large-scale-document-processing
profile pictureAWS
답변함 일 년 전
  • If my response helped, please consider accept it so it can help others, thanks!

0

Please reach out to your local AWS account team and/or Solutions Architect. Given the scale that you're operating at there are likely a few different conversations that need to be had.

profile pictureAWS
전문가
답변함 일 년 전
  • Ok, very thank you!

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠