Textract performance degradation

0

Hello, I need to OCR huge pdf documents as fast as possible due to SLA. My first short was to use SYNC mode. I was not able to get more than couple of pages per second and AWS support declined all requests to increase quota. So I decided to use ASYNC mode. It works more or less ok most of time - I am able to ocr 200 pages in less than 2 minutes. But around 15:00 by UTC almost every day I get huge performance degradation - sometimes it takes 120 sec to OCR single page. Textract task is just get stuck in IN_PROGRESS state. Any ideas or suggestions?

feita há 2 anos1524 visualizações
2 Respostas
0

Hi, Thank you for using Textract and I'm sorry to hear you're facing performance issues. If you're comfortable can you share sample job IDs and the region where you're facing these issues? You can also reach out via AWS Support to share these details.

AWS
respondido há 2 anos
  • And as well, I found this: "underestimate because of the timed-out jobs. If you want to build a real-time, customer-facing product with PDF inputs, AWS Textract is not the tool for you. Accuracy and speed results. Double asterisks indicate the best result for each measure" Above was the answer I got from AWS support when I reported about performance. Is it true?

0

I expect the regular pattern you're seeing to the latency probably corresponds to changing overall demand on the service in the region.

Therefore I'd maybe suggest you try routing documents to a different AWS Region during these problem periods, if possible? Probably some testing would be needed to find ideal schedules & regions - but as a first guess I'd explore regions in significantly different timezones and those with high default quotas.

It's worth mentioning that for async APIs the performance characteristics for very short documents should be dominated by queuing/overheads anyway: So it's probably not that useful to compare the per-page processing time of a 200-page doc and a 1-page doc.

AWS
ESPECIALISTA
Alex_T
respondido há 2 anos

Você não está conectado. Fazer login para postar uma resposta.

Uma boa resposta responde claramente à pergunta, dá feedback construtivo e incentiva o crescimento profissional de quem perguntou.

Diretrizes para responder a perguntas