How do I get the total number of textract jobs running

0

I am processing thousands of documents using textract. But since textract has a limit of running 100 jobs in parallel, I want to make sure that only 100 documents are sent to textract at any particular time. However looking at the api, I am not able to find anything useful which would return me the total number of textract jobs running currently. Please let me know if there is a way around this. Thanks

munna
질문됨 10달 전285회 조회
1개 답변
0

Once you set a quota for the number of parallel jobs, the next job that goes over quota will be throttled and you need to handle this in your submission process (e.g. re- submission after a back off period). There is a CloudWatch metric as well that shows you the number of throttled requests.

But if you want to keep a real-time view on the jobs, you need to keep track of the jobs in a repository such as DynamoDB to know precisely how many jobs are running in parallel at any given time. This needs a bit of more coding and potential use of AWS Step Functions for jobs submission to orchestrate.

AWS
답변함 10달 전
  • Thank you for the response, Behrang. That helps. However the proposed workaround would involve a bunch of work just to get the total jobs that are currently running. It is a surprise though that textract do not provide list_jobs() function like other aws services.

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠