Glue Throttling Exception when starting > 15 Glue jobs in Parallel via Step Function
We are using Step Functions for our ETL pipeline. The first step kicks off 21 jobs that each take about 1-3 minutes each consuming 5 DPUs. The Step Function fails with the below error when trying to run more than 15 Glue Jobs in parallel. We are using the arn:aws:states:::glue:startJobRun.sync task to invoke the jobs synchronously. Is there a quota I need to ask for an increase on? Kicking off 21 jobs in parallel seems pretty reasonable.
{ "resourceType": "glue", "resource": "startJobRun.sync", "error": "Glue.AWSGlueException", "cause": "Rate exceeded (Service: AWSGlue; Status Code: 400; Error Code: ThrottlingException; Proxy: null)" }
Hi! Good question.
For General Glue Service Quotas (Limits), please see here: https://docs.aws.amazon.com/general/latest/gr/glue.html
Default Glue Quotas include things like:
- Max concurrent job runs per account (50)
- Max jobs per trigger (50)
To increase those and other limits, you can open a Service Quota Increase Request.
For Throttling Exceptions (https://docs.aws.amazon.com/glue/latest/webapi/CommonErrors.html), I'm not sure on the exact limit where API calls will get limited - if that's the case, you may need to use exponential backoff to retry (I've seen this for other API calls): https://docs.aws.amazon.com/general/latest/gr/api-retries.html
Thanks. I ended up using the retry/back off options on each of those jobs to have them retry after 5 seconds.
Relevant questions
Step function state to execute a Glue job seems to be stalling
asked a year agoGlue, steps or Kinesis? Some guidance on when to choose what
asked 5 months agoGlue Throttling Exception when starting > 15 Glue jobs in Parallel via Step Function
Accepted Answerasked 5 months agoHow to concatenate character strings in step function for AWS Glue parameters?
Accepted Answerasked 2 years agoSophisticated Triggering of Glue Jobs
asked 13 days agowhat is the best Job scheduler in AWS
Accepted Answerasked 4 months agoStep Function action GetJob (AWS Glue) does not return CodeGenConfigurationNodes despite the documentation saying it should
asked a month agoETL Workflow Orchestration Step functions and/or Glue Workflows??
Accepted Answerasked 3 years agodesign suggestions to modernize .Net application
asked 2 months agoAWS StepFunctions: sum() function returns error about not finding path
asked 2 months ago
Hello tjtoll, would you mind to post the code snippet, which solves the issue for your case?
Would be highly appreciate, thank you very much!
Kind regards, Armin