Glue Throttling Exception when starting > 15 Glue jobs in Parallel via Step Function

0

We are using Step Functions for our ETL pipeline. The first step kicks off 21 jobs that each take about 1-3 minutes each consuming 5 DPUs. The Step Function fails with the below error when trying to run more than 15 Glue Jobs in parallel. We are using the arn:aws:states:::glue:startJobRun.sync task to invoke the jobs synchronously. Is there a quota I need to ask for an increase on? Kicking off 21 jobs in parallel seems pretty reasonable.

{ "resourceType": "glue", "resource": "startJobRun.sync", "error": "Glue.AWSGlueException", "cause": "Rate exceeded (Service: AWSGlue; Status Code: 400; Error Code: ThrottlingException; Proxy: null)" }

  • Hello tjtoll, would you mind to post the code snippet, which solves the issue for your case?

    Would be highly appreciate, thank you very much!

    Kind regards, Armin

tjtoll
質問済み 2年前4927ビュー
1回答
2
承認された回答

Hi! Good question.

For General Glue Service Quotas (Limits), please see here: https://docs.aws.amazon.com/general/latest/gr/glue.html

Default Glue Quotas include things like:

  • Max concurrent job runs per account (50)
  • Max jobs per trigger (50)

To increase those and other limits, you can open a Service Quota Increase Request.

For Throttling Exceptions (https://docs.aws.amazon.com/glue/latest/webapi/CommonErrors.html), I'm not sure on the exact limit where API calls will get limited - if that's the case, you may need to use exponential backoff to retry (I've seen this for other API calls): https://docs.aws.amazon.com/general/latest/gr/api-retries.html

jsonc
回答済み 2年前
AWS
エキスパート
レビュー済み 2年前
  • Thanks. I ended up using the retry/back off options on each of those jobs to have them retry after 5 seconds.

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ