Optimize Batch startup time

1

I'm considering using Batch as a more flexible alternative to Lambda where I could run longer or faster processes. A big difference with Lambda is the startup time. I searched and I found Batch startup time can vary from minutes to hours!

Considering my jobs are initiated by the user and the result is expected within minutes. How can the startup time be optimized? Yesterday night the jobs using default (optimal + spot) and no GPU requirements were starting promptly. This morning I can't make any job to start regardless of the machine. Surely there must be some guidance on the matter. Thanks!

asked 2 years ago1644 views
1 Answer
0

Batch needs Compute environments to run the Jobs. RUNNABLE jobs are started as soon as sufficient resources are available in one of the compute environments that are mapped to the job's queue. Now Compute Env can be unavailable due to various factors like non availability of selected Spot instances at that moment or you choose an single AZ where on demand capacity is very thin ( think specialized instances)

Additionally, the AWS Batch Scheduler periodically evaluates jobs in the queue and moves them forward as appropriate. As you submit more jobs, you will see that the AWS Batch Scheduler evaluates and operates upon many jobs at once with each scheduling interval.

For example, if you submit a hundred jobs to AWS Batch, the Scheduler will transition all of these from SUBMITTED to RUNNABLE or PENDING in about a minute. RUNNABLE jobs should transition to STARTING and RUNNING fairly quickly assuming you have sufficient resources in your compute environment.

Here are some resources to debug : Why is my AWS Batch job stuck in RUNNABLE status?

Some Blog : AWS Batch Dos and Don’ts: Best Practices in a Nutshell

AWS
answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions