By using AWS re:Post, you agree to the Terms of Use

Serverless scheduled GPU processing solution



I have an SQS queue where I'm sending pending video processing and I'm searching for a solution for bringing up compute resources with GPU for processing this pending jobs every X hours. I have a container image capable of pulling messages from the queue, processing the video and persisting the results. I would like the solution to be scalable, meaning capable to increase the compute resources based on the quantity of pending videos in the queue.

I checked AWS Batch but it seems that each pending video should be a task by itself, meaning that it will bring up a new instance of the image for processing, which is not optimal. I think I found a way to do it with ECS but it's pretty complex and not really serverless (need some kind of provisioning, VPC etc.).

Any suggestions? Anyone already faced this issue?



2 Answers

GPU support for Fargate/serverless compute, is not yet available, however it's in roadmap - refer here

As you have rightly mentioned, ECS and AWS Batch has good GPU support with VPC. Please refer below blogs/docs, in case that help in the challenges you are facing.

profile picture
answered 16 days ago

An option might be to have a Lambda function that runs every minute. It checks the number of visible messages in the queue. If it increases, it start new ECS tasks, with GPU. The tasks themselves, run a loop that reads messages from the queue. If there are messages, it process them. If there are no more messages, the task exits.

You will need to do some tweaking as to when exactly to launch new tasks, i.e., how many messages should there be in the queue to increase the number of tasks.

answered 16 days ago
  • Hi Uri, thanks for you answer.

    Is the difference between AWS Batch and ECS only the fact that Batch also manage priority queues? I'm not sure I'm getting the differences between the services and which one match my needs...

    As far as I understand, the solution you suggested me seems to be the best. Just to be sure, to implement this solution, I need to:

    • Create an ECS cluster (with a minimum capacity of 0?)
    • Create a task definition for my job requiring GPU
    • Implement the lambda that will check the SQS queue and created the tasks depending on the number of messages in the queue

    How is the scaling of the cluster working? Since GPU resources are pretty expensive, I want to be sure to exactly fit my needs. Does ECS knows how to scale down once the tasks are done?


You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions