By using AWS re:Post, you agree to the Terms of Use

what is the best Job scheduler in AWS

0

1)I have 150 ETL jobs need to be moved from on-premise to AWS cloud. What is the best way to schedule them in AWS? AWS batch or SWF or STEP function or any other service? Which tool/service has all the capability of a full time job scheduler?

  1. As per AWS Batch documentation, it says that it supports all jobs that are Docker container. Does it mean that AWS batch is only meant for Docker container jobs? In my case all jobs are either bash/ksh scripts and normal ETL jobs that read from files from EC2 and load data into RDS DB.
2 Answers
2

Hi there,

For the first question, I would suggest considering using ETL glue for your 150 ETL jobs. It's the serverless ETL offering within AWS data tools and you basically pay for the amount of compute (per second billing) to run your scripts (i.e like Lambda). Scalability and maintenance wise, most of the heavy lifting is handled for you.

Glue supports a few languages . To run your ETL script using python within glue, you can use a module called subprocess to run your bash script within the lean python script.

import subprocess
exit_code = subprocess.call('./practice.sh')
print(exit_code)

2)For your second question, the quick answer is yes as AWS batch is designed to run batch workloads using containers. However, you can have a simple fetch and run container image to do the work for all your scripts. You start by building a simple Docker image containing a helper application that can download your script or even a zip file from Amazon S3. AWS Batch then launches an instance of your container image to retrieve your script and run your job.

Here is a technical guide to do that .

Hope the above helps. Cheers.

answered 8 months ago
  • I am migrating SAS from on-premise DC to AWS. So I need to schedule the existing jobs in AWS. I am not using any AWS components to design my ETL jobs. I believe in this case, I can't use Glue, rather I am checking what is the best option to schedule existing SAS jobs (which will run in SAS DI EC2 instance in AWS). Please suggest AWS GLUE/MWAA/SWF or any other scheduler will be best for my case.

  • Hi, Thanks for providing the details. But I am looking for a full-fledged scheduler that provides a user interface , has all options/features of a scheduler, easy to schedule the jobs (adding dependencies, graphical representation of scheduling jobs etc). May I know if AWS batch has these features? How about the MWAA scheduler? Can these tools replace a scheduler like (autosys/Control-M etc)? Please note that I am not using AWS components for my job. instead I have installed my application software in EC2 instances and need to run the jobs using scripts. There are very complex requirements/dependencies in our jobs schedule. So I am looking for the best scheduler in AWS.

0
Accepted Answer

What about MWAA? It's a little bit complex but more suitable as a full time ETL scheduler.

answered 8 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions