Glue pricing for visual ETL jobs

0

I am writing this question after going through bunch of glue pricing documents. Essentially what I want to know is how glue divides visual job ETL components for pricing.

Pipeline Design: Visual ETL reading from DynamoDB, applying data transformation logics and loading data into Redshift. With this pipeline, we create multiple relational tables in Redshift for a given DDB table.

Glue visual ETL job test environment config: Version - 4.0 Worker Type - G1X Requested workers - 2 Job timeout - 2880 minutes

Let's say test job runs for 10 minutes.

Now, does AWS charge separately for visual diagram containing nodes like data source, custom transform, data target AND actual glue job run? If so, can you help me provide the price calculation for visual ETL diagram and job run? Assuming we will be running this job daily once and it may take 10 mins to run for example's sake. Additionally, does AWS also charge for data preview? Thanks

asked 25 days ago98 views
1 Answer
0

Hi,

You will only be charged in the visual part of building the job if you use the preview session. The Glue job run itself will only charge you for how long it takes to run, not node per node. Now if you have a transform that makes your job run considerably longer, then this is what will make cost go up, since the logic may be more complex and it takes more time to complete.

Here is an example of what this may cost for a typical session:

AWS Glue Studio Job Notebooks and Interactive Sessions: Suppose you use a notebook in AWS Glue Studio to interactively develop your ETL code. An Interactive Session has 5 DPU by default. If you keep the session running for 24 minutes or 2/5th of an hour, you will be billed for 5 DPUs * 2/5 hour at $0.44 per DPU-Hour or $0.88.

Based off of the job configuration of 2 G.1X Workers running for ten minutes you can use the following logic-

A Glue job will charge 1 minute minimum when it starts and then it is per second charges after that, so we will assume it runs for 10 minutes with two workers the whole time. In the calculation, you will see 2 DPUs which maps back to two G.1X workers.

https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-jobs-job.html

Unit conversions Duration for which Apache Spark ETL job runs: 10 minutes = 0.17 hours Pricing calculations Max (0.17 hours, 0.0166 hours (minimum billable duration)) = 0.17 hours (billable duration) 2 DPUs x 0.17 hours x 0.44 USD per DPU-Hour = 0.15 USD (Apache Spark ETL job cost) ETL job cost (per run): 0.15 USD

https://calculator.aws/#/createCalculator/Glue

You can use the pricing calculator to get estimates on job runs but keep in mind the worker types you're using and how they map to DPUs per the link mentioned above.

AWS
hamltm
answered 23 days ago
profile picture
EXPERT
reviewed 23 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions