How do I monitor and optimize cost on AWS Glue for Apache Spark?

3 minute read
0

I want to monitor my Spark usage with AWS Glue, and optimize costs.

Resolution

Monitor usage

To get a summary of the cost of Spark usage in your AWS Glue jobs, use AWS Cost Explorer.

Complete the following steps:

  1. Open the AWS Billing and Cost Management console.
  2. In the navigation pane, choose Cost Explorer.
  3. On the Cost dashboard, view the monthly costs for AWS Glue.

View usage by job detail

To monitor AWS Glue job details, such as its run status, run duration, or data processing unit (DPU) usage, complete the following steps:

  1. Open the AWS Glue console.
  2. Under ETL Jobs, choose Job run monitoring.

View cost by type of job

To get the costs for a specific type of AWS Glue job, complete the following steps:

  1. Open the AWS Billing and Cost Management console.
  2. Under Cost and usage analysis, choose Cost Explorer.
  3. Under Report parameters, in the Filters section, for Service, choose Glue.
  4. Under Usage type, select the filter for your job and include your AWS Region:
    For a standard job, use the ETL-DPU-Hour filter. For example, for the US West (Oregon) Region, apply USW2-ETL-DPU-Hour.
    For a flex job, use the ETL-Flex-DPU-Hour filter. For example, apply USW2-ETL-Flex-DPU-Hour.
    For an interactive session, use the GlueInteractiveSession-DPU-Hour. For example, apply USW2-GlueInteractiveSession-DPU-Hour.

Get the usage and cost for a specific job

To get the cost for a specific AWS Glue job, complete the following steps:

  1. Open the AWS Glue console.
  2. Under ETL Jobs, choose Job run monitoring.
  3. Find the DPU hours that you used for the job.
  4. On the AWS Glue pricing page, on the ETL jobs and interactive sessions tab, select your Region.
  5. Note the cost of each DPU-HOUR for your job type.
  6. To calculate the cost, multiply your DPU hours by the cost for each DPU-HOUR.

To get AWS Glue job metrics for memory or CPU usage or data traffic, set up a CloudWatch alarm.

To get notifications about your AWS Glue job, see How do I receive Amazon SNS notifications when my AWS Glue job changes states?

Optimize cost

To optimize costs for Spark usage in AWS Glue jobs, take the following actions:

Related information

Monitor and optimize cost on AWS Glue for Apache Spark

AWS OFFICIAL
AWS OFFICIALUpdated a month ago