Need a metric in AWS Glue for Glue failed jobs

0

What is the cloud watch metric for failed glue jobs/runs?

Simply put I need a metric for failed glue jobs/runs.

The GlueExceptionAnalysisListener seems to be the only thing capturing failed glue jobs/runs. And job/run failures are still not simple to find within Cloudwatch let alone Glue as a metric.

Looking for something like this.

glue.X.executor.failedjobs glue.X.executor.failedruns and glue.X.executor.completedjobs glue.X.executor.completedruns

Ultimately, I am looking to pipe this into a third party Platform Observability tool

I need something that is in line with the following metrics:Glue Metrics

2 Answers
0
Accepted Answer
Phil11
answered 6 months ago
profile picture
EXPERT
reviewed a month ago
0

Normally you don't expect to regularly have failed job, instead you alarm when jobs fail.
If you want to create that metric, you could in EventBridge trigger a lambda when a job ends and update a metric depending on the outcome.

profile pictureAWS
EXPERT
answered 6 months ago
  • Interesting are there any other methods of completing this?

    Again, we are trying to get this into our Third Party Platform Obs as a metric and it would be used as an emergency type of metric too wake up the troops to look into the issue. I understand it is extremely rare that jobs/runs fail but that is even more reason we would like the metric.

  • An EventBridge rule is more timely and actionable that any metric but if you want to do something for complex like (if it fails x times over period y) you could use that metric, for that you would have to build it yourself from the rule action (e.g. calling a lambda)

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions