Need a metric in AWS Glue for Glue failed jobs

0

What is the cloud watch metric for failed glue jobs/runs?

Simply put I need a metric for failed glue jobs/runs.

The GlueExceptionAnalysisListener seems to be the only thing capturing failed glue jobs/runs. And job/run failures are still not simple to find within Cloudwatch let alone Glue as a metric.

Looking for something like this.

glue.X.executor.failedjobs glue.X.executor.failedruns and glue.X.executor.completedjobs glue.X.executor.completedruns

Ultimately, I am looking to pipe this into a third party Platform Observability tool

I need something that is in line with the following metrics:Glue Metrics

Phil11
feita há 7 meses510 visualizações
2 Respostas
0
Resposta aceita
Phil11
respondido há 6 meses
profile picture
ESPECIALISTA
avaliado há um mês
0

Normally you don't expect to regularly have failed job, instead you alarm when jobs fail.
If you want to create that metric, you could in EventBridge trigger a lambda when a job ends and update a metric depending on the outcome.

profile pictureAWS
ESPECIALISTA
respondido há 7 meses
  • Interesting are there any other methods of completing this?

    Again, we are trying to get this into our Third Party Platform Obs as a metric and it would be used as an emergency type of metric too wake up the troops to look into the issue. I understand it is extremely rare that jobs/runs fail but that is even more reason we would like the metric.

  • An EventBridge rule is more timely and actionable that any metric but if you want to do something for complex like (if it fails x times over period y) you could use that metric, for that you would have to build it yourself from the rule action (e.g. calling a lambda)

Você não está conectado. Fazer login para postar uma resposta.

Uma boa resposta responde claramente à pergunta, dá feedback construtivo e incentiva o crescimento profissional de quem perguntou.

Diretrizes para responder a perguntas