Glue glue.driver.aggregate.numFailedTasks not reporting correctly

0

we are using something similar to the following lambda function and collecting glue customer metrics as per the article : https://medium.com/@ettefette/metrics-for-aws-glue-jobs-as-you-know-them-from-lambda-functions-e5e1873c615c

But we see different number of failed tasks in Glue console comparing to what Cloudwatch metrics reporting when trying to find Cound ('Glue glue.driver.aggregate.numFailedTasks')

def handler(event, context): job_name = event["detail"]["jobName"] job_run_id = event["detail"]["jobRunId"]

cloudwatch = boto3.client("cloudwatch", region_name="eu-central-1")

if event["detail-type"] == "Glue Job State Change":
    job_status = event["detail"]["state"]

    if job_status not in ["SUCCEEDED", "FAILED", "TIMEOUT", "STOPPED"]:
        raise AttributeError("Job state is not supported.")

    if job_status == "SUCCEEDED":
        metric_value = 1.0
    else:
        metric_value = 0.0

    cloudwatch.put_metric_data(
        MetricData=[
            {
                "MetricName": "JobStatus",
                "Dimensions": [
                    {"Name": "JobName", "Value": job_name},
                    {"Name": "JobRunId", "Value": job_run_id},
                    {"Name": "JobStatus", "Value": job_status},
                ],
                "Unit": "None",
                "Value": metric_value,
            }
        ],
        Namespace="Glue",
    )

=======================

Any ideas ?

gefragt vor 2 Jahren46 Aufrufe
Keine Antworten

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen