Glue glue.driver.aggregate.numFailedTasks not reporting correctly

0

we are using something similar to the following lambda function and collecting glue customer metrics as per the article : https://medium.com/@ettefette/metrics-for-aws-glue-jobs-as-you-know-them-from-lambda-functions-e5e1873c615c

But we see different number of failed tasks in Glue console comparing to what Cloudwatch metrics reporting when trying to find Cound ('Glue glue.driver.aggregate.numFailedTasks')

def handler(event, context): job_name = event["detail"]["jobName"] job_run_id = event["detail"]["jobRunId"]

cloudwatch = boto3.client("cloudwatch", region_name="eu-central-1")

if event["detail-type"] == "Glue Job State Change":
    job_status = event["detail"]["state"]

    if job_status not in ["SUCCEEDED", "FAILED", "TIMEOUT", "STOPPED"]:
        raise AttributeError("Job state is not supported.")

    if job_status == "SUCCEEDED":
        metric_value = 1.0
    else:
        metric_value = 0.0

    cloudwatch.put_metric_data(
        MetricData=[
            {
                "MetricName": "JobStatus",
                "Dimensions": [
                    {"Name": "JobName", "Value": job_name},
                    {"Name": "JobRunId", "Value": job_run_id},
                    {"Name": "JobStatus", "Value": job_status},
                ],
                "Unit": "None",
                "Value": metric_value,
            }
        ],
        Namespace="Glue",
    )

=======================

Any ideas ?

질문됨 2년 전46회 조회
답변 없음

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠