Why doesn't my AWS Glue job write logs to Amazon CloudWatch?

4 minute read
0

My AWS Glue extract, load, and transform (ETL) job doesn't write logs to Amazon CloudWatch.

Short description

If your AWS Glue jobs don't write logs to CloudWatch, then confirm the following:

  • Your AWS Glue job has all the required AWS Identity and Access Management (IAM) permissions.
  • The AWS Key Management Service (AWS KMS) key allows CloudWatch Logs to use the key.
  • Your job checks the correct CloudWatch log group.
  • The logs:AssociateKmsKey IAM permission is attached to the AWS Glue role.
  • If you don't use continuous logging for your AWS Glue Spark ETL job, then check if the job failed before log aggregation.

Resolution

The AWS Glue job role doesn't have permissions to create and write to the CloudWatch log group

If you don't use the AWSGlueServiceRole managed policy, then confirm that the IAM role that's attached to the ETL job has the correct permissions. The following permissions are required to use CloudWatch. If the job uses a custom log group, then the IAM policy must provide access to the custom log group:

{
    "Effect": "Allow",
    "Action": [
        "logs:CreateLogGroup",
        "logs:CreateLogStream",
        "logs:PutLogEvents"
    ],
    "Resource": [
        "arn:aws:logs:*:*:*:/aws-glue/*",
        "arn:aws:logs:*:*:*:/customlogs/*"
    ]
}

Note: Replace arn:aws:logs:*:*:*:/customlogs/* with the ARN of the custom log group.

The AWS KMS key doesn't grant permission to CloudWatch Logs

If you use security configurations with your AWS Glue job, then check your AWS KMS key. The AWS KMS key that's attached to the security configuration must allow CloudWatch Logs to use the key. Attach the following policy to the AWS KMS key:

{    "Effect": "Allow",
    "Principal": {
        "Service": "logs.region.amazonaws.com"
    },
    "Action": [
        "kms:Encrypt*",
        "kms:Decrypt*",
        "kms:ReEncrypt*",
        "kms:GenerateDataKey*",
        "kms:Describe*"
    ],
    "Resource": "*",
    "Condition": {
        "ArnEquals": {
            "kms:EncryptionContext:aws:logs:arn": "arn:aws:logs:us-west-2:1111222233334444:log-group:log-group-name"
        }
    }
}

Replace us-west-2 with your AWS Region, 1111222233334444 with your AWS account ID, and og-group-name with the name of your log group.

For more information, see Encrypt log data in CloudWatch Logs using AWS Key Management Service.

Also, confirm that the logs:AssociateKmsKey IAM permission is attached to the AWS Glue role. For more information, see Security configuration with continuous logging.

Continuous logging isn't turned on

If you don't turn on continuous logging for your AWS Glue Spark ETL job, then log aggregation happens after the job is completed. If the job fails before log aggregation, then the logs might not get pushed to CloudWatch.

Turn on continuous logging for your AWS Glue jobs so that logs are populated in case applications fail.

You aren't looking for the logs in the correct log group

If you turn on continuous logging and use the default log groups, then you can find CloudWatch logs in the following locations:

  • Custom messages, such as those from print statements, are pushed to /aws-glue/jobs/output log group.

  • The messages from the AWS Glue logger are pushed to the driver logs under /aws-glue/jobs/logs-v2.

    logger = glueContext.get_logger()
    logger.info("MY INFO LOGGER MESSAGE" )
    logger.error("MY ERROR LOGGER MESSAGE")
  • The messages from a Python logger are pushed to the driver logs under /aws-glue/jobs/output.

    import logging
    MSG_FORMAT = '%(asctime)s %(levelname)s %(name)s: %(message)s'
    DATETIME_FORMAT = '%Y-%m-%d %H:%M:%S'
    logging.basicConfig(format=MSG_FORMAT, datefmt=DATETIME_FORMAT)
    logger2 = logging.getLogger("logger2")
    logger2.setLevel(logging.INFO)
    logger2.info("Test log message from python logging")
  • Jobs that use security configurations push custom messages from the AWS Glue logger to /aws-glue/jobs/logs-v2-testconfig. Replace testconfig with the name of the security configuration.

  • Jobs that use security configurations push custom messages from a Python logger to /aws-glue/jobs/testconfig-role/job-role/output. Replace testconfig with the name of the security configuration and job-role with the AWS Glue job role.

If you turn on continuous logging and use custom log groups, then you can find CloudWatch logs in the following location:

  • The custom log messages, driver logs, and executor logs are stored under the custom log group.

If you don't turn on continuous logging, then you can find CloudWatch logs in the following locations:

  • Messages, such as print statement outputs and Python logging messages, are stored under /aws-glue/jobs/output.
  • All custom messages from the AWS Glue logger are stored under /aws-glue/jobs/error.

For more information, see Logging behavior.

Related information

Logging and monitoring in AWS Glue

Monitoring AWS Glue

AWS OFFICIAL
AWS OFFICIALUpdated 7 months ago
2 Comments

In my case I had to use the logger of GlueContext. Neither my print statements, nor the built-in logging module has written any logs to Cloudwatch.

glueContext = GlueContext(sc)
logger = glueContext.get_logger()
logger.info("MY LOG MESSAGE" )
replied 9 months ago

Thank you for your comment. We'll review and update the Knowledge Center article as needed.

profile pictureAWS
MODERATOR
replied 9 months ago