Skip to content

How to push EMR application logs to Cloudwatch.

0

We have an lambda function that creates a cluster on demand , the cluster logs goes by default to S3 bucket specified in loguri, now we have a requirement to get application logs in cloudwatch whenever an Cluster (EC2)is started in EMR,how to achieve this?

1 Answer
0

To push EMR application logs to CloudWatch when a cluster is started, you need to configure your EMR job to send log information to CloudWatch Logs. Here's how you can achieve this:

  1. Update IAM Policy: First, ensure that the execution role for your EMR job has the necessary permissions to send logs to CloudWatch. The IAM policy should include permissions like logs:CreateLogStream, logs:DescribeLogGroups, logs:DescribeLogStreams, and logs:PutLogEvents.

  2. Configure CloudWatch Logging: In your Lambda function that creates the EMR cluster, you need to add CloudWatch logging configuration. This is done by including a cloudWatchMonitoringConfiguration in the monitoringConfiguration section of your StartJobRun request.

  3. Specify Log Group and Stream: In the CloudWatch configuration, you need to specify a log group name and optionally a log stream name prefix. If the log group doesn't exist, EMR will create it automatically (given the proper permissions).

  4. Update Cluster Configuration: When creating the EMR cluster in your Lambda function, include the CloudWatch logging configuration. This will ensure that when the cluster starts, it will begin sending logs to CloudWatch.

The logs will be organized in CloudWatch as follows:

  • Driver logs: logGroup/logStreamPrefix/virtual-cluster-id/jobs/job-id/containers/spark-application-id/spark-job-id-driver/(stderr/stdout)
  • Executor logs: logGroup/logStreamPrefix/virtual-cluster-id/jobs/job-id/containers/spark-application-id/executor-pod-name/(stderr/stdout)

By implementing these steps, your EMR application logs will be pushed to CloudWatch whenever a cluster is started, allowing you to monitor and troubleshoot your EMR jobs more effectively.
Sources
Configure a job run to use Amazon CloudWatch Logs - Amazon EMR
CloudWatchLoggingConfiguration - Amazon EMR Serverless
Spark Application Logs - Amazon EMR

answered a year ago
AWS
EXPERT
reviewed 5 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.