跳至內容

How to push EMR application logs to Cloudwatch.

0

We have an lambda function that creates a cluster on demand , the cluster logs goes by default to S3 bucket specified in loguri, now we have a requirement to get application logs in cloudwatch whenever an Cluster (EC2)is started in EMR,how to achieve this?

已提問 1 年前檢視次數 429 次
1 個回答
0

To push EMR application logs to CloudWatch when a cluster is started, you need to configure your EMR job to send log information to CloudWatch Logs. Here's how you can achieve this:

  1. Update IAM Policy: First, ensure that the execution role for your EMR job has the necessary permissions to send logs to CloudWatch. The IAM policy should include permissions like logs:CreateLogStream, logs:DescribeLogGroups, logs:DescribeLogStreams, and logs:PutLogEvents.

  2. Configure CloudWatch Logging: In your Lambda function that creates the EMR cluster, you need to add CloudWatch logging configuration. This is done by including a cloudWatchMonitoringConfiguration in the monitoringConfiguration section of your StartJobRun request.

  3. Specify Log Group and Stream: In the CloudWatch configuration, you need to specify a log group name and optionally a log stream name prefix. If the log group doesn't exist, EMR will create it automatically (given the proper permissions).

  4. Update Cluster Configuration: When creating the EMR cluster in your Lambda function, include the CloudWatch logging configuration. This will ensure that when the cluster starts, it will begin sending logs to CloudWatch.

The logs will be organized in CloudWatch as follows:

  • Driver logs: logGroup/logStreamPrefix/virtual-cluster-id/jobs/job-id/containers/spark-application-id/spark-job-id-driver/(stderr/stdout)
  • Executor logs: logGroup/logStreamPrefix/virtual-cluster-id/jobs/job-id/containers/spark-application-id/executor-pod-name/(stderr/stdout)

By implementing these steps, your EMR application logs will be pushed to CloudWatch whenever a cluster is started, allowing you to monitor and troubleshoot your EMR jobs more effectively.
Sources
Configure a job run to use Amazon CloudWatch Logs - Amazon EMR
CloudWatchLoggingConfiguration - Amazon EMR Serverless
Spark Application Logs - Amazon EMR

已回答 1 年前
AWS
專家
已審閱 6 個月前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。