How do I troubleshoot missing Amazon EMR Serverless logs on Amazon S3 or CloudWatch?
I want to troubleshoot missing Amazon EMR Serverless logs on Amazon Simple Storage Service (Amazon S3) or Amazon CloudWatch.
Resolution
Before you begin, complete the following prerequisites for CloudWatch and Amazon S3.
CloudWatch prerequisites
Complete the following steps:
- (Optional) Create a CloudWatch log group where you can save the logs. By default, Amazon EMR Serverless automatically creates the /aws/emr-serverless log group when you turn on CloudWatch logging and include the logs:CreateLogGroup permission in the runtime role.
- Create a policy that includes the required permissions for CloudWatch logging. Then, attach the policy to the Amazon EMR Serverless runtime role that the application uses. For a cross-account CloudWatch setup, see New - Amazon CloudWatch cross-account observability.
- To turn on CloudWatch logging for Amazon EMR Serverless jobs, use the following configuration:
{ "monitoringConfiguration": { "cloudWatchLoggingConfiguration": { "enabled": true } } }
Amazon S3 prerequisites
Complete the following steps:
- Create an AWS Identity and Access Management (IAM) policy that allows s3:PutObject on the Amazon S3 bucket prefix to use for logging. Then, attach the policy to the Amazon EMR Serverless runtime role. For more information, see Logging for EMR Serverless with Amazon S3 buckets. If the Amazon S3 bucket is located in another account, then configure cross-account access.
- Submit the Amazon EMR Serverless job with Amazon S3 logging prefix configured on the monitoringConfiguration classification:
Note: Replace example-logging-bucket with your logging bucket.{ "monitoringConfiguration": { "s3MonitoringConfiguration": { "logUri": "s3://example-logging-bucket/logs/" } } }
Troubleshoot missing job logs
Note: If you receive errors when you run AWS Command Line Interface (AWS CLI) commands, then see Troubleshoot AWS CLI errors. Also, make sure that you're using the most recent AWS CLI version.
When you use CloudWatch for log capture, CloudWatch pushes only driver logs to the log group. To allow CloudWatch to push worker (SPARK_EXECUTOR or TEZ_TASK) logs, specify the worker type in your logTypes configuration:
{ "monitoringConfiguration": { "cloudWatchLoggingConfiguration": { "enabled": true, // Required "logGroupName": "/aws/emr", "logTypes": { "SPARK_DRIVER": ["stdout", "stderr"], "SPARK_EXECUTOR":["stdout", "stderr"] } } } }
The following are the supported worker types for Hive and Spark that you can specify for the logTypes configuration:
- SPARK_DRIVER: STDERR and STDOUT
- SPARK_EXECUTOR: STDERR and STDOUT
- HIVE_DRIVER: STDERR, STDOUT, HIVE_LOG, and TEZ_AM
- TEZ_TASK: STDERR, STDOUT, and SYSTEM_LOGS
Note: You can't use the Amazon EMR Serverless console to specify the worker logTypes configuration. To specify the worker logTypes configuration, use the AWS CLI, API, or AWS CloudFormation.
To troubleshoot missing job logs, complete the following steps:
- To confirm that the correct log type and resource name is referenced in the Amazon EMR Serverless job log settings, run the following get-job-run command:
Note: Replace example-application-id with the application ID and example-job-id with the job ID.aws emr-serverless get-job-run —application-id example-application-id —job-run-id example-job-id
- On the Amazon EMR Studio console, view the Status details from the Job runs tab on an application's Details page. When a log push fails because of permission issues, the status shows detailed information related to the missing permission. Make sure that the log type is configured correctly.
- Run the get-job-run command to view the stateDetails. The following stateDetails example shows a runtime role that has insufficient permissions and gives further information:
"stateDetails": "Unable to push logs, please ensure logging destination is valid and execution role has sufficient permissions. Error: \"An error occurred (AccessDeniedException) when calling the DescribeLogGroups operation: User: arn:aws:sts::account ID:assumed-role/AmazonEMR-ExecutionRole/application ID,job ID is not authorized to perform: logs:DescribeLogGroups on resource: arn:aws:logs:us-east-1:xxxxx:log-group::log-stream: because no identity-based policy allows the logs:DescribeLogGroups action\"."
If the job has request validation-level issues when you perform the StartJobRun API, then job logs aren't captured in the following locations:
- Amazon S3 prefix
- CloudWatch log group
For more information, see Storing logs.
Relevanter Inhalt
- AWS OFFICIALAktualisiert vor 3 Monaten
- AWS OFFICIALAktualisiert vor 6 Monaten
- AWS OFFICIALAktualisiert vor 6 Monaten