By using AWS re:Post, you agree to the AWS re:Post Terms of Use

Why does my Amazon EMR application fail with an HTTP 403 "Access Denied" AmazonS3Exception?

6 minute read
0

When I submit an application to an Amazon EMR cluster, the application fails with an HTTP 403 "Access Denied" AmazonS3Exception.

Resolution

If you don't correctly configure permissions, then you might get an "Access Denied" error on Amazon EMR or Amazon Simple Storage Service (Amazon S3).

Example error message:

java.io.IOException: com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: 8B28722038047BAA; S3 Extended Request ID: puwS77OKgMrvjd30/EY4CWlC/AuhOOSNsxfI8xQJXMd20c7sCq4ljjVKsX4AwS7iuo92C9m+GWY=), S3 Extended Request ID: puwS77OKgMrvjd30/EY4CWlC/AuhOOSNsxfI8xQJXMd20c7sCq4ljjVKsX4AwS7iuo92C9m+GWY=

Check the credentials or IAM role that's specified in your application code

Note: If you receive errors when you run AWS Command Line Interface (AWS CLI) commands, then see Troubleshooting errors for the AWS CLI. Also, make sure that you're using the most recent AWS CLI version.

Run the ls command on the Amazon EMR cluster's primary node:

aws s3 ls s3://doc-example-bucket/abc/

Note: Replace s3://doc-example-bucket/abc/ with your Amazon S3 path.

If the previous command is successful, then the credentials or AWS Identity and Access Management (IAM) role is causing the Access Denied error.

To resolve this issue, complete the following steps:

  1. Confirm that your application uses the expected credentials or assumes the expected IAM role.
  2. To verify that the role has permissions to the Amazon S3 path, use the AWS CLI to assume the IAM role. Then, perform a sample request to the S3 path.

Check the policy for the Amazon EC2 instance profile role

If the Amazon Elastic Compute Cloud (Amazon EC2) instance profile doesn't have the required read and write permissions on the S3 buckets, then you might get an Access Denied error.

Note: By default, applications inherit Amazon S3 access from the IAM role for the Amazon EC2 instance profile. Verify that the IAM policies that are attached to the role allow the required S3 operations on the source and destination buckets.

To check if you have the required read permission, run the ls command:

aws s3 ls s3://doc-example-bucket/myfolder/

Example output:

An error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied

-or-

Run the following command:

hdfs dfs -ls s3://doc-example-bucket/myfolder

Example output:

ls: com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: RBT41F8SVAZ9F90B; S3 Extended Request ID: ih/UlagUkUxe/ty7iq508hYVfRVqo+pB6/xEVr5WHuvcIlfQnFf33zGTAaoP2i7cAb1ZPIWQ6Cc=; Proxy: null), S3 Extended Request ID: ih/UlagUkUxe/ty7iq508hYVfRVqo+pB6/xEVr5WHuvcIlfQnFf33zGTAaoP2i7cAb1ZPIWQ6Cc=

Be sure that the instance profile role has the required read and write permissions for the S3 buckets.

Example IAM policy:

{  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "ListObjectsInBucket",
      "Effect": "Allow",
      "Action": [
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::doc-example-bucket"
      ]
    },
    {
      "Sid": "AllObjectActions",
      "Effect": "Allow",
      "Action": "s3:*Object*",
      "Resource": [
        "arn:aws:s3:::doc-example-bucket/*"
      ]
    }
  ]
}

Check the IAM role for the EMRFS role mapping

If you use a security configuration to specify IAM roles for Amazon EMR File System (EMRFS), then you use role mapping. Your application inherits the S3 permissions from the IAM role based on the role-mapping configuration.

The IAM policy attached to the roles must have the required S3 permissions on the source and destination buckets. To specify IAM roles for EMRFS requests to Amazon S3, see Set up a security configuration with IAM roles for EMRFS.

Check the Amazon S3 VPC endpoint policy

If the EMR cluster's subnet route table has a route to an Amazon S3 virtual private cloud (VPC) endpoint, then confirm that the endpoint policy allows the required Amazon S3 operations.

Use the AWS CLI

Run the describe-vpc-endpoints AWS CLI comment to check the endpoint policy:

aws ec2 describe-vpc-endpoints --vpc-endpoint-ids "vpce-########"

Note: Replace vpce-######## with your VPC ID.

Run the modify-vpc-endpoint command to modify endpoint policy:

aws ec2 modify-vpc-endpoint --vpc-endpoint-id "vpce-########" --policy-document file://policy.json

Note: Replace --vpc-endpoint-id and the JSON file path.

Use Amazon VPC console

Complete the following steps:

  1. Open the Amazon VPC console.
  2. In the navigation pane, choose Endpoints.
  3. Select the Amazon S3 endpoint that's on the EMR cluster's subnet route table.
  4. Choose the Policy tab.
  5. Choose Edit Policy.

Check the S3 source and destination bucket policies

Bucket policies specify the actions that are allowed or denied for principals. The source and destination bucket policies must allow the instance profile role or the mapped IAM role to perform the required Amazon S3 operations.

To modify the bucket policies, use AWS CLI or the Amazon S3 console.

Use the AWS CLI

Run the get-bucket-policy command to get the bucket policy:

aws s3api get-bucket-policy --bucket doc-example-bucket

Note: Replace doc-example-policy with the name of the source or destination bucket.

Modify the policy, and then save the policy to a JSON file.

Then, run the put-bucket-policy command to add the modified policy to the bucket:

aws s3api put-bucket-policy --bucket doc-example-bucket --policy file://policy.json

Note: Replace the bucket name and the JSON file path.

Use the Amazon S3 console

For instructions, see Adding a bucket policy by using the Amazon S3 console.

Important: If your application accesses an S3 bucket that belongs to another AWS account, then the account owner must allow your IAM role on the bucket policy.

For example, the following bucket policy gives all IAM roles and users in emr-account full access to s3://doc-example-bucket/myfolder/:

{
  "Id": "MyCustomPolicy",
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowRootAndHomeListingOfCompanyBucket",
      "Principal": {
        "AWS": [
          "arn:aws:iam::emr-account:root"
        ]
      },
      "Action": [
        "s3:ListBucket"
      ],
      "Effect": "Allow",
      "Resource": [
        "arn:aws:s3:::doc-example-bucket"
      ],
      "Condition": {
        "StringEquals": {
          "s3:prefix": [
            "",
            "myfolder/"
          ],
          "s3:delimiter": [
            "/"
          ]
        }
      }
    },
    {
      "Sid": "AllowListingOfUserFolder",
      "Principal": {
        "AWS": [
          "arn:aws:iam::emr-account:root"
        ]
      },
      "Action": [
        "s3:ListBucket"
      ],
      "Effect": "Allow",
      "Resource": [
        "arn:aws:s3:::doc-example-bucket"
      ],
      "Condition": {
        "StringLike": {
          "s3:prefix": [
            "myfolder/*"
          ]
        }
      }
    },
    {
      "Sid": "AllowAllS3ActionsInUserFolder",
      "Principal": {
        "AWS": [
          "arn:aws:iam::emr-account:root"
        ]
      },
      "Effect": "Allow",
      "Action": [
        "s3:*"
      ],
      "Resource": [ 
       "arn:aws:s3:::doc-example-bucket/myfolder/*",
        "arn:aws:s3:::doc-example-bucket/myfolder*"
      ]
    }
  ]
}

Related information

Why does my Spark or Hive job on Amazon EMR fail with an HTTP 503 "Slow Down" AmazonS3Exception?

Why does my Amazon EMR application fail with an HTTP 404 "Not Found" AmazonS3Exception?

Error responses

How do I troubleshoot 403 Access Denied errors from Amazon S3?

AWS OFFICIAL
AWS OFFICIALUpdated 21 days ago