I can't connect to an external s3 bucket as a data source for a glue crawler

0

I am trying to connect to an external s3 bucket as a data source for a glue crawler. The bucket has the necessary arn permissions as i can connect to it via s3 browser as an external bucket. The crawler has the trust relationship:

{ "Version": "2012-10-17", "Statement": [ { "Sid": "CID", "Effect": "Allow", "Principal": { "Service": "glue.amazonaws.com", "AWS": "arn:aws:iam::<client-arn>:role/<client-made-role>" }, "Action": "sts:AssumeRole" } ] }

Running the crawler, i get the error User does not have access to target s3://<external-bucket-name>.

Are there any reasons why this isn't working?

Thanks

  • Can you share the permissions your IAM role has? Also, is there a bucket policy / ACL enforced on the bucket?

asked 4 months ago576 views
1 Answer
0

Hello,

Please note that assume role may not work for glue crawler case. In order to ensure you are able to crawler external s3 bucket as a data source you need to ensure the below:

  1. Crawler role in Account A should have access to Account B s3 bucket(Get*, List*)
  2. Account B s3 bucket must allow required permissions(Get, List etc) to account A crawler role in it's bucket policy.
  3. Account B s3 bucket must not be using SSE-KMS(aws/s3) key, if bucket is encrypted with aws/s3 AWS Managed KMS key then cross account s3 access won't work
  4. If Account B s3 bucket is SSE-KMS CMK(custom key) encrypted then, KMS key policy in Account B must allow Account A glue crawler role.

Please check on all the above permissions in order to resolve the issue. You can also refer to below repost link as well: https://repost.aws/questions/QU_8lhusbHSLOg9CE7U30W7w/cross-account-s3-bucket-access-in-aws-glue-crawler

AWS
SUPPORT ENGINEER
answered 4 months ago
  • Hi, Thank you for your response. Should the crawler have the trust relationship that I have copied, or should they be in the crawlers permissions (i.e have access to account B). Right now, the layout of my crawlers' IAM role is:

    • trust relationship (above)
    • customer inline policy { "Version": "2012-10-17", "Statement": [ { "Sid": "CID", "Effect": "Allow", "Action": "sts:AssumeRole", "Resource": "arn:aws:iam::<client_arn>:role/<client-made-role>" } ] }
    • customer managed policy { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3:GetObject", "s3:PutObject" ], "Resource": [ "arn:aws:s3:::<clinet-bucket-name>/*" ] } ] }

    Does this seem correct?

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions