AWS Visual ETL Data Preview & Join keys are not populating

0

I'm trying to learn how to use this.

Not sure what the issue is behind the scenes, but I have 3 simple CSV files that I uploaded to S3.

I'm creating a test ETL pipeline with those three CSV files, each, as a data source.

Initially, I had zero problems with creating joins and data & fields from the data sources were populating.

But then I had to save the project, but I had an error and had to add IAM Pass Role to the user.

So after adding the permission to the user, I was able to save the project (because the user now has access to the Glue role).

However, after this, the data stopped populating in the data preview (only for the joins). It initially worked (before the IAM Pass Role permissions), but then all the sudden the joins no longer populate the data.

It's one thing if there's a error, but in this case, it sometimes works and sometimes doesn't without any error.

Makes zero sense why this occurs for no rhyme or reason.

Is this a common issue?

ajt
asked 7 months ago261 views
3 Answers
0

Hi ajt,

It sounds like there might be an issue with the permissions for accessing the data sources after adding the IAM pass role permission. Double-check that the IAM role assigned to the Glue job has the necessary permissions to access the S3 buckets where the CSV files are located. Also, ensure that the IAM policies attached to the role grant appropriate permissions for Glue to read from S3.

You can see the instructions Here to set up AWS Identity and Access Management (IAM) permissions for AWS Glue.

I hope it helps.

profile pictureAWS
BezuW
answered 7 months ago
  • Thanks for that. Just a quick question though. For this particular project, others were only adding "AmazonS3FullAccess" and they did not encounter these issues.

    But along with "AmazonS3FullAccess", I added "IAMFullAccess". Of course, I already have "AWSGlueConsoleFullAccess" to begin with, as well.

    For all intents and purposes for my project, shouldn't the "AmazonS3FullAccess" permissions be fine since it's wide-open access to all S3 buckets?

    I'm a bit confused.

    Update: So even though the "AmazonS3FullAccess" should suffice, I did add the inline policy (below) to the user, but no luck. I'm really not sure what's going on because it seems pretty straight-forward.

    { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3:Get*", "s3:List*", "s3:Object" ], "Resource": [ "arn:aws:s3:::MY-BUCKET-NAME/*" ] } ] }

0

Notice the list operation is done in the bucket, not the prefix, so you need to add the resource with and without prefix wildcard: : ["arn:aws:s3:::MY-BUCKET-NAME", "arn:aws:s3:::MY-BUCKET-NAME/*"]

profile pictureAWS
EXPERT
answered 7 months ago
  • Yes, that was an example/template. I added my resource/bucket name. But yeah, regardless, the inline policy shouldn't be required because it's superseded by "AmazonS3FullAccess".

    It's a very basic project, so this is extremely perplexing.

    UPDATE: Actually, I'm not entirely sure what you were saying. I went by this documentation > https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_elements_resource.html.

    It states the following below.

    The following example refers to all items within a specific Amazon S3 bucket. "Resource": "arn:aws:s3:::DOC-EXAMPLE-BUCKET/*"

    So I'm not sure why I would need to add "Resource": "arn:aws:s3:::DOC-EXAMPLE-BUCKET/" AND "Resource": "arn:aws:s3:::DOC-EXAMPLE-BUCKET"? I would have access to all the items within that one bucket with "/"?

    But again, not sure why any of this would matter if I am using "AmazonS3FullAccess" to begin with?

0

I should now rephrase the question. Would anyone know why I'm having issues with the join data population on joins (no keys or data preview) when I have "AmazonS3FullAccess"?

The inline policy (allowing bucket access) seems to be redundant for no reason.

The policies used are below.

USER POLICIES:

  • AmazonAthenaFullAccess
  • AmazonS3FullAccess
  • AWSGlueConsoleFullAccess
  • AWSQuicksightAthenaAccess
  • AWSQuickSightDescribeRDS
  • IAMFullAccess
  • Inline policy for access to specific buckets (shouldn't be necessary w/ "AmazonS3FullAccess" above)

ROLE POLICIES:

  • AmazonS3FullAccess
ajt
answered 7 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions