Glue ETL AccessDeniedException for not existent Lake Formation

0

Hello,

In a Glue ETL made of nodes: Amazon S3, Change Schema, AWS Glue Data Catalog with the table "us_spending" backed by S3, I have the following error:

Error Category: PERMISSION_ERROR; Failed Line Number: 87; An error occurred while calling o101.getCatalogSink. Insufficient Lake Formation permission(s) on us_spending (Service: AWSGlue; Status Code: 400; Error Code: AccessDeniedException; Request ID: 587df4c4-8b94-4b47-925c-f87779215bb6; Proxy: null). Note: This run was executed with Flex execution.

Where could Lake Formation be involved, as long as I did not specifically used it ?
Where should the permission be added for AWS Glue Data Catalog table "us_spending" ?
Is this a recent change is Glue ETL, because for previous tables, default permissions were enough ?

Thank you,
Mihai

asked 3 months ago184 views
1 Answer
0
Accepted Answer

I would say you are trying to write into an s3 location that has been marked as managed by LakeFormation, if that table doesn't exist.
Or if the table exists, it has been set to use LF permissions, instead of IAM ones.
Go to LakeFormation and in the Data lake permissions, make sure that table (if exists) gives permissions to your specific role or to the group "IAMAllowedPrincipals", which delegates in IAM Glue permissions.

profile pictureAWS
EXPERT
answered 3 months ago
  • Hello,

    Thank you for the answer.
    I checked in the LakeFormation console, in Data lake permissions, and I added for the existing table, us_spending, the Table permission Super for group "IAMAllowedPrincipals".

    The job goes over the previous error point, but now I have another permission ERROR: Error Category: PERMISSION_ERROR; Failed Line Number: 87; An error occurred while calling o154.pyWriteDynamicFrame. Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: SV9SNXXQXYTME26Y; S3 Extended Request ID: JsQrpfhTs41jpTukc54CIYpZCLxlSBANLpKIrtmhmggqmnYQsB03gSEW9ncylo/lNVJCrBmYtCQ=; Proxy: null). Note: This run was executed with Flex execution. Check the logs if run failed due to executor termination.

    Could you please give more help and tell why this error appears ?

    Thanks a lot,
    Mihai

  • Please see more details below.

    In CloudWatch Logs, just before the error stack trace of the above error, there is this row: 2024-01-22 12:31:23,397 INFO [Executor task launch worker for task 0.3 in stage 0.0 (TID 15)] s3n.MultipartUploadOutputStream (MultipartUploadOutputStream.java:close(421)): close closed:false s3://temp-misc-ohio/run-1705926527849-part-r-00000 where s3://temp-misc-ohio/ is the bucket of the AWS Glue Data Catalog table "us_spending". run-1705926527849-part-r-00000 could be a working intermediary file for table "us_spending".

    In same bucket is also the input file used by the Amazon S3 component, but the session for Data Preview works and I can see input file data.

    Thank you,
    Mihai

  • It sounds the role has read but not write permissions on that bucket, how to fix that depends if you are using S3 bucket policy (sounds like your case) or managed by LF

  • Thank you for your answers,

    Yes indeed, the job role was missing the write rights on output S3 bucket that backed the Glue Data Catalog table "us_spending".

    Now the job ran until the end successfully, without any other permissions issue.

    Have a good day,
    Mihai

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions