AWS Glue Notebook issue when running SQL script


I am following the steps outlined in the link below:

(1) No issue with Query Delta Lake tables using Amazon Athena, able to query out the data.

(2) Issue AWS Glue Notebook:

(a) created IAM role named "AWSGlueServiceRoleDefault" which includes

(i) AmazonS3FullAccess, which is AWS managed

(ii) AWSGlueServiceRole, which is AWS managed

(iii) PassRolePolicy, which is Customer Inline Enter image description here

PassRolePolicy as below: Enter image description here

Following the instructions from the link, which I used the IAM role ("AWSGlueServiceRoleDefault") I created above, the first part of the Python code is able to run as shown below:

Enter image description here

But the SQL portion throws me a list of Py4JJavaError:

Enter image description here

Even the below Python code which tries to retrieve information about the table generates the same error as below:

Enter image description here

The above error is repeatable, which can be viewed here:

Would appreciate if you could help me to see which portion is creating the error, and how to remedy it. Thanks.

asked a year ago466 views
1 Answer
Accepted Answer

The libraries for the Delta format are not enabled by default.
The job needs a parameter: --datalake-formats=delta

From a notebook you need to pass it as config before you run any other cell (once the session is started, it won't take effect)

    "--datalake-formats": "delta"
profile pictureAWS
answered a year ago
  • Splendid.... your short and concise answer saves me lots of hours of frustration to find out exactly what is the issue. Thank you very much.

  • Take note above also needs to add IAM:PassRole policy, which is not instructed in the first link, else it would have issue relating to IAM:PassRole.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions