AWS Glue Notebook issue when running SQL script

0

I am following the steps outlined in the link below:

https://aws.amazon.com/blogs/big-data/introducing-native-delta-lake-table-support-with-aws-glue-crawlers/

(1) No issue with Query Delta Lake tables using Amazon Athena, able to query out the data.

(2) Issue AWS Glue Notebook:

(a) created IAM role named "AWSGlueServiceRoleDefault" which includes

(i) AmazonS3FullAccess, which is AWS managed

(ii) AWSGlueServiceRole, which is AWS managed

(iii) PassRolePolicy, which is Customer Inline Enter image description here

PassRolePolicy as below: Enter image description here

Following the instructions from the link, which I used the IAM role ("AWSGlueServiceRoleDefault") I created above, the first part of the Python code is able to run as shown below:

Enter image description here

But the SQL portion throws me a list of Py4JJavaError:

Enter image description here

Even the below Python code which tries to retrieve information about the table generates the same error as below:

Enter image description here

The above error is repeatable, which can be viewed here: https://justpaste.it/1zxzz

Would appreciate if you could help me to see which portion is creating the error, and how to remedy it. Thanks.

已提問 1 年前檢視次數 411 次
1 個回答
0
已接受的答案

The libraries for the Delta format are not enabled by default.
The job needs a parameter: --datalake-formats=delta

From a notebook you need to pass it as config before you run any other cell (once the session is started, it won't take effect)

%%configure
{
    "--datalake-formats": "delta"
}
profile pictureAWS
專家
已回答 1 年前
  • Splendid.... your short and concise answer saves me lots of hours of frustration to find out exactly what is the issue. Thank you very much.

  • Take note above also needs to add IAM:PassRole policy, which is not instructed in the first link, else it would have issue relating to IAM:PassRole.

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南