AWS Glue Notebook issue when running SQL script

0

I am following the steps outlined in the link below:

https://aws.amazon.com/blogs/big-data/introducing-native-delta-lake-table-support-with-aws-glue-crawlers/

(1) No issue with Query Delta Lake tables using Amazon Athena, able to query out the data.

(2) Issue AWS Glue Notebook:

(a) created IAM role named "AWSGlueServiceRoleDefault" which includes

(i) AmazonS3FullAccess, which is AWS managed

(ii) AWSGlueServiceRole, which is AWS managed

(iii) PassRolePolicy, which is Customer Inline Enter image description here

PassRolePolicy as below: Enter image description here

Following the instructions from the link, which I used the IAM role ("AWSGlueServiceRoleDefault") I created above, the first part of the Python code is able to run as shown below:

Enter image description here

But the SQL portion throws me a list of Py4JJavaError:

Enter image description here

Even the below Python code which tries to retrieve information about the table generates the same error as below:

Enter image description here

The above error is repeatable, which can be viewed here: https://justpaste.it/1zxzz

Would appreciate if you could help me to see which portion is creating the error, and how to remedy it. Thanks.

已提问 1 年前410 查看次数
1 回答
0
已接受的回答

The libraries for the Delta format are not enabled by default.
The job needs a parameter: --datalake-formats=delta

From a notebook you need to pass it as config before you run any other cell (once the session is started, it won't take effect)

%%configure
{
    "--datalake-formats": "delta"
}
profile pictureAWS
专家
已回答 1 年前
  • Splendid.... your short and concise answer saves me lots of hours of frustration to find out exactly what is the issue. Thank you very much.

  • Take note above also needs to add IAM:PassRole policy, which is not instructed in the first link, else it would have issue relating to IAM:PassRole.

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则

相关内容