Access error : Spark query from AWS EMR with AWS Lake Formation

0

I am trying to integrate EMR with Lake Formation. The EMR cluster has been created with Default Roles for both EMR and EC2 instances. In addition to the default permissions created in those Roles, I have also provided full access on Lake formation and Glue to the Default roles . I selected the default Service linked Role for all s3 buckets registered with this Lake formation.

After creating Jupyter notebook, which uses the above created cluster, I tried running

spark.sql ("show databases") and spark.sql("use <database>")

to get the following error

An error was encountered: org.apache.spark.sql.catalyst.analysis.AccessControlException: Unable to verify existence of default database: com.amazonaws.services.glue.model.AccessDeniedException: Insufficient Lake Formation permission(s) on default (Service: AWSGlue; Status Code: 400; Error Code: AccessDeniedException; Request ID: .....; Proxy: null);

I think i have tried everything with permissions and not able to understand the root cause for the error. Appreciate any pointers which would help understand / resolve the error. Many Thanks.

質問済み 2年前2760ビュー
1回答
2

I was able to resolve this error - When spark tries connecting to Lake formation, it checks if 'default' database exists. Since my Lake formation did not have a database named 'default', hence the error. Creating a database named 'default' and granting permission on Database to the default EMR_EC2_DefaultRole resolved the error

回答済み 2年前

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ