Select from shared Glue table in EMR

0

Hi all,

I have shared a Glue table (S3) with another account where I can already query it via Athena.

Now I added LakeFormation permissions for the database and table to the role that I am using with an EMR Serverless Application through an interactive workspace.

Below is a screenshot of my notebook. I was able to see the database and table in the catalog but as soon as I try to access the table I get an error: EMR Notebook Glue table access

org.apache.hadoop.hive.ql.metadata.HiveException: Unable to fetch table customers. Unable to get table: java.lang.IllegalArgumentException: No enum constant org.apache.hadoop.hive.metastore.TableType.

I see that the tableType property is None and therefore the query crashes because instead of org.apache.hadoop.hive.metastore.TableType. it should probably be org.apache.hadoop.hive.metastore.TableType.EXTERNAL_TABLE.

I tried to set the table type but I couldn't find any options.

These are the Glue table details: Glue table details

Has anyone encountered this issue before? How can I change the table type or make sure that it has the correct type in the catalog?

profile picture
demandé il y a un mois122 vues
1 réponse
2

Hello,

Could you verify and make sure that your scenario comply with the considerations mentioned in this document - https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-lf-limitations.html

We also may get this error Unable to get table: java.lang.IllegalArgumentException: No enum constant org.apache.hadoop.hive.metastore.TableType. when EMR cluster missing table filtering permissions from the source lake formation account. If "External data filtering" missing, then it will deny accessing to the S3 locations that are registered with Lake Formation. This is consistent with the error message and my interpretation of it, that EMR possibly has no access to the location in which the table resides. - https://docs.aws.amazon.com/lake-formation/latest/dg/getting-started-setup.html#emr-switch

AWS
INGÉNIEUR EN ASSISTANCE TECHNIQUE
répondu il y a un mois
  • I enabled the "external data filtering" option in the Lake Formation application integration settings of the producer account. I provided the session tag value "Amazon EMR" and added the consuming AWS account ID. Unfortunately, still no success. Is EMR serverless actually adding any session tag values when assuming the job runtime role? Or is this feature only working for EMR clusters?

  • Support for EMR Serverless with Lake formation was introduced from EMR 6.15 but still in preview as mentioned in the document.

    Just make sure that you are configuring your EMR Serverless application appropriately.

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.

Instructions pour répondre aux questions