Unable to read Hive Acid tables in Athena using Athena Hive data connector

0

Hi, We are trying to use Athena as our consumption service. We have migrated most of the hive databases/tables from external Hive meta store to AWS Glue except those database that has Hive ACID tables because Glue don't support Hive ACID tables. To read Hive ACID tables from Athena, we have configured Athena connector for Hive based this article https://docs.aws.amazon.com/athena/latest/ug/connect-to-data-source-hive.html and used AthenaHiveMetastoreFunctionWithLayer jar.

When try to query Hive ACID table (based on ORC file format ) from Athena using newly created custom catalog for Hive, I get below error.

"HIVE_CURSOR_ERROR: Failed to read ORC file: s3://my-datalake-bkt-dev/test/acid/ug/base_0000002/bucket_00000"

It looks like Athena not able to read the hive ACID file format. Can some one please help me?

RamSet
已提问 2 年前803 查看次数
1 回答
0
已接受的回答

Hello,

The latest Athena engine v2 uses Presto 0.217 which does not support Hive ACID tables. I tried to use this article and this to test it out and got the below error

HIVE_INVALID_BUCKET_FILES: Hive table 'default.acid_tbl' is corrupt. Found sub-directory in bucket directory for partition: 

Presto appears to only supports reading ACID tables starting from Presto 331

However as per this doc ,Athena do support ACID transactions via AWS Lakeformation Governed tables or Icerberg. If you are looking to move your Hive ACID tables to AWS, then I would suggest you to check on the AWS LakeFormation governed tables feature which uses the same Glue catalog.

Ref: AWS lakeformation governed tables blog series

https://aws.amazon.com/blogs/big-data/part-1-effective-data-lakes-using-aws-lake-formation-part-1-getting-started-with-governed-tables/

AWS
支持工程师
已回答 2 年前
profile picture
专家
已审核 1 个月前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则