Unable to read Hive Acid tables in Athena using Athena Hive data connector

0

Hi, We are trying to use Athena as our consumption service. We have migrated most of the hive databases/tables from external Hive meta store to AWS Glue except those database that has Hive ACID tables because Glue don't support Hive ACID tables. To read Hive ACID tables from Athena, we have configured Athena connector for Hive based this article https://docs.aws.amazon.com/athena/latest/ug/connect-to-data-source-hive.html and used AthenaHiveMetastoreFunctionWithLayer jar.

When try to query Hive ACID table (based on ORC file format ) from Athena using newly created custom catalog for Hive, I get below error.

"HIVE_CURSOR_ERROR: Failed to read ORC file: s3://my-datalake-bkt-dev/test/acid/ug/base_0000002/bucket_00000"

It looks like Athena not able to read the hive ACID file format. Can some one please help me?

RamSet
질문됨 2년 전803회 조회
1개 답변
0
수락된 답변

Hello,

The latest Athena engine v2 uses Presto 0.217 which does not support Hive ACID tables. I tried to use this article and this to test it out and got the below error

HIVE_INVALID_BUCKET_FILES: Hive table 'default.acid_tbl' is corrupt. Found sub-directory in bucket directory for partition: 

Presto appears to only supports reading ACID tables starting from Presto 331

However as per this doc ,Athena do support ACID transactions via AWS Lakeformation Governed tables or Icerberg. If you are looking to move your Hive ACID tables to AWS, then I would suggest you to check on the AWS LakeFormation governed tables feature which uses the same Glue catalog.

Ref: AWS lakeformation governed tables blog series

https://aws.amazon.com/blogs/big-data/part-1-effective-data-lakes-using-aws-lake-formation-part-1-getting-started-with-governed-tables/

AWS
지원 엔지니어
답변함 2년 전
profile picture
전문가
검토됨 한 달 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠