HIVE_CURSOR_ERROR: Failed to read Parquet file, when i using Athena engine version 3 in my workgroup

8

When I change the version of athena to v3 in the workgroup I am having the error HIVE_CURSOR_ERROR: Failed to read Parquet file. When I using Athena engine v2 the query works normally. I'm just running a table preview, I didn't change anything

SELECT * FROM {table} limit 10;

error

asked a year ago1308 views
7 Answers
1

I have the same issue as well. Digging a bit deeper I found that the the query is failing when I use equal sign. With 'like' operator it works. It looks weird.

Doesn't work

select product_name, user_id from {table} 
where 1=1
AND meta_date = '2020-01-01'
and user_id = 'foo'
and product_name = 'name1'

Works

select product_name, user_id from {table} 
where 1=1
AND meta_date = '2020-01-01'
and user_id = 'foo'
and product_name like '%name1%'
answered a year ago
1

I'm also experiencing the same thing. Downgrading to v2 helps. If someone from AWS is interested, I can provide file/config/queries which is causing issues.

It turned out that it was missing/wrongly formatted page index (https://github.com/apache/parquet-format/blob/master/PageIndex.md). Engine v3 is requires page index in parquet files to work.

tobiasz
answered 10 months ago
0

Having the same issue, but for me it seems to be related to a date column, it has values like below:

1754-08-30 22:43:41.129

2023-06-23 07:00:00.000

query works fine v2 but in v3 will only work in periods with no 1754 dates.

Colin
answered 10 months ago
  • Interesting. Our data does not contain such a column. We only have a bigint timestamp column. None of them are < 0 though

0

I'm experiencing the same thing. I have a parquet file that reads fine in athena engine 3 that is not loading. I downgraded the engine version to 2 in the athena workgroup and it works fine. Is this a bug in Athena?

kade
answered a year ago
0

I'm also experiencing the same thing. Downgrading to v2 helps. If someone from AWS is interested, I can provide file/config/queries which is causing issues.

tobiasz
answered a year ago
0

Experiencing the same issue. I uploaded a lot of parquet files and it fails to query over new + old combined. I inspected the schema with parquet-tools and they are the exact same, so no issue there.

Downgrading to v2 also helped us, but why?

answered a year ago
0

For everyone finding this thread, it seems to me that Athena Engine Version 3 cannot handle timestamps before epoch time (January 1st 1970). I opened a thread about it before stumbling upon this one: https://repost.aws/questions/QUYNLXewhyQwaOxR0-FdbQEQ/bug-athena-engine-version-3-cannot-handle-timestamps-before-epoch-time.

Mods let me know if I should move my answer from the thread I opened to here in order to keep information in one spot.

answered 7 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions