Skip to content

HIVE_CANNOT_OPEN_SPLIT because of Index out of bounds for length Error

0

hi I am encountering weird error while querying data for a particular day. My data is being split into 4 partition based on group, year, month and day. While querying data for a particular day I am encountering following error

HIVE_CANNOT_OPEN_SPLIT: Error opening Hive split s3://xxxxxxxxx/xxxxxxxxxx/xxxxxxxxxxx/group=xxxxxx/year=xxxx/month=xx/day=xx/part-000xx-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxx-xxxx.snappy.parquet (offset=0, length=936927): Index 7 out of bounds for length 3

I am doing a very simple query something like

select <field_name1> from xyzzy where group=xy and year='2023' and month='06' and day='27' If I change the dates, year or any other filtering criteria there is no error. What can be issue with this data ?? I don't have much context to carry on my debug part. Any help in this regard will be helpful

asked 3 years ago537 views
1 Answer
0

Hello, the 'HIVE_CANNOT_OPEN_SPLIT : Index out of bounds' is currently a known issue that usually occurs when running the queries in Athena Version 3 due to a Parquet specific incompatibility. This is due to the fact that Athena V3 leverages column indices, while Athena V2 does not. We are currently working on matching the compatibility in Athena Version 3, hence, as a workaround, you can consider running the queries in Athena version 2.

If you encounter the issue in V2 or for more specific troubleshooting of your resources please raise a support case with the AWS Premium Support Team.

AWS
answered 3 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.