AWS Athena can not query parquet file

1

Hi.

I have created the parquet file from the ETL visual tool using S3 source (CSV) to S3 target (Parquet) with partition key(date). When I created the Athena table from S3 source using parquet location it fails to query.

I'm getting below error: any help would be appreciated.

TYPE_MISMATCH: Unable to read parquet data. This is most likely caused by a mismatch between the parquet and metastore schema This query ran against the "xxxxxx" database, unless qualified by the query

Vijay
asked 8 months ago1837 views
1 Answer
0

Hi Vijay,

The error you are receiving is generally caused by a mismatch in the data type between the parquet file and the hive metastore schema. One example of where this can occur is with a timestamp column, where the timestamp format between parquet and hive are different. I would check any date or timestamp columns and modify your Athena query to use a CAST function to convert it. Take a look at Data Types in Athena for more information.

Are you able to share the query or schema, which would have been asked in the rest of the error message you received? Then we can provide help with a specific CAST or other function based on the column that is having trouble converting intrinsically.

Thank you.

References:

profile pictureAWS
Jen_F
answered 7 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions