AWS Athena can not query parquet file

1

Hi.

I have created the parquet file from the ETL visual tool using S3 source (CSV) to S3 target (Parquet) with partition key(date). When I created the Athena table from S3 source using parquet location it fails to query.

I'm getting below error: any help would be appreciated.

TYPE_MISMATCH: Unable to read parquet data. This is most likely caused by a mismatch between the parquet and metastore schema This query ran against the "xxxxxx" database, unless qualified by the query

Vijay
已提問 9 個月前檢視次數 1885 次
1 個回答
0

Hi Vijay,

The error you are receiving is generally caused by a mismatch in the data type between the parquet file and the hive metastore schema. One example of where this can occur is with a timestamp column, where the timestamp format between parquet and hive are different. I would check any date or timestamp columns and modify your Athena query to use a CAST function to convert it. Take a look at Data Types in Athena for more information.

Are you able to share the query or schema, which would have been asked in the rest of the error message you received? Then we can provide help with a specific CAST or other function based on the column that is having trouble converting intrinsically.

Thank you.

References:

profile pictureAWS
Jen_F
已回答 7 個月前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南