AWS Athena can not query parquet file

1

Hi.

I have created the parquet file from the ETL visual tool using S3 source (CSV) to S3 target (Parquet) with partition key(date). When I created the Athena table from S3 source using parquet location it fails to query.

I'm getting below error: any help would be appreciated.

TYPE_MISMATCH: Unable to read parquet data. This is most likely caused by a mismatch between the parquet and metastore schema This query ran against the "xxxxxx" database, unless qualified by the query

Vijay
已提问 9 个月前1886 查看次数
1 回答
0

Hi Vijay,

The error you are receiving is generally caused by a mismatch in the data type between the parquet file and the hive metastore schema. One example of where this can occur is with a timestamp column, where the timestamp format between parquet and hive are different. I would check any date or timestamp columns and modify your Athena query to use a CAST function to convert it. Take a look at Data Types in Athena for more information.

Are you able to share the query or schema, which would have been asked in the rest of the error message you received? Then we can provide help with a specific CAST or other function based on the column that is having trouble converting intrinsically.

Thank you.

References:

profile pictureAWS
Jen_F
已回答 7 个月前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则