AWS Athena can not query parquet file

1

Hi.

I have created the parquet file from the ETL visual tool using S3 source (CSV) to S3 target (Parquet) with partition key(date). When I created the Athena table from S3 source using parquet location it fails to query.

I'm getting below error: any help would be appreciated.

TYPE_MISMATCH: Unable to read parquet data. This is most likely caused by a mismatch between the parquet and metastore schema This query ran against the "xxxxxx" database, unless qualified by the query

Vijay
demandé il y a 9 mois1885 vues
1 réponse
0

Hi Vijay,

The error you are receiving is generally caused by a mismatch in the data type between the parquet file and the hive metastore schema. One example of where this can occur is with a timestamp column, where the timestamp format between parquet and hive are different. I would check any date or timestamp columns and modify your Athena query to use a CAST function to convert it. Take a look at Data Types in Athena for more information.

Are you able to share the query or schema, which would have been asked in the rest of the error message you received? Then we can provide help with a specific CAST or other function based on the column that is having trouble converting intrinsically.

Thank you.

References:

profile pictureAWS
Jen_F
répondu il y a 7 mois

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.

Instructions pour répondre aux questions