Athena Error- HIVE_BAD_DATA: Not valid Parquet file: s3://deng-utube-raw-us-east-1-dev/youtube/raw_stats_reference_data/FR_category_id.json expected magic number: PAR1 got: ] }

0

I have written a lambda function to convert json file in raw s3 bucket into parquet file and gets uploaded directly it to the cleansed s3 bucket. I cannot delete json files since i want to convert it parquet . So while testing the lambda function, parquet format is shown in the bucket and destination table is also in parquet format . but don't know why athena is showing off this error. please help me with this one.

2 réponses
0
Réponse acceptée

Based on the error , it looks like the table is pointing to a location which has a json file s3://deng-utube-raw-us-east-1-dev/youtube/raw_stats_reference_data/FR_category_id.json

Can you verify the following ?

1- What is the Athena Table DDL showing up as LOCATION ?

2- If the location is pointing to raw bucket , then it's a valid error .

3- Verify the LOCATION of your table and point it to destination bucket /location which has only parquet file.

AWS
EXPERT
répondu il y a un an
  • I have figured it out...the problem lied in the location of destination bucket. Thankyou for the response.

0

I guess the S3 prefix you've told Athena to use contains json files somewhere? It would be best to have separate prefixes for your json and parquet files, and configure Athena to just use the parquet files' prefix.

EXPERT
répondu il y a un an
  • I have figured it out...the problem lied in the location of destination bucket. Thankyou for the response.

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.

Instructions pour répondre aux questions