Athena Error- HIVE_BAD_DATA: Not valid Parquet file: s3://deng-utube-raw-us-east-1-dev/youtube/raw_stats_reference_data/FR_category_id.json expected magic number: PAR1 got: ] }

0

I have written a lambda function to convert json file in raw s3 bucket into parquet file and gets uploaded directly it to the cleansed s3 bucket. I cannot delete json files since i want to convert it parquet . So while testing the lambda function, parquet format is shown in the bucket and destination table is also in parquet format . but don't know why athena is showing off this error. please help me with this one.

2 Risposte
0
Risposta accettata

Based on the error , it looks like the table is pointing to a location which has a json file s3://deng-utube-raw-us-east-1-dev/youtube/raw_stats_reference_data/FR_category_id.json

Can you verify the following ?

1- What is the Athena Table DDL showing up as LOCATION ?

2- If the location is pointing to raw bucket , then it's a valid error .

3- Verify the LOCATION of your table and point it to destination bucket /location which has only parquet file.

AWS
ESPERTO
con risposta un anno fa
  • I have figured it out...the problem lied in the location of destination bucket. Thankyou for the response.

0

I guess the S3 prefix you've told Athena to use contains json files somewhere? It would be best to have separate prefixes for your json and parquet files, and configure Athena to just use the parquet files' prefix.

ESPERTO
con risposta un anno fa
  • I have figured it out...the problem lied in the location of destination bucket. Thankyou for the response.

Accesso non effettuato. Accedi per postare una risposta.

Una buona risposta soddisfa chiaramente la domanda, fornisce un feedback costruttivo e incoraggia la crescita professionale del richiedente.

Linee guida per rispondere alle domande