Athena Error- HIVE_BAD_DATA: Not valid Parquet file: s3://deng-utube-raw-us-east-1-dev/youtube/raw_stats_reference_data/FR_category_id.json expected magic number: PAR1 got: ] }

0

I have written a lambda function to convert json file in raw s3 bucket into parquet file and gets uploaded directly it to the cleansed s3 bucket. I cannot delete json files since i want to convert it parquet . So while testing the lambda function, parquet format is shown in the bucket and destination table is also in parquet format . but don't know why athena is showing off this error. please help me with this one.

2 回答
0
已接受的回答

Based on the error , it looks like the table is pointing to a location which has a json file s3://deng-utube-raw-us-east-1-dev/youtube/raw_stats_reference_data/FR_category_id.json

Can you verify the following ?

1- What is the Athena Table DDL showing up as LOCATION ?

2- If the location is pointing to raw bucket , then it's a valid error .

3- Verify the LOCATION of your table and point it to destination bucket /location which has only parquet file.

AWS
专家
已回答 1 年前
  • I have figured it out...the problem lied in the location of destination bucket. Thankyou for the response.

0

I guess the S3 prefix you've told Athena to use contains json files somewhere? It would be best to have separate prefixes for your json and parquet files, and configure Athena to just use the parquet files' prefix.

专家
已回答 1 年前
  • I have figured it out...the problem lied in the location of destination bucket. Thankyou for the response.

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则