Glue table partition format question

0

Hi, We send our database old data into S3 in this format:

{table}/year/month/day/asset_id.parquet so we have a lot of files like this: table/2024/01/28/z.parquet table/2024/01/28/y.parquet table/2024/01/29/z.parquet table/2024/01/29/y.parquet etc...

When I edit my glue table and click on Partition it says : "No available partitions." Am I missing something? Is it because we need to specify in the file name the actual meaning of the number like here for example ? {table}/year=2024/month=10/day=28/z.parquet {table}/year=2024/month=10/day=28/y.parquet

Is Glue able to automatically partition with my file naming convention "accelerometer/2024/01/28" ?

What do you guys recommend doing for me ? Thank you!

LouisAW
已提問 3 個月前檢視次數 362 次
1 個回答
0

The partitions have to be explicitly added to the catalog (expect from Athena projections), is not enough to have the data in the right path
Because you are not following the naming convention, you cannot "repair" the table and the crawler cannot give it meaning partition names (will be named 0, 1, 2...) but you could rename the partition columns after the crawler has loaded them.

profile pictureAWS
專家
已回答 3 個月前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南