Glue table partition format question

0

Hi, We send our database old data into S3 in this format:

{table}/year/month/day/asset_id.parquet so we have a lot of files like this: table/2024/01/28/z.parquet table/2024/01/28/y.parquet table/2024/01/29/z.parquet table/2024/01/29/y.parquet etc...

When I edit my glue table and click on Partition it says : "No available partitions." Am I missing something? Is it because we need to specify in the file name the actual meaning of the number like here for example ? {table}/year=2024/month=10/day=28/z.parquet {table}/year=2024/month=10/day=28/y.parquet

Is Glue able to automatically partition with my file naming convention "accelerometer/2024/01/28" ?

What do you guys recommend doing for me ? Thank you!

LouisAW
posta 3 mesi fa362 visualizzazioni
1 Risposta
0

The partitions have to be explicitly added to the catalog (expect from Athena projections), is not enough to have the data in the right path
Because you are not following the naming convention, you cannot "repair" the table and the crawler cannot give it meaning partition names (will be named 0, 1, 2...) but you could rename the partition columns after the crawler has loaded them.

profile pictureAWS
ESPERTO
con risposta 3 mesi fa

Accesso non effettuato. Accedi per postare una risposta.

Una buona risposta soddisfa chiaramente la domanda, fornisce un feedback costruttivo e incoraggia la crescita professionale del richiedente.

Linee guida per rispondere alle domande