Glue table partition format question

0

Hi, We send our database old data into S3 in this format:

{table}/year/month/day/asset_id.parquet so we have a lot of files like this: table/2024/01/28/z.parquet table/2024/01/28/y.parquet table/2024/01/29/z.parquet table/2024/01/29/y.parquet etc...

When I edit my glue table and click on Partition it says : "No available partitions." Am I missing something? Is it because we need to specify in the file name the actual meaning of the number like here for example ? {table}/year=2024/month=10/day=28/z.parquet {table}/year=2024/month=10/day=28/y.parquet

Is Glue able to automatically partition with my file naming convention "accelerometer/2024/01/28" ?

What do you guys recommend doing for me ? Thank you!

LouisAW
질문됨 3달 전362회 조회
1개 답변
0

The partitions have to be explicitly added to the catalog (expect from Athena projections), is not enough to have the data in the right path
Because you are not following the naming convention, you cannot "repair" the table and the crawler cannot give it meaning partition names (will be named 0, 1, 2...) but you could rename the partition columns after the crawler has loaded them.

profile pictureAWS
전문가
답변함 3달 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠