AWS Glue crawler - automatic column rename detection

0

I am using Glue Crawler to crawl for parquet files, partitioned by type/yyyy/mm/dd which has been working just great. However, we have just run into a scenario where we need to rename a column. Is it feasible to have the crawler detect that it is a rename, such that data before the change can be selected with a query referring to the column name after the change, or is it only possible to have it treated as a brand new column? I was hoping the crawler would be able to automatically pick up these kind of changes.

In the event it is not possible for the crawler to correctly detect and action a column rename, what would be the best way to achieve this? I was hoping to avoid coming in and manually changing the Glue database. If we created the column as a new column is there a way to have the old data stored against old column be moved into the new column? And is there a way to have this detected automatically?

I am wanting to avoid making manual / non-automatic changes just because of the volume of different message types / parquet files. It doesn't really scale as process if different folks are coming in and manipulating the database. Glue looked to be perfect - it seemed to pick up all the changes, partitions etc. and just deal with it, but the rename seems to be something it struggles with. Suggestions very welcome.

pete
gefragt vor 8 Monaten119 Aufrufe
Keine Antworten

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen