AWS Glue crawler - automatic column rename detection

0

I am using Glue Crawler to crawl for parquet files, partitioned by type/yyyy/mm/dd which has been working just great. However, we have just run into a scenario where we need to rename a column. Is it feasible to have the crawler detect that it is a rename, such that data before the change can be selected with a query referring to the column name after the change, or is it only possible to have it treated as a brand new column? I was hoping the crawler would be able to automatically pick up these kind of changes.

In the event it is not possible for the crawler to correctly detect and action a column rename, what would be the best way to achieve this? I was hoping to avoid coming in and manually changing the Glue database. If we created the column as a new column is there a way to have the old data stored against old column be moved into the new column? And is there a way to have this detected automatically?

I am wanting to avoid making manual / non-automatic changes just because of the volume of different message types / parquet files. It doesn't really scale as process if different folks are coming in and manipulating the database. Glue looked to be perfect - it seemed to pick up all the changes, partitions etc. and just deal with it, but the rename seems to be something it struggles with. Suggestions very welcome.

pete
asked 7 months ago112 views
No Answers

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions