AWS Glue crawler - automatic column rename detection

0

I am using Glue Crawler to crawl for parquet files, partitioned by type/yyyy/mm/dd which has been working just great. However, we have just run into a scenario where we need to rename a column. Is it feasible to have the crawler detect that it is a rename, such that data before the change can be selected with a query referring to the column name after the change, or is it only possible to have it treated as a brand new column? I was hoping the crawler would be able to automatically pick up these kind of changes.

In the event it is not possible for the crawler to correctly detect and action a column rename, what would be the best way to achieve this? I was hoping to avoid coming in and manually changing the Glue database. If we created the column as a new column is there a way to have the old data stored against old column be moved into the new column? And is there a way to have this detected automatically?

I am wanting to avoid making manual / non-automatic changes just because of the volume of different message types / parquet files. It doesn't really scale as process if different folks are coming in and manipulating the database. Glue looked to be perfect - it seemed to pick up all the changes, partitions etc. and just deal with it, but the rename seems to be something it struggles with. Suggestions very welcome.

pete
已提問 8 個月前檢視次數 119 次
沒有答案

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南