AWS Glue crawler - automatic column rename detection

0

I am using Glue Crawler to crawl for parquet files, partitioned by type/yyyy/mm/dd which has been working just great. However, we have just run into a scenario where we need to rename a column. Is it feasible to have the crawler detect that it is a rename, such that data before the change can be selected with a query referring to the column name after the change, or is it only possible to have it treated as a brand new column? I was hoping the crawler would be able to automatically pick up these kind of changes.

In the event it is not possible for the crawler to correctly detect and action a column rename, what would be the best way to achieve this? I was hoping to avoid coming in and manually changing the Glue database. If we created the column as a new column is there a way to have the old data stored against old column be moved into the new column? And is there a way to have this detected automatically?

I am wanting to avoid making manual / non-automatic changes just because of the volume of different message types / parquet files. It doesn't really scale as process if different folks are coming in and manipulating the database. Glue looked to be perfect - it seemed to pick up all the changes, partitions etc. and just deal with it, but the rename seems to be something it struggles with. Suggestions very welcome.

pete
質問済み 8ヶ月前119ビュー
回答なし

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ