AWS Glue crawler - automatic column rename detection

0

I am using Glue Crawler to crawl for parquet files, partitioned by type/yyyy/mm/dd which has been working just great. However, we have just run into a scenario where we need to rename a column. Is it feasible to have the crawler detect that it is a rename, such that data before the change can be selected with a query referring to the column name after the change, or is it only possible to have it treated as a brand new column? I was hoping the crawler would be able to automatically pick up these kind of changes.

In the event it is not possible for the crawler to correctly detect and action a column rename, what would be the best way to achieve this? I was hoping to avoid coming in and manually changing the Glue database. If we created the column as a new column is there a way to have the old data stored against old column be moved into the new column? And is there a way to have this detected automatically?

I am wanting to avoid making manual / non-automatic changes just because of the volume of different message types / parquet files. It doesn't really scale as process if different folks are coming in and manipulating the database. Glue looked to be perfect - it seemed to pick up all the changes, partitions etc. and just deal with it, but the rename seems to be something it struggles with. Suggestions very welcome.

pete
질문됨 8달 전119회 조회
답변 없음

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인