Boto3 AWS Glue Crawler: how to get last run table changes?

0

In the AWS console, we can see if a crawler run returned with changes in schema for particular tables:

crawler ui

If I were to click on where it says "3 table changes", I would see the names of the tables that changed in that run.

Using boto3 or the API, how can I do that?

Using get_crawler_metrics only returns the number of affected tables, not the names of the tables.

EDIT:

After considering the accepted answer, I've solved my problem by recursively calling the GetTables API. Tables have a last updated timestamp. I used that to check which tables have been updated recently. This is enough for me, as I don't care what happened to the tables, just that they changed recently, because I do this check immediately after the crawler finished a run.

已提問 2 個月前檢視次數 422 次
1 個回答
1
已接受的答案

I'd suggest using the AWS Glue Data Catalog APIs like GetTable or GetTables to programmatically retrieve the metadata of all tables in a database after a crawler run. By comparing the table metadata before and after a run, you can identify which tables were created, updated or deleted. Get Table Get Tables

The crawler logs and AWS Glue console are other options but may not be suitable. For ad-hoc exploration, the AWS Glue console provides the most direct way to see the names of tables affected by a specific crawler run. However, through code, the Data Catalog APIs provide a way to programmatically retrieve this information by comparing table metadata.

AWS
已回答 2 個月前
profile picture
專家
已審閱 2 個月前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南