Boto3 AWS Glue Crawler: how to get last run table changes?

0

In the AWS console, we can see if a crawler run returned with changes in schema for particular tables:

crawler ui

If I were to click on where it says "3 table changes", I would see the names of the tables that changed in that run.

Using boto3 or the API, how can I do that?

Using get_crawler_metrics only returns the number of affected tables, not the names of the tables.

EDIT:

After considering the accepted answer, I've solved my problem by recursively calling the GetTables API. Tables have a last updated timestamp. I used that to check which tables have been updated recently. This is enough for me, as I don't care what happened to the tables, just that they changed recently, because I do this check immediately after the crawler finished a run.

gefragt vor 2 Monaten421 Aufrufe
1 Antwort
1
Akzeptierte Antwort

I'd suggest using the AWS Glue Data Catalog APIs like GetTable or GetTables to programmatically retrieve the metadata of all tables in a database after a crawler run. By comparing the table metadata before and after a run, you can identify which tables were created, updated or deleted. Get Table Get Tables

The crawler logs and AWS Glue console are other options but may not be suitable. For ad-hoc exploration, the AWS Glue console provides the most direct way to see the names of tables affected by a specific crawler run. However, through code, the Data Catalog APIs provide a way to programmatically retrieve this information by comparing table metadata.

AWS
beantwortet vor 2 Monaten
profile picture
EXPERTE
überprüft vor 2 Monaten

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen