1 Answer
- Newest
- Most votes
- Most comments
-1
It sounds like you're experiencing an issue with the AWS Glue crawler not deleting tables from the Glue Data Catalog when they've been dropped from your PostgreSQL database.
Your configuration looks correct with the SchemaChangePolicy set to "DeleteBehavior": "DELETE_FROM_DATABASE", which should remove tables from the Glue Data Catalog when they no longer exist in the source database.
Since the crawler isn't detecting the dropped tables, here are some potential issues to check:
1. JDBC connection configuration: Ensure your JDBC URL is correctly configured to connect to the right database schema where the tables were dropped.
2. IAM permissions: Verify that the IAM role used by the crawler has sufficient permissions to both read from the database and update/delete tables in the Glue Data Catalog.
3. Network connectivity: Check that there are no networking issues preventing proper communication between AWS Glue and your Aurora PostgreSQL database.
4. Database credentials and permissions: Ensure the database user credentials configured in your connection have the proper permissions to view schema information.
If the crawler is successfully adding new tables but not removing deleted ones, you might need to manually delete the tables from your Glue Data Catalog using the AWS Glue API or SDK. You can use methods like `delete_table()` with the appropriate database name and table name parameters.
As a workaround, you could try recreating the crawler or adjusting the include path to be more specific to the tables you want to track.
Sources
Troublehsoot AWS Glue connection to a JDBC data source | AWS re:Post
Use DeleteTable with an AWS SDK - AWS Glue
AWS Glue examples using SDK for Python (Boto3) - AWS SDK Code Examples
answered 9 months ago
Relevant content
- asked 9 months ago
- AWS OFFICIALUpdated 2 months ago
