- Newest
- Most votes
- Most comments
Possible Causes:
-
Large Schema Complexity: The database schema is large with around 250 tables, which may be overwhelming the Glue Crawler, leading to timeouts or internal service errors.
-
JDBC Connector Issues: Using the latest JDBC connector may sometimes cause compatibility issues or bugs, especially if the schema is complex.
-
Internal Service Errors: AWS Glue can sometimes throw vague "Internal Service Exception" errors due to underlying issues in the service, especially when dealing with large datasets or complex schemas.
Troubleshooting Steps:
-
Schema Segmentation: Consider segmenting the schema into smaller parts and running the Glue Crawler on these segments individually. This can help you isolate the problematic parts of the schema and reduce the load on the crawler.
-
Check JDBC Driver Compatibility: Ensure that you are using a JDBC driver that is fully compatible with both AWS Glue and your version of PostgreSQL (v14). Sometimes using an alternative or slightly older version of the JDBC driver may resolve compatibility issues.
-
Increase Timeout Settings: If possible, increase the timeout settings for the Glue Crawler or consider using a more powerful worker type to handle the larger schema.
-
CloudWatch Logs Analysis: Review the CloudWatch logs in more detail to see if there are any specific tables or operations that are causing the crawler to fail. Sometimes, specific patterns or table types can be identified as the root cause.
Relevant content
- asked a year ago
- asked a year ago
- AWS OFFICIALUpdated 3 years ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 3 years ago
- AWS OFFICIALUpdated 5 months ago
I reverted the changes, changed the password type of the RDS user to MD5 and it worked. So I suppose it had little to nothing to do with either of those causes.