AWS Glue: Crawler fails with Internal Service Error

0

Hello,

we are facing a very odd issue with AWS Glue. When we run AWS Glue Crawler against the RDS Postgres v14 we get very vague error logs from CloudWatch (see below). Now, admittedly the schema is relatively big (ca 250 tables) and it is a bit of a mess. I used the latest JDBC connector to setup RDS connection due to another issue too. Any idea what could be causing this issue?

2024-08-29T10:18:14.396Z
[e3695841-c51b-440b-8271-c7d05017e076] BENCHMARK : Running Start Crawl for Crawler prod-db-crawler

[e3695841-c51b-440b-8271-c7d05017e076] BENCHMARK : Running Start Crawl for Crawler prod-db-crawler
2024-08-29T10:34:33.950Z
[e3695841-c51b-440b-8271-c7d05017e076] ERROR : Internal Service Exception

[e3695841-c51b-440b-8271-c7d05017e076] ERROR : Internal Service Exception
2024-08-29T10:36:19.545Z
[e3695841-c51b-440b-8271-c7d05017e076] BENCHMARK : Crawler has finished running and is in state READY

[e3695841-c51b-440b-8271-c7d05017e076] BENCHMARK : Crawler has finished running and is in state READY
pbocan
asked a month ago54 views
1 Answer
0

Possible Causes:

  1. Large Schema Complexity: The database schema is large with around 250 tables, which may be overwhelming the Glue Crawler, leading to timeouts or internal service errors.

  2. JDBC Connector Issues: Using the latest JDBC connector may sometimes cause compatibility issues or bugs, especially if the schema is complex.

  3. Internal Service Errors: AWS Glue can sometimes throw vague "Internal Service Exception" errors due to underlying issues in the service, especially when dealing with large datasets or complex schemas.

Troubleshooting Steps:

  1. Schema Segmentation: Consider segmenting the schema into smaller parts and running the Glue Crawler on these segments individually. This can help you isolate the problematic parts of the schema and reduce the load on the crawler.

  2. Check JDBC Driver Compatibility: Ensure that you are using a JDBC driver that is fully compatible with both AWS Glue and your version of PostgreSQL (v14). Sometimes using an alternative or slightly older version of the JDBC driver may resolve compatibility issues.

  3. Increase Timeout Settings: If possible, increase the timeout settings for the Glue Crawler or consider using a more powerful worker type to handle the larger schema.

  4. CloudWatch Logs Analysis: Review the CloudWatch logs in more detail to see if there are any specific tables or operations that are causing the crawler to fail. Sometimes, specific patterns or table types can be identified as the root cause.

profile pictureAWS
EXPERT
Deeksha
answered a month ago
  • I reverted the changes, changed the password type of the RDS user to MD5 and it worked. So I suppose it had little to nothing to do with either of those causes.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions