- Newest
- Most votes
- Most comments
Based on your description, there are several potential causes for the "Internal Service Exception" in your Glue Crawler when connecting to SQL Server via JDBC.
Since your connection tests successfully and works in a Glue job, but fails during crawling, this suggests the issue might be related to specific permissions or configurations needed for crawling operations rather than basic connectivity.
Here are some troubleshooting steps and considerations:
- For error logs and increased verbosity:
- Unfortunately, Glue Crawlers don't provide extensive logging options beyond what you're seeing in CloudWatch.
- You might want to check if there are any throttling-related errors in CloudTrail that could indicate resource constraints.
- Regarding SQL Server permissions for crawling:
- The database user needs SELECT permissions on the tables you want to crawl
- The user should have permissions to query system tables and metadata views including:
- INFORMATION_SCHEMA views
- sys.tables, sys.columns, sys.types, and other system catalog views
- Permissions to execute sp_tables and sp_columns stored procedures
- Consider creating a dedicated database user with appropriate read-only permissions specifically for the crawler
- Additional troubleshooting steps:
- Try limiting the crawler scope to a single table or schema instead of using the wildcard path
- Check if there are any special characters or unsupported data types in your SQL Server tables
- Verify network configuration, ensuring the security groups, subnets, and VPC settings allow proper connectivity
- Ensure the IAM role has all necessary permissions for both the connection and writing to the Glue Data Catalog
- Consider using a custom JDBC driver if you're encountering compatibility issues with the default driver
- Check if your SQL Server requires specific authentication methods or encryption settings that might be causing issues during the crawling process
Since the crawler runs for about 5 minutes before failing, it's possible that it encounters a specific table or schema element that causes the exception. Try crawling smaller subsets of your database to isolate the problematic area.
Sources
AWS Glue Connection to RDS SQL Server Fails – "Failed status due to failed connection" | AWS re:Post
Troublehsoot AWS Glue connection to a JDBC data source | AWS re:Post
Resolve connection error in AWS Glue | AWS re:Post
Relevant content
- asked a year ago
- asked 2 years ago
