Why does my AWS Glue crawler fail with an internal service exception?

8 minute read
1

My AWS Glue crawler fails with the error "ERROR : Internal Service Exception".

Resolution

Transient issues with the AWS Glue crawler internal service can cause intermittent exceptions. Before you start to troubleshoot, run the crawler again. If you continue to get an internal service exception, then check for the following common issues.

Data issues

If your AWS Glue crawler is configured to process a large amount of data, then the crawler might face an internal service exception. Review the causes of data issues to remediate:

  • If you have a large number of small files, then the crawler might fail with an internal service exception. To avoid this issue, use the S3DistCp tool to combine smaller files. You might incur additional Amazon EMR charges when you use S3DistCp. Or, set the exclude patterns to crawl the files repeatedly. Finally, turn on sampling to avoid scanning all of the files within a prefix.
  • If your crawler is nearing the 24 hour timeout value, then split the workflow to prevent memory issues. For more information, see Why is the AWS Glue crawler running for a long time?

Note: The best way to resolve data scale issues is to reduce the amount of data processed.

Inconsistent Amazon S3 folder structure

Over time, your AWS Glue crawler encounters your data in a specific format. However, inconsistencies in upstream applications might trigger an internal service exception error.

There might be inconsistency between a table partition definition on the Data Catalog and a Hive partition structure in Amazon Simple Storage Service (Amazon S3). These differences might cause issues for your crawler. For example, the crawler might expect objects to be partitioned as "s3://awsdoc-example-bucket/yyyy=xxxx/mm=xxx/dd=xx/[files]". However, some of the objects might fall under "s3://awsdoc-example-bucket/yyyy=xxxx/mm=xxx/[files]" instead. When this happens, the crawler fails and the internal service exception error is thrown.

If you modify a previously crawled data location, then an internal service exception error with an incremental crawl might occur. This happens because one of these conditions is met:

  • An empty Amazon S3 location is updated with data files
  • Files are removed from an Amazon S3 location that's populated with data files

If you change the Amazon S3 prefix structure, then this exception is triggered.

If you see changes in your S3 data store, then it's a best practice to delete the current crawler. After you delete the current crawler, create a new crawler on the same S3 target that uses the Crawl all folders option.

AWS KMS issues

If your data store is configured with AWS Key Management Service (AWS KMS) encryption, then follow these steps:

  • Confirm that your crawler's AWS Identity and Access Management (IAM) role has the necessary permissions to access the KMS key.
  • Confirm that your KMS key policy is properly delegating permissions.
  • Confirm that the KMS key still exists, and that it is in the Available status. If the KMS key is pending deletion, then the internal service exception is triggered.

For more information, see Working with security configurations on the AWS Glue console and Setting up encryption in AWS Glue.

Data Catalog issues

If your AWS Glue Data Catalog has a large number of columns or nested structures, then the schema size might exceed the 400 KB limit. To address exceptions related to the Data Catalog, follow these steps:

  • Be sure that the column name lengths don't exceed 255 characters and don't contain special characters. For more information about column requirements, see Column.
  • Check for columns that have a length of 0. This might occur if the columns in the source data don't match the data format of the Data Catalog table.
  • In the schema definition of your table, be sure that the Type value of each of your columns doesn't exceed 131,072 bytes. If this limit is surpassed, your crawler might encounter an internal service exception. For more information, see Column structure.
  • Check for malformed data. For example, if the column name doesn't conform to the regular expression pattern "[\u0020-\uD7FF\uE000-\uFFFD\uD800\uDC00-\uDBFF\uDFFF\t]", then the crawler won't work.
  • Check whether your data contains DECIMAL columns in a precision or scale format. If it does, then confirm that the scale value is less than or equal to the precision value.
  • Your crawler might fail with an "Unable to create table in Catalog" or "Payload size of request exceeded limit" error message. When this happens, monitor the size of the table schema definition. There's no limitation on the number of columns that a table in the Data Catalog might have. But, there is a 400 KB limit on the total size of the schema. A large number of columns contributes to the total schema size that might exceed the 400 KB limit. To reduce the size of the table schema, break the schema into multiple tables, and then remove the unnecessary columns. Also, to decrease the size of metadata, reduce the column names.

Amazon S3 issues

For Amazon S3 issues, follow these steps:

  • Be sure that the Amazon S3 path doesn't contain special characters.
  • Confirm that the IAM role for the crawler has permissions to access the Amazon S3 path. For more information, see Step 2: Create an IAM role for AWS Glue.
  • Remove special ASCII characters such as ^, %, and ~ from your data. Or use custom classifiers to classify your data.
  • Confirm that the S3 objects use the STANDARD storage class. To restore objects to the STANDARD storage class, see Restoring an archived object.
  • Confirm that the include and exclude patterns in the crawler configuration match the S3 bucket paths.
  • If you crawl an encrypted S3 bucket, then confirm that the IAM role for the crawler has the appropriate permissions for the AWS KMS key.
  • If you crawl an encrypted S3 bucket, then be sure that the bucket, KMS key, and AWS Glue job are in the same AWS Region.
  • If you crawl a S3 bucket, check the request rate. If the request rate is high, then create more prefixes to parallelize reads. For more information, see Best practices design patterns: optimizing Amazon S3 performance.
  • Be sure that the S3 resource path length has fewer than 700 characters.

Amazon DynamoDB issues

For Amazon DynamoDB, check the following conditions:

JDBC issues

For JDBC issues, follow these steps:

  • If you crawl a JDBC data source that's encrypted with AWS KMS, then check the subnet that you use for the connection. The subnet's route table must have a route to the AWS KMS endpoint. This route might go through an AWS KMS supported virtual private cloud (VPC) endpoint or a NAT gateway.
  • Be sure that you use the correct Include path syntax. For more information, see Defining crawlers in AWS Glue.
  • If you're crawling a JDBC data store, then confirm that the SSL connection is configured correctly. If you don't use an SSL connection, then be sure that Require SSL connection isn't selected when you configure the crawler.
  • Confirm that the database name in the AWS Glue connection matches the database name in the crawler's Include path. Also, be sure that you enter the Include path correctly. For more information, see Include and exclude patterns.
  • Be sure that the subnet that you use is in an Availability Zone that AWS Glue supports.
  • Be sure that the subnet that you use has enough available private IP addresses.
  • Confirm that the JDBC data source is supported with the built-in AWS Glue JDBC driver.

Check this KMS issue when you use a VPC endpoint:

  • If you use KMS, then the AWS Glue crawler must have access to KMS. To grant access, select the Enable Private DNS Name option when you create the KMS endpoint. Then, add the KMS endpoint to the VPC subnet configuration for the AWS Glue connection. For more information, see Connecting to AWS KMS through a VPC endpoint.

Contact AWS Support

Based on the AWS Glue crawler source, review the information in this article. If you still have issues with the AWS Glue crawler, then contact AWS Support.

Related information

Working with crawlers on the AWS Glue console

Encrypting data written by AWS Glue

AWS OFFICIAL
AWS OFFICIALUpdated a month ago
4 Comments

It would be great if instead of a long list of potential reasons why we can get that error, the error itself was made more descriptive.

replied 6 months ago

The current state of error reporting and number of potential issues that might face it made me give up on setting the crawler up. I simply don't know what I'm doing wrong.

replied 5 months ago

Is there an easier to use tool AWS recommends? It is frustrating to have to go through a large list of reasons as to why a data processing step would fail. To start off, the advise to reduce the amount of data tells me this tool is outdated; the very reason someone runs a crawler is because the volume of data is high. The most basic expectation we would have on a crawler is that it can handle any volume of data.

tw
replied 5 months ago

Thank you for your comments. We'll review and update the Knowledge Center article as needed.

profile pictureAWS
MODERATOR
replied 5 months ago