Why does my AWS Glue crawler fail with an internal service exception?

8 minute read
1

My AWS Glue crawler fails with the error "ERROR : Internal Service Exception".

Resolution

Crawler internal service exceptions can be caused by transient issues. Before you start to troubleshoot, run the crawler again. If you still get an internal service exception, then check for the following common issues.

Data issues

If your AWS Glue crawler is configured to process a large amount of data, then the crawler might face an internal service exception. Review the causes of data issues to remediate:

  • If you have a large number of small files, then the crawler might fail with an internal service exception. To avoid this issue, use the S3DistCp tool to combine smaller files. You incur additional Amazon EMR charges when you use S3DistCp. Or, you can set exclude patterns and crawl the files iteratively. Finally, consider turning on sampling to avoid scanning all of the files within a prefix.
  • If your crawler is nearing the 24 hour timeout value, then split the workflow to prevent memory issues. For more information, see Why is the AWS Glue crawler running for a long time?

Note: The best way to resolve data scale issues is to reduce the amount of data processed.

Inconsistent Amazon Simple Storage Service (Amazon S3) folder structure

Over time, your AWS Glue crawler encounters your data in a specific format. However, inconsistencies in upstream applications can trigger an internal service exception error.

There might be inconsistency between a table partition definition on the Data Catalog and a Hive partition structure in Amazon S3. Differences like this can issues for your crawler. For example, the crawler might expect objects to be partitioned as "s3://awsdoc-example-bucket/yyyy=xxxx/mm=xxx/dd=xx/[files]". But suppose that some of the objects fall under "s3://awsdoc-example-bucket/yyyy=xxxx/mm=xxx/[files]" instead. When this happens, the crawler fails and the internal service exception error is thrown.

If you modify a previously crawled data location, then an internal service exception error with an incremental crawl can occur. This happens because one of these conditions are met:

  • An Amazon S3 location that's known to be empty is updated with data files
  • Files are removed from an Amazon S3 location that's known to be populated with data files

If you make changes in the Amazon S3 prefix structure, then this exception is triggered.

If you think that there have been changes in your S3 data store, then it's a best practice to delete the current crawler. After deleting the current crawler, create a new crawler on the same S3 target using the Crawl all folders option.

AWS Key Management Service (AWS KMS) issues

If your data store is configured with AWS KMS encryption, then check the following:

  • Confirm that your crawler's AWS Identity and Access Management (IAM) role has the necessary permissions to access the AWS KMS key.
  • Confirm that your AWS KMS key policy is properly delegating permissions.
  • Confirm that the AWS KMS key still exists, and is in the Available status. If they AWS KMS key is pending deletion, then the internal service exception is triggered.

For more information, see Working with security configurations on the AWS Glue console and Setting up encryption in AWS Glue.

AWS Glue Data Catalog issues

If your Data Catalog has a large number of columns or nested structures, then the schema size might exceed the 400 KB limit. To address exceptions related to the Data Catalog, check the following:

  • Be sure that the column name lengths don't exceed 255 characters and don't contain special characters. For more information about column requirements, see Column.
  • Check for columns that have a length of 0. This can occur if the columns in the source data don't match the data format of the Data Catalog table.
  • In the schema definition of your table, be sure that the Type value of each of your columns doesn't exceed 131,072 bytes. If this limit is surpassed, your crawler might face an internal service exception. For more information, see Column structure.
  • Check for malformed data. For example, if the column name doesn't conform to the regular expression pattern "[\u0020-\uD7FF\uE000-\uFFFD\uD800\uDC00-\uDBFF\uDFFF\t]", then the crawler doesn't work.
  • If your data contains DECIMAL columns in (precision, scale) format, then confirm that the scale value is less than or equal to the precision value.
  • Your crawler might fail with an "Unable to create table in Catalog" or "Payload size of request exceeded limit" error message. When this happens, monitor the size of the table schema definition. There's no limitation on the number of columns that a table in the Data Catalog can have. But, there is a 400 KB limit on the total size of the schema. A large number of columns contributes to the total schema size exceeding the 400 KB limit. Potential workarounds include breaking the schema into multiple tables and removing the unnecessary columns. You can also consider decreasing the size of metadata by reducing column names.

Amazon S3 issues

Amazon DynamoDB issues

JDBC issues

  • If you're crawling a JDBC data source that's encrypted with AWS KMS, then check the subnet that you're using for the connection. The subnet's route table must have a route to the AWS KMS endpoint. This route can go through an AWS KMS supported virtual private cloud (VPC) endpoint or a NAT gateway.
  • Be sure that you're using the correct Include path syntax. For more information, see Defining crawlers.
  • If you're crawling a JDBC data store, then confirm that the SSL connection is configured correctly. If you're not using an SSL connection, then be sure that Require SSL connection isn't selected when you configure the crawler.
  • Confirm that the database name in the AWS Glue connection matches the database name in the crawler's Include path. Also, be sure that you enter the Include path correctly. For more information, see Include and exclude patterns.
  • Be sure that the subnet that you're using is in an Availability Zone that's supported by AWS Glue.
  • Be sure that the subnet that you're using has enough available private IP addresses.
  • Confirm that the JDBC data source is supported with the built-in AWS Glue JDBC driver.

AWS KMS issues when using a VPC endpoint

  • If you're using AWS KMS, then the AWS Glue crawler must have access to AWS KMS. To grant access, select the Enable Private DNS Name option when you create the AWS KMS endpoint. Then, add the AWS KMS endpoint to the VPC subnet configuration for the AWS Glue connection. For more information, see Connecting to AWS KMS through a VPC endpoint.

Related information

Working with crawlers on the AWS Glue console

Encrypting data written by crawlers, jobs, and development endpoints

AWS OFFICIAL
AWS OFFICIALUpdated a year ago
4 Comments

It would be great if instead of a long list of potential reasons why we can get that error, the error itself was made more descriptive.

replied 3 months ago

The current state of error reporting and number of potential issues that might face it made me give up on setting the crawler up. I simply don't know what I'm doing wrong.

replied 3 months ago

Is there an easier to use tool AWS recommends? It is frustrating to have to go through a large list of reasons as to why a data processing step would fail. To start off, the advise to reduce the amount of data tells me this tool is outdated; the very reason someone runs a crawler is because the volume of data is high. The most basic expectation we would have on a crawler is that it can handle any volume of data.

tw
replied 3 months ago

Thank you for your comments. We'll review and update the Knowledge Center article as needed.

profile pictureAWS
MODERATOR
replied 2 months ago