Lake Formation (Database Snapshot) blueprint creates a workflow that does not work. "Internal service error: Invalid Input Provided"


I have followed the guide in this blog post

I am using an RDS mysql instance in the eu-west-1 region.

The blueprint and subsequently the workflows, jobs and crawlers are created sucessfully.

The worklow runs the crawler jobs successfully and the tables in Glue are populated.

But then the workflow fails when reaching the ETL jobs with the error message "failed to execute with exception Internal service error: Invalid Input Provided".

I have looked at the job definition itself and the code behind the script and the input seems to match.


I found out where the error is coming from. The ETL job created by Lake Formation is using "Glue 1.0" and that simply doesn't work.

It works running an empty spark job using Glue 3.0 and the same job using Glue 1.0 fails with the error "Internal service error: Invalid Input Provided".

1 Answer
  • Be sure that the column name lengths don't exceed 255 characters and don't contain special characters. For more information about column requirements, see Column.
  • Check for malformed data. For example, if the column name doesn't conform to the regular expression pattern "[\u0020-\uD7FF\uE000-\uFFFD\uD800\uDC00-\uDBFF\uDFFF\t]", then the crawler doesn't work.
  • Check for columns that have a length of 0. This happens when columns in the data don't match the data format of the table.
  • If your data contains DECIMAL columns with the "(precision, scale)" format, then be sure that the scale value is less than or equal to the precision value.
  • In the schema definition of your table, be sure that the Type of each of your columns isn't longer than 131,072 bytes. For more information, see Column structure.
  • If your crawler fails with either of the following errors, then be sure that the total schema definition of your table is not larger than 1 MB:
  •     "Unable to create table in Catalog"
  •     "Payload size of request exceeded limit"
answered 3 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions