I have a table of 30GB in size I am running an etl with an aws-glue job that copies the table to an s3 bucket. I try to bookmark using the combination of a couple of columns as the bookmark key. Some of the columns have rows with null values. An error occurred while calling o97.getDynamicFrame. Incorrect DATETIME value: 'null'. I would like to ask if there is any way to give the does column a default value.

The other alternative was moving the entire table without bookmark which I don't think is efficient.glue bookmark error

  • Where is that origin table stored?

For this use case, the bookmark keys are used at data consuming side, and per documented at [1], the create_dynamic_frame.from_catalog just takes the column names for the "jobBookmarkKeys". There is no option to give a default value to a column when its value is null.

However, there is workaround to this.

If your original table is stored in an RDBMS system, then you can add a computed column, which has same value as the original column, and has a default value where the original is null.

Then in your glue job, you can use the computed column as part of bookmark keys.

Hope it helps.


Reference: [1] -

