Lake Formation blueprint for database ingestion fail with SPARK-31404

0

Hello,

I'am trying to run a Lake Formation blueprint for database ingestion (Aurora PostgreSQL, Glue Connection working, snapshot mode), but I got the following error:

An error occurred while calling o471.pyWriteDynamicFrame. You may get a different result due to the upgrading of Spark 3.0: writing dates before 1582-10-15 or timestamps before 1900-01-01T00:00:00Z into Parquet INT96 files can be dangerous, as the files may be read by Spark 2.x or legacy versions of Hive later, which uses a legacy hybrid calendar that is different from Spark 3.0+'s Proleptic Gregorian calendar. See more details in SPARK-31404. You can set spark.sql.legacy.parquet.int96RebaseModeInWrite to 'LEGACY' to rebase the datetime values w.r.t. the calendar difference during writing, to get maximum interoperability. Or set spark.sql.legacy.parquet.int96RebaseModeInWrite to 'CORRECTED' to write the datetime values as it is, if you are 100% sure that the written files will only be read by Spark 3.0+ or other systems that use Proleptic Gregorian calendar.

I've found that adding --conf spark.sql.legacy.parquet.int96RebaseModeInWrite=CORRECTED would solve. However, it is not possible to change the Glue ETL Job (got putObject: AccessDenied: Access Denied).

asked a year ago254 views
1 Answer
0

Hello,

I understand that you are trying to update a Glue job with parameter --conf spark.sql.legacy.parquet.int96RebaseModeInWrite=CORRECTED, however you are getting the following error

"putObject: AccessDenied: Access Denied"

This error is generally seen when the IAM user/role used to edit the job does not have permissions to upload to the S3 bucket containing the script. Ideally we will need to have the permissions specified in the following document for the user that is used while editing the job https://docs.aws.amazon.com/glue/latest/dg/attach-policy-iam-user.html

For the above error specifically, we will need to allow s3:PutObject permission on the "arn:aws:s3:::<scriptBucketName>/* in the IAM policy.

AWS
SUPPORT ENGINEER
answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions