Lake Formation blueprint for database ingestion fail with SPARK-31404

0

Hello,

I'am trying to run a Lake Formation blueprint for database ingestion (Aurora PostgreSQL, Glue Connection working, snapshot mode), but I got the following error:

An error occurred while calling o471.pyWriteDynamicFrame. You may get a different result due to the upgrading of Spark 3.0: writing dates before 1582-10-15 or timestamps before 1900-01-01T00:00:00Z into Parquet INT96 files can be dangerous, as the files may be read by Spark 2.x or legacy versions of Hive later, which uses a legacy hybrid calendar that is different from Spark 3.0+'s Proleptic Gregorian calendar. See more details in SPARK-31404. You can set spark.sql.legacy.parquet.int96RebaseModeInWrite to 'LEGACY' to rebase the datetime values w.r.t. the calendar difference during writing, to get maximum interoperability. Or set spark.sql.legacy.parquet.int96RebaseModeInWrite to 'CORRECTED' to write the datetime values as it is, if you are 100% sure that the written files will only be read by Spark 3.0+ or other systems that use Proleptic Gregorian calendar.

I've found that adding --conf spark.sql.legacy.parquet.int96RebaseModeInWrite=CORRECTED would solve. However, it is not possible to change the Glue ETL Job (got putObject: AccessDenied: Access Denied).

已提问 1 年前260 查看次数
1 回答
0

Hello,

I understand that you are trying to update a Glue job with parameter --conf spark.sql.legacy.parquet.int96RebaseModeInWrite=CORRECTED, however you are getting the following error

"putObject: AccessDenied: Access Denied"

This error is generally seen when the IAM user/role used to edit the job does not have permissions to upload to the S3 bucket containing the script. Ideally we will need to have the permissions specified in the following document for the user that is used while editing the job https://docs.aws.amazon.com/glue/latest/dg/attach-policy-iam-user.html

For the above error specifically, we will need to allow s3:PutObject permission on the "arn:aws:s3:::<scriptBucketName>/* in the IAM policy.

AWS
支持工程师
已回答 1 年前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则