Lake Formation blueprint for database ingestion fail with SPARK-31404

0

Hello,

I'am trying to run a Lake Formation blueprint for database ingestion (Aurora PostgreSQL, Glue Connection working, snapshot mode), but I got the following error:

An error occurred while calling o471.pyWriteDynamicFrame. You may get a different result due to the upgrading of Spark 3.0: writing dates before 1582-10-15 or timestamps before 1900-01-01T00:00:00Z into Parquet INT96 files can be dangerous, as the files may be read by Spark 2.x or legacy versions of Hive later, which uses a legacy hybrid calendar that is different from Spark 3.0+'s Proleptic Gregorian calendar. See more details in SPARK-31404. You can set spark.sql.legacy.parquet.int96RebaseModeInWrite to 'LEGACY' to rebase the datetime values w.r.t. the calendar difference during writing, to get maximum interoperability. Or set spark.sql.legacy.parquet.int96RebaseModeInWrite to 'CORRECTED' to write the datetime values as it is, if you are 100% sure that the written files will only be read by Spark 3.0+ or other systems that use Proleptic Gregorian calendar.

I've found that adding --conf spark.sql.legacy.parquet.int96RebaseModeInWrite=CORRECTED would solve. However, it is not possible to change the Glue ETL Job (got putObject: AccessDenied: Access Denied).

已提問 1 年前檢視次數 260 次
1 個回答
0

Hello,

I understand that you are trying to update a Glue job with parameter --conf spark.sql.legacy.parquet.int96RebaseModeInWrite=CORRECTED, however you are getting the following error

"putObject: AccessDenied: Access Denied"

This error is generally seen when the IAM user/role used to edit the job does not have permissions to upload to the S3 bucket containing the script. Ideally we will need to have the permissions specified in the following document for the user that is used while editing the job https://docs.aws.amazon.com/glue/latest/dg/attach-policy-iam-user.html

For the above error specifically, we will need to allow s3:PutObject permission on the "arn:aws:s3:::<scriptBucketName>/* in the IAM policy.

AWS
支援工程師
已回答 1 年前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南