Lake Formation blueprint for database ingestion fail with SPARK-31404

0

Hello,

I'am trying to run a Lake Formation blueprint for database ingestion (Aurora PostgreSQL, Glue Connection working, snapshot mode), but I got the following error:

An error occurred while calling o471.pyWriteDynamicFrame. You may get a different result due to the upgrading of Spark 3.0: writing dates before 1582-10-15 or timestamps before 1900-01-01T00:00:00Z into Parquet INT96 files can be dangerous, as the files may be read by Spark 2.x or legacy versions of Hive later, which uses a legacy hybrid calendar that is different from Spark 3.0+'s Proleptic Gregorian calendar. See more details in SPARK-31404. You can set spark.sql.legacy.parquet.int96RebaseModeInWrite to 'LEGACY' to rebase the datetime values w.r.t. the calendar difference during writing, to get maximum interoperability. Or set spark.sql.legacy.parquet.int96RebaseModeInWrite to 'CORRECTED' to write the datetime values as it is, if you are 100% sure that the written files will only be read by Spark 3.0+ or other systems that use Proleptic Gregorian calendar.

I've found that adding --conf spark.sql.legacy.parquet.int96RebaseModeInWrite=CORRECTED would solve. However, it is not possible to change the Glue ETL Job (got putObject: AccessDenied: Access Denied).

질문됨 일 년 전260회 조회
1개 답변
0

Hello,

I understand that you are trying to update a Glue job with parameter --conf spark.sql.legacy.parquet.int96RebaseModeInWrite=CORRECTED, however you are getting the following error

"putObject: AccessDenied: Access Denied"

This error is generally seen when the IAM user/role used to edit the job does not have permissions to upload to the S3 bucket containing the script. Ideally we will need to have the permissions specified in the following document for the user that is used while editing the job https://docs.aws.amazon.com/glue/latest/dg/attach-policy-iam-user.html

For the above error specifically, we will need to allow s3:PutObject permission on the "arn:aws:s3:::<scriptBucketName>/* in the IAM policy.

AWS
지원 엔지니어
답변함 일 년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인