Facing AmazonS3Exception: Please reduce your request rate Exception while writing data to S3 bucket using Glue Job

0

I am currently using a Glue job to read data from one Amazon S3 source, perform some transformations and write the transformed data into another S3 bucket in parquet format. While writing data to the destination bucket, I am adding a partitioning on one of the field. I am using below code to write data to destination:

partition_keys = ["partition_date"]
glueContext.write_dynamic_frame.from_catalog(
        frame=dynamic_frame,
        database=glue_catalog_db,
        table_name=glue_catalog_dest_table,
        transformation_ctx="write_dynamic_frame",
        additional_options={"partitionKeys": partition_keys}
    )

Right now, I am observing below error message in the logs: WARN TaskSetManager: Lost task 342.0 in stage 0.0 (TID 343) (172.35.6.249 executor 10): org.apache.spark.SparkException: Task failed while writing rows. Caused by: com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.model.AmazonS3Exception: Please reduce your request rate. (Service: Amazon S3; Status Code: 503; Error Code: SlowDown;

I just wanted to know if we can ignore these warnings as of now. As in will there be any data loss which I might face due to this issue, or these errors/warnings are auto retriable. In case of data loss, What will be the best solution to avoid this issue?

Note: Number of files to be written into Destination S3 bucket are in billion.

Dhiraj
已提問 7 個月前檢視次數 480 次
1 個回答
0

Hi,

You cannot ignore such write errors.

The knowledge center article will provide you the right guidance to solve: https://repost.aws/knowledge-center/emr-s3-503-slow-down

Best,

Didier

profile pictureAWS
專家
已回答 7 個月前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南