Facing AmazonS3Exception: Please reduce your request rate Exception while writing data to S3 bucket using Glue Job

0

I am currently using a Glue job to read data from one Amazon S3 source, perform some transformations and write the transformed data into another S3 bucket in parquet format. While writing data to the destination bucket, I am adding a partitioning on one of the field. I am using below code to write data to destination:

partition_keys = ["partition_date"]
glueContext.write_dynamic_frame.from_catalog(
        frame=dynamic_frame,
        database=glue_catalog_db,
        table_name=glue_catalog_dest_table,
        transformation_ctx="write_dynamic_frame",
        additional_options={"partitionKeys": partition_keys}
    )

Right now, I am observing below error message in the logs: WARN TaskSetManager: Lost task 342.0 in stage 0.0 (TID 343) (172.35.6.249 executor 10): org.apache.spark.SparkException: Task failed while writing rows. Caused by: com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.model.AmazonS3Exception: Please reduce your request rate. (Service: Amazon S3; Status Code: 503; Error Code: SlowDown;

I just wanted to know if we can ignore these warnings as of now. As in will there be any data loss which I might face due to this issue, or these errors/warnings are auto retriable. In case of data loss, What will be the best solution to avoid this issue?

Note: Number of files to be written into Destination S3 bucket are in billion.

Dhiraj
已提问 7 个月前479 查看次数
1 回答
0

Hi,

You cannot ignore such write errors.

The knowledge center article will provide you the right guidance to solve: https://repost.aws/knowledge-center/emr-s3-503-slow-down

Best,

Didier

profile pictureAWS
专家
已回答 7 个月前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则