Facing AmazonS3Exception: Please reduce your request rate Exception while writing data to S3 bucket using Glue Job

0

I am currently using a Glue job to read data from one Amazon S3 source, perform some transformations and write the transformed data into another S3 bucket in parquet format. While writing data to the destination bucket, I am adding a partitioning on one of the field. I am using below code to write data to destination:

partition_keys = ["partition_date"]
glueContext.write_dynamic_frame.from_catalog(
        frame=dynamic_frame,
        database=glue_catalog_db,
        table_name=glue_catalog_dest_table,
        transformation_ctx="write_dynamic_frame",
        additional_options={"partitionKeys": partition_keys}
    )

Right now, I am observing below error message in the logs: WARN TaskSetManager: Lost task 342.0 in stage 0.0 (TID 343) (172.35.6.249 executor 10): org.apache.spark.SparkException: Task failed while writing rows. Caused by: com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.model.AmazonS3Exception: Please reduce your request rate. (Service: Amazon S3; Status Code: 503; Error Code: SlowDown;

I just wanted to know if we can ignore these warnings as of now. As in will there be any data loss which I might face due to this issue, or these errors/warnings are auto retriable. In case of data loss, What will be the best solution to avoid this issue?

Note: Number of files to be written into Destination S3 bucket are in billion.

1 Answer
0

Hi,

You cannot ignore such write errors.

The knowledge center article will provide you the right guidance to solve: https://repost.aws/knowledge-center/emr-s3-503-slow-down

Best,

Didier

profile pictureAWS
EXPERT
answered 7 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions