What is the best way to delete large data from S3 ?

0

Hi, We have around 300 TB of data in S3 in about 500 million files. If we want to transition this data from S3 - IA to Glacier Instant Retrieval. What will the total cost of that, if I use Lifecycle policies? Are there any costs that we need to consider apart from the transition cost per request? Will there be data transfer or request charges? Alternately If we do not archive, what is the best way to delete this data directly from S3 - IA?

Aditya
asked 4 months ago366 views
2 Answers
1

Hi,

You may want to read this article (in addition to S3 / Glacier pricing page) to understand cost associated to your use case: https://www.arqbackup.com/aws-glacier-pricing.html

That should allow you to compute the cost of your transition S3 -> Glacier.

Best,

Didier

profile pictureAWS
EXPERT
answered 4 months ago
0

What will the total cost of that, if I use Lifecycle policies?

You don't mention the region the bucket is in, so the following assumes us-east-1 and uses the figures in https://aws.amazon.com/s3/pricing/

300TB = 300,000GB.

Keeping it all in S3-IA it will cost $0.0125 x 300,000 = $3750 / month.

Keeping it all in Glacier Instant Retrieval will cost $0.004 x 300,000 = $1200 / month.

So on the face of it you're saving $2550 / month.

There will be a one-off cost to transition 500,000,000 objects from S3-IA to Glacier Instant Retrieval.

Lifecycle Transition requests into Glacier Instant Retrieval costs $0.02 per 1,000 requests, therefore the cost here will be $0.02 x 500,000 = $10,000.

Don't rely on my figures when making your decision, use the AWS Pricing Calculator to come up with a figure that accurately reflects your particular use case https://calculator.aws/#/

Also bear in mind that retrievals are more expensive from Glacier Instant Retrieval compared to S3-IA (PUT, COPY & LIST cost double, GET, SELECT & all others are ten times more expensive). You will have an idea of the quantity of these each month, and that should be factored into the final decision.

Alternately If we do not archive, what is the best way to delete this data directly from S3 - IA?

In the Requests & data retrievals section of https://aws.amazon.com/s3/pricing/

DELETE and CANCEL requests are free

But I'm not absolutely sure if you'll need to do any GETs before you arrive at the point of determining whether an object should be deleted (1000 GET requests cost $0.001, so 500 million of them will be $500, and that's assuming only a single GET for each object)

Expiring them with a lifecycle rule is free https://docs.aws.amazon.com/AmazonS3/latest/userguide/lifecycle-expire-general-considerations.html#lifecycle-expire-when

You are not charged for expiration

profile picture
EXPERT
Steve_M
answered 4 months ago
profile picture
EXPERT
reviewed 4 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions