What is the best way to delete large data from S3 ?

0

Hi, We have around 300 TB of data in S3 in about 500 million files. If we want to transition this data from S3 - IA to Glacier Instant Retrieval. What will the total cost of that, if I use Lifecycle policies? Are there any costs that we need to consider apart from the transition cost per request? Will there be data transfer or request charges? Alternately If we do not archive, what is the best way to delete this data directly from S3 - IA?

Aditya
已提问 2 个月前308 查看次数
2 回答
1

Hi,

You may want to read this article (in addition to S3 / Glacier pricing page) to understand cost associated to your use case: https://www.arqbackup.com/aws-glacier-pricing.html

That should allow you to compute the cost of your transition S3 -> Glacier.

Best,

Didier

profile pictureAWS
专家
已回答 2 个月前
0

What will the total cost of that, if I use Lifecycle policies?

You don't mention the region the bucket is in, so the following assumes us-east-1 and uses the figures in https://aws.amazon.com/s3/pricing/

300TB = 300,000GB.

Keeping it all in S3-IA it will cost $0.0125 x 300,000 = $3750 / month.

Keeping it all in Glacier Instant Retrieval will cost $0.004 x 300,000 = $1200 / month.

So on the face of it you're saving $2550 / month.

There will be a one-off cost to transition 500,000,000 objects from S3-IA to Glacier Instant Retrieval.

Lifecycle Transition requests into Glacier Instant Retrieval costs $0.02 per 1,000 requests, therefore the cost here will be $0.02 x 500,000 = $10,000.

Don't rely on my figures when making your decision, use the AWS Pricing Calculator to come up with a figure that accurately reflects your particular use case https://calculator.aws/#/

Also bear in mind that retrievals are more expensive from Glacier Instant Retrieval compared to S3-IA (PUT, COPY & LIST cost double, GET, SELECT & all others are ten times more expensive). You will have an idea of the quantity of these each month, and that should be factored into the final decision.

Alternately If we do not archive, what is the best way to delete this data directly from S3 - IA?

In the Requests & data retrievals section of https://aws.amazon.com/s3/pricing/

DELETE and CANCEL requests are free

But I'm not absolutely sure if you'll need to do any GETs before you arrive at the point of determining whether an object should be deleted (1000 GET requests cost $0.001, so 500 million of them will be $500, and that's assuming only a single GET for each object)

Expiring them with a lifecycle rule is free https://docs.aws.amazon.com/AmazonS3/latest/userguide/lifecycle-expire-general-considerations.html#lifecycle-expire-when

You are not charged for expiration

profile picture
专家
Steve_M
已回答 2 个月前
profile picture
专家
已审核 2 个月前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则