What is the best way to delete large data from S3 ?

0

Hi, We have around 300 TB of data in S3 in about 500 million files. If we want to transition this data from S3 - IA to Glacier Instant Retrieval. What will the total cost of that, if I use Lifecycle policies? Are there any costs that we need to consider apart from the transition cost per request? Will there be data transfer or request charges? Alternately If we do not archive, what is the best way to delete this data directly from S3 - IA?

Aditya
demandé il y a 2 mois309 vues
2 réponses
1

Hi,

You may want to read this article (in addition to S3 / Glacier pricing page) to understand cost associated to your use case: https://www.arqbackup.com/aws-glacier-pricing.html

That should allow you to compute the cost of your transition S3 -> Glacier.

Best,

Didier

profile pictureAWS
EXPERT
répondu il y a 2 mois
0

What will the total cost of that, if I use Lifecycle policies?

You don't mention the region the bucket is in, so the following assumes us-east-1 and uses the figures in https://aws.amazon.com/s3/pricing/

300TB = 300,000GB.

Keeping it all in S3-IA it will cost $0.0125 x 300,000 = $3750 / month.

Keeping it all in Glacier Instant Retrieval will cost $0.004 x 300,000 = $1200 / month.

So on the face of it you're saving $2550 / month.

There will be a one-off cost to transition 500,000,000 objects from S3-IA to Glacier Instant Retrieval.

Lifecycle Transition requests into Glacier Instant Retrieval costs $0.02 per 1,000 requests, therefore the cost here will be $0.02 x 500,000 = $10,000.

Don't rely on my figures when making your decision, use the AWS Pricing Calculator to come up with a figure that accurately reflects your particular use case https://calculator.aws/#/

Also bear in mind that retrievals are more expensive from Glacier Instant Retrieval compared to S3-IA (PUT, COPY & LIST cost double, GET, SELECT & all others are ten times more expensive). You will have an idea of the quantity of these each month, and that should be factored into the final decision.

Alternately If we do not archive, what is the best way to delete this data directly from S3 - IA?

In the Requests & data retrievals section of https://aws.amazon.com/s3/pricing/

DELETE and CANCEL requests are free

But I'm not absolutely sure if you'll need to do any GETs before you arrive at the point of determining whether an object should be deleted (1000 GET requests cost $0.001, so 500 million of them will be $500, and that's assuming only a single GET for each object)

Expiring them with a lifecycle rule is free https://docs.aws.amazon.com/AmazonS3/latest/userguide/lifecycle-expire-general-considerations.html#lifecycle-expire-when

You are not charged for expiration

profile picture
EXPERT
Steve_M
répondu il y a 2 mois
profile picture
EXPERT
vérifié il y a 2 mois

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.

Instructions pour répondre aux questions