By using AWS re:Post, you agree to the AWS re:Post Terms of Use

S3 Bucket Size increase 10X

1

We have an S3 Bucket that held steady storage from 2018 to 2023 at 2TB. It is pushed objects from Sonatype Nexus Repo. The bucket is version enabled and a cloudformation stack template deploys the Nexus Repo Manager Server. In july 2023 we modified the template yaml changing the Nexus version from 3.17.0-01 to 3.32.0-03. The update completed successfully. Ever since that update we noticed objects being uploaded to the bucket much more frequently. Prior to it we could see gaps of days, weeks, and months between objects. Since the update we have multiples objects per day on a very consistent and daily basis. Versioning is enabled on the bucket and, versions only account for about 64% of total bucket size (18 TB) vs the 28 TB of the total bucket. Even implementing a lifecycle rule to clean up non-current versions it wouldn't account for all of the bloat. We've tried upgrading Nexus to the most current version but the bucket continue to grow. We have a backup bucket for this which is pushed to through a replication rule and I've been playing with that. I attempted to delete all non-current versions, we saw a slight reduction but nothing substantial. We shut the replication rule off, suspended versioning, and I modified the lifecycle rule to expire any objects after 2 weeks and delete non-current versions, yet the bucket still shows up as 22 TB in cloudwatch. I've looked into incomplete and multipart uploads, they don't represent a significant portion of the storage, a few GB at most.

My 2 questions are why are objects being pushed more frequently to our Nexus bucket since the version update? And why after clearing out the backup bucket are we still seeing 22TB of storage there?

Bucket Growth

1 Answer
0

Check S3 Lens Activity metrics for the bucket.

S3 > Lens > default-account-dashboard > Select "Activity" from dropdown > select "bucket name" from drop down top left > Select timestamp as today date > Metrics as Month by month > apply filters.

Validate for % increase in storage data, total number of requests to s3 bucket.

  1. why after clearing out the backup bucket are we still seeing 22TB of storage there?
  • Few other options for cost savings, Configure Intelligent tiering to save cost, Create Life Cycle rule to move objects to Glacier_IR after certain age or for a specific sub folder that could have potential low access for data.

Further, your team should get insights on data access patterns for this bucket.

When did the spike of data started. are lifecycle policies working correctly. Sometimes prefix configured in the LCP could be wrong (like missing / at the end) and can cause data not being deleted.

AWS
EXPERT
answered 9 months ago
profile picture
EXPERT
reviewed 9 months ago
  • I don't see "Activity" as one of the dropdown options to select from, there's "Active Bucket" which isn't very insightful. The date range in storage lens only lets me look back 2 weeks, so unless i look through cloudwatch I can't observe the growth overtime.

    Again I just want to clarify I'm talking about two issues here. The first is the increase in cost in bucket A, which has no lifecycle rules currently enabled. We upgraded the version of Nexus we use in the CFT template and suddenly objects began to be uploaded much more frequently.

    The second issue is we have a backup bucket (bucket B) which is populated via a replication rule in bucket A. To test the effects of applying a lifecycle rule we implemented one in the backup bucket, suspended versioning, and disabled the replication rule. This means this bucket should not have anything uploaded to it at all. The lifecycle rule that was implemented expires objects after a few days, and deletes non-current versions after a few days. But even after implementing this rule, and having the bucket appear empty visually, excluding when you hit "show versions" and seeing the delete marker on all of the deleted objects, the bucket is still registering as 22 TB.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions