Skip to content

Quickest way to calculate total S3 Size

0

I am currently using a Lambda function to figure out the total S3 Size in an account. First, I list all the buckets, then I figure out which region a bucket is from. After that, I configure a CloudWatch client to that region and fetch the CloudWatch metric from that client for that bucket. This however, is taking too long and is causing the Lambda to timeout. What else can I possibly do? The other method I had in mind was to list out the buckets filtered by region then configure clients for the regions respectively and then fetch the size for each bucket from the CloudWatch metric, but that timed out as well. The account has a large number of S3 buckets.

3 Answers
1

You can view the total S3 storage in an account from S3 Storage Lens. From the S3 Console, click Dashboards from the Storage Lens drop down, and then default-account-dashboard. On the overview page you will see the total size within the account, this value is updated daily.

Default account dashboard overview page

answered a year ago
EXPERT
reviewed a year ago
0

For accounts with a large number of S3 buckets, there are several more efficient approaches than using Lambda to query CloudWatch metrics for each bucket:

  1. Use S3 Storage Lens: This is likely the most efficient solution for your use case. S3 Storage Lens is specifically designed to provide organization-wide visibility into object storage usage and activity across AWS accounts. It offers a single view of storage usage and can handle large numbers of buckets efficiently. The advanced version includes features like prefix aggregation and 15 months of historical data analysis.

  2. Use S3 Inventory with Amazon Athena: For very large bucket counts, S3 Inventory can generate reports of all objects and their metadata, which you can then query efficiently using Amazon Athena. This approach works well for arbitrarily large numbers of objects:

  • Set up a separate bucket to receive inventory reports
  • Configure S3 inventory reports (delivered daily)
  • Use Athena to run SQL queries against the inventory data to calculate total sizes
  1. Optimize your CloudWatch approach: If you still prefer using CloudWatch metrics:
  • Use parallel processing to fetch metrics for multiple buckets simultaneously
  • Implement pagination when listing buckets to process them in smaller batches
  • Increase your Lambda timeout and memory allocation
  • Consider using AWS SDK batch operations where available
  1. Use AWS SDK with pagination: Write a script using AWS SDK (like Boto3 for Python) that implements proper pagination when listing buckets and fetching metrics. This can be more efficient than the Lambda approach, especially if run on an EC2 instance with appropriate IAM permissions.

Remember that CloudWatch records S3 metrics once per day, so real-time data isn't available regardless of which method you choose.
Sources
Calculate Amazon S3 bucket size and number of objects | AWS re:Post
Accessing CloudWatch metrics - Amazon Simple Storage Service
Folder size in S3 bucket | AWS re:Post

answered a year ago
  • I have to use Lambda... And there to get the total size, unfortunately we have to check each region

0

Hello.

How about using StepFunctions?
By replacing the processing currently performed by Lambda with a StepFunctions state machine, it is possible to execute the processing for more than 15 minutes.

EXPERT
answered a year ago
EXPERT
reviewed a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.