- Newest
- Most votes
- Most comments
according to my understanding the recommendation to use S3 Storage Lens (Free Tier) is technically the best "out-of-the-box" free option, but as Manohar already noted, the 24-hour latency is the dealbreaker for "near real-time" needs.
Regarding "Glacier & Deep Archive Blind Spot"
One critical detail often missed is that objects moved to S3 Glacier or S3 Glacier Deep Archive are "archived." This means:
- Standard Metadata Access: While you can see the object name in a
LISTcommand, you cannot performHeadObjectrequests to retrieve detailed metadata in real-time without initiating a restore process if you are trying to calculate sizes via certain scripts. - The S3 Inventory Advantage: S3 Inventory is the only reliable and cost-effective way to get metadata (like size and storage class) for millions of archived objects without incurring massive API overhead or restore costs. It provides a flat file (CSV, ORC, or Parquet) containing the metadata for every object in your bucket, including those in deep archive.
"Amazon S3 inventory provides comma-separated values (CSV), Apache Optimized Row Columnar (ORC) or Apache Parquet output files that list your objects and their corresponding metadata on a daily or weekly basis... S3 inventory is one of the most efficient ways to manage your storage, as it avoids the need to perform expensive synchronous List requests."
Source: https://docs.aws.amazon.com/AmazonS3/latest/userguide/storage-inventory.html
So, S3 Inventory is the only "reliable" and "cost-effective" way to get metadata (like size and storage class) for millions of archived objects without incurring massive API overhead or restore costs. It provides a flat file (CSV, ORC, or Parquet) containing the metadata for every object in your bucket, including those in deep archive.
The fundamental issue is that S3 is an object store, not a file system. To give you a "real-time" sum of millions of objects, AWS would have to scan metadata constantly, which costs compute power. Since you want to avoid costs, here are two deeper perspectives:
1. The "Event-Driven" Workaround (Near Real-Time & Low Cost)
If you need to track changes as they happen without waiting 24 hours, you can build a simple monitoring pipeline. This is often covered by the AWS Free Tier:
- S3 Event Notifications: Enable events for
s3:ObjectCreated:*ands3:ObjectRemoved:*. https://docs.aws.amazon.com/AmazonS3/latest/userguide/EventNotifications.html - AWS Lambda: Trigger a small function on every event.
- DynamoDB (Atomic Counter): The Lambda function updates a single table that keeps a running total of bytes per storage class (e.g., Standard: 500GB, Glacier: 200GB).
Benefit: This gives you a dashboard with absolute real-time accuracy for all new activity. Note: You would need to run a one-time S3 Inventory report to get the "starting balance" for your existing millions of objects.
2. Why you should avoid CLI (ls --recursive)
For buckets with millions of objects, running a manual scan via CLI is actually NOT free. You will be charged for LIST requests (currently $0.005 per 1,000 requests). Scanning 10 million objects would cost roughly $50 per scan just in API fees. It is also extremely slow and could take hours to complete.
3. S3 Storage Lens Advanced (The "Almost Free" Alternative)
While you asked for a free solution, S3 Storage Lens Advanced Metrics provides more frequent updates and CloudWatch publishing.
- It costs $0.20 per million objects per month.
- If you have 5 million objects, for $1/month, you get significantly better visibility than the free tier, which might be a fair trade-off compared to the engineering effort of a custom Lambda solution.
If you strictly want 0 USD cost, you must accept the 24-hour delay of the standard Storage Lens. If you need "Near-Real-Time (NRT)", the "Event + Lambda + DynamoDB" approach is the most professional way to solve this within the Free Tier limits.
Based on the available AWS-native methods, there isn't a completely free solution that provides near real-time storage usage breakdown by storage class for S3 buckets. However, here are your best free options:
Amazon S3 Storage Lens (Free Metrics) is your best free option. It provides 62 metrics across various categories at the bucket level, including storage class breakdowns. The free tier includes metrics for cost optimization, data protection, and access management. However, the data is available for queries for up to 14 days, and while it's updated regularly, it's not truly near real-time.
Amazon CloudWatch offers storage metrics including BucketSizeBytes by StorageType, which shows usage per storage class. This is free for S3 metrics, but as you've noted, CloudWatch records Amazon S3 metrics once each day, so it won't provide near real-time data.
For near real-time data, you would need to use S3 Storage Lens with Advanced Metrics, which includes activity metrics and more frequent updates. However, this comes with additional charges for per-object monitoring.
The fundamental limitation is that AWS doesn't offer a free, near real-time method for tracking storage usage by storage class. The daily update frequency in CloudWatch and S3 Storage Lens free metrics is the standard for free monitoring. For large buckets with millions of objects, manually scanning objects would be impractical and potentially costly due to API request charges.
Your best free approach would be to use S3 Storage Lens free metrics or CloudWatch metrics, accepting the daily update limitation, as these provide accurate storage class breakdowns without additional costs beyond standard S3 storage.
Sources
Amazon S3 Features – Storage Lens – AWS
Monitoring your storage activity and usage with Amazon S3 Storage Lens - Amazon Simple Storage Service
Calculate Amazon S3 bucket size and number of objects | AWS re:Post
Viewing metrics with Amazon S3 Storage Lens - Amazon Simple Storage Service
Relevant content
- asked 9 months ago
- AWS OFFICIALUpdated 7 months ago

If my answer helped solve your problem, I would appreciate it if you click on “accepted answer”.