This article explores cost-saving techniques for using Amazon S3 Deep Archive, a low-cost long-term data storage service. It focuses on addressing the challenges of small data objects, which can incur disproportionately high overhead fees in Deep Archive. The article likely provides practical strategies and best practices for organizing, managing, and optimizing the storage of small data objects.
Introduction: In the landscape of cloud storage, optimizing costs and efficiently managing data become paramount for businesses leveraging services like Amazon S3.
In this post, we'll delve into a specific challenge faced by users - the overhead cost associated with small objects in amazon S3
The Two Dimensions of the Problem:
The first dimension concerns the overhead costs incurred by small objects when archived in the S3 Deep Archive class. Objects like those from CloudTrail incur an additional 32KB of metadata and indexing. This is financially concerning, especially for objects of one kilobyte or less, as more small objects lead to higher expenses.
For instance, if we need to store 1TB, this equals 1,000,000,000 objects of 1KB each. The cost for storing these in Amazon S3 Glacier Deep Archive is about $1.01 (1,024 GB x $0.00099 per GB). However, with an overhead of around 40KB per object, the total overhead cost becomes approximately $39.60 (1,000,000,000 objects x 40KB = 41TB, which costs $39.60). Thus, the total cost to store 1TB of 1KB objects, including storage and overhead, is $40.61.
In comparison, storing the same 1TB in S3 Standard Infrequent Access would cost about $12.80 (1,024 GB x $0.0125).
Now, consider a scenario with a 128KB object size, resulting in 7,812,500 objects (1TB / 128KB per object). The cost in S3 Glacier Deep Archive would still be about $1.80. This is calculated as follows: 1,024 GB (1TB) x $0.00099 per GB = $1.01.
The overhead for these objects would be around $0.31. This is calculated as follows: 7,812,500 objects x 40KB overhead per object = 312GB of overhead storage x $0.00099 per GB = $0.31.
Therefore, the total cost to store 1TB of 128KB objects in Amazon S3 Glacier Deep Archive, including the object storage and the overhead, would be $1.31 ($1.01 for object storage + $0.31 for overhead).
Meanwhile, storing these in Amazon S3 Standard would also cost $12.80. Although the overhead cost for Deep Archive storage is approximately 31% of the object storage cost, it is still over 10 times cheaper than using Standard storage, showing that Deep Archive is significantly cheaper despite its overhead.
The break-even point for object size is 3.1KB. Objects smaller than this are more cost-effective in the standard tier due to overhead costs, while larger objects benefit from the lower per GB pricing in Deep Archive.
One solution is to package small objects before archiving, though this process may introduce its own costs.
The second dimension involves using intelligent tiering lifecycle to transition small files to Amazon S3 Glacier Deep Archive.
Transitioning objects incurs a cost of $0.06 per 1,000 objects.
Considering our earlier examples, the transition cost for 1TB of 1KB objects is $60,000 (1,000,000,000 objects) and $469 for 1TB of 128KB objects (7,812,500 objects).
When uploading to Amazon S3, it is recommended to use the Intelligent-Tiering storage class, noting that objects smaller than 128KB won't automatically transition to other storage classes.
Conclusion: The cost challenge arises from overhead in S3 Deep Archive and the costs of transitioning objects. Transitioning to Deep Archive is only cost-effective for objects of 1GB or more.
Best Practices:
- For Existing Objects: Specify a minimum object size in S3 lifecycle policies to transition only eligible objects to Deep Archive, keeping smaller ones in the standard class.
- For New Uploads: Use the Intelligent-Tiering storage class.
Implementing these practices helps balance cost optimization and efficient data management in S3 Deep Archive. Additionally, we may consider requesting a feature to allow lifecycle transitions only for appropriately sized objects.
Instead of a lifecycle policy for small objects, using a simple script to transition them could be effective. Note that retrieving a 128KB object from Standard storage would cost $0.0004 per object. To break even on transition costs, these objects need to be stored in Deep Archive for at least 11.7 years.
For additional details, please refer to the Amazon S3 User Guide's section on cost optimization
Authors:
Georges Hamieh Senior Technical Account Manager &
Mohamed Sherif Senior Technical Account Manager