- 最新
- 最多得票
- 最多評論
For storing Zstd-compressed data in DynamoDB, you should use the Binary attribute type rather than String, even though you're considering Zstd instead of GZIP or LZO.
Binary type attributes are specifically designed to store any binary data, including compressed text. When you use the Binary type, DynamoDB properly handles the data as binary content rather than text. Your applications would need to encode the binary values in base64 format before sending them to DynamoDB, and upon receipt, DynamoDB will decode the data into an unsigned byte array.
While you could technically store base64-encoded compressed data in a String attribute, this approach is less efficient because:
- It doesn't leverage the proper data type semantics
- Base64 encoding increases the size by approximately 33% compared to storing it as a Binary attribute
Regarding your table structure, your approach of using a "ref" attribute to indicate whether data is stored inline or in S3 is a good practice. When the compressed data exceeds DynamoDB's 400KB item size limit, storing it in S3 and keeping only the reference in DynamoDB is the recommended approach.
For your specific schema:
- PK (String) and SK (String) are appropriate
- ref (String) with values "inline"/"ref" is a good approach
- payload should be Binary when ref="inline" (not String)
- When ref="ref", your S3 location can be stored as a String
While Zstd may perform better in your benchmarks than GZIP or LZO, the choice of compression algorithm doesn't change the recommendation to use Binary attribute type for storing the compressed data in DynamoDB.
Sources
Best practices for storing large items and attributes in DynamoDB - Amazon DynamoDB
Supported data types and naming rules in Amazon DynamoDB - Amazon DynamoDB
DynamoDB low-level API - Amazon DynamoDB
Maximum Item Size in DynamoDB for Storing Base64 Images | AWS re:Post
DynamoDB overview - AWS Prescriptive Guidance
相關內容
- 已提問 3 個月前

@MikeLim Thanks. One question.
Do you recommend storing the S3 Path in the payload itself in binary format, or shall I create a separate attribute?