Why is my Redis backup (.rdb) file always smaller than the BytesUsedForCache metric in the ElastiCache cluster?
2 minute read
I'm using Amazon ElastiCache for Redis. Why is my Redis backup (.rdb) file always smaller than the BytesUsedForCache metric in the cluster?
The BytesUsedForCache metric includes bytes for all purposes. This includes actual key size, headers, and memory fragmentation. Expired keys also consume memory until Redis removes them asynchronously, either passively or actively. For more information, see Expire key seconds - How Redis expires keys on the Redis.io website.
ElastiCache uses both fork and fork-less backup processes. Both processes dump all keys on to the disk to create the .rdb file. The .rdb file doesn't use pointers or expired keys, and it doesn't deal with memory fragmentation, metadata, or buffers. This means that the backup files are smaller in size than the BytesUsedForCache metric.
For example, your Redis cluster might show BytesUsedforCache as 15 GiB. It might also show that the backup was initiated during a period of heavy read/write requests on the cluster. After the backup process is complete, the backup cache size might be between 10 GiB or 12 GiB, rather than 15 GiB. This is because the backup doesn't contain expired keys, pointers, and so on.
The cache size in ElastiCache backups is derived from the Redis used_memory at the time of snapshot creation. This is an estimate of uncompressed cache size. If you export a backup snapshot to Amazon Simple Storage Service (Amazon S3), then the compressed serialized file size is exported.
You can verify data integrity of the backup using the info keyspace or DBSIZE command. Compare the number of keys in the original cluster and the cluster restored from the RDB. Make sure that no key insert, deletion, or eviction actions happened between snapshot creation and restore. For more information, see Why does my replica have a different number of keys than its master instance? in the Redis FAQ.