AWS RDS Backup Storage Usage: Reason for High Usage

0

I have implemented a backup policy for my AWS RDS database, which includes hourly, daily, and monthly backups, with a retention period of 35 days for hourly snapshots, one year for monthly snapshots, and seven years for yearly snapshots. Additionally, I have configured multi-region backups to copy backups to other regions. However, I am experiencing a unexpectedly high storage usage for my AWS RDS backups, and I'm hoping to get some guidance on the reason for this issue. Our RDS database has a total storage size of 2TB. The total backup monthly storage usage reported in the AWS Cost Explorer is around 80TB. The unexpected 80TB storage usage is likely due to the hourly snapshots, as indicated by the data transfer metrics to other regions, which show high data transfer rate during hours when the hourly snapshots are taken, and monthly data transfer is 70TB. During peak hours we see 350 GB data transfer which implies the hourly incremental snapshot size is 350GB. It is unlikely that the hourly incremental snapshot size would be 350GB if the daily growth rate of the database is only 9GB. Can anyone offer any guidance on the reason for this high storage usage?

2 Answers
1

There is a difference in daily growth rate and the daily change rate.

As documented, if you make periodic snapshots of a volume, the snapshots are incremental. This means that only the blocks on the device that have changed after your most recent snapshot are saved in the new snapshot.

You can have a scenario where you insert 50 rows, delete 50 rows and update 100 rows, the growth rate is 0, but the change rate is 200.

If the same block changes at least once in an hour every hour, it will be included in the 24 hourly snapshots.

A good example would be SQL Server tempdb. A large very active tempdb would cause high change rates that could impact the snapshot size. The solution for this particular case would be an instance with instance store support.

AWS
answered a year ago
  • Thank you for your answer, yes it is a very active database but I have also checked the writethroughput metric and did not see any anomaly that can cause that change every hour. I ended up to start thinking it may be because of database log and tmp files

1

Hello There,

I understand that you have implemented a backup policy for your AWS RDS DB instance with multi-region backups as well however I can see you have noticed very high backup storage usage against your DB instance.

Considering as you have mentioned you have a 2 TB of storage size and you have noticed around 80 TB of backup monthly storage usage via the AWS Cost Explorer out of which you have also seen 350 GB of hourly data transfer for backup storage which is unlikely considering when the database has only 9 GB daily increment and you are seeking some guidance regarding the same.

To begin with, I noted down and considered the database daily backup storage usage and I could do an approximate ideal usage for your DB instance below for one month.

Firstly, considering you have configured hourly increment backup as well I have approximated 2 GB increment per hour as a max value and did the calculations below for one month:

Hourly backup storage usage (per month) => 2 GB (~hourly increment) x 24 (hours) x 30 (days) = 1440 GB

Mostly backup storage usage (per month) => 10 GB (~daily increment) x 30 (days) = 300 GB

Yearly backup storage usage (per month) => 3600 GB (~yearly increment) x (1/12) (for only one month) = 300 GB

Considering the above approximate calculations we can see ideally your RDS DB instance should have used only 2 TB for Backup storage for one region. Moreover, since you enabled multi-region backup storage as well we can approximate your RDS instance should have only used 4 TB or 6 TB for having 2 or 3 regions for your backup storage.

This is because, Once the snapshot is copied, standard database snapshot charges will apply to store it in the destination region as well.

However, from your concern I can see you have a 80 TB for backup storage usage which is very unlikely considering the above approximate calculations. And this could be possible only if you have a very high hourly/daily increment as you have mentioned which would in-turn increase the backup data transfer and storage usage.



——————————— Recommendations ———————————

As we cannot exactly track down the backup storage usage from your DB instance as of now. Here, I would like to advise you to track and monitor the ‘FreeStorageSpace’ Metric available in the RDS instance both of hourly and daily usage and note down the steep decline and corresponding timestamps in the metric for the same.

[+] : More about FreeStorageSpace metric - https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/rds-metrics.html#rds-cw-metrics-instance

Furthermore, you can match the timestamps to other available metrics in the RDS Instance such as WriteThroughput, ReadThroughput, WriteIOPS and ReadIOPS to analyse if there is any specific load and high write/reads on the DB Instance which would indicate the high storage increments in the database instance or if there is an anomaly.

You can refer the below handy documentations as well for your future reference-

[+] RDS Performance guidelines - https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/MonitoringOverview.html#MonitoringOverview.guidelines

[+] : Viewing DB Instance metrics - https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/metrics_dimensions.html

[+] : Demystifying Amazon RDS backup storage costs - https://aws.amazon.com/blogs/database/demystifying-amazon-rds-backup-storage-costs/

That being said, since this issue might need deeper investigation you can always open a support case with AWS Support Engineering Team. Certain non-public information might be required such as your RDS instance and metric details and information regarding your database usage hence you can open a support case with AWS using the following support link and this anomaly/issue can be investigated for further troubleshooting.

[+] : AWS Support team - https://console.aws.amazon.com/support/

AWS
answered a year ago
  • Thank you for your wonderful answer. I have already read all documentation and the links you shared and also checked the metrics. I can only say it is a very active database, although writethroughput metrics does not show me that high data change (a couple of GB daily) but we have millions of select statements daily. I believe it is not because of database tablespace, it is because of database log and tmp files. Snaphots are likely storing these temporary file changes unnecessarily.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions