Why is my Amazon RDS for MySQL DB instance using more storage than expected?
4 minute read
My Amazon Relational Database Service (Amazon RDS) for MySQL DB instance is using more space than expected. Why is this happening, and how can I optimize disk storage?
You can use the FreeStorageSpaceAmazon CloudWatch metric to monitor the available storage space for an RDS DB instance. But FreeStorageSpace doesn't describe how the DB instance is consuming storage.
Use the following strategies to reclaim storage space:
Run OPTIMIZE TABLE
Tables consume some space that isn't actively in use, but Amazon RDS allocates that space to the tables anyway. If innodb_file_per_table is turned on (it's turned on by default), then you can reclaim that space using OPTIMIZE TABLE. OPTIMIZE TABLE works for InnoDB, MyISAM, and ARCHIVE tables. While Amazon RDS accepts the OPTIMIZE TABLE command, it actually runs the ALTER TABLE...FORCE command instead. When this occurs, you receive a warning message similar to the following: "Table does not support optimize, doing recreate + analyze instead." For more information, see the MySQL documentation for OPTIMIZE TABLE.
To check for fragmentation, run a query like this:
The data_free column highlights the amount of free space allocated to a table that isn't actively in use. You might be able to reclaim this space by using OPTIMIZE TABLE. OPTIMIZE TABLE works if the table is created in a separate tablespace, according to the Amazon RDS default innodb_file_per_table configuration setting. For more information, see the MySQL documentation for File-per-table tablespaces.
Reduce application table storage
To see how much storage is used by application tables on your DB instance, run a query like this:
SUM(data_length + index_length + data_free)/1024/1024 AS total_mb,
SUM(data_length)/1024/1024 AS data_mb,
SUM(index_length)/1024/1024 AS index_mb,
SUM(data_free)/1024/1024 AS free_mb,
COUNT(*) AS tables,
CURDATE() AS today
GROUP BY table_schema
ORDER BY 2 DESC
To locate the largest application table on your DB instance, run a query like this:
(data_length + index_length + data_free)/1024/1024 AS total_mb,
(data_length)/1024/1024 AS data_mb,
(index_length)/1024/1024 AS index_mb,
(data_free)/1024/1024 AS free_mb,
CURDATE() AS today
ORDER BY 3 DESC
Note: The total storage used by an individual database and table can't be calculated if the database includes tables with variable-length columns that are longer than 768 bytes. For example, this includes BLOB, TEXT, VARCHAR, or VARBINARY.
Reduce binary log storage
Adding a read replica causes the source instance's binary log to use additional storage. To find out how much storage the binary log on the source instance is using, check the BinLogDiskUsage CloudWatch metric. A large amount of growth might indicate that one or more Read Replicas are not synchronized.
Reduce or turn off general log and slow query log storage
Note: Turn off the general and slow query logs when you're not actively using them to troubleshoot.
Manage or reduce InnoDB system tablespace size
The system tablespace contains the InnoDB data dictionary and undo space, and it starts out at 10 MB. After space is allocated, the file will always be at least that size, though long-running transactions can consume more available storage.
By default, Amazon RDS sets innodb_file_per_table to 1. This means that the data for each tablespace is stored in its own .ibd file. To recover space that is marked as reusable for related tables, use OPTIMIZE TABLE to resize per table tablespace files, or drop a table.
If innodb_file_per_table is set to 0, all tables are also allocated to the system tablespace. Dropping tables or indexes, or deleting or truncating data from tables allocated in the system tablespace marks the space previously occupied as reusable. But, innodb_file_per_table doesn't free up any space to the file system.
Because it's not possible to shrink the system tablespace in-place, export your current database's data. Then import the data into a new instance. To reduce downtime, configure your new MySQL instance as a replica of the source Amazon RDS instance. When the replica is synchronized with the source Amazon RDS instance, switch to the new instance.
Note: Restoring from a snapshot or creating a Read Replica won't help you recover space from the system tablespace. This is because both methods use a snapshot of the source instance storage volume that contains the system tablespace.