Skip to content

Downtime of Switching Storage Configuration on Aurora PostgreSQL with NVMe-based Reader Instance

0

Hi re:Post community :)

I am currently using an Aurora PostgreSQL Cluster with the following setup:

  • Writer Instance: r6g.2xlarge
  • Reader Instance: r6gd.2xlarge (Aurora Optimized Reads Instance)
  • All clients connect via RDS Proxy, using separate read/write and read-only endpoints.

I am considering switching the storage configuration from Standard to I/O-Optimized due to high I/O costs.

According to the AWS documentation, there is no downtime when switching between Aurora Standard and Aurora I/O-Optimized.

However, another AWS document states that switching between I/O-Optimized and Standard clusters on an NVMe-based DB instance class causes an immediate database engine restart.

Given my current configuration, will changing the storage configuration restart my r6gd.2xlarge instance, potentially causing downtime? Or, since I'm using RDS Proxy, will the read traffic be routed to the writer instance during the reader restart, and then back to the reader once it completes?

Handling all read/write traffic temporarily with just the writer instance is not a problem for my workload, as we have previously managed all read traffic through the single writer instance.

Thank you for any insights you can provide!

asked a year ago2.9K views
2 Answers
2
Accepted Answer

Hello.

Since db.r6gd provides NVMe-based storage, I believe a reboot will occur.
I think there is a possibility that downtime will occur if a reboot is performed.
Therefore, I think it is better to create a test environment and check before making changes in the production environment.
https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.DBInstanceClass.html

db.r6g – Instance classes powered by AWS Graviton2 processors. These instance classes are ideal for running memory-intensive workloads in open-source databases such as MySQL and PostgreSQL. The db.r6gd type offers local NVMe-based SSD block-level storage for applications that need high-speed, low latency local storage.
You can modify a DB instance to use one of the DB instance classes powered by AWS Graviton2 processors. To do so, complete the same steps as with any other DB instance modification.

EXPERT
answered a year ago
EXPERT
reviewed a year ago
EXPERT
reviewed a year ago
  • In the case of db.r6g, the change was completed in about 30 seconds. Also, the changes were completed immediately and no reboot was required. a
    a In the case of db.r6gd, it was confirmed that a reboot was performed. a
    a

  • Thank you for the answer. I plan to add another r6g.2xlarge reader instance before changing the storage configuration. After the change is complete, I will delete the r6g.2xlarge instance.

1

Apologies... the docs could be clearer on the issue of restarting when switching the storage configuration between "Standard" to "I/O-optimized". When using a 'd' instance with local NVMe storage (R6id or R6gd), the local NVMe storage is used differently depending on the storage configuration in use.

In "Standard" storage, 90% of the NVMe is used for larger and faster temp space locally. This works out to a local NVMe-based volume about 6x memory size as compared to a remote EBS volume sized at 2x memory size for a non-'d' instance (R6i or R6g in this case). However, when using the "I/O-Optimized" storage configuration, that same NVMe volume has now two purposes. We allocate about 2x memory size locally on the NVMe volume for faster local temp space and about 4x memory size for a "tiered buffer cache" so when database page buffers are evicted out of the RAM-based shared buffers, they "age" into the tiered buffer cache in NVMe. This effectively creates a much larger local buffer cache helpful in read workloads with large working sets of data. In both storage configurations, between 10-20% of the total NVMe volume size needs to be reserved to manage the normal effects of "write amplification" in SSD devices. If you are unfamiliar with "write amplification", Wikipedia has a decent explanation at https://en.wikipedia.org/wiki/Write_amplification.

So to come back to your original question about restarting... Due to the need to restructure the NVMe volume's usage differences when changing the storage configuration as well as altering the internal database engine's memory data structures, a restart is required when switching the storage configuration between "Standard" to "I/O-optimized".

AWS
answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.