What are the factors that influence downtime for Amazon Aurora DB clusters?

5 minute read
0

I want to understand why my Amazon Aurora DB cluster is in downtime.

Short description

Your Amazon Aurora DB instances might be in downtime for a number of reasons. The main factors that influence downtime include:

  • Engine version upgrades
  • DB cluster failovers
  • Maintenance tasks
  • DB cluster or instance reboots
  • Modifying specific settings on your DB cluster or instance

Resolution

Note: If you receive errors when running AWS Command Line Interface (AWS CLI) commands, make sure that you’re using the most recent AWS CLI version.

Engine version upgrades

Engine version upgrades include major and minor version upgrades. Both major and minor version upgrades cause downtime for your entire Aurora DB cluster. Before you upgrade a production DB cluster, it's important that you test the upgrade process on a test DB cluster. Verify the duration of the process, and then validate your applications before you perform the upgrade.

You can also use Amazon Relational Database Service (Amazon RDS) blue/green deployments to upgrade the major or minor version of your cluster. Downtime typically lasts less than one minute for upgrades when using blue/green deployments.

Automatic minor version upgrades

Automatic minor version upgrades cause downtime for your entire Aurora DB cluster. These automatic minor version upgrades are applied during the cluster maintenance window. If you don't need this feature then turn automatic minor version upgrades off on your DB instances.

For more information, see Upgrading the minor version or patch level of an Aurora MySQL DB cluster.

Note: Turning on the automatic minor version upgrade feature itself doesn't cause downtime during the modification. The downtime occurs only when Aurora applies the automatic upgrade.

DB cluster failover

If your Aurora DB cluster has one or more Aurora replicas, then the replica is promoted to the primary instance during failover events. A brief downtime occurs, and read and write operations fail with an exception. Service typically restores in less than 120 seconds, and often less than 60 seconds.

To increase the availability of your DB cluster, create one or more Aurora replicas in two or more different Availability Zones (AZs). For more information, see Fault tolerance for an Aurora DB cluster.

Maintenance tasks for your Aurora DB cluster

Some maintenance tasks, like updates to the operating system (OS) or database patching, cause your DB cluster to go offline for a short period of time. For more information, see Maintaining an Amazon Aurora DB cluster.

Maintenance window

Downtime doesn't inherently occur when you modify the maintenance window. But your DB cluster might have one or more pending actions that do cause downtime. If you change the maintenance window, then you apply the pending actions immediately, and downtime occurs. For more information on modifying your maintenance window, see What do I need to know about the Amazon RDS maintenance window?

DB cluster or DB instance reboots

Rebooting a DB cluster or DB instance causes downtime. The time required to reboot each DB instance in your cluster depends on the database activity at the time of reboot. Downtime also depends on the recovery process of your specific DB engine. For more information, see Rebooting an Amazon Aurora DB cluster or Amazon Aurora DB instance.

Modifying the DB instance class

When you modify the DB instance class of your instance, downtime occurs on the specified DB instance but not the entire cluster. For more information on instance classes, see Aurora DB instance classes.

Attaching a new DB cluster or DB parameter group

When you modify the DB cluster or DB parameter group that's attached to your DB instance, downtime doesn't occur automatically. But, to apply changes to a DB cluster parameter group, you must reboot the primary DB instance in the cluster. For DB parameter groups, you must reboot the instance to apply changes. The reboot itself causes downtime. For more information, see Associating a DB cluster parameter group with a DB cluster and Working with parameter groups.

Modifying specific settings on your DB cluster or instance

Modifying parameter settings in a DB cluster or DB parameter group

Database parameters are either static or dynamic. When you modify a static parameter setting in a DB cluster or DB parameter group, the parameter change takes effect after you manually reboot the DB instances in each associated DB cluster. Downtime occurs during the reboot.

But when you modify a dynamic parameter setting in a DB cluster or DB parameter group, the changes are applied to your DB cluster immediately. The instance isn't rebooted when you modify dynamic parameters, so there is no downtime.

For more information, see Working with parameter groups.

Modifying the DB instance identifier

Downtime occurs when you modify the DB instance identifier because the DB instance is rebooted.

Modifying the database port

Downtime occurs when you modify the database port that you want to use to access your DB cluster. This happens because all of the DB instances in the DB cluster reboot immediately.

Modifying the certificate authority

You might want to modify the certificate authority (CA) for the server certificate used by your DB instance. In this use case, downtime occurs if the DB engine doesn't support rotation without restart. Use the describe-db-engine-versions AWS CLI command to check if the DB engine supports rotation without restart.

For more information on which settings for Aurora influence downtime or not, see Settings for Amazon Aurora.

Related information

Performing major version upgrades for Amazon Aurora MySQL with minimum downtime

AWS OFFICIAL
AWS OFFICIALUpdated a year ago