RDS Aurora MySQL 5.6 -> 5.7 upgrade stalled

0

Yesterday at 7am we kicked of an upgrade to an Aurora MySQL database from 5.6 -> 5.7. At the start of the upgrade MySQL reported

Upgrade in progress: Purging undo records for old row versions. Records remaining: 405001890

This updated in the logs every 15 minutes until 3am this morning, when those updates stopped (and still reported 96274094 records remaining).

I can also see from our error logs in cloudwatch that the cluster rebooted at 11am. Since then our Database Connections alarm has been in the insufficient data state.

Since then there have been no updates and the database is still in the "Upgrading" state. I have no visibility on the status of the upgrade or how long it will take to complete. Very shortly after it became apparent the upgrade was going to take a very long time we switched to a clone we had made before the upgrade, but we now have a large provisioned database cluster we can't interact with.

Any guidance would be appreciated...

3 Answers
0

Please open a ticket with support, as they have access to backplane and logs, without access to logs, its very hard to figure out what exactly is going on. How big is the database, Major version upgrade does take several hours to complete.

AWS
answered 2 years ago
0

The best way to proceed forward is to follow up with support as was initially suggested, but here is what is likely happening. Aurora MySQL performs a clean shutdown for the writer DB instance during the upgrade and progress events are recorded every 15 minutes for operations where Aurora purges the undo records for old versions of rows and rolls back any uncommitted transactions. MySQL maintains a history of all undo changes and this "list" can be quite long depending on a number of factors like long running transactions. You can read about the Global History functionality of InnoDB in the link provided below

https://blog.jcole.us/2014/04/16/the-basics-of-the-innodb-undo-logging-and-history-system/.

The command "SHOW ENGINE INNODB STATUS" can give you some information about the length of this list.
Reducing the size of this list and the UNDO records prior to upgrading will reduce the amount of time that Aurora needs to purge these records during the upgrade. As was mentioned, our support team can provide more guidance.

AWS
Peter_S
answered 2 years ago
0

In addition to the advice my colleagues have provided, this Knowledge Center Article has more information about purge and undo records.

AWS
answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions