- Newest
- Most votes
- Most comments
I haven't heard anything further back from AWS on any of the (multiple) "reach out" cases that have been opened for this issue yet, or here, but I have figured out what's triggering the bug in Aurora.
Steps to Reproduce:
- Configure a Aurora MySQL global database, with write forwarding enabled from secondary to primary region.
- Create a basic Entity Framework Core app that performs 2 writes against the secondary region via Dapper direct SQL statements.
Expected Behaviour
Both writes succeed and are forwarded and committed in the primary region.
Actual Behaviour:
The second write triggers an assertion error on the secondary instance, causing an instance restart - the write, and all subsequent reads fail for a period of time until the instance restarts. Enabling error logging on the secondary cluster, the following assertion error is observed in the secondary instance log:
2022-12-19T03:45:38.970437Z 137 [ERROR] [MY-013183] [InnoDB] Assertion failure: fwd_cmd.cc:2266:thd->variables.aurora_using_thread_pool_connection_handler == true thread 70368986431216 (ut0dbg.cc:57)
On the application side, the write fails with a MySQL protocol error:
MySqlConnector.MySqlException (0x80004005): Failed to read the result set. ---> System.IO.EndOfStreamException: Expected to read 4 header bytes but only received 0.
Suspected Issue:
Entity Framework core uses https://mysqlconnector.net/ under the hood, which provides a connection pool to the application, and crucially per the documentation at https://mysqlconnector.net/connection-options/ - the Pooling, and ConnnectionReset options are both defaulted to true, this triggers the following sequence of events:
- First write requests a connection
- A new connection is opened to Aurora.
- The first write sets the aurora_replica_read_consistency session variable successfully and performs its write.
- The first write completes and the connection is returned to the idle pool.
- The second write requests a connection
- EF/MySqlConnector return the existing idle connection from the pool, issuing a Reset instruction to Aurora
- This triggers the Aurora bug and the secondary instance asserts, crashes and reboots
- The subsequent Write fails as shown above
The issue can be worked around by disabling either Pooling or ConnectionReset in the connection string, however both of these are desired options in general, and even with this bug avoided, we are still hitting the subsequent issues below.
It would be great to get confirmation from AWS that you agree this is an Aurora issue, and an ETA for when it will be fixed?
Relevant content
- asked 2 months ago
- asked 2 years ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 3 years ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 2 years ago