Memory leaks when upgrading from Aurora MySQL 2.10.0 to 3.01.0

0

We recently upgraded a cluster from MySQL 5.7 (Aurora 2.10.0) to MySQL 8.0.23 (3.01.0) using a snapshot restore, as we could really benefit from the new features of MySQL 8. The moment the new cluster became available, memory usage on our writer instance started decreasing until it started consuming swap, and then once it ran out of swap, AWS initiated an automatic failover. This cycle has been repeating since we've performed the upgrade.

We have a separate test cluster we upgraded (using the same snapshot), and it does not appear to be experiencing the same issue. That cluster does not have much load on it, and also wouldn't be used for replication to other databases.

We tried increasing our instance sizes (db.t3.large to db.r5.large), turning off our cross region read replicas, turning off our DMS tasks, tweaking some memory settings, and opened a support ticket, but so far none of these avenues have helped us out at all.

Has anyone else run into this issue? Are there other troubleshooting steps we could try?

preguntada hace 2 años896 visualizaciones
3 Respuestas
1
Respuesta aceptada
AWS
respondido hace 2 años
profile picture
EXPERTO
revisado hace un mes
  • We just upgraded to Aurora 3.02.0 and turned our binlogs back on, and it appears that the memory usage is remaining stable. Thanks!

1

Hi,

I understand troubleshooting an issue of this nature can be frustrating, but we're here to help. Considering the issue is specific to an AWS Account/Aurora Cluster, we will need to review the cluster metrics and logs to understand where memory is being utilized. I see an investigation is on-going through the support case created on this topic. I have triggered an internal engagement with the Aurora development team to aid with investigating the behavior you're seeing with memory consumption.

Please refer to the case in the Support Center for the latest updates: https://console.aws.amazon.com/support/home#/

If you have any other questions or concerns, please engage us via your case as this is being actively monitored for the quickest turn around time. Thank you for your understanding and patience in working with us concerning this.

AWS
Shiv
respondido hace 2 años
1

Same issues here. We raised a ticket with AWS who (eventually) responded confirming it's a bug with Aurora 3.0.1.0. A potential workaround is to disable binlog replication which isn't an option for us. A fix is pending "some time in Q1 2022", which is incredibly helpful when you're relying on this for production use.

respondido hace 2 años

No has iniciado sesión. Iniciar sesión para publicar una respuesta.

Una buena respuesta responde claramente a la pregunta, proporciona comentarios constructivos y fomenta el crecimiento profesional en la persona que hace la pregunta.

Pautas para responder preguntas