RDS Aurora Mysql Multi-master downtime and best practices

0

Hi all,

---Context--- A customer is currently using a large RDS Aurora cluster (12 replicas, 6x r5.12xlarge and 6x r5.4xlarge ) for their production environment. This cluster is currently part of a monolith that is proactively (and slowly) being broken down into smaller applications with independent data stores. This will still take months/years to complete due to competing priorities on their end

---Challenge--- Over the past few months the customer has performed a few database restarts due to either engine upgrades or different database parameter tuning. The customer would like to evaluate multi-master or any other alternative that mitigates service downtime as much as possible for future upgrades or restarts.

---Questions---

  1. Is multi-master (2 nodes) + 12 additional read replicas an option at all?
  2. If we ever implement a multi-master approach keeping the remaining replicas as readers, how does a database upgrade/restart affect the service? are all the database notes rebooted as well as it happens with a regular single-master cluster?
  3. The customer application is not built for a multi-master active-active approach as they won't be able to handle deadlocks at the application level. Is a multi-master active-passive an option for fail-over?
  4. Do we have any other recommendation/architecture for managing database upgrade/restarts that would help minimizing downtime?

Thanks!

AWS
gefragt vor 4 Jahren2781 Aufrufe
1 Antwort
0
Akzeptierte Antwort

1), 2)

https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-multi-master.html#aurora-multi-master-overview

In a multi-master cluster, all DB instances can perform write operations. The notions of a single read/write primary instance and multiple read-only Aurora Replicas don't apply.

The multi-master does not have read replicas and two nodes R/W.

Because binlog replication is not available, EC2 can also be used for replication Not possible. As a result, it is currently difficult to scale out read workloads on multi-master.

Yes, active-passive workloads minimize any downtime for write operations. However, if one of the nodes dies, the mechanism for accessing the other node is the application's It is a responsibility. Cluster endpoints are not used for DML in Multi-master.

Check out those other limitatons.

https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-multi-master.html#aurora-multi-master-limitations

What seconds of downtime is acceptable? You may want to review DB connection management first. There are best practices for DNS caching, Smart drivers, etc.

https://d1.awsstatic.com/whitepapers/RDS/amazon-aurora-connection-management-handbook.pdf

beantwortet vor 4 Jahren
profile picture
EXPERTE
überprüft vor 6 Monaten

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen