I'm experiencing issues when I use read replicas in my Amazon Aurora MySQL-Compatible Edition DB instance. I want to troubleshoot these issues.
Resolution
Promote an Aurora MySQL-Compatible read replica
If the writer instance requires a reboot or maintenance, then perform a manual failover to promote a read replica as a writer instance.
To perform a manual failover, complete the following steps:
- Open the Amazon RDS console.
- In the navigation pane, choose Databases.
- Select the writer instance for your Aurora DB cluster.
- Choose Actions, and then choose Failover.
If the writer instance is unavailable, then Aurora MySQL-Compatible automatically fails over to a read replica instance. A writer instance can become unavailable for multiple reasons, such as resource contention or maintenance activity.
If you have multiple readers, then specify a promotion priority tier for each instance that's in your cluster. When the writer instance fails, Aurora MySQL-Compatible promotes the replica with the highest priority as the new writer.
You can also promote a cross-AWS Region Aurora replica as a standalone DB cluster. After you initiate the promotion process, the cross-Region replication stops. The promoted cluster functions as an independent DB cluster and manages both read and write operations.
Measure replication lag
Because all Aurora DB instances in a DB cluster share a common data volume, there's minimal replication lag. However, you might experience slightly increased lag on the readers in some scenarios.
Note: Cross-Region replicas use binary log replication. Change and apply rates and delays in network communication between the selected Regions can affect cross-Region replicas. Cross-Region replicas that use Aurora MySQL databases have a typical lag of less than 1 second.
To measure replication lag, use the following Amazon CloudWatch metrics:
- The AuroraReplicaLag metric measures replica lag between the writer and reader node in milliseconds in the same Region.
- The AuroraBinlogReplicaLag metric measures replica lag between Aurora DB clusters that use binary logs.
For more information about the preceding metrics, see Instance-level metrics for Amazon Aurora.
Improve replication performance
Take the following actions:
- To avoid heavy workloads on the reader instances, it's a best practice to make all instances in a cluster the same size. When the reader instance is smaller than the writer instance, the volume of changes is too much for the reader to match.
Note: If there's heavy workload on the writer instance, then you might notice temporary read replica lag. After the reader instance matches the writer instance, the lag reduces.
- To avoid replication lag when long-running transactions are in progress, run your transactions in smaller batches and frequently run commits.
For information about how to use native binary log-based MySQL replication to troubleshoot replica lag, see Amazon Aurora MySQL replication issues.
Troubleshoot high replication lag
Use the AuroraReplicaLag CloudWatch metric to check high replication lag. High replication lag can cause a reader instance to restart. To resolve this issue, see Why did my Amazon Aurora read replica fall behind and restart?
Set up GTID-based replication
Aurora doesn't use native binary log replication to replicate data to read replica instances. You can't use a global transaction identifier (GTID) to replicate data between instances in the same cluster. However, you can set up GTID-based replication in some scenarios. For more information about how to use GTID-based replication in Aurora MySQL-Compatible, see Amazon Aurora for MySQL compatibility now supports global transaction identifiers (GTIDs) replication.
For Aurora MySQL-Compatible versions 3.04 and later, multithreaded binary log replication is activated and replica_parallel_workers is set to 4 by default. Because multithreaded binary log replication is activated, you must increase the resilience of your database against an unexpected halt. It's a best practice to activate GTID replication on your source and allow GTIDs on replica.
Note: You can set up GTID-based replication between Amazon Relational Database Service (Amazon RDS) for MySQL and an Aurora cluster and between Aurora Cluster. The source must be an external primary server. Before you start the replication process, be sure to activate binary logging on the source.
For more information about GTID, see GTID format and storage on the MySQL website.
Related information
Replicating Amazon Aurora MySQL DB clusters across AWS Regions
Replication with Amazon Aurora