How does Aurora MySQL or PostgreSQL distribute workload between reader nodes?

2 minute read
2

I want to know how Amazon Aurora for MySQL or PostgreSQL distributes workload between reader nodes.

Resolution

Use the reader endpoint to connect to the reader nodes in an Amazon Aurora cluster. 

To verify the rotation distribution of connections, run a shell command. To see how Aurora tries to load balance your connections to the reader endpoint, run a command similar to the following example:

while sleep 5; do dig xxx-cluster.cluster-ro-xxxxxx.us-east1.rds.amazonaws.com|grep CNAME; done

Each time you resolve the reader endpoint, you get an instance IP address that you can connect to. This IP address is chosen based on the rotation distribution.

The DB connection to each read replica might not be evenly distributed in the following scenarios:

  • A client caches DNS information. The uneven distribution happens when the client uses cached connection settings to connect to the same Aurora replica. DNS caching can occur anywhere, including your network layer, the operating system, or the application container.
  • If the DB instance is failing over, then the reader endpoint might redirect connections. The reader endpoint might temporarily direct connections to the primary DB instance for the DB cluster. This redirect happens when an Aurora replica is promoted to the primary DB instance.
  • The read replica is unavailable or fails a health check.
  • The application is written in Java, and you don't turn off or adjust the TTL caching. When you don't turn off or adjust TTL caching, Java virtual machines (JVMs) might indefinitely cache DNS. For more information, see Setting the JVM TTL for DNS name lookups.
  • When the connections occur at the same time, the connections are sent to the same reader endpoint.

When you're managing your workload distribution, use custom endpoints for more flexibility. For example, you can use custom endpoints if you use different DB instance sizes within the cluster. 

Related information

Amazon Aurora connection management

AWS OFFICIAL
AWS OFFICIALUpdated 8 months ago
2 Comments

If a client caches DNS information, you might see a discrepancy in the distribution of the connections. This happens when the client connects to the same Aurora replica using cached connection settings.

This DNS caching aspect can actually have benefits for cache utilization. If the same client makes several connections in quick succession and queries the same tables, it's likely to go to the same database server and get the data out of that server's buffer cache. Which is faster (better cache utilization) and cheaper (less I/O) than if every single connection goes to a different reader.

johrss
replied 8 months ago

Thank you for your comment. We'll review and update the Knowledge Center article as needed.

profile pictureAWS
MODERATOR
replied 8 months ago