Skip to content

Aurora RDS Storage and Connection issues

0

I am running my applications on Aurora RDS MySQL 8.0 with two instances in the cluster of the type r7g.large.

I am encountering two issues that I do not seem to be able to identify their root causes: 1- Too many connection errors: every now and then, the application starts reporting too many "too many connections" errors. I checked metrics like DB connections, and they are rated at a maximum of 120 during incidents, and my max_connections parameter is at 1000, which is odd. At the same time, all other metrics like CPU utilization, Freeable memory, and Free local storage are all at acceptable values of 40%, 4.4GB, and 30GB, respectively.

:{"code":1040,"error_message":"SQLSTATE[08004] [1040] Too many connections","error_trace"

DB_Connections_with_no_spike_when_the_error_happened

2- Storage Issues: I am receiving this error in the logs:

**Due to storage space constraints, the log file mysql-slowquery.log will be deleted and will not be uploaded to CloudWatch Logs ** I am receiving this every five minutes which is causing too many disturbance, should not the aurora storage dynamically scale? my whole cluster is at only 200GB so it is way below the storage limit. the Aurora Storage reported error free_local_storage_nealry_the_same

1 Answer
-1

For your "too many connections" errors:

This issue could be occurring despite your metrics showing only 120 connections when your max_connections parameter is set to 1000. There are several possible explanations:

  1. Connection leaks in your application: Your application might not be properly closing connections, leading to a buildup of idle connections that aren't actively being used but still count against your limit.

  2. Inefficient connection pooling: If your application isn't using connection pooling effectively, it might be creating too many new connections rather than reusing existing ones.

  3. Temporary connection spikes: Brief spikes in connection attempts might occur between your monitoring intervals, causing the error without being captured in your metrics.

  4. Connection distribution: If you have multiple instances in your Aurora cluster, the connections might not be evenly distributed, causing one instance to reach its limit while the overall cluster appears to have capacity.

For your storage issues with the slow query logs:

The error about the mysql-slowquery.log file is related to local storage constraints, not the Aurora cluster storage. While Aurora storage does automatically scale for your database data, each DB instance has a fixed amount of local storage for logs and temporary files. This local storage doesn't automatically scale.

The slow query logs are being stored on the local instance storage, and it appears this storage is filling up. When this happens, Aurora will delete older log files to make space, which is what the message is indicating.

To address these issues:

For connection issues:

  • Implement proper connection pooling in your application
  • Check for and fix any connection leaks
  • Monitor connections more frequently to catch potential spikes

For storage issues:

answered 15 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.