Lost connection to MySQL server when scaling RDS Aurora Cluster

0

Hello, we're having some difficulty with our primary AWS Aurora Clusters and the various webservers that connect to them. Whenever a scaling operation occurs - both scaling up and scaling down - a large number of connections to the database are rejected ( 2,000+ ). We have noted the following errors from the RDS event log and our webservers:

The DB cluster failed to scale from 64 capacity units to 32 capacity units for this reason: A scaling point wasn’t found.

OperationalError: (2013, "Lost connection to MySQL server at 'reading initial communication packet', system error: 0")

We have attempted to debug this problem with the following steps, but nothing has helped us to resolve it so far:

  • Reviewed MySQL error logs in Cloudwatch ( these only provided the server-level errors, nothing from RDS specifically )
  • Turned off general MySQL logging
  • Ensured Apache / MySQL connection timeouts did not exceed the RDS Autoscaling Timeout
  • Verified all VPC, IGW, Route Table, and subnet settings associated with the instances connecting to the database.

Below is some information regarding the cluster.

Cluster Version:

  • Aurora MySQL (compatible with MySQL 5.6.1.22.3)

Capacity Settings:

  • Minimum ACUs 16 - 32 GiB RAM
  • Maximum ACUs 64 - 122 GiB RAM

Additional Scaling Configuration:

  • Autoscaling timeout and action: 00:05:00
  • Do this: Roll back the capacity change

Any advice or insight into possible avenues of troubleshooting or debugging would be highly appreciated at this point.

1 Answer
2

By default, Aurora Serverless v2 needs scaling point to scaling up and down to meet required capacity. The Force the capacity change option isn't selected by default. Keep this option clear to have your Aurora Serverless v1 DB cluster's capacity remain unchanged if the scaling operation times out without finding a scaling point. Selecting this option causes your Aurora Serverless v1 DB cluster to enforce the capacity change, even without a scaling point.

Lost connection to MySQL server at 'reading initial communication packet this error might be caused due to the lack of capacity on Aurora Serverless side. Please consider enabling Force the capacity change option. But please be careful, and recommend review the below document before tern this setting ON.

Please see more detail on our document: https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-serverless-v1.how-it-works.html#aurora-serverless.how-it-works.timeout-action

AWS
answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions