Elevated CPU Utilization on Aurora Serverless V2 MySql

3

Hello, I notice that beginning on March 29 the CPU usage on our Aurora Serverless V2 (MySql) instance increased and has not come back down. The previous average was around 16% for many weeks. Then it suddenly went up to average between 20% and 35% and almost never dips below this. This happened on March 29 at 04:12 AM UTC-4.

Can you please tell me if something changed on the configuration of the server or if there is a problem with it that is causing this?

The instance that I am referring to is called ptchat-instance-1.

Thank you!

CPU Usage Chart

  • Update… The problem has persisted, so I contacted the AWS paid support group (went ahead and purchased a subscription). They confirmed the problem is on their end and they are working on a resolution. They also said when the solution is rolled out I should contact the billing team and ask for credits from several months of elevated charges. Let’s hope they can resolve this bug because it’s fundamental to Aurora V2.

  • Thanks for sharing this update, mojo-alan. I opened two cases with AWS a few weeks ago, but they didn't admit an issue on their side. Please update this post again when you are notified that the issue has been resolved. Cheers

  • I'm also facing this issue. Two random spikes of increased CPU and ACU usage for 15+ hours. Checked my internal logs and haven't had any user activity to trigger the spikes. This is a pretty gnarly issue and can cause a lot of extra expenses for businesses. I find it funny that we have to pay to submit a ticket and hopefully get a refund for the RDS usage. First time I faced this issue, I had to reboot the entire cluster (rebooting instance didn't fix spike), which caused an outage. After rebooting, the utilization went back to normal. I'm hesitant to do that again for this new spike.

asked a year ago1604 views
6 Answers
1

Update... Today we upgraded from Aurora MySQL 3.02.2 to Aurora MySQL 3.03.1. After doing this update there became available an operating system update. I do not know if updating MySQL triggered the availability of the OS patch, but I am positive it was not there about one hour earlier as I went looking for any such update or patch and found none.

After the OS patch finished installing (it took maybe 10 minutes with a few minutes of actual SQL downtime, so be prepared for that) we are seeing absolutely no issues. The ServerlessDatabaseCapacity dropped nicely back down to 0.5 and the CPU and ACU usage are way down in the 15% range. Also we had been seeing lots of aborted connections, like 5 per minute, and those are gone as well. Those aborted connections were made by the user rdsadmin, so they were definitely part of general Aurora plumbing that was going awry. So I think the team updated all that plumbing with the OS patch, and that resolved all this. That's what I think, anyway. I have not yet received confirmation from the technical folks as to this being the root cause. Of course, I will continue to monitor everything and hope that the problem does not reoccur in a few days, as it had previously. But the complete absence of the aborted connections helps me feel that this issue may be behind us. Standing by...

answered a year ago
  • Thanks for this update! I'm facing your EXACT issue, and will be upgrading our cluster to 3.03.1 during our slow hours. I will respond with a comment if that fixes my issue as well.

  • After applying the OS patch, we saw a drop in both rdsadmin user disconnections and ACU utilization. Let's hope this fix is stable and this behavior lasts forever!

  • Upgraded to 3.03.1 and haven't had any issues for a week and a half. I talked to support and they will not provide me a refund for neither the RDS charges, nor the support fee. Seems silly. Anyways, this "fix" has been working and hoping it will continue to stay low. Thanks again for this "solution".

1

We have got precisely the same problem. Starting from the end of March, without any changes to our workload, the CPU consumed by Aurora Serverless v2 is increased and remains stable at 3 times than it was before.

See the cost and ACU utilization charts below:

Enter image description here

Enter image description here

answered a year ago
  • Ahh, so we are not alone. Thanks for posting. Here's something... If I look at the Queries count/sec. I can see there is usually about 11 queries per second, even when we are not using the system at all. Is there a way that I can see a trace or the stream of queries that are happening? I need that for general application tuning, as well as security reviews, and it would be good to have it for this kind of performance inquiry as well.

1

Update... Last night it seems that the CPU spiked to 94% for a few minutes, and then the whole thing simmered down to where it was before and even lower. I did nothing. Here is the same 2-week chart. You can see the recent change early this morning, followed by what looks like extremely healthy CPU usage today. I'll .. um .. take it. But I will have to keep an eye on this going forward.

CPU Usage Chart

answered a year ago
0

Hello - Please use this link which has a detailed explanation on how to troubleshoot high resources consumption - https://repost.aws/knowledge-center/rds-instance-high-cpu

Regards, -Praveen

AWS
answered a year ago
  • I have already worked through much of that document, and I will continue to do so. But it is hard to know if that document pertains to Aurora Serverless V2. There is only one reference to Aurora V2.

    My question is not "how can I troubleshoot my Aurora Serverless V2 instance" but rather "Can you please tell me if something changed" which I would have no means of knowing unless I asked directly. The fact that I have provided the exact date and time of when this CPU usage began to be elevated should be helpful for the right person to review any releases or changes that took place at that time. Thoughts?

0

I'm going to wait until my June bill gets created and then it will be easy to prove to AWS that this problem caused increased RDS costs, and for which months exactly. One of the support team mentioned asking the AWS Billing team for a refund for that amount. They seemed to go dark on that comment later on in my correspondence with them, but they definitely mentioned it at first.

answered 10 months ago
  • For my case, the support is checking with internal team and yet to give me an clarification on this issue. btw, may i know is your utilisation still in high level or you did upgrade to Aurora MySQL 3.02.2?

0

yeah, i got sudden spike on my CPU & ACU utilisation that costs me 1000% higher than average. I wonder can AWS refund my back my money
Enter image description here
Enter image description here

answered 10 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions