RDS instance running out of EBS byte balance

2

Hi Hivemind,

I am running a data-crunching workload on DB.t3. medium RDS instance (with a 600GiB gp2 volume and PostGres Engine).

I have observed that the EBS byte balance quickly reaches zero and remains at zero throughout the workload execution. Could someone let me know the effect of this on the workload? Will it degrade the rate of the read /write operations and eventually slow down the DB queries?

Please note that IOPS(at 4000), EBS IO Balance (at 100), and CPU credit balance (at 600) remain healthy throughout the execution.

4 Answers
2

Description of the metric can be found here - https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/rds-metrics.html

Information about EBS optimized instances is here - https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-optimized.html#current. Using the describe-instance-types in that command, I see that the maximum bandwidth for t3.medium is 2085 Mbps. If you're getting EBSByteBalance% as zero, you're somehow consuming a lot of throughput, but not a lot of IOPS. When the byte balance is zero, you should be getting throttled on bandwidth and getting significantly less than the max 2085 Mbps

I would recommend summing up NetworkReceiveThroughput and NetworkTransmitThroughput metrics and monitor them alongside EBSByteBalance%. This would help with the confirmation.

On a side note, I'm not sure how you're getting 4000 IOPS. According to the docs, you should be getting between 1800 and 3000 IOPS - https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_Storage.html.

Saurabh
answered 2 years ago
1

I am in a similar boat, using a db.t3.medium and experiencing regular ebs byte balance exhaution. I have a 1TiB disk and it still throttles to a crawl after exhaution. @Saurabh's note definitely is worth looking into:

I would recommend summing up NetworkReceiveThroughput and NetworkTransmitThroughput metrics and monitor them alongside EBSByteBalance%. This would help with the confirmation.

I see a direct correlation with NetworkReceiveThroughput and NetworkTransmitThroughput spiking as EBSByteBalance plummits to zero.

I am curious though if you found a suitable resolution perhaps with another instance type or adjusting how you are using your RDS instance?

answered 2 years ago
0

I am also having the same problem. I increased the instance to a db.m6i.large to perform the task I need, I also changed the EBS storage to a 400GB gp3 that said it didn't have a bust IO limit or something like that, but I keep falling into the same problem and paying an amount exorbitant to my unsuccessful reality!

The funny thing is that they say it is a network, but despite carrying a lot of data, the peak of simultaneous network use according to the metrics is 2.2MBps of data traffic, a 10/100 network would handle that, but a network that is said to be 2.5 Gb can't handle it without maxing out (if that's the case)?

Too sad to have these kind of problems, google for 2 days and not find the solution anywhere. If I had known the problem it was going to cause, I would have done this process locally and just replaced the database! Never process large amounts of data in RDS again, it's not worth it!

answered a year ago
-2

The burst balance being exhausted will limit the read/write operations of the database to the baseline performance of the volume- this could absolutely impair an I/O intensive workload.

There is a CloudWatch metric called Burst Balance you can use to monitor the burst credits. This is what you should be looking at as I suspect it gets exhausted very quickly and does not replenish until the workload I/O falls below the baseline IOPS of the volume. If you increase the volume size to 1 TiB or greater, the volume IOPS will exceed the burst IOPS and eliminate burst as as a bottleneck.

https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_Storage.html#CHAP_Storage.IO.Credits

AWS
answered 2 years ago
  • The question was about EBS Byte balance, not Burst balance. We are having the same problem, but I am wracking my head trying to understand what that metric measures exactly, and hence what to do to prevent its depletion.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions