Skip to content

Implicit IO limits on Aurora

0

Hey folks,

We are testing Aurora instance with some benchmarks like BenchBase’s Resourcestresser: cmu-db/benchbase: Multi-DBMS SQL Benchmarking Framework via JDBC.

While consistently running the benchmark for a long period of time, suddently we noticed a rapid drop in Throughput (meausred from our side) but also this is reflected in the CloudWatch metrics.

DBtune dashboard Cloudwatch

Curious on why this is happening. My initial hypothesis is that there is an implicit rate limiting on the IO operations with the disk. This is based on the sudden doubling of the IO latency value.

Do you maybe notice/know something that I am missing?

Cheers

3 Answers
1

Your instance type has a burst and baseline resource for network bandwith. The most likely explanation for the performance drop was that you ran out of burst bandwidth.

$ aws ec2 describe-instance-types --instance-types m6i.large --query "InstanceTypes[*].[NetworkInfo.NetworkCards]"
[
    [
        [
            {
                "NetworkCardIndex": 0,
                "NetworkPerformance": "Up to 12.5 Gigabit",
                "MaximumNetworkInterfaces": 3,
                "BaselineBandwidthInGbps": 0.781,
                "PeakBandwidthInGbps": 12.5
            }
        ]
    ]
]
AWS
MODERATOR
answered 2 years ago
  • Hello phil. In this case, which monitoring metric would confirm that this is what's throttling the Aurora performance?

0

Hey folks thanks both for the answers. @Giovani I am already aware of the metrics, and all of them did not add up, thus made the question here, as it seems like an implicit restrictition somewhere.

@philaws that sound solid as an explaination. You think, if I get this right, as we tested Aurora with a benchmark running on an EC2 that the limit was in EC2 rather than Aurora correct?

answered 2 years ago
  • Yes, Aurora is running on a db.m6i.large which is no different from an ec2 m6i.large instance. They have the same specs.

0

A sudden drop in throughput for your Aurora instance could be due to resource constraints, increased traffic, memory issues, or network issues. You can check various CloudWatch metrics to diagnose the issue, including:

CPUUtilization: The percentage of CPU utilization.

FreeableMemory: The amount of available random access memory.

SwapUsage: The swap space used on the DB instance.

ReadIOPS, WriteIOPS: The average number of disk I/O operations per second.

NetworkReceiveThroughput, NetworkTransmitThroughput: The incoming (received) and outgoing (transmitted) network traffic on the DB instance.

EXPERT
answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.