Skip to content

Why does my Amazon EC2 instance exceed its network limits when my average utilization is low?

9 minute read
1

My Amazon Elastic Compute Cloud (Amazon EC2) instance's average network utilization is low, but the instance still exceeds its bandwidth or Packets Per Second (PPS) quota.

Short description

The bw_in_allowance_exceeded, bw_out_allowance_exceeded, or pps_allowance_exceeded Elastic Network Adaptor (ENA) network performance metrics might increase even when your average utilization is low. The most common cause of this issue are short spikes in demand for network resources, called microbursts. Microbursts typically only last for seconds, milliseconds, or even microseconds. Amazon CloudWatch metrics aren't granular enough to reflect them. For example, you can use the NetworkIn and NetworkOut instance metrics in CloudWatch to calculate the average throughput per second. However, the calculated rates might be lower than the available instance bandwidth for the instance type because of microbursts.

An increase in bw_in_allowance_exceeded and bw_out_allowance_exceeded metrics also occurs on smaller instances that have an "up to" bandwidth, such as "up to 10 gigabits per second (Gbps)." The smaller instances use network I/O credits to burst beyond their baseline bandwidth for a limited time. When the credits are depleted, the traffic aligns to the baseline bandwidth and the metrics increase. Because instance burst occurs on a best-effort basis, the metrics might increase even when your instance has available I/O credits.

An increase in the pps_allowance_exceeded metric also occurs when non-optimal traffic patterns cause packet drops at lower PPS rates. Asymmetric routing, outdated drivers, small packets, fragments, and connection tracking affect the PPS performance for a workload.

Resolution

Average calculation

CloudWatch samples Amazon EC2 metrics every 60 seconds to capture the total bytes or packets that are transferred in 1 minute. Amazon EC2 aggregates the samples and publishes them to CloudWatch in 5-minute periods. Each statistic in the period shows a different value.

When you use detailed monitoring, CloudWatch publishes the NetworkIn and NetworkOut metrics without aggregation in 1-minute periods. The values for Sum, Minimum, Average, and Maximum are the same, and the value for SampleCount is 1. CloudWatch always aggregates and publishes the NetworkPacketsIn and NetworkPacketsOut metrics in 5-minute periods.

Use the following methods to calculate the average throughput in bytes per second (Bps) or PPS in a period:

  • For a simple average in your specified time period, divide Sum by Period or by the timestamp difference between values (DIFF_TIME).
  • For an average in the minute with the highest activity, divide Maximum by 60 seconds.

To convert Bps into Gbps, divide the calculation results by 1,000,000,000 bytes, and then multiply them by 8 bits.

Microbursts in CloudWatch metrics

The following example shows how a microburst appears in CloudWatch. The instance has a network bandwidth allowance of 10 Gbps and uses basic monitoring.

In a sample of 60 seconds, an outbound data transfer of approximately 24 GB uses all available bandwidth. The data transfer increases the bw_out_allowance_exceeded value and completes in approximately 20 seconds with an average speed of 9.6 Gbps. Amazon EC2 doesn't send any other data, and the instance remains idle for the remaining 4 samples of 240 seconds.

The average throughput in Gbps in a 5-minute period is much lower than the one during the microburst:

Formula: AVERAGE_Gbps = SUM(NetworkOut) / PERIOD(NetworkOut) / 1,000,000,000 bytes * 8 bits

SUM(NetworkOut) = (~24 GB * 1 sample) + (~0 GB * 4 samples) = ~24 GB

PERIOD(NetworkOut) = 300 seconds (5 minutes)

AVERAGE_Gbps = ~24 / 300 / 1,000,000,000 * 8 = ~0.64 Gbps

Even when you calculate the average throughput based on the highest sample, the amount still doesn't reflect the throughput during the microburst:

Formula: AVERAGE_Gbps = MAXIMUM(NetworkOut) / 60 seconds / 1,000,000,000 bytes / 8 bits

MAXIMUM(NetworkOut) = ~24 GB

AVERAGE_Gbps = ~24 GB / 60 / 1,000,000,000 * 8 = ~3.2 Gbps

When high-resolution data is available, you can get more accurate averages. When you collect operating system (OS) network usage metrics at 1-second intervals, the average throughput briefly reaches approximately 9.6 Gbps.

Monitor microbursts

You can use the CloudWatch agent on Linux and Windows to publish OS-level network metrics to CloudWatch at up to 1-second intervals. The agent can also publish ENA network performance metrics.

Note: High-resolution metrics have higher pricing.

You can also use OS tools to monitor network statistics at up to 1-second intervals. For Windows instances, use Performance Monitor. For Linux, use sar, nload, iftop, iptraf-ng, or netqtop.

To clearly identify microbursts, perform a packet capture of the OS, and then use Wireshark to plot an I/O graph at 1-millisecond intervals. For more information, see Download Wireshark and 8.8. The "I/O Graphs" window on the Wireshark website.

This method has the following limitations:

  • Network allowances are approximately proportionate at a microsecond level. For example, an instance type with a 10 Gbps bandwidth performance can send and receive about 10 megabits (Mb) in 1 millisecond.
  • Packet captures cause additional system load and might reduce the overall throughput and PPS rates.
  • Packet captures might not include all packets because of packet drops that a full buffer caused.
  • Timestamps don't accurately reflect when a network sent packets or when the ENA received them.
  • The I/O graphs might show lower activity for inbound traffic because Amazon EC2 shapes traffic that exceeds its quota before it reaches the instance.

Packet queues and drops

When the network queues a packet, the resulting latency is measured in milliseconds. TCP connections can scale their throughput and exceed the quotas of an EC2 instance type. As a result, some packet queues are expected even when you use bottleneck bandwidth and round trip (BBR) or other congestion control algorithms that use latency as a signal. When a network drops a packet, the TCP automatically retransmits lost segments. Both packet queues and drops can result in higher latency and lower throughput. However, you can't view recovery actions. Typically, the only errors that you can view are when your application uses low timeouts, or when the network drops enough packets that the connection is forcibly closed.

The ENA network performance metrics don't differentiate between queued packets or dropped packets. To measure connection-level TCP latency on Linux, use the ss or tcprtt commands. To measure TCP retransmissions, use the ss or tcpretrans commands for connection-level statistics, and nstat for systemwide statistics. To download the tcprtt and tcpretrans tools that are part of the BPF Compiler Collection (BCC), see bcc on the GitHub website. You can also use packet captures to measure latency and retransmissions.

Note: Packets that the network dropped because of exceeded instance quotas don't appear in the drop counters for ip or ifconfig.

Prevent microbursts

First, check ENA network performance metrics against your application's key performance indicators (KPIs) to determine the effect of packet queues or drops.

If the KPIs are below a required threshold, or you receive application log errors, then take the following actions to reduce packet queues and drops:

  • Scale up: Increase the instance size to an instance that has a higher network allowance. Instance types with an "n", such as C7gn, have higher network allowances.
  • Scale out: Spread traffic across multiple instances to reduce traffic and contention at individual instances.

For Linux-based operations, you can also implement the following strategies to avoid microbursts. It's a best practice to test the strategies in a test environment to verify that they reduce traffic shaping without negative effects on the workload.

Note: The following strategies are only for outbound traffic.

SO_MAX_PACING_RATE

Use the SO_MAX_PACING_RATE socket option to specify a maximum pacing rate in Bps for a connection. The Linux kernel then introduces delays between packets from the socket so that the throughput doesn't exceed the quota that you specify.

To use this method, you must implement the following changes:

  • Application code changes.
  • Support from the kernel. For more information, see net: introduce SO_MAX_PACING_RATE on the GitHub website.
  • Fair queue (FQ) queuing discipline or the kernel's support for pacing at the TCP layer (for TCP only).

For more information, see getsockopt(2) - Linux manual page and tc-fq(8) - Linux manual page on the man7 website. Also, see tcp: internal implementation for pacing on the GitHub website.

qdiscs

Linux uses the default configuration of a pfifo_fast queuing discipline (qdisc) for each ENA queue to schedule packets. Use the fq qdisc to reduce traffic bursts from individual flow and regulate their throughput. Or, use fq_codel and cake to provide active queue management (AQM) capabilities that reduce network congestion and improve latency. For more information, see the tc(8) - Linux manual page on the man7 website.

For TCP, activate Explicit Congestion Notification (ECN) on clients and servers. Then, combine ECN with a qdisc that can perform ECN Congestion Experienced (CE) marking. CE marks cause the OS to lower the throughput for a connection to reduce latency and packet losses that an exceeded instance quota caused. To use this solution, you must configure the qdisc with a low CE threshold based on the average round-trip time (RTT) of your connections. It's a best practice to use this solution only when the average RTT between connections doesn't vary much. For example, your instance manages traffic only in one Availability Zone.

Because of performance issues, it's not a best practice to set up aggregated bandwidth shaping at the instance level.

Shallow Transmission (Tx) queues

Use shallow Tx queues to reduce PPS shaping. Byte queue limits (BQL) dynamically limits the number of in-flight bytes on Tx queues. To activate BQL, add ena.enable_bql=1 to your kernel command line in GRUB.

Note: You must have ENA driver version 2.6.1g or higher to use this solution. BQL is already activated on ENA drivers with Linux kernel versions that end with K.

For more information, see bql: Byte Queue Limits on the LWN.net website.

When you use ENA Express, you must deactivate BQL to maximize the bandwidth.

You can also use ethtool to reduce the Tx queue length from its default of 1,024 packets. For more information, see ethtool(8) - Linux manual page on the man7 website.

Related information

Amazon EC2 instance network bandwidth

AWS OFFICIALUpdated 9 months ago
7 Comments

will the packets that will get dropped or queued due to the allowance pps_allowance_exceeded exceeding also be counted into the metric "NetworkPacketsIn"? Therefore I could see how many packets per second (PPS) were coming in for the allowance to exceed. This information from the production environment could be used to simulate load using iperf.

replied 2 years ago

Thank you for your comment. We'll review and update the Knowledge Center article as needed.

AWS
EXPERT
replied 2 years ago

will the packets that will get dropped or queued due to the allowance pps_allowance_exceeded exceeding also be counted into the metric "NetworkPacketsIn"?

Dropped packets do not count towards CloudWatch's Network* metrics. Queued packets do count, though, because they eventually go through.

AWS
replied 2 years ago

Is it possible to differentiate whether packets are queued or dropped? Dropped seems like a more serious condition.

EXPERT
replied a year ago

Is it possible to differentiate whether packets are queued or dropped?

For TCP connections, queuing results in higher Round Trip Time (RTT), and drops result in retransmissions. To monitor these, you must do it at OS level. The nstat tool displays many useful system-level SNMP counters for TCP. And with ss, you can obtain similar statistics at a socket level. On the fancier side, there are tools out there which leverage perf or eBPF for similar purposes. These days, I prefer tcpretrans and tcprtt from the BPF Compiler Collection (BCC).

Dropped seems like a more serious condition.

In a sense, yes. However, packets may be dropped at many layers within and beyond EC2 instances for many reasons, meaning that packet drops are normal and expected to an extent. Therefore, metrics like these shouldn't be used by themselves when measuring business impact. You must always correlate them with one or more Key Performance Indicators (KPIs) from your application to determine their true impact.

AWS
replied a year ago

is there any value in watching the interface metrics for dropped packets at the "ifconfig" level like below:

    RX packets 395480  bytes 149047230 (142.1 MiB)
    RX errors 0  dropped 0  overruns 0  frame 0
    TX packets 274803  bytes 47960383 (45.7 MiB)
    TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

Would that be a good indicator if packets are being dropped due to bandwidth allowances ( or other factors )

replied a year ago

is there any value in watching the interface metrics for dropped packets at the "ifconfig" level like below: Would that be a good indicator if packets are being dropped due to bandwidth allowances ( or other factors )

There may be value in monitoring those, but not because of bandwidth allowances. These counters reflect packet drops happening at OS level for various reasons. On Receive (Rx) side, packets dropped due to exceeding EC2 instance network allowances don't make it to the instance, therefore they aren't reflected as drops in ethtool or ip (note that ifconfig is obsolete). Likewise, on Transmit (Tx) side, packets dropped by the OS don't make it to EC2. Information about these counters is available at Interface statistics and other sources. For Rx drops in particular, Red Hat has 3 articles covering this topic in detail (subscription required):

AWS
replied a year ago