Elastic cache redis network allowance exceeded

1

We are seeing lots of network allowance exceeded on reading from 1 or 2 shards alone. Running in clustered mode with 18 shards. We think there might be some objects which are large which might create these exceptions. IS there a metric or a way to find the values size in redis elastic node or if there is metric which shows the size of values in redis ?Enter image description here

已提問 1 年前檢視次數 277 次
2 個答案
1

Hello,

Thank you for your query!

As per the official AWS document, we can see that the metric 'NetworkBandwidthOutAllowanceExceeded' indicates the number of packets queued or dropped because the outbound aggregate bandwidth exceeded the maximum for the instance.

[+] https://docs.aws.amazon.com/AmazonElastiCache/latest/red-ug/CacheMetrics.HostLevel.html

A spike in this metric is usually seen when there is heavy network traffic on the cluster/node, which causes the cluster to constantly function above the network baseline limit, eventually leading to network throttling. When the network throttles, these spikes can be seen. If there is no visible spike in the network bytes metrics, there is also a chance that microbursting took place.

Further, regarding your query, please note that unfortunately there is no metrics available that lists the size of Keys directly in Redis. However, you can monitor bigkeys in the clusters as any operation like read/write/evict/sync on those keys would use more system resources. redis-cli has --bigkeys option that sample Redis keys looking for keys with many elements (complexity).

$ redis-cli --bigkeys [+]https://redis.io/docs/ui/cli/

In cluster mode enabled clusters, I would suggest you to provide individual endpoints of the master nodes instead of the cluster configuration endpoint as shown below:

src/redis-cli -c -h <node_endpoint> -p 6379 --bigkeys

Thank you for your interest in re:Post community.

Have a great day!

AWS
支援工程師
已回答 1 年前
0

The best way to troubleshoot/understand these Network*AllowanceExceeded metrics is to determine what specific impact this has had on the application. Are you seeing timeout errors or visible slowness on your application matching the timestamp of these spikes?

Since TCP is a reliable transport protocol, dropped packets are retransmitted. This is intended functionality and happens independently on inbound and outbound traffic. It is common to observe occasional spikes in these metrics. If the metric shows sustained high values (10k/min or more), it's only meaningful when NetworkBytesIn/NetworkBytesOut is approaching the host's baseline network bandwidth.

If no latency issues are observed or if the numbers are fairly low, then no further action is required.

Note: Please note that NetworkBytesIn and NetworkBytesOut metrics are measured at a per-minute granularity. Network traffic shaping, which generates non-zero BandwidthInAllowanceExceeded & BandwidthOutAllowanceExceeded, happens at a much smaller granularity (milliseconds). Small bursts of traffic will cause some traffic shaping, even if average bandwidth is well within limits. This can happen even during a single SET or GET operation for a larger item.

AWS
已回答 1 個月前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南