Skip to content

How to Know Per Node Network Bandwidth Usage for AWS OpenSearch Instances

0

Question

Does or can AWS OpenSearch make the NetworkIn/NetworkOut EC2 metric available to clients so we know how close we are to hitting network bandwith limits? (ref: EC2 cloudwatch metrics)

I don't see any way to measure how close we are to hitting EC2 instance network bandwith limitations based on the current OpenSearch managed domain metrics other than "lagging" indicators like maybe CPU load, utilization or memory.

Context

My team uses managed OpenSearch instance types and would like to know how much of our network bandwith is being used (e.g. i4g.4xL up to 25 Gbps) ideally at a node level.

Does or can AWS OpenSearch make the NetworkIn/NetworkOut EC2 metric available to clients to know how much network throughput is being used?

I see i/o and ebs throughput metrics being forwarded (e.g. ThroughputThrottle, ReadThroughput, WriteThroughput), is there some existing way to measure the per-instance/node NetworkIn/NetworkOut metrics from the underlying EC2 instances?

Reference

1 Answer
1

Based on the available information, Amazon OpenSearch Service does not directly expose the EC2-level NetworkIn/NetworkOut metrics for individual nodes in your OpenSearch cluster.

The OpenSearch Service metrics published to CloudWatch focus primarily on cluster health, search performance, shard status, CPU utilization, storage space, and various OpenSearch-specific metrics like ThreadCount, OpenContexts, and ConcurrentSearchRate. While OpenSearch Service does provide metrics related to I/O and EBS throughput (such as ThroughputThrottle, ReadThroughput, and WriteThroughput), there are no direct equivalents to the EC2 NetworkIn/NetworkOut metrics that would allow you to monitor network bandwidth usage at the node level.

Without these network metrics, it's challenging to determine how close your OpenSearch nodes are to hitting their network bandwidth limits (such as the 25 Gbps limit on i4g.4xL instances). As you mentioned, you would need to rely on indirect or "lagging" indicators like CPU utilization or memory pressure to infer potential network-related issues.

For monitoring network performance in other AWS services, metrics like NetworkBandwidthInAllowanceExceeded, NetworkBandwidthOutAllowanceExceeded, and NetworkPacketsPerSecondAllowanceExceeded are available, but these don't appear to be exposed for OpenSearch Service nodes.

If monitoring network bandwidth usage at the node level is critical for your use case, you might want to contact AWS Support to inquire if there are any plans to expose these metrics or if there are alternative approaches to monitor network utilization for OpenSearch Service domains.
Sources
Monitoring OpenSearch cluster metrics with Amazon CloudWatch - Amazon OpenSearch Service
Troubleshoot Network allowance exceeded metrics in ElastiCache | AWS re:Post

answered 14 days ago
EXPERT
reviewed 14 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.