- Newest
- Most votes
- Most comments
Possible Causes
-
Resource Constraints: If the single node does not have sufficient CPU, memory, or disk resources, it might not be able to handle the load during peak times, leading to missing metrics.
-
Java Garbage Collection (GC) Pauses: Long GC pauses can cause the node to become unresponsive temporarily. This can result in missing data metrics during the pause period.
-
Network Issues: Network interruptions or latency spikes can cause the node to temporarily lose connection, leading to gaps in metrics.
-
High Query Load: A high number of simultaneous queries or write operations can overwhelm the node, leading to dropped or delayed metric collection.
-
OpenSearch Node Restarts: Automatic or manual restarts of the OpenSearch node can cause temporary unavailability, leading to missing metrics.
-
Cluster Configuration: Misconfigurations in the cluster settings might lead to issues with node performance or stability.
I am getting possibility of opensearch node restart and network interruptions, but not able to sure through events, metrics and logs. How could I find correct root cause, could you please guide.
you can check Openseach logs, if they are enabled for your cluster https://docs.aws.amazon.com/opensearch-service/latest/developerguide/createdomain-configure-slow-logs.html
Relevant content
- asked 3 years ago
- AWS OFFICIALUpdated a year ago
please accept the answer if it was useful