跳至內容

Opensearch cluster is unresponsive

0

Since yesterday our Opensearch Cluster has 0 data nodes active, is totally unresponsive ( no curl or working dashboard), is stuck with its normal cluster health status. We tried to increase the number of datanodes from 1 to 2 to help with any load, but it's stuck on 20% for more than 14 hours.

  • The same. Two OpenSearch clusters in two projects. Region eu-central-1 - some breakdowns and issues today, region eu-west-1 - huge problem for several hours - cluster's health is "red", 1 node active, stuck applying changes. Dashboards are unavailable - 502 Gateway Error.

已提問 1 個月前檢視次數 39 次
2 個答案
0

Based on your description, your OpenSearch cluster is experiencing a serious issue with 0 active data nodes, unresponsiveness, and a configuration change stuck at 20% for over 14 hours.

This is a common problem that can occur with Amazon OpenSearch Service. When configuration changes get stuck in the "Processing" or "Modifying" state (showing as 20% complete in your case), it typically indicates an underlying issue with the cluster.

Here are some steps to address this:

  1. Check CloudWatch metrics for your domain to identify potential causes like storage issues or high CPU usage. These metrics might reveal if your nodes were experiencing resource constraints before they became unresponsive.

  2. Review the AWS knowledge center article about OpenSearch domains stuck in the processing state, which provides specific troubleshooting steps for this scenario.

  3. If the above doesn't help, you'll need to create a ticket with AWS Technical Support. This is especially important if you need to recover data from the unresponsive cluster.

  4. As a last resort, some users have resolved similar issues by creating a new OpenSearch instance and deleting the stuck one. However, this approach would result in data loss if you haven't backed up your indices.

The fact that you tried to increase data nodes from 1 to 2 during an already problematic state might have complicated the situation, as making configuration changes to an unhealthy cluster can sometimes exacerbate issues.
Sources
OpenSearch Configuration Change Stuck at 20% and 503 Error Responses | AWS re:Post
Amazon OpenSearch Service Stuck at Modifying | AWS re:Post

已回答 1 個月前
0

Practically, you should not make your data nodes as Zero. Unlike other AWS server cluster where you can make the nodes as zero and make the cluster working doesn't work in AWS OSS. I have personally experienced that. There is no specific documentation available but this is a experience from actual scenario.

已回答 1 個月前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。