내용으로 건너뛰기

Opensearch cluster is unresponsive

0

Since yesterday our Opensearch Cluster has 0 data nodes active, is totally unresponsive ( no curl or working dashboard), is stuck with its normal cluster health status. We tried to increase the number of datanodes from 1 to 2 to help with any load, but it's stuck on 20% for more than 14 hours.

  • The same. Two OpenSearch clusters in two projects. Region eu-central-1 - some breakdowns and issues today, region eu-west-1 - huge problem for several hours - cluster's health is "red", 1 node active, stuck applying changes. Dashboards are unavailable - 502 Gateway Error.

질문됨 한 달 전41회 조회
2개 답변
0

Based on your description, your OpenSearch cluster is experiencing a serious issue with 0 active data nodes, unresponsiveness, and a configuration change stuck at 20% for over 14 hours.

This is a common problem that can occur with Amazon OpenSearch Service. When configuration changes get stuck in the "Processing" or "Modifying" state (showing as 20% complete in your case), it typically indicates an underlying issue with the cluster.

Here are some steps to address this:

  1. Check CloudWatch metrics for your domain to identify potential causes like storage issues or high CPU usage. These metrics might reveal if your nodes were experiencing resource constraints before they became unresponsive.

  2. Review the AWS knowledge center article about OpenSearch domains stuck in the processing state, which provides specific troubleshooting steps for this scenario.

  3. If the above doesn't help, you'll need to create a ticket with AWS Technical Support. This is especially important if you need to recover data from the unresponsive cluster.

  4. As a last resort, some users have resolved similar issues by creating a new OpenSearch instance and deleting the stuck one. However, this approach would result in data loss if you haven't backed up your indices.

The fact that you tried to increase data nodes from 1 to 2 during an already problematic state might have complicated the situation, as making configuration changes to an unhealthy cluster can sometimes exacerbate issues.
Sources
OpenSearch Configuration Change Stuck at 20% and 503 Error Responses | AWS re:Post
Amazon OpenSearch Service Stuck at Modifying | AWS re:Post

답변함 한 달 전
0

Practically, you should not make your data nodes as Zero. Unlike other AWS server cluster where you can make the nodes as zero and make the cluster working doesn't work in AWS OSS. I have personally experienced that. There is no specific documentation available but this is a experience from actual scenario.

답변함 한 달 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.