What could I do when the amazon opensearch instance is stuck in update for more than a week?

0

Service Name: Amazon OpenSearch Service
Version Name: Elasticsearch 6.3
Service software version: updating to R20211203-P2 (latest)
Start time: 16/12/2021, 17:11:14 (UTC), checked from the Notifications tab
Cluster Status: 16-17/12 are green, then red from 18-29/12, I have then restored it from the healthy automated snapshot, becomes green again until now
Instance Status: processing
Data node: 2 (desired to be 1)

What could I do when I have such situation? The status is stuck in processing.
Any method to handle it as I would like to perform version update from 6.3 to 6.8 and to opensearch?

Remarks:
I have another production instance that finished the service software update within 1 hour.

  • Follow up on my question: I finially solved it by creating a new instance of same elastic search version. Then recover the snapshot from the old one to the new one and remove the old one (which was stuck in processing) It is fine to remove a processing elastic search instance at last. The only change is the name of the instance ( you might need to recreate it again to use back the old one's name)

질문됨 2년 전1733회 조회
1개 답변
1

Good question. Sorry to hear about your issues!

AWS recommends resolving the red cluster status first before reconfiguring your OpenSearch Service domain. Typically a red cluster status means 1 primary shard and the replicas are not allocated. OpenSearch will take snapshots but these fail when the red cluster status persists.

2 of the most common causes are failed cluster nodes and the process crashing from a heavy load. The following can help identify the issue:

  • GET /_cluster/allocation/explain
  • GET /_cat/indices?v

From that info, you can try the following:

  • Delete the problematic index
  • Restore a snapshot
  • Delete documents from the index
  • Change the index settings
  • Reduce the number of replicas
  • Delete other indices to free up disk space.

For failed cluster nodes or more complex issues that aren't solvable by the above, you can reach out to AWS Support for more help as detailed here: https://docs.aws.amazon.com/opensearch-service/latest/developerguide/handling-errors.html

jsonc
답변함 2년 전
  • curl -XGET 'https://xx/_cat/indices?v'
    health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
    green open .kibana xx 1 1 2 0 18.7kb 9.3kb
    green open book xx 5 0 52598 150 24.1mb 24.1mb

    curl -XGET 'https://xx/_cluster/allocation/explain?pretty' "reason" : "unable to find any unassigned shards to explain [ClusterAllocationExplainRequest[useAnyUnassignedShard=true,includeYesDecisions?=false]"

    For both suggested methods, they both seem normal at this moment, dunno why it's still stuck in processing

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인