What could I do when the amazon opensearch instance is stuck in update for more than a week?

0

Service Name: Amazon OpenSearch Service
Version Name: Elasticsearch 6.3
Service software version: updating to R20211203-P2 (latest)
Start time: 16/12/2021, 17:11:14 (UTC), checked from the Notifications tab
Cluster Status: 16-17/12 are green, then red from 18-29/12, I have then restored it from the healthy automated snapshot, becomes green again until now
Instance Status: processing
Data node: 2 (desired to be 1)

What could I do when I have such situation? The status is stuck in processing.
Any method to handle it as I would like to perform version update from 6.3 to 6.8 and to opensearch?

Remarks:
I have another production instance that finished the service software update within 1 hour.

  • Follow up on my question: I finially solved it by creating a new instance of same elastic search version. Then recover the snapshot from the old one to the new one and remove the old one (which was stuck in processing) It is fine to remove a processing elastic search instance at last. The only change is the name of the instance ( you might need to recreate it again to use back the old one's name)

asked 2 years ago1716 views
1 Answer
1

Good question. Sorry to hear about your issues!

AWS recommends resolving the red cluster status first before reconfiguring your OpenSearch Service domain. Typically a red cluster status means 1 primary shard and the replicas are not allocated. OpenSearch will take snapshots but these fail when the red cluster status persists.

2 of the most common causes are failed cluster nodes and the process crashing from a heavy load. The following can help identify the issue:

  • GET /_cluster/allocation/explain
  • GET /_cat/indices?v

From that info, you can try the following:

  • Delete the problematic index
  • Restore a snapshot
  • Delete documents from the index
  • Change the index settings
  • Reduce the number of replicas
  • Delete other indices to free up disk space.

For failed cluster nodes or more complex issues that aren't solvable by the above, you can reach out to AWS Support for more help as detailed here: https://docs.aws.amazon.com/opensearch-service/latest/developerguide/handling-errors.html

jsonc
answered 2 years ago
  • curl -XGET 'https://xx/_cat/indices?v'
    health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
    green open .kibana xx 1 1 2 0 18.7kb 9.3kb
    green open book xx 5 0 52598 150 24.1mb 24.1mb

    curl -XGET 'https://xx/_cluster/allocation/explain?pretty' "reason" : "unable to find any unassigned shards to explain [ClusterAllocationExplainRequest[useAnyUnassignedShard=true,includeYesDecisions?=false]"

    For both suggested methods, they both seem normal at this moment, dunno why it's still stuck in processing

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions