What could I do when the amazon opensearch instance is stuck in update for more than a week?

0

Service Name: Amazon OpenSearch Service
Version Name: Elasticsearch 6.3
Service software version: updating to R20211203-P2 (latest)
Start time: 16/12/2021, 17:11:14 (UTC), checked from the Notifications tab
Cluster Status: 16-17/12 are green, then red from 18-29/12, I have then restored it from the healthy automated snapshot, becomes green again until now
Instance Status: processing
Data node: 2 (desired to be 1)

What could I do when I have such situation? The status is stuck in processing.
Any method to handle it as I would like to perform version update from 6.3 to 6.8 and to opensearch?

Remarks:
I have another production instance that finished the service software update within 1 hour.

  • Follow up on my question: I finially solved it by creating a new instance of same elastic search version. Then recover the snapshot from the old one to the new one and remove the old one (which was stuck in processing) It is fine to remove a processing elastic search instance at last. The only change is the name of the instance ( you might need to recreate it again to use back the old one's name)

已提问 2 年前1733 查看次数
1 回答
1

Good question. Sorry to hear about your issues!

AWS recommends resolving the red cluster status first before reconfiguring your OpenSearch Service domain. Typically a red cluster status means 1 primary shard and the replicas are not allocated. OpenSearch will take snapshots but these fail when the red cluster status persists.

2 of the most common causes are failed cluster nodes and the process crashing from a heavy load. The following can help identify the issue:

  • GET /_cluster/allocation/explain
  • GET /_cat/indices?v

From that info, you can try the following:

  • Delete the problematic index
  • Restore a snapshot
  • Delete documents from the index
  • Change the index settings
  • Reduce the number of replicas
  • Delete other indices to free up disk space.

For failed cluster nodes or more complex issues that aren't solvable by the above, you can reach out to AWS Support for more help as detailed here: https://docs.aws.amazon.com/opensearch-service/latest/developerguide/handling-errors.html

jsonc
已回答 2 年前
  • curl -XGET 'https://xx/_cat/indices?v'
    health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
    green open .kibana xx 1 1 2 0 18.7kb 9.3kb
    green open book xx 5 0 52598 150 24.1mb 24.1mb

    curl -XGET 'https://xx/_cluster/allocation/explain?pretty' "reason" : "unable to find any unassigned shards to explain [ClusterAllocationExplainRequest[useAnyUnassignedShard=true,includeYesDecisions?=false]"

    For both suggested methods, they both seem normal at this moment, dunno why it's still stuck in processing

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则