내용으로 건너뛰기

OpenSearch Service Software Update Stuck in Initialization State since 24 hours

0

We are facing an issue with our Amazon OpenSearch Service domain.

Details: Domain Name: ghn-anon-elasticsearch Region: Ireland (eu-west-1) Update Trigger Time: Yesterday, 5th May - around 11:30 AM Indian Time

Issue: The service software update has been stuck for over 24 hours with: Domain processing status: Updating service software Configuration change status: Initializing

We are currently unable to perform any configuration changes or operations on the domain as it appears to be locked in this state.

Cluster appears healthy from our checks, but the update is not progressing.

We have also attempted all possible actions via the AWS Console and AWS CLI to cancel or roll back the service software update, but none of these operations are working.

Could you please investigate and help unblock or reset the update process?

Please let us know if any additional details are required.

Thanks, Shubham

질문됨 한 달 전52회 조회
1개 답변
0

I understand your OpenSearch Service domain has been stuck in the "Initializing" state during a service software update for over 24 hours, which is preventing you from making any configuration changes.

Unfortunately, once an upgrade or update has been initiated in OpenSearch Service, it cannot be cancelled. OpenSearch Service uses a blue/green deployment process for domain upgrades and updates, and if the process gets stuck, you must wait for AWS to resolve the underlying issue before it can complete.

During the blue/green deployment process, several factors can cause the domain to get stuck in a processing state, including:

  • Issues with shard relocation
  • Large shard sizes
  • High index or search load on the domain
  • Insufficient disk space
  • Reaching shard count quotas

To help monitor the situation while waiting for resolution, you can:

  • Use Amazon CloudWatch to check OpenSearch Service metrics such as "FreeStorageSpace," "ClusterStatus," and "ClusterIndexWritesBlocked"
  • Monitor the shard migration process using the "_cat/recovery" command if accessible

Since you've been stuck for over 24 hours and cannot cancel the operation, I recommend opening a support case with AWS Support immediately. They will need to investigate the underlying issue causing the update to remain stuck and can take the necessary actions to resolve it. When opening the case, provide all the details you've mentioned including the domain name, region, and the exact time the update was initiated.

In the future, to help prevent similar issues, consider initiating updates during periods of low cluster usage and ensuring your domain has sufficient resources to handle the increased load during an update.
Sources
Troubleshoot stuck or failed upgrades in OpenSearch Service | AWS re:Post

답변함 한 달 전
전문가
검토됨 한 달 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.