How do I troubleshoot issues with software updates for OpenSearch?
I want to understand downtime and how to troubleshoot issues during a software update for Amazon OpenSearch Service.
Amazon OpenSearch Service regularly releases service software updates that add features or improve your domains.
The following are the most common issues for OpenSearch software updates:
- Domain eligibility
- Data loss
- Severity of software update
- Estimated time required for service software update
- Manually updating software
- Canceling an update
Note: Service software updates are different from OpenSearch version upgrades. For more information, see Upgrading Amazon OpenSearch Service domains.
OpenSearch Service software updates use blue/green deployment to minimize downtime and maintain the original environment in the event that the deployment is unsuccessful.
Updates typically complete within minutes but can take several hours to days if your system is experiencing heavy load.
Note: OpenSearch Dashboards might be unavailable during some or all of the upgrade.
To reduce the downtime of a service software update, follow these best practices:
- Perform your configuration changes in a single change request. This runs the blue/green deployment one time.
- Keep traffic on the domain as low as possible.
- Update your domain during the configured off-peak window to avoid long update periods.
- Make sure that the cluster is in a healthy and active state when running the configuration change.
- Make sure that the resource utilization is within the threshold and in an optimal state.
- If the cluster has dedicated primary nodes, then upgrades complete without downtime. If the cluster doesn't have dedicated primary nodes, then the cluster might be unresponsive for several seconds after an upgrade as it elects a primary node.
OpenSearch Service sends a notification when a service software update is available, required, started, completed, or failed. Also, two weeks before the schedule date, OpenSearch sends notification emails to the registered email address on the AWS account. If you don't act on required updates, then OpenSearch Service still automatically updates your domain service software after a certain time frame, typically two weeks. OpenSearch Service sends notifications when it starts the update and when the update is complete. For more information, see Notifications in Amazon OpenSearch Service.
Note: If you manually start an update, then OpenSearch Service doesn't send a notification when the update starts. OpenSearch Service sends a notification only when the update is complete.
To perform a service software update, your domain must be in an eligible state. For a list of states that are ineligible for an update, see When domains are ineligible for an update.
To programmatically check the eligibility of the domain, run the following AWS Command Line Interface (AWS CLI) command:
aws es —region region_name upgrade-elasticsearch-domain —domain-name domain_name —target-version OpenSearch_1.1 —perform-check-only
Note: If you receive errors when running AWS CLI commands, make sure that you're using the most recent version of the AWS CLI.
OpenSearch Service takes automated snapshots to back up your data in the event of data loss. You can use the snapshots to restore your domain in the event of a red cluster status or data loss. For more information, see Restoring snapshots.
To proactively back up your data, you can take manual snapshots of your domain. For more information, see Creating index snapshots in Amazon OpenSearch Service.
After a service update is successfully applied, you can't perform a rollback. If your service update is stuck, then contact AWS Support.
Severity of software update
To see if an update is available or to check the status of an update, open the OpenSearch Service console. Then, in the navigation pane, choose Notifications. For more information on monitoring cluster upgrades, see Why is my Amazon OpenSearch Service domain upgrade taking so long?
Each notification includes details about the service software update, including the severity of the service software update. The service software updates are categorized as optional or required.
If the notification severity is Informational, Low, or Medium, then the update is optional. You must manually run optional updates.
If the notification severity is High or Critical, then the update is required. OpenSearch Service automatically runs required updates. Within the domain's off-peak window, OpenSearch Service can initiate the update at any time beyond the specified deadline, typically 14 days from availability.
Estimated time required for service software update
The duration of service software updates can vary depending on these factors:
- Domain configuration
- Number of nodes
- Shards data
- Ongoing load or request that the cluster is serving at the time of update.
As a best practice, install updates when there's less load on the clusters because updates can temporarily strain a cluster's dedicated primary nodes. You can schedule software updates during off-peak windows to minimize strain on a cluster's dedicated primary nodes. You can also configure a custom off-peak window to change the start time for software updates.
Another way to schedule updates is to initiate a configuration change that runs the blue/green deployment. When the blue/green deployment runs, the configurations are also performed with service software updates. For example, adding any amount of storage immediately runs blue/green deployment.
Manually updating software
You can manually update your domain to take advantage of new features more quickly. For instructions on manually updating your domain, see Starting a service software update.
Canceling an update
To cancel a software update, run the AWS CLI cancel-service-software-update command.
The cancel-service-software-update command cancels a scheduled service software update for an OpenSearch Service domain. You can perform this operation only when the UpdateStatus is in the PENDING_UPDATE state and before the AutomatedUpdateDate.
- Accepted AnswerrePost-User-0325082lg...asked a year agolg...
- rePost-User-9538639lg...asked a year agolg...
- gregwaehnerlg...asked 18 days agolg...
- Accepted AnswerrePost-User-7696062lg...asked 3 months agolg...
- rePost-User-0261550lg...asked 9 months agolg...
- AWS OFFICIALUpdated 6 months ago
- AWS OFFICIALUpdated 7 months ago
- AWS OFFICIALUpdated 8 months ago
- AWS OFFICIALUpdated 10 months ago
- EXPERTGreg_Blg...published 5 months agolg...