I want to initiate an automatic recovery action when my Amazon Elastic Compute Cloud (Amazon EC2) instance fails a status check.
Short description
Automatic recovery can recover an EC2 instance when it fails a system status check. An instance failure during a system status check typically means that there's an AWS hardware issue. However, automatic recovery can't recover an instance that fails an instance status check. For more information about these checks, see Types of status checks.
Note: Only certain instance types support automatic recovery actions.
Resolution
To automatically recover your instance, use one of the following methods:
- Simplified automatic recovery based on instance configuration
- Amazon CloudWatch action based recovery
Simplified automatic recovery based on instance configuration
By default, all instances that support simplified automatic recovery are configured to recover failed instances.
Make sure that you adhere to the simplified automatic recovery prerequisites. You can take the following actions during instance launch, or after you launch your instance.
To turn off simplified automatic recovery during instance launch, complete the following steps:
- Open the Amazon EC2 console.
- Choose Launch instance.
- Under Advanced details, turn off Instance auto-recovery.
- Configure your settings, and then launch the instance.
To set the automatic recovery behavior to default for instances in the Running or Stopped states, complete the following steps:
- Open the Amazon EC2 console.
- In the navigation pane, choose Instances.
- Select the instance, and then choose Actions.
- Choose Instance settings, and then for Change auto-recovery behavior, choose Default (On).
Note: To turn off automatic recovery, turn off Change auto-recovery behavior.
- Choose Save.
To review the results of the simplified automatic recovery, check the AWS Health Dashboard event. Example notifications:
- Failed events: AWS_EC2_SIMPLIFIED_AUTO_RECOVERY_FAILURE
- Successful events: AWS_EC2_SIMPLIFIED_AUTO_RECOVERY_SUCCESS
CloudWatch action based recovery
Use CloudWatch action based recovery to choose when you want to recover your instance. When an event invokes the StatusCheckFailed_System alarm, CloudWatch initiates the recover action. Then, the Amazon Simple Notification Service (Amazon SNS) topic initiates the notification that you chose when you created the alarm.
Important: As part of instance recovery, Amazon EC2 migrates the instance during an instance reboot, and data that's in-memory is lost.
After the instance recovery process is complete, CloudWatch publishes information to the SNS topic. Subscribers to the SNS topic receive an email notification that includes the status of the recovery attempt and further instructions. Successful recovery appears as an instance reboot on the recovered instance.
Verify that your configuration adheres to the requirements for CloudWatch action based recovery. To configure automatic recovery on your instance, configure a CloudWatch alarm for recover actions.