- Newest
- Most votes
- Most comments
Hello matthalion,
I am sorry to hear about the issue with your instance i-017d39167c95d214c.
I have checked the instance and I could see that the underlying physical host, on top of which your instance was hosted, had been experiencing hardware related issues during the above mentioned times. This caused your instance to become unresponsive and to fail its status checks.
Please note that in the future you can check whether an instance was affected by a hardware related event by checking its 'System Status Checks' [1]. The history of these checks can also be viewed in Amazon CloudWatch by looking at StatusCheckFailed_System metric \[2,3].
Please accept our apologies for the above issue and for any inconvenience caused by it.
Please note that your instance is still being hosted on the same physical host. Although the host is healthy right now, you may consider stopping and then starting your instance. As you may be aware already, the stop / start action has the function to move an instance to another healthy physical host [4] (note: simple 'Reboot' action does not work this way) that was not affected by the above mentioned hardware issues.
I would like to suggest that you to take a look at the Auto Recovery feature for Amazon EC2. You can create an Amazon CloudWatch alarm that monitors an Amazon EC2 instance and automatically recovers the instance if it becomes impaired due to an underlying hardware failure or a problem that requires AWS involvement to repair. Basically, you can use CloudWatch to set up the alarm which will trigger when the System Status check fails. This alarm can further trigger an EC2 Action like "Recover this instance" \[5,6].
Please let us know if you need any further help.
Links:
[1] https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/monitoring-system-instance-status-check.html#types-of-instance-status-checks
[2] https://aws.amazon.com/blogs/aws/ec2-instance-status-metrics/
[3] https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/ec2-metricscollected.html#ec2-metrics
[4] http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/Stop_Start.html#instance_stop
[5] http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/UsingAlarmActions.html#AddingRecoverActions
[6] https://aws.amazon.com/blogs/aws/new-auto-recovery-for-amazon-ec2/
Regards,
awstomas
Relevant content
- asked 2 years ago
