Instance reachability check failed - two days in a row

0

The last two nights at exactly 1.00 Central European Time we have lost connectivity to an EC2 instance that has ran without problems for years (i-fee7a9b0).

That exact instance was a few days ago stop/started due to a system event where it was moved to new underlying host (Ref: AWS_EC2_INSTANCE_REBOOT_FLEXIBLE_MAINTENANCE_SCHEDULED_b0a31e5f-d9bc-4954-b36c-122e4638f85f )

We have tried whats is described here: https://docs.aws.amazon.com/AWSEC2/latest/WindowsGuide/screenshot-service.html however we cannot get at screenshot of the instance when in this state. Furthermore nothing is available in the system log.

The CPU usuage (according to cloudwatch) is not high at the point where the reachability check suddenly fails.

If we stop the instance and start it again it comes back, its unresponsive to reboot too.

What to do from here, we are running a production system and this can't be a recurring event.

Edited by: jta on Oct 16, 2019 10:50 PM

Edited by: jta on Oct 16, 2019 10:51 PM

jta
asked 5 years ago248 views
2 Answers
0

An update: Has found out that the reachability check failing is correlated to allmost full network utilization on the instance.

jta
answered 5 years ago
0

Turns out that after our instance migrated to new underlying host the network driver used was not compatible. We upgradet to newest AWS Network driver and our problems were resolved.

jta
answered 4 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions