Instance reachability check failed - two days in a row

0

The last two nights at exactly 1.00 Central European Time we have lost connectivity to an EC2 instance that has ran without problems for years (i-fee7a9b0).

That exact instance was a few days ago stop/started due to a system event where it was moved to new underlying host (Ref: AWS_EC2_INSTANCE_REBOOT_FLEXIBLE_MAINTENANCE_SCHEDULED_b0a31e5f-d9bc-4954-b36c-122e4638f85f )

We have tried whats is described here: https://docs.aws.amazon.com/AWSEC2/latest/WindowsGuide/screenshot-service.html however we cannot get at screenshot of the instance when in this state. Furthermore nothing is available in the system log.

The CPU usuage (according to cloudwatch) is not high at the point where the reachability check suddenly fails.

If we stop the instance and start it again it comes back, its unresponsive to reboot too.

What to do from here, we are running a production system and this can't be a recurring event.

Edited by: jta on Oct 16, 2019 10:50 PM

Edited by: jta on Oct 16, 2019 10:51 PM

jta
demandé il y a 5 ans257 vues
2 réponses
0

An update: Has found out that the reachability check failing is correlated to allmost full network utilization on the instance.

jta
répondu il y a 5 ans
0

Turns out that after our instance migrated to new underlying host the network driver used was not compatible. We upgradet to newest AWS Network driver and our problems were resolved.

jta
répondu il y a 5 ans

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.

Instructions pour répondre aux questions