EC2 Instance connection issues

0

Recently we have been having issues where some of our instances will loose connectivity but there are no instance checks that fail. The instance will regain connectivity if we (stop/start) or restart the instance. I have been doing stop/start to rule out any issues with host the instance is running on. I can verify that the instance was still running when the connectivity is lost. There are entries in the windows event log where it is having issues connecting to our domain controller and other monitoring tools we use. I had setup the serial console to try and see what is happening to the instance but that did not even connect the last time we had an issue.

There have been no changes to the security groups.

We are using the latest 2.6.0 version of the ENA driver and have opened multiple tickets with AWS support only to yield nothing. I even sent them copies of the ec2-rescue logs from multiple instances that experienced this issue

This has happened on windows 2016/2019 instances (m6a,r5a,t3a). This has been happening more often with certain instances running our WMS software but has happened on others as well.

This just happened this morning around 6AM UTC. The CPU was not maxed but you can see the network dropped out until it was restarted just before 11AM UTC.

Enter image description here

Enter image description here

dsekely
asked a month ago321 views
3 Answers
0

There were no status check errors. the gap just before 11 was from when I stopped/started the instance Enter image description here

dsekely
answered a month ago
  • Ok you have a Nitro-based instance . When you tired connecting your EC2 instance Serial Console for Windows. You couldn't maybe because you need To view your account access status to the serial console via (AWS CLI) by typing: aws ec2 get-serial-console-access-status --region you-region If In the output indicates you : { "SerialConsoleAccessEnabled": true } This means that your account is allowed to access to the serial console.

  • Once the instance comes back up after doing a stop/start I am able to access the serial console for windows without any issues

  • I had another instance have the same issue tonight. I had read somewhere about trying to add another network interface. I tried that but still was unable to connect to the instance. The serial console did come up but when trying to start a cmd session it timed out. Again there were no failed status checks and the CPU was less than 20% at the time it happened.

0

Firstly I hope helping you. We must to verify if it's a hardware or software or security or network issues ?

answered a month ago
  • For me it's a windows restrictions maybe firewall or antivirus inside the os. Did you enable the sac in order to troubleshoot your Windows instance via the serial console?

  • No changes to the windows firewall or antivirus prior to connection loss. The issue that prompted the post happened early in the morning while our team was asleep. The SAC was enabled but was not working when the issue occurred. Once the system is restarted the SAC and system works as expected.

0

The windows os work correctly , Where is the issue located ?

answered a month ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions