Root cause for AWS EC2 disconnection.

0

Hi,

One of our production EC2 instances (Ubuntu 18.04 OS) became unreachable and we were unable to connect it via SSH. We rebooted the instance and then we were able to connect after that. Can someone point me to where should we check for the root cause for this problem? I could not find any useful info from the Instance's system logs as well.

demandé il y a 2 ans249 vues
3 réponses
0

Actually, it was almost at 0% utilization. This is the first time we faced this issue. Instance type is c5.2xlarge. We have also checked the disk usage. It was at 55%. Is there a possibility of some internal OS error on Ubuntu, that might have crashed it and made it unreachable? If yes, where could I find the logs?

répondu il y a 2 ans
  • Server logs are saved in /var/log directory. You might want to check kern.log or syslog to rule out OS crash.

0

You should have a look at the Metrics for the instance. Especially the CPU Utilization, Status check failed metrics. See if there was high CPU utilization around the time you had to reboot instance. Depending on the instance type also look for CPU credit usage and balance.

Is it a recurring issue or one-off incident? What is the instance type?

profile picture
Syd
répondu il y a 2 ans
0

Hello,

There is a possibility that this could be an operating system issue, such as a crash or kernel panic. Was there an error message when trying to connect?

I would suggest checking out this Ubuntu documentation on how to view your logs.

https://ubuntu.com/tutorials/viewing-and-monitoring-log-files#2-log-files-locations

Some specific logs you may find useful could include:

Authorization log = /var/log/auth.log - which keeps track of authorization systems, such as password prompts, the sudo command and remote logins.

Login failures log = /var/log/faillog - which contains info about login failures. You can view it with the faillog command.

Daemon Log = /var/log/daemon.log - daemons are programs that run in the background, usually without user interaction. For example, display server, SSH sessions, printing services, bluetooth, and more. This could indicate if there were issues with SSH.

System log = /var/log/syslog - Contains more information about your system. If you can’t find anything in the other logs, it’s probably here.

Also, here are some articles on some general EC2 troubleshooting tips:

https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-troubleshoot.html

https://docs.aws.amazon.com/en_us/AWSEC2/latest/UserGuide/TroubleshootingInstancesConnecting.html#TroubleshootingInstancesConnectionTimeout

répondu il y a 8 mois
AWS
INGÉNIEUR EN ASSISTANCE TECHNIQUE
vérifié il y a 8 mois

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.

Instructions pour répondre aux questions