Root cause for AWS EC2 disconnection.

0

Hi,

One of our production EC2 instances (Ubuntu 18.04 OS) became unreachable and we were unable to connect it via SSH. We rebooted the instance and then we were able to connect after that. Can someone point me to where should we check for the root cause for this problem? I could not find any useful info from the Instance's system logs as well.

posta 2 anni fa254 visualizzazioni
3 Risposte
0

Actually, it was almost at 0% utilization. This is the first time we faced this issue. Instance type is c5.2xlarge. We have also checked the disk usage. It was at 55%. Is there a possibility of some internal OS error on Ubuntu, that might have crashed it and made it unreachable? If yes, where could I find the logs?

con risposta 2 anni fa
  • Server logs are saved in /var/log directory. You might want to check kern.log or syslog to rule out OS crash.

0

You should have a look at the Metrics for the instance. Especially the CPU Utilization, Status check failed metrics. See if there was high CPU utilization around the time you had to reboot instance. Depending on the instance type also look for CPU credit usage and balance.

Is it a recurring issue or one-off incident? What is the instance type?

profile picture
Syd
con risposta 2 anni fa
0

Hello,

There is a possibility that this could be an operating system issue, such as a crash or kernel panic. Was there an error message when trying to connect?

I would suggest checking out this Ubuntu documentation on how to view your logs.

https://ubuntu.com/tutorials/viewing-and-monitoring-log-files#2-log-files-locations

Some specific logs you may find useful could include:

Authorization log = /var/log/auth.log - which keeps track of authorization systems, such as password prompts, the sudo command and remote logins.

Login failures log = /var/log/faillog - which contains info about login failures. You can view it with the faillog command.

Daemon Log = /var/log/daemon.log - daemons are programs that run in the background, usually without user interaction. For example, display server, SSH sessions, printing services, bluetooth, and more. This could indicate if there were issues with SSH.

System log = /var/log/syslog - Contains more information about your system. If you can’t find anything in the other logs, it’s probably here.

Also, here are some articles on some general EC2 troubleshooting tips:

https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-troubleshoot.html

https://docs.aws.amazon.com/en_us/AWSEC2/latest/UserGuide/TroubleshootingInstancesConnecting.html#TroubleshootingInstancesConnectionTimeout

con risposta 8 mesi fa
AWS
TECNICO DI SUPPORTO
verificato 8 mesi fa

Accesso non effettuato. Accedi per postare una risposta.

Una buona risposta soddisfa chiaramente la domanda, fornisce un feedback costruttivo e incoraggia la crescita professionale del richiedente.

Linee guida per rispondere alle domande