Skip to content

statuscheckfailed_instance still unreachable after reboot and shutdown restart

0

Hi,

After running AWS-RunPatchBaseline on the server and issuing a reboot the server is no longer accessible via SSH or the AWS Console connect feature. The server is Ubuntu Server 22.04 LTS. I tried restarting and then stopping and starting the instance with no luck. If I look at the console screen shot it's at the login screen. I was able to download the logs and and do not see any issues. Instance ID i-0f13cae8eb38eb0c0. This is the error I got if I try to connect via the AWS console.

SSM Agent is not online Ping status: Connection lost

The SSM Agent was unable to connect to a Systems Manager endpoint to register itself with the service.

At this point I'm not sure where to go from here.

asked 9 months ago169 views
2 Answers
1
Accepted Answer

Disk errors and was set to read only. Restored from backup.

answered 9 months ago
EXPERT
reviewed 9 months ago
1

Based on the information provided, it appears that your EC2 instance is experiencing a reachability issue after running AWS-RunPatchBaseline and rebooting. This situation can occur due to various reasons, and there are several steps you can take to troubleshoot and potentially resolve the issue:

  1. Check instance status checks: Ensure that both system and instance status checks are passing. If they're failing, this could indicate underlying hardware or software issues.

  2. Review console output: Examine the instance console output for any error messages or unusual behavior during the boot process. This can provide valuable insights into what might be preventing the instance from becoming fully operational.

  3. Analyze CloudWatch metrics: Check the CPU utilization and other performance metrics in Amazon CloudWatch. High CPU usage or memory exhaustion could cause the instance to become unresponsive.

  4. Verify network configuration: Ensure that the security groups and network ACLs associated with the instance are correctly configured and not blocking necessary traffic.

  5. Check EBS volumes: Verify that all attached EBS volumes are properly mounted and accessible to the instance.

  6. Investigate SSM Agent: The error message indicates that the SSM Agent is not online and unable to connect to the Systems Manager endpoint. This could be due to network connectivity issues or problems with the SSM Agent itself.

  7. Consider OS-level issues: Since you can see the login screen in the console screenshot, the operating system seems to be booting. The problem might be related to network services or SSH configuration.

If these steps don't resolve the issue, you may need to consider more advanced recovery options:

  1. Create a snapshot of the instance's root volume.
  2. Attach the root volume to another instance as a secondary volume to investigate further or recover data.
  3. If necessary, launch a new instance from a recent backup AMI and restore your data.

If you're unable to resolve the issue through these methods, you may need to contact AWS support for further assistance, especially if you suspect there might be an underlying hardware problem with the EC2 host.

Remember to always maintain regular backups and consider using an Elastic IP address to make it easier to remap your public IP if you need to stop and start your instance during troubleshooting.

Sources
Troubleshoot status check failures on an unreachable EC2 Linux instance | AWS re:Post
Instance keeps on failing with status check - reachability issue | AWS re:Post
Troubleshoot an unreachable Amazon EC2 instance - Amazon Elastic Compute Cloud

answered 9 months ago
EXPERT
reviewed 9 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.