Unable to ssh to Lightsail server when in burst zone with ample remaining burst capacity

1

Hi, I have noticed that I can't access lightsail server when it is in burst zone. It doesn't matter if I have a lot of burst capacity remaining. I tried this by creating a new server etc.. same issue. Here's how to replicate the issue:

  1. Create a Lightsail server (in my case it's a plan with 2vCPUs and 30% sustainable capacity)
  2. Wait for remaining burst capacity to reach 100% (~432m)
  3. Run any script that will consume bust CPU (I was running it at 50% instead of the sustainable 30%, my guess is 1vCPU was maxed out in my case)

After a few mins (maybe give it 30-40 mins to be sure) you will not be able to ssh (even the "connect via ssh button on AWS interface won't work). This should not happen as from my understanding of how burst capacities work I should be able to burst for much longer than that. The main problem here is because I can't ssh, I can't stop the script either. My main use-case for these bursts is to be able to run scripts for updates/upgrades temporarily. But this weird behavior all but makes Lightsail entirely useless as a server.

asked 2 years ago497 views
1 Answer
0

As I understand, whenever you run a script consuming CPU/Memory on your LightSail server, you are not being able to SSH into the server, even when Burstable capacity is not entirely depleted.


First, please allow me to explain Burst performance baselines:

Performance baselines are per vCPU. The CPU utilization metric graph in the Lightsail console averages the CPU utilization and baseline for instances with more than one vCPU. For example, a Linux or Unix-based $40 USD/month instance has two vCPUs and an averaged CPU utilization baseline of 30%.

Therefore, if:

  • One vCPU operates at 50% and the other at 0%, a 25% averaged CPU utilization is displayed on the graph. This puts the instance's CPU utilization below its 30% baseline, and in the sustainable zone.

  • One vCPU operates at 30%, and the other at 20%, a 25% averaged CPU utilization is displayed on the graph. This puts the instance's CPU utilization below its 30% baseline, and in the sustainable zone.

  • One vCPU operates at 35% and the other at 25%, a 30% averaged CPU utilization is displayed on the graph. This puts the instance's CPU utilization at the 30% baseline.

  • One vCPU operates at 100% and the other at 90%, a 95% averaged CPU utilization is displayed on the graph. This puts the instance's CPU utilization above its 30% baseline, and in the burstable zone.


In order to understand per vCPU performance, we would need to investigate what is causing your instance to burst using tools like top on Linux/Unix instances, and Task Manager on Windows Server instances. In top, we can also check per vCPU consumption, determining the Burst baseline performance.


Next, there could be multiple reasons behind server becoming unreachable that we would need to dive deeper into. You would need need to share more details if you want help with this issue - Relevant log snippets with errors, what's happening when the instance crashes, any troubleshooting performed so far, exact configuration of the instance (since there are 2 LightSail instances offered with 2 vCPUs, and depending on the configuration, burst time will vary, etc. When you run the script, we would also need to check the Memory metrics from within the instance using utility like 'top', in order to understand whether memory is not being depleted.


Lastly, I would recommend opening up a case with us at AWS Premium Support for an AWS Engineer to check instance metrics live on screenshare when you run the script. Therefore, please open up a case with us and choose the Chat/Call option in order to investigate further.

AWS
SUPPORT ENGINEER
answered 2 years ago
  • Hi, thanks for the response and details on how burst performance works.

    Unfortunately, I don't have the time to help resolve the issue as it is easier for us to use another service for now. But this is an easy to replicate issue. Just create a new lightsail instance (any type) and run a process in burst. You won't be able to ssh after a little while despite plenty of burst remaining. Appears to be a known lightsail issue on the internet. This is on US-East N.Virginia if that matters.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions