Huge, random network downtimes on EC2

0

Hi,

I've been using Lightsail for about a year without any issue or downtime, and a few days ago I switched to EC2 because I'll need more compute power for my application in the near future. I created a c5a.2xlarge instance on eu-west-3. It's a Ubuntu 22.04 x86 instance.

My application is a financial application that connects to a websocket, saves some data to SQLite and sends some POST requests. It uses about 15% of the CPU and 2% of the memory of my instance for the time being.

I followed all the guidelines on instance configuration, and it seemed to work perfectly when I started using it. My problem is that, randomly throughout the day, about 1-2 times per day, the server disconnects from the internet completely and my application fails any re-connection attempt. If I try to connect via SSH or SFTP during these times, it also fails. Eventually after about 2h the instance safety check fails and triggers my alarm, which automatically reboots the instance, and it starts working again.

For my application it's extremely important to have 99.99% uptime, and I cannot figure out what is causing this, since the same exact software I'm running has been running with 100% uptime for 1 year on a small Lightsail instance. All the metrics look normal, besides incoming network packets going to 0.

R
asked 7 months ago178 views
2 Answers
0

Hello.

What type of EBS are you using?
If you are using gp2 and there are a lot of writes, it may be possible that the burst credits are used up and the performance drops to the baseline throughput.
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/general-purpose.html#EBSVolumeTypes_gp2
https://repost.aws/knowledge-center/ebs-volume-burst-balance-low

profile picture
EXPERT
answered 7 months ago
0

I’ve seen similar things happen to Linux Machines but when the OS memory runs out.

You’d want to check all your system logs too as it could be file descriptors, memory issues etc.

You could if not already install the cloud watch agent and grab the extra metrics and logs for investigation purposes.

profile picture
EXPERT
answered 7 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions