Instance Reachability fails every few days


Hi, I am running a basic webscraper on a T3 micro Linux instance. The programme always runs fine for a few days but then randomly the Instance Reachability check fails. I do notice that the CPU utilization does go from around 20% to 45% just before the check first fails. I then have to manually reboot the instance and rerun the webscraper. My questions are as follows:

  1. Why is reachability becoming an issue every few days?
  2. Is there anything that I can put it my code to stop this happening?
  3. How do I make it so that when this check fails, the instance automatically reboots and run my programme again.
asked 7 months ago671 views
1 Answer
Accepted Answer

My hunch here is that it's memory usage. The t3.micro has only 1GB of memory, if your application is trying to use more memory than is available then the memory manager comes into play, and will try to swap processes out of main memory and onto disk (it's more complicated, but that's the general gist of it). This in itself is CPU-intensive, and as free memory gets less and less the memory manager will spend more and more of its time (and more and more CPU) trying to free up pages of main memory, leaving fewer and fewer CPU cycles for anything else (including instance reachability checks, and SSH requests for login), and to all intents and purposes the instance will appear to have crashed.

Many years ago I ran a website on a free tier EC2 and found that a memory leak in Apache was making my EC2 become unavailable after about a month. The fix was to restart Apache once a week.

Unfortunately the EC2 section of the AWS Console doesn't display metrics for memory use, you'll need to setup CloudWatch agent to collect these, and see if this really is what's happening here

If this what's happening then you could try to tune your application some more, or uplift the instance type to one with more memory (which will take you out of free tier), or schedule a restart of your application every couple of days, or add a swap partition.

To automatically recover from a situation where your instance goes down, consider making it part of an auto-scaling group

This will need a bit of knowledge of Launch Templates and User Data scripts on your part, but once you get to grips with the basics of these it should be fairly straightforward to implement on what you have here.

profile picture
answered 7 months ago
profile picture
reviewed 7 months ago
profile picture
reviewed 7 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions