Instance Reachability fails every few days

0

Hi, I am running a basic webscraper on a T3 micro Linux instance. The programme always runs fine for a few days but then randomly the Instance Reachability check fails. I do notice that the CPU utilization does go from around 20% to 45% just before the check first fails. I then have to manually reboot the instance and rerun the webscraper. My questions are as follows:

  1. Why is reachability becoming an issue every few days?
  2. Is there anything that I can put it my code to stop this happening?
  3. How do I make it so that when this check fails, the instance automatically reboots and run my programme again.
Goten
已提问 5 个月前604 查看次数
1 回答
2
已接受的回答

My hunch here is that it's memory usage. The t3.micro has only 1GB of memory, if your application is trying to use more memory than is available then the memory manager comes into play, and will try to swap processes out of main memory and onto disk (it's more complicated, but that's the general gist of it). This in itself is CPU-intensive, and as free memory gets less and less the memory manager will spend more and more of its time (and more and more CPU) trying to free up pages of main memory, leaving fewer and fewer CPU cycles for anything else (including instance reachability checks, and SSH requests for login), and to all intents and purposes the instance will appear to have crashed.

Many years ago I ran a website on a free tier EC2 and found that a memory leak in Apache was making my EC2 become unavailable after about a month. The fix was to restart Apache once a week.

Unfortunately the EC2 section of the AWS Console doesn't display metrics for memory use, you'll need to setup CloudWatch agent to collect these, and see if this really is what's happening here https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Install-CloudWatch-Agent.html

If this what's happening then you could try to tune your application some more, or uplift the instance type to one with more memory (which will take you out of free tier), or schedule a restart of your application every couple of days, or add a swap partition.

To automatically recover from a situation where your instance goes down, consider making it part of an auto-scaling group https://docs.aws.amazon.com/autoscaling/ec2/userguide/auto-scaling-groups.html

This will need a bit of knowledge of Launch Templates and User Data scripts on your part, but once you get to grips with the basics of these it should be fairly straightforward to implement on what you have here.

profile picture
专家
Steve_M
已回答 5 个月前
profile picture
专家
已审核 5 个月前
profile picture
专家
已审核 5 个月前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则