Skip to content

Inexplicable EC2 Linux instance CPU spikes

0

We have an EC2 Linux (Ubuntu 22.04) t.2 medium instance running a Node based backend service with one port open (behind our LB and WAF & Shield protection). Recently we noticed intermittent CPU usage spikes that last for 1-2 hours (but vary) that randomly occur but are not seen in the host itself (checking top) or reflected in AS Systems Manager - Fleet Manager Performance Counters. There are no apparent network spikes at the same time or any scheduled services accessing the instance (backups/snapshots etc.) From the metrics to me it looks like possible brute force or DOS attacks but as mentioned there is no visible network traffic to back that up.

Any ideas as to what this could be or other places to check, AWS logs so we can find out? (Looking into this due to random service crash but so far the times don't correlate)

2 week view

6 hour view

1 Answer
0

Based on the information provided, you're experiencing an unusual situation with your EC2 t2.medium instance running Ubuntu 22.04. The intermittent CPU usage spikes that last for 1-2 hours but are not visible in the host itself or reflected in AWS Systems Manager Fleet Manager Performance Counters are indeed puzzling.

Given that you're not seeing corresponding network traffic spikes and the instance is behind a load balancer with WAF and Shield protection, it's less likely to be a direct DDoS or brute force attack. However, there are several areas you can investigate further:

  1. CloudWatch Metrics: Ensure you're looking at the most granular metrics available. Sometimes, averaging over longer periods can mask short-term spikes. Use CloudWatch Metrics Insights to query for the highest CPU utilization instances and compare with your observations.

  2. Detailed Monitoring: If not already enabled, consider enabling detailed monitoring for your EC2 instance to get more frequent data points.

  3. CloudWatch Logs: If you have CloudWatch Logs set up for your instance, review them for any unusual activity or error messages coinciding with the CPU spikes.

  4. EC2 Instance Connect: This service allows secure connection to your instance and could potentially cause CPU spikes if used frequently. Check if there are any correlations with its usage.

  5. T2 Unlimited: If you're using T2 Unlimited, the CPU spikes might be related to credit usage. Review your CPU credit usage and balance.

  6. Hidden Processes: Some malicious processes can hide from standard monitoring tools. Use alternative tools like 'atop' or 'htop' to see if they reveal any hidden resource-intensive processes.

  7. Kernel-level Issues: The discrepancy between CloudWatch metrics and on-instance monitoring could indicate kernel-level problems. Check kernel logs for any anomalies.

  8. AWS Support: If you have an appropriate support plan, consider opening a case with AWS Support for more in-depth investigation.

  9. Instance Upgrade: As a test, you could temporarily upgrade to a larger instance type to see if the issue persists, which might help isolate whether it's related to the t2.medium's specific constraints.

Remember, without more detailed logs or direct access to the instance, it's challenging to pinpoint the exact cause. Continuous monitoring and correlation of various metrics over time will likely be necessary to identify the root cause of these mysterious CPU spikes.
Sources
AWS EC2 T2 Medium Performance Issue | AWS re:Post
Identify the EC2 instances with the most CPU utilization | AWS re:Post
View the monitoring graphs for your instances - Amazon Elastic Compute Cloud

answered a year ago
AWS
EXPERT
reviewed 4 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.