Why did my Amazon EC2 instance automatically or unexpectedly stop?
My Amazon Elastic Compute Cloud (Amazon EC2) instance automatically or unexpectedly stopped.
Resolution
Note: If you receive errors when you run AWS Command Line Interface (AWS CLI) commands, then see Troubleshooting errors for the AWS CLI. Also, make sure that you're using the most recent AWS CLI version.
Identify why the EC2 instance stopped
Check the instance's StateReason code
To quickly identify why your instance stopped, run the following describe-instances AWS CLI command:
aws ec2 describe-instances --instance-ids i-1234567890abcdef0 --query "Reservations[].Instances[].{StateReason:StateReason}" --output json
Note: Replace i-1234567890abcdef0 with your instance ID.
In the output, check the StateReason value to identify why the instance stopped.
Check the instance's Event history in CloudTrail
Check the AWS CloudTrail Event history for the StopInstances event to get more information about why the instanced stopped.
On the Event history page, check the following values:
- Check eventTime to find the exact time when the stop action occurred.
- Check userIdentity for the AWS Identity and Access Management (IAM) user, role, or service that initiated the stop.
- Check userAgent to find the tool or service that the user used to make the API call, such as the AWS CLI or AWS Lambda.
- Check the requestParameters to find the instances that stopped.
Troubleshoot the Client.UserInitiatedShutdown StateReason
If the StateReason is Client.UserInitiatedShutdown, then use the CloudTrail console to identify the user that initiated the stop action.
Or, run the following lookup-events command:
aws cloudtrail lookup-events \ --lookup-attributes AttributeKey=ResourceName,AttributeValue=i-1234567890abcdef0 \ --start-time "starttime" \ --end-time "endtime" \ --query "Events[?EventName=='StopInstances']"
Note: Replace i-1234567890abcdef0 with your instance ID, startime with the start time of when you want to pull data from, and endtime with the end time.
Check userIdentity for the IAM user, role, or service that initiated the stop. If the userIdentity value is an IAM user from your AWS account, then a user in your account manually stopped the instance. If the userIdentity value is an IAM role, then an automated process that uses the role stopped the instance.
To prevent future unexpected instance stops, remove the ec2:StopInstances permissions from IAM users and roles that aren't authorized to stop instances.
Troubleshoot the Client.InstanceInitiatedShutdown StateReason
If the StateReason is Client.InstanceInitiatedShutdown, then the instance's operating system (OS) issued a shutdown or halt command. OS-initiated shutdowns bypass AWS APIs, so they don't generate StopInstances events in CloudTrail.
Start your instance, and then retrieve the instance console output. In the output, check for kernel panics, Out of Memory (OOM) messages, or shutdown sequences.
To identify what caused the instance to stop, run the following commands to check the logs based on your OS.
Linux:
# Check for shutdown/halt/reboot commands in auth logs grep -i "shutdown\|halt\|poweroff\|reboot" /var/log/auth.log /var/log/secure 2>/dev/null # Check system journal for previous boot's final messages journalctl --list-boots journalctl -b -1 -r | head -100 # Check for OOM killer events journalctl -b -1 | grep -i "out of memory\|oom-killer" dmesg | grep -i "oom\|killed process" # Check for kernel panic journalctl -b -1 | grep -i "kernel panic\|BUG:" # Check who/what initiated shutdown last -x shutdown reboot | head -10 # Check if a scheduled shutdown was set cat /run/systemd/shutdown/scheduled 2>/dev/null
Windows:
# Check Windows Event Log for shutdown events # Event 1074 = user/process initiated shutdown # Event 6006 = clean shutdown # Event 6008 = unexpected shutdown (crash/BSOD) Get-WinEvent -FilterHashtable @{LogName='System'; ID=1074,6006,6008} | Select-Object -First 10 | Format-List # Check for BugCheck (BSOD) events Get-WinEvent -FilterHashtable @{LogName='System'; ProviderName='Microsoft-Windows-WER-SystemErrorReporting'} -ErrorAction SilentlyContinue
To check for automated OS-level shutdown triggers, run the following commands:
# Check for shutdown scheduled via cron crontab -l | grep -i "shutdown\|halt\|poweroff\|reboot" sudo crontab -l | grep -i "shutdown\|halt\|poweroff\|reboot" # Check systemd timers systemctl list-timers --all | grep -i "shutdown\|reboot" # Check if unattended-upgrades triggered a reboot (Debian/Ubuntu) cat /var/log/unattended-upgrades/unattended-upgrades-shutdown.log 2>/dev/null # Check watchdog configuration systemctl status watchdog 2>/dev/null
To troubleshoot kernel panic or OOM issues, proceed to Troubleshoot instances with high resource usage.
Troubleshoot the Server.SpotInstanceTermination StateReason
If your instance is a Spot Instance and the StateReason is Server.SpotInstanceTermination, then Amazon EC2 reclaimed the capacity. For more information, see Why did Amazon EC2 interrupt my Spot Instance?
To work around Spot Instance interruptions, take the following actions:
- Use a diversified fleet strategy across multiple instance types and Availability Zones.
- Use Spot Instance interruption notices to gracefully manage shutdowns.
- Use Capacity Rebalancing to proactively replace at-risk Spot Instances.
- For workloads that can't be interrupted, use On-Demand or Reserved Instances instead.
Troubleshoot automated processes that stopped your instance
The userAgent field in CloudTrail shows whether a Lambda function, scheduled script, or other automated process stopped your instance. To stop or modify an automation that stopped your instance, take the following actions based on the source.
Update Lambda functions
If the userAgent field shows a Lambda function such as ssmApplicationInstancesToggle, then update the Lambda function and check the following configurations:
- Check the function code for stop logic.
- Check the function's triggers, such as Amazon EventBridge rules, schedules, or other event sources.
- Modify the function logic to exclude instances that you don't want to stop.
- Adjust the schedule.
Or, delete the function if you no longer need it.
Update Amazon EC2 Auto Scaling scale-in events
If your instance is in an Amazon EC2 Auto Scaling group, then the Auto Scaling group might terminate the instance during a scale-in event.
To check whether the instance belongs to an Auto Scaling group, run the following describe-auto-scaling-instances command:
aws autoscaling describe-auto-scaling-instances \ --instance-ids i-1234567890abcdef0
Note: Replace i-1234567890abcdef0 with your instance ID.
To identify what caused the instance to stop, check the instance's scaling activities.
To make sure that a scale-in activity doesn't cause the instance to stop, activate instance scale-in protection.
Update scheduled scripts or cron jobs
If userAgent is an instance that used the AWS CLI with temporary credentials from the Instance Metadata Service (IMDS), then an instance script stopped the instances.
To identify the script, connect to the instance that's listed in CloudTrail.
For Windows instances, run the following command to check Task Scheduler:
Get-ScheduledTask | Where-Object {$_.State -ne "Disabled"} | Select-Object TaskName, TaskPath, State
For Linux instances, run the following commands to check cron jobs for all users:
# Check current user's cron jobs crontab -l # Check root user's cron jobs sudo crontab -l # List all users' cron jobs for user in $(cut -f1 -d: /etc/passwd); do echo "Cron jobs for $user:"; sudo crontab -u $user -l 2>/dev/null; done
Then, run the following command to review the cron logs to identify the user and script that ran during the time of the stop event:
sudo cat /var/log/cron | grep "time-of-stop-event"
Note: Replace time-of-stop-event with the time when the instance stopped.
After you identify the cron job or scheduled task that stopped the instance, take one of the following actions:
- Modify the script logic so that it doesn't stop instances.
- Adjust the schedule.
- Remove the script.
If you can't identify the automation that stopped the instance, then update the instance that's listed in CloudTrail so that it can't stop instances. Modify the instance's IAM role permissions policy to remove the ec2:StopInstances and ec2:StartInstances permissions.
Check whether there was a scheduled maintenance event
AWS periodically performs maintenance on the underlying hardware that hosts instances.
To check whether a scheduled maintenance event occurred, run the following describe-instance-status command:
aws ec2 describe-instance-status \ --instance-ids i-1234567890abcdef0 \ --include-all-instances \ --query "InstanceStatuses[*].Events"
Note: Replace i-1234567890abcdef0 with your instance ID.
For more information, see How do I manage and reschedule Amazon EC2 instance maintenance events?
When AWS schedules a maintenance event, you receive an alert before the event's date and time. If you didn't receive email notifications about scheduled maintenance, then take the following actions:
- Make sure that the primary email address that's associated with your account is correct.
- Check spam or junk folders for AWS notifications.
- Set up proactive notifications through the AWS Health Dashboard to receive alerts through multiple channels.
Identify automatic recovery actions
System status checks detect issues with the underlying host hardware or AWS infrastructure. If your instance failed a system status check, then Amazon EC2 runs EC2 Auto Recovery. By default, EC2 Auto Recovery is activated on instance types that support automatic recovery. To check whether an automatic recovery action stopped your instance, see Verify if automatic instance recovery occurred.
In CloudTrail, the invokedBy value is monitoring.amazonaws.com for events that occurred because of automatic recovery.
Example event:
{ "eventSource": "ec2.amazonaws.com", "eventName": "StopInstances", "userIdentity": { "invokedBy": "monitoring.amazonaws.com" } }
Important: Instance status check failures don't result in automatic recovery. Only system status checks do.
For more information about status checks, see How do I troubleshoot status check failures on my Amazon EC2 instance?
Troubleshoot instances with high resource usage
If your instance's resource usage is high, then the instance might become unresponsive. If you configured an automation to stop unresponsive instances, then the automation stops the instance. If out-of-memory issues result in kernel panic errors, then the OS might also initiate a shutdown.
If your instance stopped during a period of high resource usage, then check your Amazon CloudWatch alarms, EventBridge rules, and AWS Systems Manager automation. Identify whether one of the automations caused the instance to stop.
To troubleshoot high resource usage issues, see the following AWS Knowledge Center troubleshooting articles:
- How do I troubleshoot an EC2 Linux instance that fails a status check because of resource over-usage?
- How do I troubleshoot high CPU utilization on an Amazon EC2 Linux instance?
- How do I troubleshoot high memory usage issues on my EC2 Linux instance?
- How do I troubleshoot high CPU usage on an Amazon EC2 Windows instance?
- How can I tell if my Amazon EC2 instance reaches its Amazon EBS quotas?
Update your configuration to get notifications about instance stops
Note: It's a best practice to activate stop protection for instances that must remain running. Stop protection makes sure that APIs can't stop the instance.
To manage and monitor your instance state, configure the following settings:
- Set up Amazon Simple Notification Service (SNS) topics to receive emails when your instances change states.
- Set up EventBridge rules to get real-time notifications when instances change state.
- Remove the ec2:StopInstances permissions from IAM users and roles that aren't authorized to stop instances.
- Add a tag to critical instances that must never be automatically stopped. Then, modify your automation scripts to exclude instances with the tag. For more information, see EC2: start or stop instances based on tag.
Related information
Why did Amazon EC2 unexpectedly terminate my instance?
Why did my Amazon EC2 Linux instance reboot or restart itself?
- Topics
- Compute
- Tags
- Amazon EC2
- Language
- English

Relevant content
- Accepted Answerasked 2 years ago
- Accepted Answerasked 2 years ago
AWS OFFICIALUpdated 3 years ago
AWS OFFICIALUpdated 4 months ago