Skip to content

Why does my EC2 stop 10 minutes after being automatically started via Lambda?

0

Hi everyone, I need assistance with something I cannot figure out how to troubleshoot. For context I have a single EC2 instance that I want to automatically start M-F @ 8AM EST. Using this documentation from AWS, I created the Lambda function and the schedule, set to invoke the Lamda using CRON M-F @ 8AM EST.

Come 8AM, the EC2 starts automatically, and I see the "StartInstances" event in CloudTrail from the Lambda function. However, consistently ~10 minutes after starting, that same EC2 enters a stop state. When checking CloudTrail, I do not see any "StopInstances" making this very difficult to troubleshoot. After researching for a bit, I am unable to find anyone else with this issue. Any ideas?

Thanks

5 Answers
6

Check for other automation: Make sure there are no other systems or scripts that are scheduled to stop the EC2 instance after it has been started. Check any other Lambda functions, CloudWatch Events rules, or third-party automation tools that may be managing your EC2 instances.

Review CloudWatch Logs: Check the logs for your Lambda function in CloudWatch Logs. Look for any errors or unexpected behavior that may indicate why the EC2 instance is being stopped.

Verify Lambda Function: Double-check the configuration of your Lambda function. Ensure that it is correctly configured to start the EC2 instance and that there are no errors in the code that may be causing the unexpected behavior.

Check IAM Permissions: Ensure that the IAM role associated with your Lambda function has the necessary permissions to start and stop EC2 instances. It should have permissions for the ec2:StartInstances action.

Monitor CloudTrail Events: Keep an eye on CloudTrail events for any unexpected API calls related to EC2 instances. While you didn't see any "StopInstances" events, there may be other actions or API calls that are affecting the instance.

Test Manually: Manually trigger the Lambda function to start the EC2 instance and monitor its behavior. This can help you verify if the issue is specific to the scheduled invocation or if it occurs consistently regardless of how the instance is started.

import boto3
def lambda_handler(event, context):
 # Initialize the EC2 client for the Mumbai region
 ec2_client = boto3.client('ec2', region_name='ap-south-1')
 # Retrieve all instance IDs in the Mumbai region
 try:
 response = ec2_client.describe_instances()
 instance_ids = [instance['InstanceId'] for reservation in 
response['Reservations'] for instance in reservation['Instances']]
 except Exception as e:
 print(f"Error describing instances: {str(e)}")
 raise e
 # Start all instances in the Mumbai region
 try:
 response = ec2_client.start_instances(InstanceIds=instance_ids)
 print("All instances in Mumbai region started successfully.")
 except Exception as e:
 print(f"Error starting instances: {str(e)}")
 raise e
 return {
 'statusCode': 200,
 'body': 'All instances in Mumbai region started successfully'
 }

Please follow above code crontab: 0 8 ? 2-6 *

EXPERT
answered 2 years ago
2

Hi

  • Review EC2 instance settings for automatic termination or suspicious startup scripts.
  • Double-check your Lambda function code and IAM permissions.
  • Filter CloudTrail events for the instance and Lambda function around the stop time, considering different event types.
  • Enable CloudWatch logs for the instance and Lambda function.
  • Use Session Manager to connect to the running instance and investigate further.
EXPERT
answered 2 years ago
1
Accepted Answer

You all won't believe the solution to this. It boils down to me being the issue. Prior to learning how to automatically schedule shutdowns (since I'm new to AWS), I had two Windows tasks on this machine. One to shutdown at 9pm EST and another to shutdown at 2am EST if it was powered on again after the first shutdown. Both of these tasks had the option selected of "run ASAP the next available time if the first run was skipped". Well, on days where the 9pm task ran, the 2am task was skipped. So, on boot in the morning, Windows would run the 2am shutdown task within the first 10 minutes of being powered on. I disabled the setting to run if the task was missed, and the EC2 now stays on in the morning.

Thanks anyway everyone.

answered 2 years ago
EXPERT
reviewed a year ago
1

This is an unusual behavior, and it's challenging to pinpoint the exact cause without more information. However, here are a few troubleshooting steps you can try:

Auto Scaling Group or AWS Systems Manager Automation: Check if your EC2 instance is part of an Auto Scaling group or if there are any AWS Systems Manager Automation documents running that could be stopping the instance automatically. Review your AWS resources and configurations to ensure there are no conflicting automation or scaling policies in place.

AWS CloudTrail Log Delay: There can be a delay in CloudTrail logging events, especially for certain types of events like instance state changes. Wait for some time (up to a few hours) and check CloudTrail again for any "StopInstances" events that may have been logged with a delay.

EC2 Instance Monitoring: Enable detailed monitoring on your EC2 instance and review the CloudWatch metrics for any unusual patterns or events that could be causing the instance to stop.

answered 2 years ago
1

When the instance has entered the Stopped state, what does the State transition message field say in the console? Does it say "Client.UserInitiatedShutdown: User initiated shutdown", "Client.InstanceInitiatedShutdown: Instance initiated shutdown", or some other code? The first means that the shutdown was initiated via the StopInstances API call and the second that the power-off command was issued by the operating system on the instance, similarly to how your laptop's operating system can power itself off.

Other codes are also possible (the full list is here: https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_StateReason.html), such as a spot instance being stopped by AWS, or a hardware failure forcing AWS to power the instance off, but I would guess that this happening repeatedly shortly after starting up, and with CloudTrail not showing StopInstances API calls, the power-off may simply have been requested by the operating system on the virtual machine. The operating system's logs would probably reveal the reason, such as a scheduled task or cron job being set to do that.

EXPERT
answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.