- Newest
- Most votes
- Most comments
Check for other automation: Make sure there are no other systems or scripts that are scheduled to stop the EC2 instance after it has been started. Check any other Lambda functions, CloudWatch Events rules, or third-party automation tools that may be managing your EC2 instances.
Review CloudWatch Logs: Check the logs for your Lambda function in CloudWatch Logs. Look for any errors or unexpected behavior that may indicate why the EC2 instance is being stopped.
Verify Lambda Function: Double-check the configuration of your Lambda function. Ensure that it is correctly configured to start the EC2 instance and that there are no errors in the code that may be causing the unexpected behavior.
Check IAM Permissions: Ensure that the IAM role associated with your Lambda function has the necessary permissions to start and stop EC2 instances. It should have permissions for the ec2:StartInstances action.
Monitor CloudTrail Events: Keep an eye on CloudTrail events for any unexpected API calls related to EC2 instances. While you didn't see any "StopInstances" events, there may be other actions or API calls that are affecting the instance.
Test Manually: Manually trigger the Lambda function to start the EC2 instance and monitor its behavior. This can help you verify if the issue is specific to the scheduled invocation or if it occurs consistently regardless of how the instance is started.
import boto3
def lambda_handler(event, context):
# Initialize the EC2 client for the Mumbai region
ec2_client = boto3.client('ec2', region_name='ap-south-1')
# Retrieve all instance IDs in the Mumbai region
try:
response = ec2_client.describe_instances()
instance_ids = [instance['InstanceId'] for reservation in
response['Reservations'] for instance in reservation['Instances']]
except Exception as e:
print(f"Error describing instances: {str(e)}")
raise e
# Start all instances in the Mumbai region
try:
response = ec2_client.start_instances(InstanceIds=instance_ids)
print("All instances in Mumbai region started successfully.")
except Exception as e:
print(f"Error starting instances: {str(e)}")
raise e
return {
'statusCode': 200,
'body': 'All instances in Mumbai region started successfully'
}
Please follow above code crontab: 0 8 ? 2-6 *
Hi
- Review EC2 instance settings for automatic termination or suspicious startup scripts.
- Double-check your Lambda function code and IAM permissions.
- Filter CloudTrail events for the instance and Lambda function around the stop time, considering different event types.
- Enable CloudWatch logs for the instance and Lambda function.
- Use Session Manager to connect to the running instance and investigate further.
You all won't believe the solution to this. It boils down to me being the issue. Prior to learning how to automatically schedule shutdowns (since I'm new to AWS), I had two Windows tasks on this machine. One to shutdown at 9pm EST and another to shutdown at 2am EST if it was powered on again after the first shutdown. Both of these tasks had the option selected of "run ASAP the next available time if the first run was skipped". Well, on days where the 9pm task ran, the 2am task was skipped. So, on boot in the morning, Windows would run the 2am shutdown task within the first 10 minutes of being powered on. I disabled the setting to run if the task was missed, and the EC2 now stays on in the morning.
Thanks anyway everyone.
This is an unusual behavior, and it's challenging to pinpoint the exact cause without more information. However, here are a few troubleshooting steps you can try:
Auto Scaling Group or AWS Systems Manager Automation: Check if your EC2 instance is part of an Auto Scaling group or if there are any AWS Systems Manager Automation documents running that could be stopping the instance automatically. Review your AWS resources and configurations to ensure there are no conflicting automation or scaling policies in place.
AWS CloudTrail Log Delay: There can be a delay in CloudTrail logging events, especially for certain types of events like instance state changes. Wait for some time (up to a few hours) and check CloudTrail again for any "StopInstances" events that may have been logged with a delay.
EC2 Instance Monitoring: Enable detailed monitoring on your EC2 instance and review the CloudWatch metrics for any unusual patterns or events that could be causing the instance to stop.
When the instance has entered the Stopped state, what does the State transition message field say in the console? Does it say "Client.UserInitiatedShutdown: User initiated shutdown", "Client.InstanceInitiatedShutdown: Instance initiated shutdown", or some other code? The first means that the shutdown was initiated via the StopInstances API call and the second that the power-off command was issued by the operating system on the instance, similarly to how your laptop's operating system can power itself off.
Other codes are also possible (the full list is here: https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_StateReason.html), such as a spot instance being stopped by AWS, or a hardware failure forcing AWS to power the instance off, but I would guess that this happening repeatedly shortly after starting up, and with CloudTrail not showing StopInstances API calls, the power-off may simply have been requested by the operating system on the virtual machine. The operating system's logs would probably reveal the reason, such as a scheduled task or cron job being set to do that.
Relevant content
- asked 10 months ago
- asked 3 months ago
