- Newest
- Most votes
- Most comments
The issue you're experiencing with your Azure DevOps agents after an AWS instance refresh in your Auto Scaling Group (ASG) could be related to a few factors. Let's explore some potential causes and solutions:
-
Instance Refresh Process: When you start an instance refresh, Amazon EC2 Auto Scaling gradually replaces the instances in your Auto Scaling group. The process you're using specifies a MinHealthyPercentage of 90%, which means that at least 90% of your instances must remain healthy during the refresh. With only three instances, this setting might cause the refresh to replace only two instances, leaving one with the old configuration.
-
InstanceWarmup Setting: You've set the InstanceWarmup to 3000 seconds (50 minutes). This is the time Auto Scaling waits after a new instance is launched before considering it ready. Ensure this time is sufficient for your Azure DevOps agent to start and become operational.
-
Auto Scaling Group Configuration: Check if your ASG is configured to maintain exactly three instances. If it's set to a range (e.g., minimum 2, desired 3, maximum 3), it might not immediately launch a third instance after the refresh.
-
Instance Configuration: Verify that the new AWS Service Catalog product version correctly includes the setup for the Azure DevOps agent. The issue might be in the product configuration rather than the refresh process.
To address this issue:
-
Consider using an instance refresh with checkpoints. This allows you to pause the refresh at certain points to verify that your applications are working correctly.
-
Adjust the MinHealthyPercentage to a lower value (e.g., 66%) to ensure all instances are refreshed.
-
After the refresh, manually check the status of your instances and the Azure DevOps agent service. You might need to manually start the service on the instance where it's not running.
-
If the issue persists, you may want to implement a custom health check in your Auto Scaling group that verifies the Azure DevOps agent is running before considering an instance healthy.
-
Consider using lifecycle hooks in your Auto Scaling group. These can help ensure that your Azure DevOps agent is fully set up and running before the instance is put into service.
Remember, when making changes to your Auto Scaling group or starting an instance refresh, it's important to monitor the process closely, especially in production environments. You may want to test these changes in a non-production environment first to ensure they work as expected with your specific setup.
Sources
Use an instance refresh to update instances in an Auto Scaling group - Amazon EC2 Auto Scaling
Update an Auto Scaling group - Amazon EC2 Auto Scaling
RefreshPreferences - Amazon EC2 Auto Scaling
Auto Scaling groups - Amazon EC2 Auto Scaling
Check the activity history of the ASG to see if it actually replaced all 3 instances. Its possible the default SkipMatching parameter made it so the ASG didn't replaced all instances if they're on different versions.
If all the instances were replaced, then the Instance Refresh itself worked correctly, and you'll want to look at logs inside the instance to see why the Agent didn't start up correctly. Might be some sort of concurrency issue with the old instances still running as the new ones are starting?
Hi Shalad. I checked the ASG activity and everything went fine. It replaced all the instances. What I dont understand is that in other environments works perfectly (ACC) and not in TST. Sometimes If I stop the instance which the azure devops agent is not installed, then a new one will come up (because of the ASG) and THEN this instance is visible in the Azure devops agents pool! I ran the pipeline again today, because there was a new version, and the 1st instance didnt have the Agent installed, but the other three yes. The 1st instance replaced at 2024/10/08 19:10 (Agent service not installed) The 2nd instance replaced at 2024/10/08 19:30 (Agent service installed) The 3d instance replaced at 2024/10/08 19:50 (Agent service installed)
Relevant content
- Accepted Answerasked 8 days ago
- AWS OFFICIALUpdated 10 months ago
- AWS OFFICIALUpdated 10 months ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 4 months ago
The automatic Agent answer is a bit iffy on some of the bullets for this one, FYI