Questions tagged with Amazon EC2 Auto Scaling

Content language: English

Sort by most recent

Browse through the questions and answers listed below or filter and sort to narrow down your results.

How frequently does an ASG attempt to remove instances when current size is greater than desired?

I have an EC2 ASG that has size triggers based on CPU utilization. Usually, it follows the predictable pattern of scaling up during time of usage and removing instances as load decreases. My instances will sometimes mark themselves as protected from scale-in if they are working on something longer-running than their normal tasks. If all instances are protected, I'll get the message "Could not scale to desired capacity because all remaining instances are protected from scale-in" in cloud watch. It appears that following that message, the next scale-in attempt won't occur for quite a while - 10 hours later when this happened yesterday. Since my instances only protect themselves for a short amount of time, the scale in would have succeeded during most of that 10 hours. My question: is there a way to configure the ASG so that it would retry the scale-in sooner than 10 hours later? Or is there a way I could respond to the failed attempt and maybe an instance could take itself off-line? (I do understand that ideally the instances wouldn't protect themselves in the first place, and that's part of a larger update to the architecture. But a short-term fix to the existing solution would be great.) To respond to the questions: The Alarm triggered based on low utilization and immediately reduced the desired count. At that point the alarm was no longer set. I'm looking at the ASG Activity History pane where there isn't anything in between message 1 that indicates that the desired size was reduced and that no instance could be removed and message 2 that a particular instance was removed due to a difference between current and desired.
1
answers
0
votes
80
views
asked 10 months ago

aws_ssm_document addomainjoin error

I am struggling to get EC2 instances deployed via an ASG joined to the domain. I get the following error each time *New-SSMAssociation : Document schema version, 2.2, is not supported by association that is created with instance id* I have tried various schema versions detailed [Here](https://docs.aws.amazon.com/systems-manager/latest/userguide/document-schemas-features.html) however all fail with the same error **SSMdoc.tf** ``` resource "aws_ssm_document" "ad-join-domain" { name = "ad-join-domain" document_type = "Command" content = jsonencode( { "schemaVersion" = "2.2" "description" = "aws:domainJoin" "parameters" : { "directoryId" : { "description" : "(Required) The ID of the directory.", "type" : "String" }, "directoryName" : { "description" : "(Required) The name of the domain.", "type" : "String" }, "dnsIpAddresses" : { "description" : "(Required) The IP addresses of the DNS servers for your directory.", "type" : "StringList" }, }, "mainSteps" = [ { "action" = "aws:domainJoin", "name" = "domainJoin", "inputs" = { "directoryId" : data.aws_directory_service_directory.adgems.id, "directoryName" : data.aws_directory_service_directory.adgems.name, "dnsIpAddresses" : [data.aws_directory_service_directory.adgems.dns_ip_addresses] } } ] } ) } ``` template.tf ``` data "template_file" "ad-join-template" { template = <<EOF <powershell> Set-DefaultAWSRegion -Region eu-west-2 Set-Variable -name instance_id -value (Invoke-Restmethod -uri http://169.254.169.254/latest/meta-data/instance-id) New-SSMAssociation -InstanceId $instance_id -Name "${aws_ssm_document.ad-join-domain.name}" </powershell> EOF } ``` The template is then referenced in the ASG Launch Template user_data section. Getting onto the instance I can see the script/logs and have confirmed the variables set (instance id for example). Full error message from the PS running below ``` New-SSMAssociation : Document schema version, 2.2, is not supported by association that is created with instance id At C:\Windows\system32\config\systemprofile\AppData\Local\Temp\EC2Launch228430162\UserScript.ps1:3 char:5 + New-SSMAssociation -InstanceId $instance_id -Name "ad-join-domain ... + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + CategoryInfo : InvalidOperation: (Amazon.PowerShe...sociationCmdlet:NewSSMAssociationCmdlet) [New-SSMAs sociation], InvalidOperationException + FullyQualifiedErrorId : Amazon.SimpleSystemsManagement.Model.InvalidDocumentException,Amazon.PowerShell.Cmdlets. SSM.NewSSMAssociationCmdlet ```
1
answers
0
votes
117
views
asked 10 months ago

Should ECS/EC2 ASGProvider Capacity Provider be able to scale-up from zero, 0->1

Following from earlier thread https://repost.aws/questions/QU6QlY_u2VQGW658S8wVb0Cw/should-ecs-service-task-start-be-triggered-by-asg-capacity-0-1 , I've now attached a proper Capacity Provider, an Auto Scale Group provider to my ECS Cluster. Question TL;DR: should scaling an ECS Service 0->1 desired tasks be able to wake-up a previously scaled-to-zero ASG and have it scale 0->1 desired/running? So I've started with an ECS Service with a single task definition and Desired=1, backed by the ASG with Capacity Provider scaling - also starting with 1 Desired/InService ASG instance. I can then set the ECS Service Desired tasks to 0, and it stops the single running task, then `CapacityProviderReservation` goes from 100 to 0, and 15 minutes/sample later the Alarm is triggered, and the ASG shuts-down it's only instance, 1->0 Desired/running. If I later change the ECS Service Desired back to 1 - nothing happens, other than ECS noting that it has no capacity to place the task. Should this work? I have previously seen something similar working - `CapacityProviderReservation` jumps to 200 and an instance gets created, but this is not working for me now - that metric is stuck at 100, and no scale-up-from-zero (to one) occurs in the ASG, and the task cannot be started. Should this be expected to work? Reference blog https://aws.amazon.com/blogs/containers/deep-dive-on-amazon-ecs-cluster-auto-scaling/ suggests that `CapacityProviderReservation` should move to 200 if `M > 0 and N = 0`, but this seems to rely on a task in "Provisioning" state - will that even happen here, or is the ECS Service/Cluster giving-up and not getting that far, due to zero capacity?
2
answers
0
votes
400
views
asked a year ago