By using AWS re:Post, you agree to the Terms of Use
/Amazon EC2 Auto Scaling/

Questions tagged with Amazon EC2 Auto Scaling

Sort by most recent
  • 1
  • 90 / page

Browse through the questions and answers listed below or filter and sort to narrow down your results.

Design questions on asg, backup restore, ebs and efs

Hi experts, We are designing to deploy a BI application in AWS. We have a default setting to repave the ec2 instance every 14 days which means it will rebuild the whole cluster instances with services and bring back it to last known good state. We want to have a solution with no/minimal downtime. The application has different services provisioned on different ec2 instances. First server will be like a main node and rest are additional nodes with different services running on them. We install all additional nodes same way but configure services later in the code deploy. 1. Can we use asg? If yes, how can we distribute the topology? Which mean out of 5 instances, if one server repaves, then that server should come up with the same services as the previous one. Is there a way to label in asg saying that this server should configure as certain service? 1. Each server should have its own ebs volume and stores some data in it. - what is the fastest way to copy or attach the ebs volume to new repaves server without downtime? 2. For shared data we want to use EFS 3. for metadata from embedded Postgres - we need to take a backup periodically and restore after repave(create new instance with install and same service) - how can we achieve this without downtime? We do not want to use customized AMI as we have a big process for ami creation and we often need to change it if we want to add install and config in it. Sorry if this is a lot to answers. Some guidance is helpful.
1
answers
0
votes
5
views
AWS-User-4880625
asked a month ago

LoadBalancer health check fails but instance is not terminating

Hello, I have a load balancer which as you know keeps the health check for the web app/website. I have deployed nothing in my instance means no app/site so when anyone visits the Loadbalancer URL they see a 502 Bad gateway error which is fine. and also in the target group, it shows that an instance has failed the health check but the thing is that the auto-scaling group is not terminating the failed health check instance and replacing it. Below is the Cloudformation code ``` AutoScailingGroup: Type: AWS::AutoScaling::AutoScalingGroup Properties: VPCZoneIdentifier: - Fn::ImportValue: !Sub ${EnvironmentName}-PR1 - Fn::ImportValue: !Sub ${EnvironmentName}-PR2 LaunchConfigurationName: !Ref AppLaunchConfiguration MinSize: 1 MaxSize: 4 TargetGroupARNs: - Ref: WebAppTargetGroup AppLoadBalancer: Type: AWS::ElasticLoadBalancingV2::LoadBalancer Properties: SecurityGroups: - Ref: ApplicationLoadBalancerSecurityGroup Subnets: - Fn::ImportValue: !Sub ${EnvironmentName}-PU1 - Fn::ImportValue: !Sub ${EnvironmentName}-PU2 Tags: - Key: Name Value: !Ref EnvironmentName Listener: Type: AWS::ElasticLoadBalancingV2::Listener Properties: DefaultActions: - Type: forward TargetGroupArn: !Ref WebAppTargetGroup LoadBalancerArn: !Ref AppLoadBalancer Port: "80" Protocol: HTTP LoadBalancerListenerRule: Type: AWS::ElasticLoadBalancingV2::ListenerRule Properties: Actions: - Type: forward TargetGroupArn: !Ref WebAppTargetGroup Conditions: - Field: path-pattern Values: [/] ListenerArn: !Ref Listener Priority: 1 WebAppTargetGroup: Type: AWS::ElasticLoadBalancingV2::TargetGroup Properties: HealthCheckIntervalSeconds: 10 HealthCheckPath: / HealthCheckProtocol: HTTP HealthCheckTimeoutSeconds: 8 HealthyThresholdCount: 2 Port: 80 Protocol: HTTP UnhealthyThresholdCount: 5 VpcId: Fn::ImportValue: Fn::Sub: "${EnvironmentName}-VPCID" ```
1
answers
0
votes
5
views
Ashish
asked a month ago

How frequently does an ASG attempt to remove instances when current size is greater than desired?

I have an EC2 ASG that has size triggers based on CPU utilization. Usually, it follows the predictable pattern of scaling up during time of usage and removing instances as load decreases. My instances will sometimes mark themselves as protected from scale-in if they are working on something longer-running than their normal tasks. If all instances are protected, I'll get the message "Could not scale to desired capacity because all remaining instances are protected from scale-in" in cloud watch. It appears that following that message, the next scale-in attempt won't occur for quite a while - 10 hours later when this happened yesterday. Since my instances only protect themselves for a short amount of time, the scale in would have succeeded during most of that 10 hours. My question: is there a way to configure the ASG so that it would retry the scale-in sooner than 10 hours later? Or is there a way I could respond to the failed attempt and maybe an instance could take itself off-line? (I do understand that ideally the instances wouldn't protect themselves in the first place, and that's part of a larger update to the architecture. But a short-term fix to the existing solution would be great.) To respond to the questions: The Alarm triggered based on low utilization and immediately reduced the desired count. At that point the alarm was no longer set. I'm looking at the ASG Activity History pane where there isn't anything in between message 1 that indicates that the desired size was reduced and that no instance could be removed and message 2 that a particular instance was removed due to a difference between current and desired.
1
answers
0
votes
10
views
bmckeever
asked 3 months ago

aws_ssm_document addomainjoin error

I am struggling to get EC2 instances deployed via an ASG joined to the domain. I get the following error each time *New-SSMAssociation : Document schema version, 2.2, is not supported by association that is created with instance id* I have tried various schema versions detailed [Here](https://docs.aws.amazon.com/systems-manager/latest/userguide/document-schemas-features.html) however all fail with the same error **SSMdoc.tf** ``` resource "aws_ssm_document" "ad-join-domain" { name = "ad-join-domain" document_type = "Command" content = jsonencode( { "schemaVersion" = "2.2" "description" = "aws:domainJoin" "parameters" : { "directoryId" : { "description" : "(Required) The ID of the directory.", "type" : "String" }, "directoryName" : { "description" : "(Required) The name of the domain.", "type" : "String" }, "dnsIpAddresses" : { "description" : "(Required) The IP addresses of the DNS servers for your directory.", "type" : "StringList" }, }, "mainSteps" = [ { "action" = "aws:domainJoin", "name" = "domainJoin", "inputs" = { "directoryId" : data.aws_directory_service_directory.adgems.id, "directoryName" : data.aws_directory_service_directory.adgems.name, "dnsIpAddresses" : [data.aws_directory_service_directory.adgems.dns_ip_addresses] } } ] } ) } ``` template.tf ``` data "template_file" "ad-join-template" { template = <<EOF <powershell> Set-DefaultAWSRegion -Region eu-west-2 Set-Variable -name instance_id -value (Invoke-Restmethod -uri http://169.254.169.254/latest/meta-data/instance-id) New-SSMAssociation -InstanceId $instance_id -Name "${aws_ssm_document.ad-join-domain.name}" </powershell> EOF } ``` The template is then referenced in the ASG Launch Template user_data section. Getting onto the instance I can see the script/logs and have confirmed the variables set (instance id for example). Full error message from the PS running below ``` New-SSMAssociation : Document schema version, 2.2, is not supported by association that is created with instance id At C:\Windows\system32\config\systemprofile\AppData\Local\Temp\EC2Launch228430162\UserScript.ps1:3 char:5 + New-SSMAssociation -InstanceId $instance_id -Name "ad-join-domain ... + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + CategoryInfo : InvalidOperation: (Amazon.PowerShe...sociationCmdlet:NewSSMAssociationCmdlet) [New-SSMAs sociation], InvalidOperationException + FullyQualifiedErrorId : Amazon.SimpleSystemsManagement.Model.InvalidDocumentException,Amazon.PowerShell.Cmdlets. SSM.NewSSMAssociationCmdlet ```
1
answers
0
votes
2
views
AWS-User-4488665
asked 3 months ago

Should ECS/EC2 ASGProvider Capacity Provider be able to scale-up from zero, 0->1

Following from earlier thread https://repost.aws/questions/QU6QlY_u2VQGW658S8wVb0Cw/should-ecs-service-task-start-be-triggered-by-asg-capacity-0-1 , I've now attached a proper Capacity Provider, an Auto Scale Group provider to my ECS Cluster. Question TL;DR: should scaling an ECS Service 0->1 desired tasks be able to wake-up a previously scaled-to-zero ASG and have it scale 0->1 desired/running? So I've started with an ECS Service with a single task definition and Desired=1, backed by the ASG with Capacity Provider scaling - also starting with 1 Desired/InService ASG instance. I can then set the ECS Service Desired tasks to 0, and it stops the single running task, then `CapacityProviderReservation` goes from 100 to 0, and 15 minutes/sample later the Alarm is triggered, and the ASG shuts-down it's only instance, 1->0 Desired/running. If I later change the ECS Service Desired back to 1 - nothing happens, other than ECS noting that it has no capacity to place the task. Should this work? I have previously seen something similar working - `CapacityProviderReservation` jumps to 200 and an instance gets created, but this is not working for me now - that metric is stuck at 100, and no scale-up-from-zero (to one) occurs in the ASG, and the task cannot be started. Should this be expected to work? Reference blog https://aws.amazon.com/blogs/containers/deep-dive-on-amazon-ecs-cluster-auto-scaling/ suggests that `CapacityProviderReservation` should move to 200 if `M > 0 and N = 0`, but this seems to rely on a task in "Provisioning" state - will that even happen here, or is the ECS Service/Cluster giving-up and not getting that far, due to zero capacity?
2
answers
0
votes
7
views
javabrett
asked 5 months ago
  • 1
  • 90 / page