Questions tagged with AWS Auto Scaling

Content language: English

Sort by most recent

Browse through the questions and answers listed below or filter and sort to narrow down your results.

I've configured a model for async-inference, and its working correctly - I can submit a file via `invoke_endpoint_async` and download the output from s3. I'm now trying to configure auto-scaling. I'm trying experimentation with different options, but basically I want to configure 0-1 instances, have an instance created when`invoke_endpoint_async` is called, and have the instance shutdown shortly afterwards (along the lines of batch inference) I'm struggling to get it to work - I'm experiencing similar issues to https://github.com/boto/boto3/issues/2839 First I think there's an issue with the `console` - if I `aws register-scalable-target ...` it works but the console doesn't like the zero for `min-capacity` ![Enter image description here](/media/postImages/original/IMWZdtU68ZSXSSvr46-_1nhw) I think this is just a UI nit though, I don't understand how the policy works - I have ```json { "TargetValue": 1.0, "CustomizedMetricSpecification": { "MetricName": "ApproximateBacklogSizePerInstance", "Namespace": "AWS/SageMaker", "Dimensions": [{"Name": "EndpointName", "Value": "***-test-endpoint-2023-03-24-04-28-06-341"}], "Statistic": "Average" }, "ScaleInCooldown": 60, "ScaleOutCooldown": 60 } ``` The first point of confusion was the console shows a built-in and custom policy. I was initially using the name of the built-in policy (SageMakerEndpointInvocationScalingPolicy) but `put-scaling-policy` doesn't appear to edit it - it creates a new policy with the same name. When I monitor the scaling activity () ```console aws application-autoscaling describe-scaling-activities \ --service-namespace sagemaker ``` I can initially see "Successfully set desired instance count to 0. Change successfully fulfilled by sagemaker." But when I involve the endpoint with ```python response = sm_runtime.invoke_endpoint_async( EndpointName=endpoint_name, InputLocation="***/input/data.json", ContentType='application/jsonlines', Accept='application/jsonlines') output_location = response['OutputLocation'] ``` I would expect to see the instance count increase to 1, then back to zero within a space of a few minutes. I have occasionally got it to do something but not reliably. I think the main issue is I don't understand the metric and how it interacts with the target. I've seen charts but I cannot figure out how to plot the "ApproximateBacklogSizePerInstance"? And how does it interact with "TargetValue"? What is the actual trigger for a scale in/out?
0
answers
0
votes
14
views
Dave
asked 2 days ago
Hi, I'm a newbie taking the AWS Cloud Architect course on Coursera and currently on Course 1, Module 4, Exercise 7. I believe I followed all the instructions to a T and have tried it twice now and continue to get stuck on the following Task within the assignment: Task 5: Testing the application In this task, you will stress-test the application and confirm that it scales. Return to the Amazon EC2 console. In the navigation pane, under Load Balancing, choose Target Groups. Make sure that app-target-group is selected and choose the Targets tab. You should see two additional instances launching. Wait until the Status for both instances is healthy. My Status never goes to "healthy" state and keeps failing, "Unhealthy", "Draining" (Target deregistration is in progress) Can someone tell me why this would happen and where i should check to correct this? Thank you in advance.
3
answers
0
votes
32
views
asked 8 days ago
if my instance is running! can i change instance type or is there is any way for auto scaling and if there is a way how to do that?
3
answers
0
votes
13
views
asked 10 days ago
Hello, I am a user who is working on the practice of migrating a shopping mall company to aws. Could you recommend a cost-effective EC2 instance type to build a server that 40 million customers can access at the same time? I am thinking about the infrastructure of a shopping mall web server that uses four available areas in Seoul Region by utilizing Application Load Balancer and Auto Scaling. Additionally, I am considering whether to use Application load balancer or Network load balancer when configuring the web server, so please recommend which load balancer is good. Thank you.
2
answers
0
votes
24
views
asked 13 days ago
Hi there, I used to configure my ECS services to scale on a "Step Scaling" mode. Currently, after the new UI has been set to default, I'm seeing that the Step Scaling option is always disabled.![See screenshot here](/media/postImages/original/IMd2aii2K4Q5anu2cKI33TjQ) The only possibility I found is Target Tracking which does not give you too much flexibility. I will appreciate any help on how to either get back to the previous UI, or been able to use Step Scaling on the new one. Thanks a lot
3
answers
0
votes
68
views
asked 14 days ago
Hi, I am trying to enable communication/data exchange between microservices in Fargate clusters. Q1: Can I place an internal load balancer in the middle of all MS and route requests based on path? Q2: My problem is, I am not sure DNS based service discovery is good option. What is my other alternatives for service discovery? I can use Netflix Eureka but not sure how to inform this registry data internal LB and autoscale MS (may be cloud-watch alarms). My other option is not to use AWS internal LB, but use Spring Cloud Gateway but need to also do autoscaling. Any ideas suggestions? Thanks
1
answers
0
votes
36
views
asked 22 days ago
I have a requirement. I have an Auto Scaling Group which has 2 or 3 EC2 instances. This setup is provisioned using Terraform. When one or more instances become unhealthy & get terminated, new instances are provisioned in their place. However, these instances get new EBS Volumes. But I want to reuse / attach the EBS volumes attached earlier to the terminated or unhealthy EC2 instances. I tried & searched a lot, but didn't get any good results. Can anyone please help me achieve this? I thought of using UserData, but I believe it runs with EC2 in Started state and hence cannot attach. From Terraform, I am not sure if this is possible. I even checked for Termination LifeCycle Hooks, but no luck. Please help. I saw a similar question, but the posted answer didn't seem very helpful to me. https://repost.aws/questions/QUfWhvtTJBRmuPdiK2W-FEWQ/how-are-ebs-volumes-in-an-unhealthy-instance-handled-when-a-new-instance-is-created-by-auto-scaling
4
answers
0
votes
42
views
asked a month ago
Hello there, We have enabled Auto Scaling for one of our RDS Aurora MySQL Cluster. Writer Size: db.r6g.2xlarge Reader Size: db.r6g.large - Whenever scale out happens, it creates new instance with the same size as Writer node(in this case size is db.r6g.2xlarge). We expected it to create new instances with same size as Reader node(db.r6g.large). - Also the new instance doesn't inherit the tags from the cluster. If it's not bug, please take this as feature request
1
answers
0
votes
27
views
asked a month ago
We are facing the lambda cold start issues. We used java and kotlin to build our backend applications following the serverless and micro service approach to deploy the data services inside lambda and api composition for front end apps in another lambda. The front end request goes to API GW and then to API Composition Lambda and goes to other Data Lambdas. In order to reduce the cold start, we are trying to use provisioned concurrency on our lambda. But When I tried to change the config for provisioned concurrency, I got the below error ``` “The maximum allowed provisioned concurrency is 0, based on the unreserved concurrency available (10) minus the minimum unreserved account concurrency (10).“ ``` Pasted the screenshot for reference. ![Enter image description here](/media/postImages/original/IMr2RcFy1ySpqB5RoPKkx55g) Then I checked the configuration, it says the number of unreserved account concurrency is 10. Pasted the screenshot below for reference. ![Enter image description here](/media/postImages/original/IMBbg-IGeDRH6Gfpxg9yMVNA) Now I am stuck with not able to increase the unreserved account concurrency and not able to provision the lambda. I didn't understand what is going wrong. Can some expert here explain, what I am missing here and help understand fix the issue ?
2
answers
0
votes
32
views
asked a month ago
Hi, I am considering a solution to auto scale my database Postgres with Elastic bean deployment of Flask application. I recently went through guides and tutorials that RDS provides stand by database for multi AZ for failover. However, In below guide, it is advised to have 2 databases to avoid single point failure. "Finally, configure your environment's Auto Scaling group with a higher minimum instance count. Run at least two instances at all times to prevent the web servers in your environment from being a single point of failure, and to allow you to deploy changes without taking your site out of service." My question is if we are enabling multi AZ with stand by instance, why to run 2 instances all the time. Can anyone help me what is the correct understanding.
0
answers
0
votes
16
views
asked a month ago
I'm using AWS Auto scaling group with AWS ALB and the following settings: Desired capacity: 1 Minimum capacity: 1 Maximum capacity: 3 When I now start a "Instance refresh" (with Minimum healthy percentage=100%) for the autoscaling group, the one and only healthy instance is terminated before the new refreshed instance is ready/healthy which results in downtime of my service. When I set desired capacity to "2" and start instance refresh, the service keeps available. How can I achieve that instance refresh first starts a second instance, waits until it is ready and after that terminates the previous old instance so that desired capacity can be 1?
2
answers
0
votes
56
views
Max
asked a month ago
Hello aws re:Post I want to run my pods (network wise) in a different subnet and for that I make use of the custom CNI config for the AWS-CNI plugin which already works like a charm. Now I want to automate the whole process. I already archived to create the CRD eniconfigs and deploy them automatically. But now I stuck at the automation of the node annotation. As I could not find any useful content while searching re:Post or the internet, I assume the solution is rather simple. I assume that the solution is somewhere here in the Launch Template, User Data or via `KUBELET_EXTRA_ARGS` but I'm just guessing. **The Question** How can I provide annotations like mine (below) to the nodes on launch or after they joined the cluster automatically? ``` kubectl annotate node ip-111-222-111-222.eu-central-1.compute.internal k8s.amazonaws.com/eniConfig=eu-central-1c ```
4
answers
0
votes
47
views
asked a month ago