Questions tagged with AWS Auto Scaling
Sort by most recent
Browse through the questions and answers listed below or filter and sort to narrow down your results.
Handling Java RMI in AWS ASG
We have 2 services front end api service and a backend service. In order to get high tps we are using async calls. 1. End user http call lands on one of the tomcat servers in frontend 2. The frontend calls backend in Async with the context of the server ip and puts the request thread to sleep 3. Once backend finishes the job it makes a callback to frontend using rmi with the server ip it got in context 4. In the callback the original http thread is invoked. 5. The invoked http request thread consumes the prepared data from cache and completes the response. This was fine till we were in physical DC as we used to not scale or de-scale. With AWS ASG the server ip might not exist by the time backend service tries to make a callback. Due to this the request at the user end needs to get requests retried. We want to get out of RMI here and still remain async. Would like to get any solutions for this
aws auto scaling routing
Did auto scaling setup and using Step scaling policy to scaleout and scalein. when CPU Max reaches 40 above Scaling is happening and another server came to available also but sometimes request is going to the first server which has high CPU utilization because of that server is crashing. Using round robin in load balancer still its not routing properly. How to solve it?
Slow down ASG target tracking scale out
From what I can tell, ASG with target tracking does not define a cooldown period. As a result, if instances are slow to initialize, the ASG could scale out again, before the new instances join the work. Is there a way to make it wait longer before scaling out again?
How best to learn to build a cloud 'sites' architecture in AWS similar to Atlassian Jira
Which strategy documents should I look at to learn how to create an AWS architecture with scalable, dedicated 'sites' for each of my customers? For example, I would like have customer-site-1.my-root-domain.com, customer-site-2.my-root-domain.com, etc. where each site has dedicated environment, web app, and database. I would like these 'sites' to able to scale/migrate/deploy to additional AWS resources when needed. These per customer dedicated web apps will share other scalable resources such authentication and messaging services. There are so many AWS services, I do not know where to begin. Thanks in advance for pointing me in the right direction. -Eric
"This action has been administratively disabled." when trying to rebuild Elastic Beanstalk environment
Hello, I'm trying to rebuild my elastic beanstalk environment, but I keep getting this error... "Creating Auto Scaling group failed Reason: API: autoscaling:CreateAutoScalingGroup This action has been administratively disabled." This is my own account. There is no one else on my account. I did not disable anything. What are my options here? Thanks!
Auto-scaling is not working with Neptune cluster except when primary writer instance type db.r5d.xlarge.
Issue: Scale-up actions work fine with any instance size, the scale-in action is triggered by the CloudWatch but is not able to remove the readers except r5d.xlarge. I am trying to Auto-scale the Amazon Neptune database to meet workload demands, But I'm getting issues while the Neptune writer is r5d.xlarge then it's working fine but when I changed the writer instance size then it's not working. I did not set neptune_autoscaling_config on the cluster parameter group. Applied the same configuration which is in the reference blog post, but the one thing is different when I created the auto-scaling the first time was that time Writer instance was on r5d.xlarge. after that, I changed the writer instance size to t3.medium and then I deleted the old configuration of the auto-scaling app and scaling policy, deregistered the scale targets, and then created new everything about autoscaling, then after the scale-up action is working fine, but the scale-in action is not working except r5d.xlarge. I am not getting any error from the CloudWatch, CloudWatch action triggered successfully but it does not remove the Neptune reader which was created while the scale-up action, and also I'm not getting scaling activities which policy action is not able to delete the reader. This same thing is working fine on our Prod and Stage accounts but this issue has only occurred on the Dev account. Note: Scale-in(remove neptune reader) is working fine through a scheduled action https://docs.aws.amazon.com/autoscaling/application/userguide/examples-scheduled-actions.html Can anyone please help me with this? Thanks in advance! Below blog which I am using for reference https://aws.amazon.com/blogs/database/auto-scale-your-amazon-neptune-database-to-meet-workload-demands/
RDS Instance not triggering an storage auto-scaling event.
I have an RDS Instance running SQL Server SE 14.00.3381.3.v1 Multi-AZ. The instance has autoscaling enabled with the upper threshold set at 4500GB, I have been getting database event notifications stating the free space is less than 10% for the last two days ``` The free storage capacity for DB Instance: databaseName is low at 10% of the provisioned storage [Provisioned Storage: 3848.87 GB, Free Storage: 385.30 GB]. You may want to increase the provisioned storage to address this issue. ``` Now my question is when does the scaling event trigger? I thought this might be done during the maintenance window, but the maintenance window was today morning and the storage auto-scaling was not triggered. I did update the Autoscaling threshold today morning to 5000GB to see if this solved the issue, but no storage scaling so far. The autoscaling of storage has worked previously, so not sure why the storage is not getting scaled up this time. is there anything in the RDS logs I should look for? Thanks Sanoob
How do GameLift target-based policies work?
Hi, I'm looking into scaling options for my fleet and wanted to try target-based scaling (as the simplest option) first. I set the buffer size to 50% (just for the sake of testing) but that doesn't seem to do anything. My fleet has a minimum of 2 instances and a maximum of 12. I would expect this setting to keep 6 instances spun up at all times. Again, this is not my production target, I'm just trying to get a feel for the feature so that I know whether it satisfies our requirements. Am I misunderstanding how this setting works? Thanks in advance
Elastic Graphics card EC2 Windows instance AUTOSCALING
Hi, I have an Windows EC2 instance on which runs an graphical heavy load application so i created an AMI from the Windows instance and launched new W-EC2 from that AMI and attached the Elastic Graphical card(GPU) . I was trying to figure it out how to Autoscale that. I created a launch template/configuration but i don't have a field to introduce the Elastic Graphical Card. I know that the process of adding EGC is that you can only add it at launch, then you have to login , download EGC software and restart the EC2 in order to get the GPU added to the Windows EC2. My questions are: How can i automate this process? How can i achieve autoscaling for my application? Any answer, blog or documentation to read regarding this will be helpful. Thanks in advance!
ECS services not scaling in (scale in protection is disabled)
Hello. I've an ECS cluster (EC2 based) attached to a CSP. The service scaling out is OK, but it isn't scaling IN. And I've already checked the scale in protection and it's disabled (Disable Scale In: false) Description of the environment: - 1 cluster (ec2-based), 2 services - Services are attached to an ALB (registering and deregistering fine) - Services are with autoscaling enabled, checking memory (above 90%), NO scale in protection,1 task minimum, 3 tasks max. - Services are using a Capacity Service provider, apparently working as intended: it's creating new EC2 instances when new tasks are provisioned and dropping when they're with 0 tasks running, registering and deregistering as expected. - The cloudwatch alarms are working fine, Alarming when expected (with Low and High usages) Description of the test and what's "not working": - Started with 1 task for each service and 1 instance for both services. - I've managed to enter one of the containers and run a memory test, increasing its usage to over 90% - The service detected it and asked for the provision of a new task. - There were no instances that could allocate the new task, so the ECS asked for the CSP/Auto Scaling Group a new ec2 instance - The new instance was provisioned, registered in the cluster and ran the new task. - The service's memory usage avg. decreased from ~93% to ~73% (average from the sum of both tasks) - All's fine, the memory stress ran for 20 minutes. - After the memory stress was over, the memory usage dropped to ~62% - The cloudwatch alarm was triggered (maybe even before, when it was with 73% usage, I didn't check it) - The service is still running 2 tasks right now (after 3 hours or more) and it's not decreasing the Desired Count from 2 to 1. Is there anything that I'm missing here? I've already done a couple of tests, trying to change the service auto scaling thresholds and other configurations, but nothing is changing this behaviour. Any help would be appreciated. Thanks in advance.
EMR autoscaling: 'org.apache.hadoop.util.DiskChecker$DiskErrorException(No space available in any of the local directories.)'
I get following error when running tez query. This is in EMR cluster with auto scaling enabled. Root device EBS volume size: 100 GiB Additional EBS volume: 200 GiB ``` bash-4.2$ls -lh /tmp lrwxrwxrwx 1 root root 8 Jun 2 13:20 /tmp -> /mnt/tmp ``` /mnt has enough space: ``` /dev/dev1 195G 3.7G 192G 2% /mnt ``` ``` INFO : Cleaning up the staging area file:/tmp/hadoop/mapred/staging/hdfs1254373830/.staging/job_local1254373830_0002 ERROR : Job Submission failed with exception 'org.apache.hadoop.util.DiskChecker$DiskErrorException(No space available in any of the local directories.)' org.apache.hadoop.util.DiskChecker$DiskErrorException: No space available in any of the local directories. at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:416) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:165) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:130) at org.apache.hadoop.mapred.LocalDistributedCacheManager.setup(LocalDistributedCacheManager.java:123) at org.apache.hadoop.mapred.LocalJobRunner$Job.<init>(LocalJobRunner.java:172) at org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:794) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:251) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:576) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:571) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:571) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:562) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:423) at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:149) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2664) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2335) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2011) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1709) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1703) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:224) at org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87) at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:316) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:330) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. No space available in any of the local directories. ```