By using AWS re:Post, you agree to the Terms of Use

Questions tagged with AWS Auto Scaling

Sort by most recent
  • 1
  • 12 / page

Browse through the questions and answers listed below or filter and sort to narrow down your results.

Auto-scaling is not working with Neptune cluster except when primary writer instance type db.r5d.xlarge.

Issue: Scale-up actions work fine with any instance size, the scale-in action is triggered by the CloudWatch but is not able to remove the readers except r5d.xlarge. I am trying to Auto-scale the Amazon Neptune database to meet workload demands, But I'm getting issues while the Neptune writer is r5d.xlarge then it's working fine but when I changed the writer instance size then it's not working. I did not set neptune_autoscaling_config on the cluster parameter group. Applied the same configuration which is in the reference blog post, but the one thing is different when I created the auto-scaling the first time was that time Writer instance was on r5d.xlarge. after that, I changed the writer instance size to t3.medium and then I deleted the old configuration of the auto-scaling app and scaling policy, deregistered the scale targets, and then created new everything about autoscaling, then after the scale-up action is working fine, but the scale-in action is not working except r5d.xlarge. I am not getting any error from the CloudWatch, CloudWatch action triggered successfully but it does not remove the Neptune reader which was created while the scale-up action, and also I'm not getting scaling activities which policy action is not able to delete the reader. This same thing is working fine on our Prod and Stage accounts but this issue has only occurred on the Dev account. Note: Scale-in(remove neptune reader) is working fine through a scheduled action https://docs.aws.amazon.com/autoscaling/application/userguide/examples-scheduled-actions.html Can anyone please help me with this? Thanks in advance! Below blog which I am using for reference https://aws.amazon.com/blogs/database/auto-scale-your-amazon-neptune-database-to-meet-workload-demands/
0
answers
0
votes
22
views
asked a month ago

ECS services not scaling in (scale in protection is disabled)

Hello. I've an ECS cluster (EC2 based) attached to a CSP. The service scaling out is OK, but it isn't scaling IN. And I've already checked the scale in protection and it's disabled (Disable Scale In: false) Description of the environment: - 1 cluster (ec2-based), 2 services - Services are attached to an ALB (registering and deregistering fine) - Services are with autoscaling enabled, checking memory (above 90%), NO scale in protection,1 task minimum, 3 tasks max. - Services are using a Capacity Service provider, apparently working as intended: it's creating new EC2 instances when new tasks are provisioned and dropping when they're with 0 tasks running, registering and deregistering as expected. - The cloudwatch alarms are working fine, Alarming when expected (with Low and High usages) Description of the test and what's "not working": - Started with 1 task for each service and 1 instance for both services. - I've managed to enter one of the containers and run a memory test, increasing its usage to over 90% - The service detected it and asked for the provision of a new task. - There were no instances that could allocate the new task, so the ECS asked for the CSP/Auto Scaling Group a new ec2 instance - The new instance was provisioned, registered in the cluster and ran the new task. - The service's memory usage avg. decreased from ~93% to ~73% (average from the sum of both tasks) - All's fine, the memory stress ran for 20 minutes. - After the memory stress was over, the memory usage dropped to ~62% - The cloudwatch alarm was triggered (maybe even before, when it was with 73% usage, I didn't check it) - The service is still running 2 tasks right now (after 3 hours or more) and it's not decreasing the Desired Count from 2 to 1. Is there anything that I'm missing here? I've already done a couple of tests, trying to change the service auto scaling thresholds and other configurations, but nothing is changing this behaviour. Any help would be appreciated. Thanks in advance.
1
answers
0
votes
36
views
asked 2 months ago

EMR autoscaling: 'org.apache.hadoop.util.DiskChecker$DiskErrorException(No space available in any of the local directories.)'

I get following error when running tez query. This is in EMR cluster with auto scaling enabled. Root device EBS volume size: 100 GiB Additional EBS volume: 200 GiB ``` bash-4.2$ls -lh /tmp lrwxrwxrwx 1 root root 8 Jun 2 13:20 /tmp -> /mnt/tmp ``` /mnt has enough space: ``` /dev/dev1 195G 3.7G 192G 2% /mnt ``` ``` INFO : Cleaning up the staging area file:/tmp/hadoop/mapred/staging/hdfs1254373830/.staging/job_local1254373830_0002 ERROR : Job Submission failed with exception 'org.apache.hadoop.util.DiskChecker$DiskErrorException(No space available in any of the local directories.)' org.apache.hadoop.util.DiskChecker$DiskErrorException: No space available in any of the local directories. at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:416) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:165) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:130) at org.apache.hadoop.mapred.LocalDistributedCacheManager.setup(LocalDistributedCacheManager.java:123) at org.apache.hadoop.mapred.LocalJobRunner$Job.<init>(LocalJobRunner.java:172) at org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:794) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:251) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:576) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:571) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:571) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:562) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:423) at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:149) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2664) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2335) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2011) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1709) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1703) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:224) at org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87) at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:316) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:330) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. No space available in any of the local directories. ```
1
answers
0
votes
30
views
asked 2 months ago
  • 1
  • 12 / page