- Newest
- Most votes
- Most comments
A target tracking scaling policy will create 2 CloudWatch alarms (one for high and one for low usage), which you'll be able to see in the CloudWatch alarms console. The high usage policy needs to have 3 consecutive 60 second breaching datapoints to trigger a scale-out; and the low alarm needs 15 consecutive 60 second breaching datapoints to scale-in
You may instead want to use step scaling policies, where you are able to create and control the alarms as well as the policy settings https://docs.aws.amazon.com/autoscaling/application/userguide/application-auto-scaling-step-scaling-policies.html
Thanks, once I learned that the policy is managed by CloudWatch alarms I was able to observe my policy working - in particular, I changed the metric to HasBacklogWithoutCapacity, the target to 0.5, and I used Maximum instead of average and it behaves as I require. I did notice that there's also some delay between when I submit an inference job to the queue, and when the metric increases from 0 to 1 (about 2-3 minutes) and then it waits for the 3 consecutive values so overall it takes about 5 minutes rather than 3 to trigger start add capacity. I'll try step scaling to reduce that but at least I have a working proof-of-concept to compare with batch-inference now.
Relevant content
- asked 2 years ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 3 years ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated a year ago
Glad you were able to get it at least mostly working! Just as an FYI, the Target Tracking scaling policy is expecting the alarm to be configured exactly as AutoScaling created it, so modifying the alarm can lead to unexpected behavior (most often, the alarm triggering but scaling may not happen). So any time you need the alarm customized, step scaling is the way to go unless its something which can be done natively in the CustomizedMetricSpecification section of PutScalingPolicy https://docs.aws.amazon.com/cli/latest/reference/application-autoscaling/put-scaling-policy.html