Sagemaker step scaling policy

0

I'm trying to define a step scaling policy for my sagemaker realtime endpoint, based on this example notebook. I understand that the step scaling policy defines thresholds to provision a different amount of instances, but I am confused because it doesn't seem to specify the metrics to track.

In the example notebook, the code is the following:

def set_step_scaling(endpoint_name, variant_name):
    policy_name = "step-scaling-{}".format(str(round(time.time())))
    resource_id = "endpoint/{}/variant/{}".format(endpoint_name, variant_name)

    response = aas_client.put_scaling_policy(
        PolicyName=policy_name,
        ServiceNamespace="sagemaker",
        ResourceId=resource_id,
        ScalableDimension="sagemaker:variant:DesiredInstanceCount",
        PolicyType="StepScaling",
        StepScalingPolicyConfiguration={
            "AdjustmentType": "ChangeInCapacity",
            "StepAdjustments": [
                {
                    "MetricIntervalLowerBound": 0.0,
                    "MetricIntervalUpperBound": 5.0,
                    "ScalingAdjustment": 1,
                },
                {
                    "MetricIntervalLowerBound": 5.0,
                    "MetricIntervalUpperBound": 80.0,
                    "ScalingAdjustment": 3,
                },
                {
                    "MetricIntervalLowerBound": 80.0, 
                    "ScalingAdjustment": 4
                },
            ],
            "MetricAggregationType": "Average",
        },
    )

    return policy_name, response

I want it to track the metric SageMakerVariantInvocationsPerInstance. Can I specify it? If so, the metric aggregation type shouldn't it be Sum instead of average? I'm quite confused, and I would appreciate a lot your help! Thanks a lot!

已提问 1 年前234 查看次数
2 回答
0
已接受的回答

With step scaling, the alarm isn't owned or controlled by AutoScaling (unlike target tracking, where AutoScaling manages the alarms for the policy).

So for each step scaling policy (you need one for scale-out and another one for scale-in), you'll need to create an alarm on your own, and create an alarm action pointing to your step scaling policy. This means step scaling is considerably more flexible and configurable than Target Tracking, but the trade off is that its much more work to configure it (especially via code/IaC).

AWS
已回答 1 年前
  • Wonderful, thank you so much for your help! Appreciate a lot :)

0

In this example notebook, two scaling policies are being used, as shown in the code excerpt from the notebook.

invocation_scaling = set_target_scaling_on_invocation(
    endpoint_name=endpoint_name,
    variant_name=variant_name,
    target_value=invocations_per_instance * 1.3,
)

cpu_scaling = set_target_scaling_on_cpu_utilization(
    endpoint_name=endpoint_name, variant_name=variant_name, target_value=100
)

As you can see above the metrics are applied to the endpoint name and variant name and include a target threshold value. You will also notice that we are using pre-defined metrics based on the following import statements.

from endpoint_scaling import set_target_scaling_on_invocation
from endpoint_scaling import set_target_scaling_on_cpu_utilization

You can use SageMakerVariantInvocationsPerInstance which is a predefined CloudWatch metric that specifies the average number of times per minute that each instance for a variant is invoked. You an implement it in the put_scaling_policy function as shown in the PredefinedMetricSpecification below:

response = client.put_scaling_policy(
    PolicyName='Invocations-ScalingPolicy',
    ServiceNamespace='sagemaker', # The namespace of the AWS service that provides the resource. 
    ResourceId=resource_id, # Endpoint name 
    ScalableDimension='sagemaker:variant:DesiredInstanceCount', # SageMaker supports only Instance Count
    PolicyType='TargetTrackingScaling', # 'StepScaling'|'TargetTrackingScaling'
    TargetTrackingScalingPolicyConfiguration={
        'TargetValue': 4000, # The target value for the metric: ApproximateBacklogSizePerInstance. 
        'PredefinedMetricSpecification': {
            'PredefinedMetricType': 'SageMakerVariantInvocationsPerInstance', 
        },
        'ScaleInCooldown': <replace with number of seconds>, 
        'ScaleOutCooldown': <replace with number of seconds>
        
        'DisableScaleIn': False, # Indicates whether scale in by the target tracking policy is disabled.
    }
)
AWS
已回答 1 年前
  • Thanks a lot for your answer! Appreciate the time in answering my question!

    The TargetTrackingPolicy is clear, and I was rather asking for the StepScaling policy. How can I apply it so that it tracks the InvocationPerInstance metric? With further research, I think I understood that I have to create a step scaling policy, and then associate it manually to a Cloud Watch alarm I create myself. Is this correct, and best practice?

    Thanks a lot once again! Really appreciate it!

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则