Sagemaker step scaling policy

0

I'm trying to define a step scaling policy for my sagemaker realtime endpoint, based on this example notebook. I understand that the step scaling policy defines thresholds to provision a different amount of instances, but I am confused because it doesn't seem to specify the metrics to track.

In the example notebook, the code is the following:

def set_step_scaling(endpoint_name, variant_name):
    policy_name = "step-scaling-{}".format(str(round(time.time())))
    resource_id = "endpoint/{}/variant/{}".format(endpoint_name, variant_name)

    response = aas_client.put_scaling_policy(
        PolicyName=policy_name,
        ServiceNamespace="sagemaker",
        ResourceId=resource_id,
        ScalableDimension="sagemaker:variant:DesiredInstanceCount",
        PolicyType="StepScaling",
        StepScalingPolicyConfiguration={
            "AdjustmentType": "ChangeInCapacity",
            "StepAdjustments": [
                {
                    "MetricIntervalLowerBound": 0.0,
                    "MetricIntervalUpperBound": 5.0,
                    "ScalingAdjustment": 1,
                },
                {
                    "MetricIntervalLowerBound": 5.0,
                    "MetricIntervalUpperBound": 80.0,
                    "ScalingAdjustment": 3,
                },
                {
                    "MetricIntervalLowerBound": 80.0, 
                    "ScalingAdjustment": 4
                },
            ],
            "MetricAggregationType": "Average",
        },
    )

    return policy_name, response

I want it to track the metric SageMakerVariantInvocationsPerInstance. Can I specify it? If so, the metric aggregation type shouldn't it be Sum instead of average? I'm quite confused, and I would appreciate a lot your help! Thanks a lot!

질문됨 일 년 전234회 조회
2개 답변
0
수락된 답변

With step scaling, the alarm isn't owned or controlled by AutoScaling (unlike target tracking, where AutoScaling manages the alarms for the policy).

So for each step scaling policy (you need one for scale-out and another one for scale-in), you'll need to create an alarm on your own, and create an alarm action pointing to your step scaling policy. This means step scaling is considerably more flexible and configurable than Target Tracking, but the trade off is that its much more work to configure it (especially via code/IaC).

AWS
답변함 일 년 전
  • Wonderful, thank you so much for your help! Appreciate a lot :)

0

In this example notebook, two scaling policies are being used, as shown in the code excerpt from the notebook.

invocation_scaling = set_target_scaling_on_invocation(
    endpoint_name=endpoint_name,
    variant_name=variant_name,
    target_value=invocations_per_instance * 1.3,
)

cpu_scaling = set_target_scaling_on_cpu_utilization(
    endpoint_name=endpoint_name, variant_name=variant_name, target_value=100
)

As you can see above the metrics are applied to the endpoint name and variant name and include a target threshold value. You will also notice that we are using pre-defined metrics based on the following import statements.

from endpoint_scaling import set_target_scaling_on_invocation
from endpoint_scaling import set_target_scaling_on_cpu_utilization

You can use SageMakerVariantInvocationsPerInstance which is a predefined CloudWatch metric that specifies the average number of times per minute that each instance for a variant is invoked. You an implement it in the put_scaling_policy function as shown in the PredefinedMetricSpecification below:

response = client.put_scaling_policy(
    PolicyName='Invocations-ScalingPolicy',
    ServiceNamespace='sagemaker', # The namespace of the AWS service that provides the resource. 
    ResourceId=resource_id, # Endpoint name 
    ScalableDimension='sagemaker:variant:DesiredInstanceCount', # SageMaker supports only Instance Count
    PolicyType='TargetTrackingScaling', # 'StepScaling'|'TargetTrackingScaling'
    TargetTrackingScalingPolicyConfiguration={
        'TargetValue': 4000, # The target value for the metric: ApproximateBacklogSizePerInstance. 
        'PredefinedMetricSpecification': {
            'PredefinedMetricType': 'SageMakerVariantInvocationsPerInstance', 
        },
        'ScaleInCooldown': <replace with number of seconds>, 
        'ScaleOutCooldown': <replace with number of seconds>
        
        'DisableScaleIn': False, # Indicates whether scale in by the target tracking policy is disabled.
    }
)
AWS
답변함 일 년 전
  • Thanks a lot for your answer! Appreciate the time in answering my question!

    The TargetTrackingPolicy is clear, and I was rather asking for the StepScaling policy. How can I apply it so that it tracks the InvocationPerInstance metric? With further research, I think I understood that I have to create a step scaling policy, and then associate it manually to a Cloud Watch alarm I create myself. Is this correct, and best practice?

    Thanks a lot once again! Really appreciate it!

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인