I want to deploy my model as a Serverless inference

0

Hey I trained a sickst learn model using python sdk and I want to deploy the model as a Serverless inference now. I am new to AWS and can't seem to make sense of the documentation. the model is fit it an estimator as follow:

from sagemaker.sklearn.estimator import SKLearn
enable_local_mode_training = False


inputs = {"train": trainpath, "test": testpath}

estimator_parameters = {
    "entry_point": "script_rf.py",
    "framework_version": "1.0-1",
    "py_version": "py3",
    "instance_type": 'ml.c5.xlarge',
    "instance_count": 1,
    "role": role,
    "base_job_name": "randomforestclassifier-model"
}

estimator = SKLearn(**estimator_parameters)
estimator.fit(inputs)

this works fine but now when I try to deploy it it doesn't work. I tried this code: https://aws.amazon.com/blogs/machine-learning/deploying-ml-models-using-sagemaker-serverless-inference-preview/ but i keep getting errors because somethings are not defined like the image_uri which I am not using.

I used this

m_boto3 = boto3.client('sagemaker')

estimator.latest_training_job.wait(logs='None')
artifact = m_boto3.describe_training_job(
    TrainingJobName=estimator.latest_training_job.name)['ModelArtifacts']['S3ModelArtifacts']

print('Model artifact persisted at ' + artifact)

but then the endpoint is not Serverless. please help

2 個答案
0

To deploy a serverless endpoint on Amazon SageMaker you will need 3 steps:

(1) Model creation

Where you create the model with client.create_model

(2) Endpoint configuration creation

Where you create a configuration file with client.create_endpoint_config, and make sure you have a serverless configuration with two parameters: MemorySizeInMB, and MaxConcurrency within the config, which should look like

"ServerlessConfig": {
        "MemorySizeInMB": 4096,
        "MaxConcurrency": 1,
},

(3) Endpoint creation and invocation

Once you have created a model, with endpoint configuration with serverless endpoint detail, then you can run client.create_endpoint to create a serverless endpoint.

Reference: Deploying ML models using SageMaker Serverless Inference

profile pictureAWS
Mia C
已回答 2 年前
0

Actually creating a serverless endpoint has become much easier because you can now use the SageMaker Python SDK to so, i.e. you don't have to use boto3 anymore, you don't have to create models or endpoint configurations anymore, and you also don't have to specify the image_uri anymore.

Instead, once you your estimator you can just use these lines of code:

from sagemaker.serverless import ServerlessInferenceConfig
serverless_config = ServerlessInferenceConfig()

serverless_config = ServerlessInferenceConfig(
  memory_size_in_mb=4096,
  max_concurrency=10,
)

serverless_predictor = estimator.deploy(serverless_inference_config=serverless_config)

See also the documentation: https://sagemaker.readthedocs.io/en/stable/overview.html#sagemaker-serverless-inference

AWS
Heiko
已回答 2 年前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南