I want to deploy my model as a Serverless inference

0

Hey I trained a sickst learn model using python sdk and I want to deploy the model as a Serverless inference now. I am new to AWS and can't seem to make sense of the documentation. the model is fit it an estimator as follow:

from sagemaker.sklearn.estimator import SKLearn
enable_local_mode_training = False


inputs = {"train": trainpath, "test": testpath}

estimator_parameters = {
    "entry_point": "script_rf.py",
    "framework_version": "1.0-1",
    "py_version": "py3",
    "instance_type": 'ml.c5.xlarge',
    "instance_count": 1,
    "role": role,
    "base_job_name": "randomforestclassifier-model"
}

estimator = SKLearn(**estimator_parameters)
estimator.fit(inputs)

this works fine but now when I try to deploy it it doesn't work. I tried this code: https://aws.amazon.com/blogs/machine-learning/deploying-ml-models-using-sagemaker-serverless-inference-preview/ but i keep getting errors because somethings are not defined like the image_uri which I am not using.

I used this

m_boto3 = boto3.client('sagemaker')

estimator.latest_training_job.wait(logs='None')
artifact = m_boto3.describe_training_job(
    TrainingJobName=estimator.latest_training_job.name)['ModelArtifacts']['S3ModelArtifacts']

print('Model artifact persisted at ' + artifact)

but then the endpoint is not Serverless. please help

2 réponses
0

To deploy a serverless endpoint on Amazon SageMaker you will need 3 steps:

(1) Model creation

Where you create the model with client.create_model

(2) Endpoint configuration creation

Where you create a configuration file with client.create_endpoint_config, and make sure you have a serverless configuration with two parameters: MemorySizeInMB, and MaxConcurrency within the config, which should look like

"ServerlessConfig": {
        "MemorySizeInMB": 4096,
        "MaxConcurrency": 1,
},

(3) Endpoint creation and invocation

Once you have created a model, with endpoint configuration with serverless endpoint detail, then you can run client.create_endpoint to create a serverless endpoint.

Reference: Deploying ML models using SageMaker Serverless Inference

profile pictureAWS
Mia C
répondu il y a 2 ans
0

Actually creating a serverless endpoint has become much easier because you can now use the SageMaker Python SDK to so, i.e. you don't have to use boto3 anymore, you don't have to create models or endpoint configurations anymore, and you also don't have to specify the image_uri anymore.

Instead, once you your estimator you can just use these lines of code:

from sagemaker.serverless import ServerlessInferenceConfig
serverless_config = ServerlessInferenceConfig()

serverless_config = ServerlessInferenceConfig(
  memory_size_in_mb=4096,
  max_concurrency=10,
)

serverless_predictor = estimator.deploy(serverless_inference_config=serverless_config)

See also the documentation: https://sagemaker.readthedocs.io/en/stable/overview.html#sagemaker-serverless-inference

AWS
Heiko
répondu il y a 2 ans

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.

Instructions pour répondre aux questions