I want to deploy my model as a Serverless inference

0

Hey I trained a sickst learn model using python sdk and I want to deploy the model as a Serverless inference now. I am new to AWS and can't seem to make sense of the documentation. the model is fit it an estimator as follow:

from sagemaker.sklearn.estimator import SKLearn
enable_local_mode_training = False


inputs = {"train": trainpath, "test": testpath}

estimator_parameters = {
    "entry_point": "script_rf.py",
    "framework_version": "1.0-1",
    "py_version": "py3",
    "instance_type": 'ml.c5.xlarge',
    "instance_count": 1,
    "role": role,
    "base_job_name": "randomforestclassifier-model"
}

estimator = SKLearn(**estimator_parameters)
estimator.fit(inputs)

this works fine but now when I try to deploy it it doesn't work. I tried this code: https://aws.amazon.com/blogs/machine-learning/deploying-ml-models-using-sagemaker-serverless-inference-preview/ but i keep getting errors because somethings are not defined like the image_uri which I am not using.

I used this

m_boto3 = boto3.client('sagemaker')

estimator.latest_training_job.wait(logs='None')
artifact = m_boto3.describe_training_job(
    TrainingJobName=estimator.latest_training_job.name)['ModelArtifacts']['S3ModelArtifacts']

print('Model artifact persisted at ' + artifact)

but then the endpoint is not Serverless. please help

2개 답변
0

To deploy a serverless endpoint on Amazon SageMaker you will need 3 steps:

(1) Model creation

Where you create the model with client.create_model

(2) Endpoint configuration creation

Where you create a configuration file with client.create_endpoint_config, and make sure you have a serverless configuration with two parameters: MemorySizeInMB, and MaxConcurrency within the config, which should look like

"ServerlessConfig": {
        "MemorySizeInMB": 4096,
        "MaxConcurrency": 1,
},

(3) Endpoint creation and invocation

Once you have created a model, with endpoint configuration with serverless endpoint detail, then you can run client.create_endpoint to create a serverless endpoint.

Reference: Deploying ML models using SageMaker Serverless Inference

profile pictureAWS
Mia C
답변함 2년 전
0

Actually creating a serverless endpoint has become much easier because you can now use the SageMaker Python SDK to so, i.e. you don't have to use boto3 anymore, you don't have to create models or endpoint configurations anymore, and you also don't have to specify the image_uri anymore.

Instead, once you your estimator you can just use these lines of code:

from sagemaker.serverless import ServerlessInferenceConfig
serverless_config = ServerlessInferenceConfig()

serverless_config = ServerlessInferenceConfig(
  memory_size_in_mb=4096,
  max_concurrency=10,
)

serverless_predictor = estimator.deploy(serverless_inference_config=serverless_config)

See also the documentation: https://sagemaker.readthedocs.io/en/stable/overview.html#sagemaker-serverless-inference

AWS
Heiko
답변함 2년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠