- 최신
- 최다 투표
- 가장 많은 댓글
To deploy a serverless endpoint on Amazon SageMaker you will need 3 steps:
(1) Model creation
Where you create the model with client.create_model
(2) Endpoint configuration creation
Where you create a configuration file with client.create_endpoint_config
, and make sure you have a serverless configuration with two parameters: MemorySizeInMB
, and MaxConcurrency
within the config, which should look like
"ServerlessConfig": {
"MemorySizeInMB": 4096,
"MaxConcurrency": 1,
},
(3) Endpoint creation and invocation
Once you have created a model, with endpoint configuration with serverless endpoint detail, then you can run client.create_endpoint
to create a serverless endpoint.
Reference: Deploying ML models using SageMaker Serverless Inference
Actually creating a serverless endpoint has become much easier because you can now use the SageMaker Python SDK to so, i.e. you don't have to use boto3 anymore, you don't have to create models or endpoint configurations anymore, and you also don't have to specify the image_uri
anymore.
Instead, once you your estimator you can just use these lines of code:
from sagemaker.serverless import ServerlessInferenceConfig
serverless_config = ServerlessInferenceConfig()
serverless_config = ServerlessInferenceConfig(
memory_size_in_mb=4096,
max_concurrency=10,
)
serverless_predictor = estimator.deploy(serverless_inference_config=serverless_config)
See also the documentation: https://sagemaker.readthedocs.io/en/stable/overview.html#sagemaker-serverless-inference
관련 콘텐츠
- AWS 공식업데이트됨 3달 전
- AWS 공식업데이트됨 3년 전
- AWS 공식업데이트됨 2년 전