- Mais recentes
- Mais votos
- Mais comentários
To deploy a serverless endpoint on Amazon SageMaker you will need 3 steps:
(1) Model creation
Where you create the model with client.create_model
(2) Endpoint configuration creation
Where you create a configuration file with client.create_endpoint_config
, and make sure you have a serverless configuration with two parameters: MemorySizeInMB
, and MaxConcurrency
within the config, which should look like
"ServerlessConfig": {
"MemorySizeInMB": 4096,
"MaxConcurrency": 1,
},
(3) Endpoint creation and invocation
Once you have created a model, with endpoint configuration with serverless endpoint detail, then you can run client.create_endpoint
to create a serverless endpoint.
Reference: Deploying ML models using SageMaker Serverless Inference
Actually creating a serverless endpoint has become much easier because you can now use the SageMaker Python SDK to so, i.e. you don't have to use boto3 anymore, you don't have to create models or endpoint configurations anymore, and you also don't have to specify the image_uri
anymore.
Instead, once you your estimator you can just use these lines of code:
from sagemaker.serverless import ServerlessInferenceConfig
serverless_config = ServerlessInferenceConfig()
serverless_config = ServerlessInferenceConfig(
memory_size_in_mb=4096,
max_concurrency=10,
)
serverless_predictor = estimator.deploy(serverless_inference_config=serverless_config)
See also the documentation: https://sagemaker.readthedocs.io/en/stable/overview.html#sagemaker-serverless-inference
Conteúdo relevante
- AWS OFICIALAtualizada há 3 anos
- AWS OFICIALAtualizada há 2 anos
- AWS OFICIALAtualizada há 6 meses
- AWS OFICIALAtualizada há 9 meses