1 Answer
- Newest
- Most votes
- Most comments
1
Hi, ProductionVariant is the key parameter to define the model associated to the endpoint, hence its version in the doc that you point to. See on this page:
you define a ProductionVariant, for each model that you want to deploy. Each
ProductionVariant parameter also describes the resources that you want SageMaker
to provision. This includes the number and type of ML compute instances to deploy.
You probably should read this blog post to setup a multi-endpoint / multi-model inference service on Sagemaker: https://aws.amazon.com/blogs/machine-learning/part-3-model-hosting-patterns-in-amazon-sagemaker-run-and-optimize-multi-model-inference-with-amazon-sagemaker-multi-model-endpoints/
Indeed, I'd recommend the full series: go at bottom to access them all.
Best,
Didier
Relevant content
- asked a year ago
- AWS OFFICIALUpdated a year ago

Thanks Didier_Durand. I had confused models with model package. I think I'm good for now.