I have some questions regarding async inference of sagemaker model packages on aws marketplace
- When packaging a model as a sagemaker model package to be offered on marketplace, can it support async inference? In the documentation there is only mention of real-time and batch inference
- In the docs it says that before publishing to marketplace one should test the model package for real-time and batch inference first. What if my model doesn't support e.g. real-time inference because processing takes more than 60 seconds. Does one have to offer real-time and/or batch inference?
- Can one restrict for what types of deployments a customer uses the model package, e.g. only for async inference or only for real-time inference?