- Newest
- Most votes
- Most comments
Here is a blog post about adding a API Gateway in front of a SageMaker endpoint: https://aws.amazon.com/blogs/machine-learning/creating-a-machine-learning-powered-rest-api-with-amazon-api-gateway-mapping-templates-and-amazon-sagemaker/ How long does your model take for inference right now? if your model is slower than you expected, you might want to choose a larger instance type or use a GPU instance for models that can use GPU. Take a look at https://github.com/aws-samples/aws-marketplace-machine-learning/blob/master/right_size_your_sagemaker_endpoints/Right-sizing%20your%20Amazon%20SageMaker%20Endpoints.ipynb
You can also consider using async inference. When the API Gateway receives a request, trigger a async inference job and return immediately. Then let the endpoint write the result to a S3 bucket, then notify your user either by SNS -> Email or through a polling API etc.
Relevant content
- asked 4 months ago
- asked 4 months ago
- asked 3 years ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 3 years ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 2 years ago
If your stack runs API Gw in front of a sagemaker endpoint, whether or not you have a Lambda in the middle or other middleware, you will be governed by the 29 second timeout. What are you trying to do with API Gw in front of your endpoint exactly so we can better work towards a solution for you
@Ted_Z - API Gw is to simplify testing for a client, so that they can call the service with just an api key, without having to share user credentials or setting up a new user.