what are some ways/alternative to expose sagemaker endpoints as a HTTP /REST endpoints?


I am testing out serverless sagemaker endpoints and was planning to integrate it with api gateway directly, but realized there is a 29 seconds timeout limit in api gateway, which might not work if the endpoints take longer than that for inference. is there any workaround for this apart from adding a lambda in-between? I am trying to avoid lambda as it might add to the latency

  • If your stack runs API Gw in front of a sagemaker endpoint, whether or not you have a Lambda in the middle or other middleware, you will be governed by the 29 second timeout. What are you trying to do with API Gw in front of your endpoint exactly so we can better work towards a solution for you

  • @Ted_Z - API Gw is to simplify testing for a client, so that they can call the service with just an api key, without having to share user credentials or setting up a new user.

1 Answer

Here is a blog post about adding a API Gateway in front of a SageMaker endpoint: https://aws.amazon.com/blogs/machine-learning/creating-a-machine-learning-powered-rest-api-with-amazon-api-gateway-mapping-templates-and-amazon-sagemaker/ How long does your model take for inference right now? if your model is slower than you expected, you might want to choose a larger instance type or use a GPU instance for models that can use GPU. Take a look at https://github.com/aws-samples/aws-marketplace-machine-learning/blob/master/right_size_your_sagemaker_endpoints/Right-sizing%20your%20Amazon%20SageMaker%20Endpoints.ipynb

You can also consider using async inference. When the API Gateway receives a request, trigger a async inference job and return immediately. Then let the endpoint write the result to a S3 bucket, then notify your user either by SNS -> Email or through a polling API etc.

S Lyu
answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions