- Newest
- Most votes
- Most comments
Hello,
Thanks for trying SageMaker and our apologies for late response.
-
Does AWS Sagemaker not process requests concurrently? Meaning, I would expect the server being able to handle two requests at the same time?
Answer: SageMaker does process requests concurrently. We keep sending the requests to model container as we get them and does not enqueue. However, we do have throttling in place which can kick in if there are too many requests coming which the endpoint is not able to handle. In case of throttling you will get the error response immediately. Here it is possible that your model is processing the requests sequentially. I suggest please test your model container locally with concurrent requests. -
From the client's side, how is it handling the case when the server is busy? I notice that it internally does retries for about 3 times after every 60 seconds if the request is not being handled
-
Within each of the 60 seconds time window, how is the client code calling the Endpoint? Is it constantly calling after every 1,2,4,6,8 seconds ?
Answer: For these you can refer to aws sdk client configuration:
https://botocore.amazonaws.com/v1/documentation/api/latest/reference/config.html
https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/ClientConfiguration.html
To answer your question, api call will wait for the response. If it gets any exception or timeout then it will do the retry depending on the retry policy you set for the sdk client configuration.
Thanks
Edited by: harishataws on May 21, 2020 12:01 PM
Relevant content
- asked 6 months ago
- Accepted Answerasked 4 years ago
- asked 2 months ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated a year ago