Is it possible to achieve multi-threading in SageMaker Endpoint

0

Context : I have a sagemaker endpoint (real-time) that essentially performs two KNN searches on two separate datasets. I'd like to parallelize the two KNN searches by creating a thread pool of some sort. Is that possible to do in SageMaker? and if so is it recommended?

Another option is to have a lambda to split the request into two invoke_endpoint() calls - in that way the endpoint will be triggered twice, once each dataset. However, to do so I need to have a multi worker/host endpoint which can bring up the costs. so I'd like to explore multi-threading in the model itself first.

Thanks!!!

zachliu
gefragt vor 8 Monaten388 Aufrufe
1 Antwort
0

Hi,

What you may explore is provisioned concurrency for Amazon SageMaker Serverless Inference: see https://aws.amazon.com/blogs/machine-learning/announcing-provisioned-concurrency-for-amazon-sagemaker-serverless-inference/

You can tune finely:

ServerlessProvisionedConcurrencyExecutions – The number of concurrent runs handled by the endpoint
ServerlessProvisionedConcurrencyUtilization – The number of concurrent runs divided by the allocated 
provisioned concurrency
ServerlessProvisionedConcurrencyInvocations – The number of InvokeEndpoint requests handled by the 
provisioned concurrency
ServerlessProvisionedConcurrencySpilloverInvocations – The number of InvokeEndpoint requests not handled 
provisioned concurrency, which is handled by on-demand Serverless Inference

Best,

Didier

profile pictureAWS
EXPERTE
beantwortet vor 8 Monaten

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen