Is it possible to achieve multi-threading in SageMaker Endpoint

0

Context : I have a sagemaker endpoint (real-time) that essentially performs two KNN searches on two separate datasets. I'd like to parallelize the two KNN searches by creating a thread pool of some sort. Is that possible to do in SageMaker? and if so is it recommended?

Another option is to have a lambda to split the request into two invoke_endpoint() calls - in that way the endpoint will be triggered twice, once each dataset. However, to do so I need to have a multi worker/host endpoint which can bring up the costs. so I'd like to explore multi-threading in the model itself first.

Thanks!!!

zachliu
質問済み 8ヶ月前388ビュー
1回答
0

Hi,

What you may explore is provisioned concurrency for Amazon SageMaker Serverless Inference: see https://aws.amazon.com/blogs/machine-learning/announcing-provisioned-concurrency-for-amazon-sagemaker-serverless-inference/

You can tune finely:

ServerlessProvisionedConcurrencyExecutions – The number of concurrent runs handled by the endpoint
ServerlessProvisionedConcurrencyUtilization – The number of concurrent runs divided by the allocated 
provisioned concurrency
ServerlessProvisionedConcurrencyInvocations – The number of InvokeEndpoint requests handled by the 
provisioned concurrency
ServerlessProvisionedConcurrencySpilloverInvocations – The number of InvokeEndpoint requests not handled 
provisioned concurrency, which is handled by on-demand Serverless Inference

Best,

Didier

profile pictureAWS
エキスパート
回答済み 8ヶ月前

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ