Is it possible to achieve multi-threading in SageMaker Endpoint

0

Context : I have a sagemaker endpoint (real-time) that essentially performs two KNN searches on two separate datasets. I'd like to parallelize the two KNN searches by creating a thread pool of some sort. Is that possible to do in SageMaker? and if so is it recommended?

Another option is to have a lambda to split the request into two invoke_endpoint() calls - in that way the endpoint will be triggered twice, once each dataset. However, to do so I need to have a multi worker/host endpoint which can bring up the costs. so I'd like to explore multi-threading in the model itself first.

Thanks!!!

zachliu
질문됨 8달 전386회 조회
1개 답변
0

Hi,

What you may explore is provisioned concurrency for Amazon SageMaker Serverless Inference: see https://aws.amazon.com/blogs/machine-learning/announcing-provisioned-concurrency-for-amazon-sagemaker-serverless-inference/

You can tune finely:

ServerlessProvisionedConcurrencyExecutions – The number of concurrent runs handled by the endpoint
ServerlessProvisionedConcurrencyUtilization – The number of concurrent runs divided by the allocated 
provisioned concurrency
ServerlessProvisionedConcurrencyInvocations – The number of InvokeEndpoint requests handled by the 
provisioned concurrency
ServerlessProvisionedConcurrencySpilloverInvocations – The number of InvokeEndpoint requests not handled 
provisioned concurrency, which is handled by on-demand Serverless Inference

Best,

Didier

profile pictureAWS
전문가
답변함 8달 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠