Deploying large scale ML model

0

Hi, I am deploying an ML model with a retrieval component from AWS and it had two parts:

  1. ML model that is deployed using Sagemaker. The model isn't big, so this is simple.
  2. Retrieval: The ML model first retrieves information from a database using ANN algorithm(like Annoy or Scann). The database needs to be loaded into memory at all times for really fast inference. However, the database is big(around 500GB). What is the best way to deploy this database? Is Sagemaker the best bet?
  • Can you please clarify, What database are you using to store the data? What kind of data is stored in the database?

  • Also I wonder if you could give an idea of how many times the DB would be queried for one model inference? Once? Many? The acceptable latency here might guide whether it's better/practical to have the endpoint call out to a separate DB service, or necessary to try and wedge everything into the endpoint container's RAM.

質問済み 2年前96ビュー
回答なし

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ