Deploy ML Timeseries models effectively

0

Hi team ! I need to deploy a ton of Machine Learning Models (Timeseries models) and I'm seeking a way that is effective.

In details, the problem is to build a platform capable of serving many time series with different frequencies from 5s to 10m (maybe beyond this, but that's it for the time being). The ML models of the system are different in terms of the ML framework. There are about 1000 ML models. ML models sizes are from 2MB to 2GB. In which, the most popular size range is 2GB. Then how should I design the serving model system to most effective with optimal cost?

已提問 1 年前檢視次數 588 次
1 個回答
1
已接受的答案

Hello Quan Dang !

The following link refers to SageMaker Model Deployment and Deployment Recommendation: https://docs.aws.amazon.com/sagemaker/latest/dg/deploy-model.html#deploy-model-options

For your problem, for each model, the processing time is not long, request payload is not large, and it’s kind of real-time latency requirement, and there are about 1000 deep learning models and each’s size is ~2GB. Therefore, we eliminate following options: async inference, serverless, batch transform and leaving only 1 option left : real-time inference. In Real-time inference, there are 4 options :

So, we narrow it down to only 3 options for deployment, you can create a survey about your ML Models deployment details (a statistics for the following information of each model : framework, inference latency, GPU usage type).

For those models are frequently accessed (inference latency <~60s) , you can choose “Host Single Model” ; Otherwise, for those aren’t frequently accessed, if they use the same ML framework, choose “Host multi models in 1 container behind 1 endpoint”, if they use different ML framework, choose “Host multi models which use different containers behind 1 endpoint”.

AWS
已回答 1 年前
  • Great point ! Will do some experiments and let you know the result !

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南