Best configuration for inferencing with PyTorch models

0

I'm trying to make a public facing web app that allows for inferencing, with probably ten or so available models to my users. My initial thought was that I would have a front-end basic webpage, that communicates with a REST API server on an EC2 instance. But since I started planning this out a bit more, I found a lot of info about various AWS products, and they seem interesting but it's all pretty over my head.

I initially came the site because I heard about elastic inferencing. After I researched elastic inferencing more, it seems like Amazon is encouraging people to use Inferentia2 instead. I realize that I could just do an EC2 instance, but I don't know how well that'll work for scaling if this app I'm making becomes popular. I've also read a bit about SageMaker, API Gateway, and even "serverless" options like Lambda, but I don't really know if those would integrate well with low cost inferencing products that AWS offers.

Any advice on setting this kind of thing up?

1 個回答

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南