Best configuration for inferencing with PyTorch models

0

I'm trying to make a public facing web app that allows for inferencing, with probably ten or so available models to my users. My initial thought was that I would have a front-end basic webpage, that communicates with a REST API server on an EC2 instance. But since I started planning this out a bit more, I found a lot of info about various AWS products, and they seem interesting but it's all pretty over my head.

I initially came the site because I heard about elastic inferencing. After I researched elastic inferencing more, it seems like Amazon is encouraging people to use Inferentia2 instead. I realize that I could just do an EC2 instance, but I don't know how well that'll work for scaling if this app I'm making becomes popular. I've also read a bit about SageMaker, API Gateway, and even "serverless" options like Lambda, but I don't really know if those would integrate well with low cost inferencing products that AWS offers.

Any advice on setting this kind of thing up?

1 回答

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则