Best practice for using Inferentia in ECS

0

Currently, we are using Elastic Inference for inferencing on AWS ECS. We use inference_accelerators in ecs.Ec2TaskDefinition to set up elastic inference. For scaling, we are monitoring AcceleratorUtilization metric to decide when to scale out or scale in.

Now that AWS recommends switching to AWS Inf instances, we plan to migrate from EI to Inf. We plan to use it in AWS ECS. How should we monitor Inf usage and scale our instances based on that? Is there any pre-defined metrics for this usage?

Mosi
gefragt vor 5 Monaten237 Aufrufe
1 Antwort
0

You can use Neuron Monitor to monitor your Inf utilization. Neuron Monitor integrates with CloudWatch (see this documentation). One metric you can use to determine scaling is NeuronCore utilization. For example, you can average the number of NeuronCores with utilization that's higher than some threshold and scale up or down based on that.

AWS
beantwortet vor 5 Monaten

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen