The Amazon CloudWatch CPU or GPU utilization metric for my Amazon SageMaker endpoint is greater than 100%.
Resolution
The CloudWatch CPUUtilization and GPUUtilization metrics show the percentage of CPU or GPU units that the containers are using. This percentage is multiplied by the number of CPUs or GPUs, and this calculation can result in a value that's greater than 100%.
To calculate CPUUtilization or GPUUtilization, multiply the percentage of CPU or GPU units that the containers are using by the number of CPUs or GPUs.
The following examples describe how CPU or GPU utilization can be greater than 100%:
- If a non-GPU instance, such as ml.m4.xlarge, has four virtual CPUs (vCPUs), then CPUUtilization is between 0 and 400%.
- If a GPU instance, such as ml.p3.8xlarge, has 32 vCPUs, then CPUUtilization is between 0 and 3200%. If the instance has 4 GPUs, then GPUUtilization is between 0 and 400%.
- If there are multiple instances, then the default view in CloudWatch shows the average CPU or GPU utilization across all instances. For example, if you have five ml.m4.xlarge instances, then CPUUtilization is between 0 and 400% because each instance has four vCPUs.
For more information about the CPUUtilization and GPUUtilization metrics, see Monitor Amazon SageMaker with Amazon CloudWatch.
For a list of how many vCPUs or GPUs are in each instance type, see Amazon SageMaker pricing.