EKS metrics - failed to get cpu utilization

0

Hi,

Any time that I run into an error it's convenient to be able to search the internet, or repost.aws, and discover a quick explanation. Often the search fails so let's add a result here.

Setting up an HPA, horizontal pod autoscaler.

Example errors:

69s (x19 over 20m)    Normal    SuccessfullyReconciled         TargetGroupBinding/k8s-foobar-fc869b6a82   Successfully reconciled
57s (x3 over 3m13s)   Warning   FailedComputeMetricsReplicas   HorizontalPodAutoscaler/foobar                     invalid metrics (1 invalid out of 1), first error is: failed to get cpu resource metric value: failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics API
57s (x3 over 3m13s)   Warning   FailedGetResourceMetric        HorizontalPodAutoscaler/foobar                      failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics API
42s (x3 over 2m58s)   Warning   FailedGetResourceMetric        HorizontalPodAutoscaler/foobar                      failed to get cpu utilization: did not receive metrics for targeted pods (pods might be unready)
42s (x3 over 2m58s)   Warning   FailedComputeMetricsReplicas   HorizontalPodAutoscaler/foobar                      invalid metrics (1 invalid out of 1), first error is: failed to get cpu resource metric value: failed to get cpu utilization: did not receive metrics for targeted pods (pods might be unready)
27s                   Warning   FailedComputeMetricsReplicas   HorizontalPodAutoscaler/foobar                      invalid metrics (1 invalid out of 1), first error is: failed to get cpu resource metric value: failed to get cpu utilization: missing request for cpu in container nginx of Pod foobar-545b4848f4-dhl5q
12s (x2 over 27s)     Warning   FailedGetResourceMetric        HorizontalPodAutoscaler/foobar                      failed to get cpu utilization: missing request for cpu in container nginx of Pod foobar-545b4848f4-dhl5q

The confusion is that "failed to get cpu resource metric value" and "failed to get cpu utilization" causes you to think the metrics server is not working. It says that it is unable to get CPU utilization.

asked a month ago156 views
2 Answers
0
Accepted Answer

The errors you're encountering with your Horizontal Pod Autoscaler (HPA) in Amazon EKS are common and can be caused by several factors. Let's break down the issues and their potential solutions:

  1. Metrics Server: The error "failed to get cpu utilization: unable to get metrics for resource cpu" often indicates that the Metrics Server is not properly installed or configured in your EKS cluster. The Metrics Server is crucial for HPA to function as it provides the necessary CPU and memory metrics. Ensure that the Metrics Server is installed and running correctly in your cluster.

  2. Pod Readiness: The message "did not receive metrics for targeted pods (pods might be unready)" suggests that the pods targeted by the HPA are not in a ready state. This could be due to various reasons such as application errors, resource constraints, or initialization issues. Check the status and events of your pods to identify any problems preventing them from becoming ready.

  3. Resource Requests: The error "missing request for cpu in container nginx of Pod" indicates that you haven't specified CPU resource requests in your pod specification. HPA requires resource requests to be set for the metrics it's scaling on. Make sure to define CPU requests in your deployment or pod template.

To resolve these issues:

  1. Verify Metrics Server installation: Ensure the Metrics Server is properly installed and running in your EKS cluster. You can check its status using kubectl commands.

  2. Check pod status: Investigate why your pods might not be ready. Look at pod events, logs, and describe the pods to identify any issues preventing them from starting successfully.

  3. Set resource requests: In your deployment or pod specification, make sure to set CPU resource requests. For example:

    resources:
      requests:
        cpu: 100m
  4. Wait for metrics: After making these changes, it may take a few minutes for the metrics to become available. The HPA typically checks metrics every 15 seconds, but it might take longer for initial metrics to be collected and reported.

  5. Verify metrics: You can use kubectl top pods to check if CPU metrics are being reported correctly for your pods.

By addressing these points, you should be able to resolve the HPA errors and get it functioning correctly with your EKS cluster.
Sources
Metrics for Amazon EKS and Kubernetes - AWS Prescriptive Guidance
How to monitor high resource usage in Amazon EKS tasks and services | AWS re:Post

profile picture
answered a month ago
profile pictureAWS
EXPERT
reviewed a month ago
0

The errors are solved by the following:

When configuring an HPA, make sure "requests" (and "limits") are set on the pods. That's usually set on the Deployment which then creates pods.

The HPA must have the "requests" value available, or else "missing request for cpu in container".

AWS Auto Mode might consider always adding default requests and limits for all pods, so that HPA will never fail in this way. Some other vendors do this. It's just one more in an already long list of items to keep in mind when running kubernetes.

answered a month ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions