Skip to content

Kubecost AddOn and other remains Degraded after EKS cluster creation in auto-mode

0

I've managed to create an EKS cluster (v1.31) in auto-mode using default node-pools. Also the metrics-server Add-On is automatically installed and running when the cluster is started. I could successfully start the cluster and deploy my pods including an ingress with ALB.

So far, so good. But if I try to add another AddOn (i.e. kubecost latest version which should be compatible with v1.31) the Add-On is degraded after initialization. The health state of the addon reports: InsufficientNumberOfReplicas. Both pools are nearly empty, I only use 2% of the resources.

Any idea how I've to fix this?? Thanks.

asked a year ago530 views
2 Answers
0

Hello,

'InsufficientNumberOfReplicas' error is generally seen when the required number of required replicas are not running and generally this could be due to nodes are not present to run the replicas. Hence you can check on the points as above mentioned and also if the replicas are started and are fine the status of the addon will be updated to healthy as in auto-mode when you create addon the node will be created afterwards. If this is not helpful you can consider reaching out to us via ticket to check resource details and to further check on any the issue.

AWS
answered a year ago
  • I don't understand how it is possible that no suitable nodes are available to run replicas in this case. The cluster was newly created using EKS Auto-Mode with all default values (except for my own VPC) and freshly created roles. I haven't deployed any additional pods. I only added the add-on after the cluster (v1.31) was fully initialized.

    From the two available default node pools [general-purpose, system], only about 2% of the resources are being utilized. How is it possible that no nodes are available? The Auto-Mode is supposed to ensure scaling if needed, and the default c6g.large EC2 instances should provide sufficient capacity. Am I misunderstanding something here?

    BTW: In a test, I was able to install the AWS CNI Add-On without any issues before installing the Kubecost Add-On (v2.4.3-eksbuild.1). And I can install my pods and ingress successfully afterwards.

0

The issue you're experiencing with the Kubecost AddOn and other add-ons remaining in a degraded state after EKS cluster creation in auto-mode is likely related to resource allocation or configuration problems. Here are some steps you can take to troubleshoot and potentially resolve the issue:

  1. Check the pod status: Use the command kubectl get pods -n <namespace> to see the status of the Kubecost pods. Look for any pods that are in a pending or failed state.

  2. Describe the problematic pods: Use kubectl describe pod <pod-name> -n <namespace> to get more detailed information about why the pods are not running properly. This may reveal issues with resource limits, scheduling, or other constraints.

  3. Review the add-on configuration: Ensure that the add-on configuration is correct and compatible with EKS auto-mode. You may need to adjust the resource requests and limits for the Kubecost containers.

  4. Check node resources: Although you mentioned that you're only using 2% of the resources, verify that there are enough resources available on the nodes to schedule the Kubecost pods. Use kubectl describe nodes to see the allocatable resources and current usage.

  5. Examine cluster events: Use kubectl get events --sort-by=.metadata.creationTimestamp to see if there are any relevant events that might explain why the add-on is not deploying correctly.

  6. Verify VPC endpoint configuration: Ensure that your VPC endpoints for EKS are correctly configured, as this can impact add-on functionality.

  7. Check for conflicting policies or webhooks: Look for any admission webhooks or policies that might be blocking the add-on deployment.

  8. Review the EKS add-on logs: Check the Amazon EKS console or use AWS CLI to retrieve the add-on logs for more information on why it's in a degraded state.

  9. Consider updating the add-on: If you're using an older version of the Kubecost add-on, try updating to the latest version compatible with your EKS version.

  10. Adjust pod priority: If resource constraints are an issue, you may need to adjust the pod priority of the Kubecost agent or scale up your instances to provide more resources.

If these steps don't resolve the issue, you may need to contact AWS support for further assistance, especially if you suspect there might be a compatibility issue between the add-on and EKS auto-mode.
Sources
Runtime coverage and troubleshooting for Amazon EKS clusters - Amazon GuardDuty
Troubleshoot EKS Auto Mode - Amazon EKS
Amazon EKS add-ons - Amazon EKS

answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.