By using AWS re:Post, you agree to the Terms of Use

Questions tagged with Amazon Elastic Kubernetes Service

Sort by most recent

Browse through the questions and answers listed below or filter and sort to narrow down your results.

Ubuntu Managed Nodes creation failure in Fully-Private cluster

Hi, For some reason I am not able to create Ubuntu managed nodes in fully private cluster. Though, managed Amazon-Linux nodes and all other self-managed nodes are joining the cluster successfully. I have followed all the guides and troubleshooting aws websites already still I am not successful. I have also run the troubleshooting script. Below is the result ``` HERE IS A SUMMARY OF THE ITEMS THAT REQUIRE YOUR ATTENTION: [WARNING]: Worker node's AMI ami-0ebb49de26355a371 differs from the public EKS Optimized AMIs. Ensure that the Kubelet daemon is at the same version as your cluster's version 1.23 or only one minor version behind. Please review this URL for further details: https://kubernetes.io/releases/version-skew-policy/ . [WARNING]: No secondary private IP addresses are assigned to worker node i-0e775ca75fe57bb70, ensure that the CNI plugin is running properly. Please review this URL for further details: https://docs.aws.amazon.com/eks/latest/userguide/pod-networking.html [WARNING]: As SSM agent is not reachable on worker node, this document did not check the status of Containerd, Docker and Kubelet daemons. Ensure that required daemons (containerd, docker, kubelet) are running on the worker node using command "systemctl status <daemon-name>". ============================================================================================================================================ Here are the detailed steps of the document execution: [X] Checking EKS cluster test-cluster: EKS Cluster: test-cluster is in Active state. 1. Checking if the cluster Security Group is allowing traffic from the worker node: Passed: The cluster Security Group sg-068d83e5a68aff9c7 is allowing traffic from the worker node. 2. Checking DHCP options of the cluster VPC: Passed: AmazonProvidedDNS is enabled 3. Checking cluster IAM role arn:aws:iam::782534010321:role/eksctl-test-cluster-cluster-ServiceRole-9WPNFE2Q3N2T for the required permissions: Passed: IAM role for cluster test-cluster has the required IAM policies attached. Passed: The cluster IAM role arn:aws:iam::782534010321:role/eksctl-test-cluster-cluster-ServiceRole-9WPNFE2Q3N2T has the required trust relationship for the EKS service. 4. Checking control plane Elastic Network Interfaces(ENIs) in the cluster VPC: Passed: The cluster Elastic Network Interfaces(ENIs) exist. 5. Cluster Endpoint Private access is disabled for your cluster, checking if the Public CIDR ranges include worker node i-0e775ca75fe57bb70 outbound IP: Passed: The cluster allows public access from 0.0.0.0/0 6. Checking cluster VPC for required DNS attributes: Passed: Cluster VPC vpc-0cbb4879c588fb52a has the required DNS attributes correctly set. -------------------------------------------------------------------------------------------------------------------------------------------- [X] Checking worker node i-0e775ca75fe57bb70 state: The instance is Running. 1. Checking if the EC2 instance family is supported: Passed: EC2 instance family m5.xlarge is supported. 2. Checking the worker node network configuration: Passed: Worker node is created in a private subnet without a NAT Gateway so VPC endpoints need to be used. Passed: Checking VPC Endpoints setup: Passed: The VPC Endpoint com.amazonaws.eu-west-2.ec2 exists. Checking its configuration: Passed: Security groups [{'GroupId': 'sg-01294c96494e79aaa', 'GroupName': 'eksctl-test-cluster-cluster-ClusterSharedNodeSecurityGroup-1TIY3P78RYQOZ'}] applied to VPC Endpoint com.amazonaws.eu-west-2.ec2 is allowing the worker node to reach the endpoint. Passed: The default VPC Endpoint Policy is being used. Passed: The VPC Endpoint com.amazonaws.eu-west-2.ecr.api exists. Checking its configuration: Passed: Security groups [{'GroupId': 'sg-01294c96494e79aaa', 'GroupName': 'eksctl-test-cluster-cluster-ClusterSharedNodeSecurityGroup-1TIY3P78RYQOZ'}] applied to VPC Endpoint com.amazonaws.eu-west-2.ecr.api is allowing the worker node to reach the endpoint. Passed: The default VPC Endpoint Policy is being used. Passed: The VPC Endpoint com.amazonaws.eu-west-2.ecr.dkr exists. Checking its configuration: Passed: Security groups [{'GroupId': 'sg-01294c96494e79aaa', 'GroupName': 'eksctl-test-cluster-cluster-ClusterSharedNodeSecurityGroup-1TIY3P78RYQOZ'}] applied to VPC Endpoint com.amazonaws.eu-west-2.ecr.dkr is allowing the worker node to reach the endpoint. Passed: The default VPC Endpoint Policy is being used. Passed: The VPC Endpoint com.amazonaws.eu-west-2.sts exists. Checking its configuration: Passed: Security groups [{'GroupId': 'sg-01294c96494e79aaa', 'GroupName': 'eksctl-test-cluster-cluster-ClusterSharedNodeSecurityGroup-1TIY3P78RYQOZ'}] applied to VPC Endpoint com.amazonaws.eu-west-2.sts is allowing the worker node to reach the endpoint. Passed: The default VPC Endpoint Policy is being used. Passed: S3 gateway endpoint ['vpce-0e01febb999cf1735', 'vpce-08ef7ffac589c3d01'] is added to the worker's VPC. Passed: Worker node's route table rtb-0d26fefe5a613e64f has the required route for the S3 endpoint vpce-0e01febb999cf1735. Passed: Worker node's route table rtb-0d26fefe5a613e64f has the required route for the S3 endpoint vpce-08ef7ffac589c3d01. 3. Checking the IAM instance Profile of the worker node: Passed: The instance profile arn:aws:iam::782534010321:instance-profile/eks-d0c1cede-94e5-c6bc-75fd-2f86043a90eb is used with the worker node i-0e775ca75fe57bb70 . Passed: IAM role arn:aws:iam::782534010321:role/eksctl-test-cluster-nodegroup-ser-NodeInstanceRole-12LZWOL4EJAJX is attached to Instance Profile. Passed: IAM role arn:aws:iam::782534010321:role/eksctl-test-cluster-nodegroup-ser-NodeInstanceRole-12LZWOL4EJAJX has the required IAM policies attached. Passed: No issues detected with the trust relationship policy of the arn:aws:iam::782534010321:role/eksctl-test-cluster-nodegroup-ser-NodeInstanceRole-12LZWOL4EJAJX role. 4. Checking worker node's UserData bootstrap script: Passed: The UserData of the worker node contains the required bootstrap script. 5. Checking the worker node i-0e775ca75fe57bb70 tags: Passed: Worker node i-0e775ca75fe57bb70 has the required cluster tags. 6. Checking the AMI version for EC2 instance i-0e775ca75fe57bb70: [WARNING]: Worker node's AMI ami-0ebb49de26355a371 differs from the public EKS Optimized AMIs. Ensure that the Kubelet daemon is at the same version as your cluster's version 1.23 or only one minor version behind. Please review this URL for further details: https://kubernetes.io/releases/version-skew-policy/ . 7. Checking worker node i-0e775ca75fe57bb70 Elastic Network Interfaces(ENIs) and Private IP addresses to check if CNI is running: [WARNING]: No secondary private IP addresses are assigned to worker node i-0e775ca75fe57bb70, ensure that the CNI plugin is running properly. Please review this URL for further details: https://docs.aws.amazon.com/eks/latest/userguide/pod-networking.html 8. Checking the outbound SG rules for worker node i-0e775ca75fe57bb70 Passed: The Outbound security group rules for worker node i-0e775ca75fe57bb70 are sufficient to allow traffic to the EKS cluster endpoint 9. Checking if the worker node is running in AWS Outposts subnet Passed: Worker node's subnet subnet-092b2d53a74ac655f is not running in AWS Outposts 10. Checking basic NACL rules Passed: NACL acl-0507434ac9745b3dc has sufficient rules to allow cluster traffic. 11. Checking STS regional endpoint availability: Passed: STS endpoint is activated within region eu-west-2. 12. Checking if Instance Metadata http endpoint is enabled on the worker node: Passed: Instance metadata endpoint is enabled on the worker node. 13. Checking if SSM agent is running and reachable on worker node: [WARNING]: As SSM agent is not reachable on worker node, this document did not check the status of Containerd, Docker and Kubelet daemons. Ensure that required daemons (containerd, docker, kubelet) are running on the worker node using command "systemctl status <daemon-name>". ============================================================================================================================================ Here is a list of other possible causes that were NOT checked by this document: [-] Ensure that Instance IAM role is added to aws-auth configmap, please check: https://docs.aws.amazon.com/eks/latest/userguide/add-user-role.html. [-] If your account is a part of AWS Organizations Service, confirm that no Service Control Policy (SCP) is denying required permissions, please check: https://aws.amazon.com/premiumsupport/knowledge-center/eks-node-status-ready/. ``` One strange thing is that in point 5 it says private endpoint access is disabled but I have already created the cluster fully private. ``` { "update": { "id": "4c429db1-80bb-48d2-bf98-3f84524c0b83", "status": "Successful", "type": "EndpointAccessUpdate", "params": [ { "type": "EndpointPublicAccess", "value": "false" }, { "type": "EndpointPrivateAccess", "value": "true" }, { "type": "PublicAccessCidrs", "value": "[\"0.0.0.0/0\"]" } ], "createdAt": "2022-10-03T09:35:42.092000+00:00", "errors": [] } } ``` Also, when I run the command `sudo systemctl status kubelet` I get the result. `Unit kubelet.service could not be found.` My cluster config is as below ``` apiVersion: eksctl.io/v1alpha5 kind: ClusterConfig metadata: name: test-cluster region: eu-west-2 version: "1.23" privateCluster: enabled: true vpc: id: vpc-ID subnets: private: hscn-private1-subnet: id: subnet-1 hscn-private2-subnet: id: subnet-2 managedNodeGroups: - name: serv-test-1 ami: ami-0ebb49de26355a371 instanceType: m5.xlarge desiredCapacity: 1 volumeType: gp2 volumeSize: 50 privateNetworking: true disableIMDSv1: true subnets: - hscn-private2-subnet ssh: allow: true tags: kubernetes.io/cluster/test-cluster: owned overrideBootstrapCommand: | #!/bin/bash /etc/eks/bootstrap.sh test-cluster --kubelet-extra-args '--node-labels=eks.amazonaws.com/nodegroup=serv-test-1,eks.amazonaws.com/nodegroup-image=ami-0ebb49de26355a371' --dns-cluster-ip 10.100.0.10 --apiserver-endpoint {My endpoint} --b64-cluster-ca {My-CA} ```
0
answers
0
votes
17
views
asked a day ago

EKS Node Not Ready

I have EKS cluster with 4 t.3large which has approx ~50 pods(small size). Often whenever I tried to update the application version from x to y and then few nodes goes into not ready state. Then I have to clean few resources and reboot the worker node and then situation back to normal. Any suggestions ? Logs from kube-proxy I0927 16:12:05.785853 1 proxier.go:790] "SyncProxyRules complete" elapsed="104.231873ms" I0927 16:18:27.078985 1 trace.go:205] Trace[1094698301]: "iptables ChainExists" (27-Sep-2022 16:16:36.489) (total time: 66869ms): Trace[1094698301]: [1m6.869976178s] [1m6.869976178s] END I0927 16:18:27.087821 1 trace.go:205] Trace[1957650533]: "iptables ChainExists" (27-Sep-2022 16:16:36.466) (total time: 67555ms): Trace[1957650533]: [1m7.555663612s] [1m7.555663612s] END I0927 16:18:27.124923 1 trace.go:205] Trace[460012371]: "DeltaFIFO Pop Process" ID:monitoring/prometheus-prometheus-node-exporter-gslfb,Depth:36,Reason:slow event handlers blocking the queue (27-Sep-2022 16:18:26.836) (total time: 186ms): Trace[460012371]: [186.190275ms] [186.190275ms] END W0927 16:18:27.248231 1 reflector.go:442] k8s.io/client-go/informers/factory.go:134: watch of *v1.Service ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding W0927 16:18:27.272469 1 reflector.go:442] k8s.io/client-go/informers/factory.go:134: watch of *v1.EndpointSlice ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding I0927 16:18:31.339045 1 trace.go:205] Trace[1140734081]: "DeltaFIFO Pop Process" ID:cuberun/cuberun-cuberun,Depth:42,Reason:slow event handlers blocking the queue (27-Sep-2022 16:18:30.696) (total time: 116ms): Trace[1140734081]: [116.029921ms] [116.029921ms] END I0927 16:18:32.403993 1 trace.go:205] Trace[903972463]: "DeltaFIFO Pop Process" ID:cuberundemo/cuberun-cuberundemo,Depth:41,Reason:slow event handlers blocking the queue (27-Sep-2022 16:18:31.657) (total time: 196ms): Trace[903972463]: [196.24798ms] [196.24798ms] END I0927 16:18:33.233172 1 trace.go:205] Trace[1265312678]: "DeltaFIFO Pop Process" ID:argocd/argocd-metrics,Depth:40,Reason:slow event handlers blocking the queue (27-Sep-2022 16:18:32.738) (total time: 359ms): Trace[1265312678]: [359.090093ms] [359.090093ms] END I0927 16:18:33.261077 1 proxier.go:823] "Syncing iptables rules" I0927 16:18:35.474678 1 proxier.go:790] "SyncProxyRules complete" elapsed="2.867637015s" I0927 16:18:35.587939 1 proxier.go:823] "Syncing iptables rules" I0927 16:18:37.014157 1 proxier.go:790] "SyncProxyRules complete" elapsed="1.45321438s" I0927 16:19:08.904513 1 trace.go:205] Trace[1753182031]: "iptables ChainExists" (27-Sep-2022 16:19:06.254) (total time: 2266ms): Trace[1753182031]: [2.266311394s] [2.266311394s] END I0927 16:19:08.904456 1 trace.go:205] Trace[228375231]: "iptables ChainExists" (27-Sep-2022 16:19:06.299) (total time: 2255ms): Trace[228375231]: [2.255433291s] [2.255433291s] END I0927 16:19:40.540864 1 trace.go:205] Trace[2069259157]: "iptables ChainExists" (27-Sep-2022 16:19:36.494) (total time: 3430ms): Trace[2069259157]: [3.430008597s] [3.430008597s] END I0927 16:19:40.540873 1 trace.go:205] Trace[757252858]: "iptables ChainExists" (27-Sep-2022 16:19:36.304) (total time: 3619ms): Trace[757252858]: [3.61980147s] [3.61980147s] END I0927 16:20:09.976580 1 trace.go:205] Trace[2070318544]: "iptables ChainExists" (27-Sep-2022 16:20:06.285) (total time: 3182ms): Trace[2070318544]: [3.182449365s] [3.182449365s] END I0927 16:20:09.976592 1 trace.go:205] Trace[852062251]: "iptables ChainExists" (27-Sep-2022 16:20:06.313) (total time: 3154ms): Trace[852062251]: [3.154369999s] [3.154369999s] END
0
answers
0
votes
15
views
asked 6 days ago

Unable to run kubectl & eks commands in a fully private cluster

I have created a VPC fully private (no direct internet access), let's call it VPC-A. This vpc is peer connected to another VPC, let's call it VPC-B. This VPC-B has internet connection and is being used as a gateway for VPC-A. I have deployed a fully private cluster noly (not any node) in the private subnet of the VPC-A using the [guide](https://eksctl.io/usage/eks-private-cluster/). The problem is I am not able to run any kubectl and eks command just like mentioned in the [guide](https://eksctl.io/usage/eks-private-cluster/). After digging a lot on the internet and I found few things to access the cluster. One thing is that I must create a machine in that private VPC and try to access the cluster from there. I also created many issues on github but did not get proper answer. Below are some experts' answers > You can communicate with the K8s API by deploying EC2 instance inside that VPC and defining the EKS K8s API to your kubectl. Well, I have deployed an instance within the vpc of my cluster but whenever I run the kubectl command from the instance inside the private vpc, I get the following error message `Unable to connect to the server: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)` Also in the [EKS fully private cluster guide](https://eksctl.io/usage/eks-private-cluster/) it is mentioned that > If your setup can reach the EKS API server endpoint via its private address, and has outbound internet access (for EKS:DescribeCluster), all eksctl commands should work. Can please someone guide me properly that how can I create such setup? I ran a number of commands to check if anything is wrong with accessing the server address. ``` nmap -p 443 1E9057EC8C316E£D"@JY$J&G%1C94A.gr7.eu-west-*.eks.amazonaws.com Starting Nmap 7.80 ( https://nmap.org ) at 2022-09-09 11:11 UTC Nmap scan report for 1E9057EC8C316E£D"@JY$J&G%1C94A.gr7.eu-west-*.eks.amazonaws.com (192.168.*.*) Host is up (0.00031s latency). Other addresses for 1E9057EC8C316E£D"@JY$J&G%1C94A.gr7.eu-west-*.eks.amazonaws.com (not scanned): 192.168.*.* rDNS record for 192.168.*.*: ip-192-168-*-*.eu-west-*.compute.internal PORT STATE SERVICE 443/tcp open https Nmap done: 1 IP address (1 host up) scanned in 0.04 seconds ``` Another command is ``` nslookup 1E9057EC8C316E£D"@JY$J&G%1C94A.gr7.eu-west-*.eks.amazonaws.com Server: 127.0.0.53 Address: 127.0.0.53#53 Non-authoritative answer: Name: 1E9057EC8C316E£D"@JY$J&G%1C94A.gr7.eu-west-*.eks.amazonaws.com Address: 192.168.*.* Name: 1E9057EC8C316E£D"@JY$J&G%1C94A.gr7.eu-west-*.eks.amazonaws.com Address: 192.168.*.* ``` And another is ``` telnet 1E9057EC8C316E£D"@JY$J&G%1C94A.gr7.eu-west-*.eks.amazonaws.com 443 Trying 192.168.*.*... Connected to 1E9057EC8C316E£D"@JY$J&G%1C94A.gr7.eu-west-*.eks.amazonaws.com Escape character is '^]'. ^CConnection closed by foreign hos ``` It is clear that I can access the api server endpoints from my machine which is in the same vpc as the api server. But still when I run the kubectl command I am getting this output `Unable to connect to the server: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)` When I ran the below command `kubectl cluster-info dump` I got the following error message `Unable to connect to the server: proxyconnect tcp: dial tcp: lookup socks5h on 127.0.0.53:53: server misbehaving` Thanks
1
answers
0
votes
56
views
asked 22 days ago