kubectl error: "dial tcp 172.17.11.97:10250: connect: no route to host"

0

Hi everyone,

Here's the situation.
We have 3 clusters: "dev", "staging", "production".
All 3 deployed with the same terraform script by our CI. All 3 using the v1.11 of k8s.
Our clusters are deployed with a public VPC and a private VPC.

We're working since more than 1 month with our "dev" cluster. Pushing apps inside, etc.

Now, we're in the step to push these apps in our "staging" env to validate that everything is ok before going into production with the "production" cluster, ofc.

First, to clean everything and start on the saner possible bases. We killed the "staging" cluster and asked to our CI to recreate it.

Then we began to encounter strange, random behaviors from this newly created cluster, which is exactly the same by construction than the "dev" cluster.

We deploy our apps thanks to Helm and most of the time we end up with the following error from Helm when we just launch the helm version command:

Client: &version.Version{SemVer:"v2.12.3", GitCommit:"eecf22f77df5f65c823aacd2dbd30ae6c65f186e", GitTreeState:"clean"}
Error: forwarding ports: error upgrading connection: error dialing backend: dial tcp 172.17.13.252:10250: connect: no route to host
ERROR: Job failed: exit code 1

If we relaunch the script that deploys our apps 10 times for example, there's a chance that one of the launch will finally work. 樂

So, this is our first problem we're trying to solve with you here.

The second problem, which is surely related to the first one, is that now that some of our apps are deployed, we can to use the kubectl tool to verify that everything is ok.

So for example, we'll kubectl get all, which works fine.
Then, we'll want to inspect a specific pod for example: kubectl describe pod/pod-name, which also works fine.
Finally, we'll want to inspect logs of that specific pod: kubectl logs pod/pod-name. And here we have the following error:

➜  kubectl logs pod/pod-name
Error from server: Get https://172.17.11.97:10250/containerLogs/.../: dial tcp 172.17.11.97:10250: connect: no route to host

Do you have an idea where the problem could come from ?

Thanks,
Jules

Edited by: jivanic on Feb 7, 2019 6:44 AM

Edited by: jivanic on Feb 7, 2019 6:45 AM

Edited by: jivanic on Feb 7, 2019 6:46 AM

Edited by: jivanic on Feb 7, 2019 6:46 AM

Edited by: jivanic on Feb 7, 2019 6:47 AM

Edited by: jivanic on Feb 7, 2019 6:48 AM

Edited by: jivanic on Feb 7, 2019 6:51 AM

jivanic
asked 5 years ago328 views
1 Answer
0

Once again, I'll answer myself to my question.

The problem was that there's a amazingly error prone limitation in AWS EKS: you cannot use the 172.17.x.x IP addresses.

More infos:
https://github.com/aws/amazon-vpc-cni-k8s/issues/137
https://stackoverflow.com/questions/53034064/eks-unable-to-pull-logs-from-pods

jivanic
answered 5 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions