CoreDNS has a ETCD plugin https://coredns.io/plugins/etcd/, which essentially allows for dynamic DNS by reading the values from ETCD.
Since EKS is managed, it means we can't access the etcd instance on master node, that's fine as I can create my own etcd cluster (and I did). Below is my coredns deployment
apiVersion: v1
data:
Corefile: |
.:53 {
errors
health
log
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
}
etcd {
path /skydns
endpoint http://etcd-cluster-ip.default.svc.cluster.local:2379
fallthrough
}
prometheus :9153
forward . /etc/resolv.conf
cache 30
loop
reload
loadbalance
}
kind: ConfigMap
metadata:
annotations: {}
labels:
eks.amazonaws.com/component: coredns
k8s-app: kube-dns
name: coredns
namespace: kube-system
The issue I face now is that the master node is not able to resolve the ClusterIP DNS etcd-cluster-ip.default.svc.cluster.local
, which is the ClusterIP of my etcd cluster. If I change that DNS with the actual ClusterIP, name resolution works as expected and CoreDNS is able to access ETCD
How can the master node resolve the DNS of my cluster ? I see below line in coredns logs
{"level":"warn","ts":"2022-03-16T20:44:42.352Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-fd406ba0-cc21-4132-bfef-ca14e3fd4eb3/etcd-cluster-ip.default.svc.cluster.local:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial tcp: lookup etcd-cluster-ip.default.svc.cluster.local on 10.0.0.2:53: no such host\""}