1 Answer
- Newest
- Most votes
- Most comments
0
The issue you're facing with the intermittent "unknown host" exceptions in your EKS cluster could be due to a few reasons. Let's go through some troubleshooting steps to help you identify the root cause:
-
Verify DNS Configuration:
- Ensure that the VPC resolver IP is correctly configured in the CoreDNS add-on. You can check this by inspecting the ConfigMap for the CoreDNS add-on in your EKS cluster.
- Verify that the DNS hostname and DNS resolution are enabled for your default VPC.
- Check if there are any firewall rules or network ACLs in your VPC that might be blocking or rate-limiting DNS traffic.
-
Investigate DNS Performance:
- Monitor the performance of the CoreDNS pods, not just the overall usage, but also look for any spikes or inconsistencies in the query response times.
- Check the CoreDNS logs for any errors or unusual behavior.
- Use tools like
digornslookupfrom within the pod to test the DNS resolution for your domains and external services.
-
Analyze Network Connectivity:
- Ensure that the Classic Load Balancer is correctly configured and able to resolve the DNS names of the backend services.
- Check the network ACLs and security groups associated with the Classic Load Balancer and the worker nodes to ensure that they are not blocking any necessary traffic.
- Verify that the worker nodes can access the Route53 service and the external services you are using.
-
Explore Alternative DNS Providers:
- Consider using a different DNS provider, such as Google DNS (8.8.8.8, 8.8.4.4) or Cloudflare DNS (1.1.1.1, 1.0.0.1), and see if the issue persists.
- This can help you identify whether the problem is specific to the VPC resolver or a broader DNS-related issue.
-
Compare Cluster Configurations:
- Carefully compare the configuration and settings of the EKS cluster that is working correctly with the one experiencing the issues.
- Look for any differences in the VPC setup, network ACLs, security groups, or other relevant configurations that could be causing the discrepancy.
-
Consider Cluster Debugging:
- Use the
kubectlcommand-line tool to inspect the state of your EKS cluster, pods, and services. - You can also use tools like
kubectl describeandkubectl logsto gather more information about the behavior of the CoreDNS pods and the DNS resolution process.
- Use the
-
Check for Resource Limits:
- Ensure that there are no resource limits or quotas in place that could be causing the DNS resolution issues.
- This could include limits on the number of DNS queries, network bandwidth, or other related resources.
By following these troubleshooting steps, you should be able to gather more information and identify the root cause of the intermittent "unknown host" exceptions in your EKS cluster. If you're still unable to resolve the issue, you may want to consider reaching out to the AWS support team for further assistance.
answered a year ago
Relevant content
- AWS OFFICIALUpdated 6 months ago

Thank you for your suggestions. I enabled query logging for my VPC resolver. What I observed is that, out of the two nodes in my cluster, one node always points to the old ip before changing my DNS serivce provider to route 53. Now on that IP I do not have my services reachable. Any hints on why this might be happening?