CoreDNS Basic Troubleshooting: Resolving Common Issues(En)
Lately, when deploying solutions, I faced some name resolution issues. While not always challenging to fix, I realized that having a good grasp of Kubernetes’ name resolution concepts can significantly speed up troubleshooting.
Through this article, I want to share my troubleshooting process and improve my understanding of CoreDNS components. The sections covered are:
- Basic Architecture and How It Works
- Kube-DNS & CoreDNS
- Troubleshooting Approaches
- Conclusion
This article is quite extensive, and I appreciate your time in reading it.
1. Basic Architecture and How It Works
First and foremost, let’s start with an official statement:
CoreDNS is also a type of DNS server developed in the Go programming language.
Unlike other DNS servers like BIND, it's highly flexible, and almost all its
functions are organized as plugins. These plugins can run individually or
together to provide DNS functionality.
According to the official description, we can leverage CoreDNS’s features to select and combine these plugins (CoreDNS Plugin API) to create a customized version of DNS resolution. A default CoreDNS installation includes approximately 30 plugins. You can find additional plugins you may need on the following website :
Having covered the background of CoreDNS, let’s now explain how domain name resolution works in Kubernetes.
Within Kubernetes, when a Pod needs to access a Service in the same Namespace, all it needs to do is execute:
# curl aa-svc
But what if the target is in a different namespace? In such cases, you need to include the domain as follows:
# curl aa-svc.domain
Hence, as soon as you step out of your namespace, name resolution becomes necessary. Whether you’re inside or outside a K8S cluster, DNS resolution typically involves these files:
/etc/host.conf
/etc/hosts
/etc/resolv.conf
When accessing, Pods will first look into the contents of /etc/resolv.conf
, which specifies the location of the DNS server. The content is automatically generated when dnspolicy: ClusterFirst
is set. Inside, the IP address of the nameserver
is the cluster IP of the DNS service. As a result, all DNS resolutions within this Pod will be routed through this cluster IP, irrespective of whether they are in the same or different namespaces.
The default search domains (searched in this order) are:
namespace.svc.cluster.local
svc.cluster.local
cluster.local
In summary, Kubernetes offers four DNS policies:
ClusterFirstWithHostNet
: When a Pod useshostNetwork: true
, it directly adopts theresolv.conf
contents from the Node. If you want to use Pod's custom content, you should use this policy.ClusterFirst
: This policy prioritizes using Kubernetes' internal DNS service (specifically CoreDNS) for DNS resolution within Pods.Default
: This policy letskubelet
determine which DNS policy to use, with the default being to use theresolv.conf
content from the Node.None
: This policy doesn't specify a DNS policy, allowing you to customize DNS configuration usingdnsConfig.
2. Kube-DNS & CoreDNS
Let’s briefly explain the differences between the two:
(1) Kube-DNS: Kube-DNS also provides DNS name resolution capabilities, but after Kubernetes version 1.21, support for Kube-DNS was removed by kubeadm, leaving only CoreDNS as the supported option. Below is a basic architecture diagram of Kube-DNS:
Here’s a brief overview of the three main components:
- kubeDNS: It monitors changes to services and endpoints within Kubernetes.
- dnsmasq: This component distinguishes between internal and external domains. For internal domains, it caches DNS queries and forwards them to port 10053.
- Sidecar: The sidecar performs health checks on kubeDNS and dnsmasq and collects monitoring metrics.
(2) CoreDNS: Please refer to the content above, so I won’t repeat it. The following is a basic operational flowchart for CoreDNS:
(3) Pros & Cons
※ Kube-DNS
Pros:
- Includes dnsmasq, providing a level of performance assurance.
Cons:
- When dnsmasq restarts, it kills the process before restarting the service, which might lead to query failures during the process.
- Excessive or frequent updates to internal files can potentially necessitate a dnsmasq restart, causing operational disruptions.
※ CoreDNS
Pros:
- Customizable with plugins to meet specific requirements.
- Default DNS solution supported in Kubernetes after version 1.21.
- More memory-efficient compared to Kube-DNS.
Cons:
- Less efficient caching compared to dnsmasq.
- Slower internal resolution compared to Kube-DNS.
3. Troubleshooting Approaches
#---------------------------------------------
# S3-1. Deploy dnsutils
#---------------------------------------------
[master]# vim dnsutils.yaml
apiVersion: v1
kind: Pod
metadata:
name: dnsutils
namespace: default
spec:
containers:
- name: dnsutils
image: registry.k8s.io/e2e-test-images/jessie-dnsutils:1.3
command:
- sleep
- "infinity"
securityContext:
capabilities:
add:
- NET_RAW
imagePullPolicy: IfNotPresent
restartPolicy: Always
[master]# kubectl create -f dnsutils.yaml -n default
[master]# kubectl get pod
#----------------------------------------------------
# S3-2. Ping
#----------------------------------------------------
[master]# kubectl exec -it dnsutils /bin/sh
/# ping kubernetes.default
/# ping default.svc.cluster.local
#----------------------------------------------------
# S3-3. Verify if the contents of the namespace's service are injected
# into environment variables
#----------------------------------------------------
[master]# kubectl exec -it dnsutils -n default -- env | grep KUBERNETES
#----------------------------------------------------
# S3-4. nslookup
#----------------------------------------------------
/# nslookup kubernetes.default
/# nslookup default.svc.cluster.local
;; connection timed out; no servers could be reached
#----------------------------------------------------
# S3-5. Verify coredns pod status
#----------------------------------------------------
[master]# kubectl get pods -l k8s-app=kube-dns -n kube-system
=> Running status
#----------------------------------------------------
# S3-6. Verify coredns pod logs
#----------------------------------------------------
[master]# for p in $(kubectl get pods --namespace=kube-system -l k8s-app=kube-dns -o name); do kubectl logs --namespace=kube-system $p; done
=> No errors
=> The reason for the increased log messages is that the 'log' parameter
was added to the 'kubectl edit configmap coredns -n kube-system' command
#----------------------------------------------------
# S3-7. Verify endpoint and pod binding
#----------------------------------------------------
[master]# kubectl get pods -l k8s-app=kube-dns -n kube-system -o wide
[master]# kubectl get endpoints kube-dns -n kube-system
#----------------------------------------------------
# S3-8. Determine whether it's 'Pod self-resolution' or 'Pod is fine,
# but requests sent to kube-dns cannot be forwarded.' This can be achieved
# by directly modifying the 'nameserver' in '/etc/resolv.conf' to another
# value.
#----------------------------------------------------
Pod /etc/resolv.conf => Point to 10.86.0.10 (Service “kube-dns” 的 cluterIP)
[lb01]# kubectl exec -it dnsutils /bin/sh
/# vi /etc/resolv.conf
nameserver 8.8.8.8
/# ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
64 bytes from 8.8.8.8: seq=0 ttl=56 time=2.774 ms
64 bytes from 8.8.8.8: seq=1 ttl=56 time=2.571 ms
64 bytes from 8.8.8.8: seq=2 ttl=56 time=2.625 ms
64 bytes from 8.8.8.8: seq=3 ttl=56 time=2.512 ms
/# ping www.google.com
ping: bad address 'www.google.com'
#----------------------------------------------------
# S3-9. Verify kube-proxy
#----------------------------------------------------
[master]# kubectl logs kube-proxy-6kdj2 --tail=5 -n kube-system
=> no Errors
Based on the following flowchart, it can be observed that Pods first query CoreDNS (10.96.0.10) and then forward the query to an external resolver for resolution.
#--------------------------------------------
# S3-10. Verify whether the Pod can correctly forward requests to
# external resolvers
#--------------------------------------------
[master]# kubectl exec -it nginx-quic-deployment-c5f8b8b44-8hwm9 -- bash
/# yum update
Lastly, please note that the IP address you can ping from within a Pod should not be the Service’s Cluster IP, as it is a virtual IP. If you need to ping other service IPs, please follow the approach outlined below:
[master]# kubectl describe pod <pod_name> => find IP
[master]# kubectl exec -it nginx-quic-deployment-c5f8b8b44-8hwm9 -- bash
/# ping 192.168.35.9
4. Conclusion
Finally, with the completion of the article on DNS resolution, I want to emphasize that understanding how internal name resolution works in Kubernetes is crucial. Communication between many services heavily relies on CoreDNS, so grasping the operational aspects of CoreDNS is vital for cluster administrators.
If your cluster is currently using the Kube-DNS solution and you plan to upgrade your Kubernetes, the official support for transitioning to CoreDNS is provided. You can refer to the following link for more information: