Caching Not working for NodeLocal DNSCache.
Problem
- The NodeLocal DNSCache pods are failing to resolve the cached kubernetes service DNS queries.
- The DNS resolution for kubernetes services failing with
SERVFAIL
response.
Environment
- Platform9 Managed Kubernetes – All Versions
Cause
- The TTL of the records coming from CoreDNS is 30 Secs by default, hence any
.cluster.local
records would only be cached for 30s.
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
ttl 30
}
- Due to this any record cached in NodeLocal DNSCache pods would only be queryable for 30 Secs before it is expired from the cache.
- The DNS resolutions beyond 30 Secs will fail with a
SERVFAIL
response.
Resolution
- The CoreDNS ConfigMap may be edited to set a higher TTL for any such domains; however, this can result in to a situation where these records will take longer to update in case their endpoint is updated.
- The ConfigMap/Corefile for the node-local-dns component would also need to be updated to allow a >30s maximum for any successful lookup record TTLs, e.g.
cache {
success 9984 60
denial 9984 5
}
A similar issue was reported to upstream at
https://github.com/kubernetes/dns/issues/415#issuecomment-712450686
Was this page helpful?