Caching Not working for NodeLocal DNSCache.

Problem

  • The NodeLocal DNSCache pods are failing to resolve the cached kubernetes service DNS queries.

  • The DNS resolution for kubernetes services failing with SERVFAIL response.

Environment

  • Platform9 Managed Kubernetes – All Versions

Cause

  • The TTL of the records coming from CoreDNS is 30 Secs by default, hence any .cluster.local records would only be cached for 30s.

kubernetes cluster.local in-addr.arpa ip6.arpa {
            pods insecure
            fallthrough in-addr.arpa ip6.arpa
            ttl 30
        }
  • Due to this any record cached in NodeLocal DNSCache pods would only be queryable for 30 Secs before it is expired from the cache.

  • The DNS resolutions beyond 30 Secs will fail with a SERVFAIL response.

Resolution

  • The CoreDNS ConfigMap may be edited to set a higher TTL for any such domains; however, this can result in to a situation where these records will take longer to update in case their endpoint is updated.

  • The ConfigMap/Corefile for the node-local-dns component would also need to be updated to allow a >30s maximum for any successful lookup record TTLs, e.g.

circle-info

Last updated