Search Domain Resolution Failure by Public Nameservers in Onprem Setup

Problem

Setting public nameservers like [1.1.1.1] and [8.8.8.8] in the /etc/resolv.conffile of the Management Plane master nodes of the airgapped environment, affecting pods to be in timeout errors:

CoreDNS pod logs
Copy

Observing Liveness/Readiness probe failures on pods like Glance-Api, Nova-Api-Osapi, Neutron-Server, etc,

Pod describe- Glance-Api pod
Copy

Environment

  • Self-Hosted Private Cloud Director Virtualization - v2025.4 and Higher
  • Self-Hosted Private Cloud Director Kubernetes - v2025.4 and Higher
  • Component: DNS

Cause

CoreDNS demonstrates a behavior in which any error encountered by its plugins can prevent DNS name resolution. In airgapped environments, this results in public nameservers being unreachable, even if they remain listed in /etc/resolv.conf (via /etc/netplan/50-cloud-init.yaml file in this case)

YAML
Copy

Here, when the PCD pods attempt to communicate with other pods via DNS, the presence of an additional search domain (e.g., abc.com) and the absence of an internal nameserver cause DNS queries to be forwarded to upstream nameservers (such as 8.8.8.8 and 1.1.1.1) configured on the nodes. Since internet connectivity is disabled, these queries time out, leading to I/O timeout errors.

This situation generates a high volume of failed DNS requests—approximately 2,800 per 166 seconds -- as pods continuously attempt to resolve names unsuccessfully. The resulting delays cause further instability within the environment, including failures in PCD pod DNS resolution, readiness and liveness probe failures, and overall management plane instability.

Diagnostics

  • Timeout errors mentioning the failure at the DNS servers [1.1.1.1:53, and 8.8.8.8:53]
CoreDNS Pod logs
Copy

Check nameserver values in the below configuration files:

Conf files
Copy

Resolution

  • Remove any unnecessary search domains from your DNS configuration.
  • Configure custom or internal nameservers that are accessible within your environment and capable of resolving the required DNS queries.

Validation

  • All the pods in the Management Plane cluster are in the running state.
  • No Readiness/Liveness probe failures in the OpenStack component pods like Nova, Neutron, Glance etc.
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard