VM Creation Failure Due to Glance Host and Stale OVN NetNS Issue

Problem

VM creation was failing at the volume-from-image stage due to the Glance host being in a failed state. The failure was linked to a stale OVN metadata network namespace, which caused the pf9-neutron-ovn-metadata-agent to crash, impacting Glance functionality.

Environment

  • Private Cloud Director Virtualization - v2025.4 and Higher
  • Private Cloud Director Kubernetes – v2025.4 and Higher
  • Self-Hosted Private Cloud Director Virtualization - v2025.4 and Higher
  • Self-Hosted Private Cloud Director Kubernetes - v2025.4 and Higher

Cause

The root cause was traced to the Glance host being in a failed state. Additionally, a stale/corrupted OVN metadata namespace caused the pf9-neutron-ovn-metadata-agent service to fail, which contributed to network namespace issues affecting Glance and possibly Cinder volume creation.

Diagnostics

Key findings during the investigation:

  • Glance backend was not visible in the openstack volume backend pool list command output.
  • pf9-neutron-ovn-metadata-agent was in a failed state with the following critical log in /var/log/pf9/pf9-neutron-ovn-metadata-agent.log:
Affected Host
Copy
  • Stale/invalid NETNS entry found under /var/run/netns:
Affected Host
Copy
  • The invalid namespace had no permissions and was inaccessible..

Resolution

  • Identified that pf9-neutron-ovn-metadata-agent was in a failed state.
  • Found a stale NETNS entry with invalid permissions:
Affected Host
Copy
  • Deleted the stale netns:
Affected Host
Copy
  • Restarted the metadata agent:
Affected Host
Copy
  • Restarted the glance-api service to ensure backend availability:
Affected Host
Copy

Validation

  • Post-remediation, pf9-neutron-ovn-metadata-agent started successfully without errors.
  • Glance backend was registered properly.
  • VM creation using image-backed volume was tested and succeeded.
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard