# VMs Running in a Specific Compute Host Were Not Reachable

## Problem

Multiple VMs running in a specific compute host were not reachable from the compute host itself and also from the external network when checked using network connectivity tests. Identified that the affected VMs were part of a common VLAN with port security disabled.

### Environment <a href="#environment" id="environment"></a>

* Private Cloud Director Virtualization - v2025.4 and Higher
* Self-Hosted Private Cloud Director Virtualization - v2025.4 and Higher
* Component: Networking Service

### Cause <a href="#cause" id="cause"></a>

The issue is that stale entries in OVN-SB-DB are causing a complete VM network outage. When VMs are connected to a provider network with **port security disabled**, any attempt to reach the VM IP addresses, PCD does not recognize this and prevents the FDB table from learning the associated MAC addresses. As a result, traffic is continuously broadcast, ultimately this affected VM is causing a total network outage for the VM.

An[ upstream](https://bugs.launchpad.net/neutron/+bug/2012069) issue was identified and is tracked as PCD-4889 in PCD

## Diagnostics

* Identify target VMs on the host from the affected network. Usually, all the VMs running on the same host from a specific network are affected

  ```bash
  openstack server list --all --host <HOST_UUID> | grep -i <NETWORK_NAME>
  ```

* Find recently migrated/created VMs from above

  ```bash
  openstack server event list <VM_UUID>
  ```

* Check if Port Security is disabled.

* Check if the VMs are reachable inside the corresponding subnet.

* Power off suspect VM**s** one by one (with confirmation) and check network reachability for other VMs.

* Once the problematic VM is identified and powered off, it allows the rest of the VMs in a running state and using the affected network to be reachable; change the IP and MAC address of the network port attached to the problematic VM.

* In the OVN flows for the affected VM, the evident traces of a stale route can be tracked as shown in the below example

  <pre class="language-bash" data-title="Affected Host - Example:"><code class="lang-bash">65. reg15=0x58,metadata=0x14, priority 100, cookie 0xf8315af4
      output:274
  ##In the OVN flow 0x58 to decimal is 88.
  </code></pre>

* The corresponding rule to 88 was the following, which caused the stale route.

  <pre class="language-bash" data-title="Affected Host"><code class="lang-bash">_uuid               : [UUID]
  dp_key              : 20
  mac                 : "[MAC_ADDRESS]"
  port_key            : 88
  </code></pre>

### Workaround 1 <a href="#workaround" id="workaround"></a>

* Delete the existing port and recreate the port with the same IP so that the MAC address of the VM is changed.<br>

### Workaround 2 <a href="#workaround" id="workaround"></a>

* To run `ovn-*` commands on the hosts onboarded to PCD, execute below steps.
  * Create an environment file `ovs-alias.rc` as below:

    ```
    EXTERNAL_ID=$(sudo ovs-vsctl get open . external_ids:ovn-remote | awk -F: '{print $2}')
    export NBDB=tcp:${EXTERNAL_ID}:6641
    export SBDB=tcp:${EXTERNAL_ID}:6642
    alias ovn-sbctl="ovn-sbctl --db=$SBDB"
    alias ovn-nbctl="ovn-nbctl --db=$NBDB"
    alias ovn-trace="ovn-trace --db=$SBDB"
    ```
  * Export the rc file and start using the ovn commands:

    ```
    $ source ovs-alias.rc
    $ ovn-sbctl show
    ```
* Delete the stale FDB from the `ovn-sb` pod using the command:<br>

  <pre class="language-bash" data-title="Affected Host"><code class="lang-bash">$ ovn-sbctl destroy fdb [FDB_UUID]
  </code></pre>
* Post this change, the network connectivity of the impacted VMs in the host will be resolved.

### Resolution <a href="#resolution" id="resolution"></a>

Instead of disabling port security entirely, we recommend keeping the basic MAC-address validation enabled and applying a security group that allows all traffic. This approach ensures that:

* The VM ports continue to enforce correct MAC-address learning.
* The gateway MAC address is not mistakenly learned on VM interfaces.
* All inbound and outbound traffic continues to flow without restriction.

This provides the required functionality while maintaining the minimal level of protection needed to prevent incorrect MAC entries from causing connectivity issues.

### Validation <a href="#validation" id="validation"></a>

The VMs will respond to ping tests and be accessible via SSH and virsh console.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://platform9.com/kb/pcd/networking/vms-running-in-a-specific-compute-host-were-not-reachable.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
