VM Migration Failure Which Results the VM in "ERROR" State
Problem
Post live migration of VM, the Nova API/DB shows the VM status as ERROR, although the VM continues running successfully on the target hypervisor.
Environment
- Private Cloud Director Virtualization - v2025.4 and Higher
- Self-Hosted Private Cloud Director Virtualization - v2025.4 and Higher
- Componen: Compute Service
Cause
This issue occurs due to inconsistencies between the Nova database and the actual hypervisor state during/after migration. Different underlying issues can cause this problem. Below are the common causes:
- Stale file locks on storage volumes.
- Database state mismatch between Nova conductor and compute host.
- Scheduler exceptions (e.g.,
NoValidHost
) after migration completes.
Diagnostics
- Stale file locks on storage volumes
- Migration fails with volume/lock errors.
- The
/var/log/pf9/ostackhost.log
from the source compute host:
x
nova.virt.libvirt.volume.mount [REQUEST_ID service] [instance: INSTANCE_ID] Request to remove attachment (volume-[VOLUME_ID] from [MOUNTPOINT], but we don't think it's in use.
ERROR nova.virt.libvirt.driver [REQUEST_ID service] [instance: [INSTANCE_ID]] Failed to start libvirt guest: libvirt.libvirtError: internal error: qemu unexpectedly closed the monitor: [TIMESTAMP] qemu-system-x86_64: -device virtio-blk-pci,.....
Failed to get shared "write" lock
- Confirm Cinder volume status:
$ openstack volume show <VOLUME_ID>
- Database state mismatch between Nova and hypervisor
- The
openstack server show
reports ERROR or wrong hypervisor, but VM is running on a different host. - The
/var/log/pf9/ostackhost.log
from the source compute host:
nova.exception_Remote.UnexpectedTaskStateError_Remote: Conflict updating instance [INSTANCE_ID] Expected: {\'task_state\': [\'migrating\']}. Actual: {\'task_state\': None}\nTraceback (most recent call last):\n\n File "/var/lib/openstack/lib/python3.10/site-packages/nova/db/main/api.py", line 2368, in _instance_update\n update_on_match(compare, raise NoRowsMatched("Zero rows matched for %d attempts" attempts)\n\noslo_db.sqlalchemy.update_match.NoRowsMatched: Zero rows matched for 3 attempts
[TIMESTAMP] TRACE nova.compute.manager [instance: [INSTANCE_ID]] nova.exception.UnexpectedTaskStateError: Conflict updating instance [INSTANCE_ID]. Expected: {'task_state': ['migrating']}. Actual: {'task_state': None}
- Check Nova DB entry vs actual hypervisor:
$ openstack server show -c OS-EXT-SRV-ATTR:hypervisor_hostname <VM_ID>
- Verify if the VM is running on the above mentioned hypervisor and its state:
$ virsh list --all | grep <INSTANCE_ID>
For Self-Hosted Private Cloud Director only
- Scheduler exceptions (e.g.,
NoValidHost
) after migration
- Migration logs show
NoValidHost
, but VM still boots/migrates at the hypervisor level. - The nova-scheduler logs :
$ kubectl logs <NOVA_SCHEDULER_POD> -n <NAMESPACE>
NoValidHost(reason="")\n\nnova.exception.NoValidHost: No valid host was found.
Resolution
To restore consistency and recover the VM:
- Set the VM state to Active
$ openstack server set --state active <VM_ID>
- Start the VM:
$ openstack server start <VM_ID>
Validation
- Confirm the VM is now in ACTIVE state:
$ openstack server show -c status <VM_ID>
| status | ACTIVE |
- Ensure it is hosted correctly:
$ openstack server show -c OS-EXT-SRV-ATTR:hypervisor_hostname <VM_ID>
Additional Information
- This procedure is safe: it resets Nova’s recorded state without modifying VM disk or memory.
- Always validate using
virsh
command that the VM is running before resetting state. - For frequent issues, investigate root causes in Nova conductor, scheduler, and Cinder logs.
Was this page helpful?