VM Migration Failure Which Results the VM in "ERROR" State

Problem

Post live migration of VM, the Nova API/DB shows the VM status as ERROR, although the VM continues running successfully on the target hypervisor.

Environment

Private Cloud Director Virtualization - v2025.4 and Higher
Self-Hosted Private Cloud Director Virtualization - v2025.4 and Higher
Componen: Compute Service

This issue occurs due to inconsistencies between the Nova database and the actual hypervisor state during/after migration. Different underlying issues can cause this problem. Below are the common causes:

Stale file locks on storage volumes.
Database state mismatch between Nova conductor and compute host.
Scheduler exceptions (e.g., NoValidHost) after migration completes.

Diagnostics

Stale file locks on storage volumes

Migration fails with volume/lock errors.
The /var/log/pf9/ostackhost.log from the source compute host:

ostackhost.log
    
​x
    
nova.virt.libvirt.volume.mount [REQUEST_ID service] [instance: INSTANCE_ID] Request to remove attachment (volume-[VOLUME_ID] from [MOUNTPOINT], but we don't think it's in use.​ERROR nova.virt.libvirt.driver [REQUEST_ID service] [instance: [INSTANCE_ID]] Failed to start libvirt guest: libvirt.libvirtError: internal error: qemu unexpectedly closed the monitor: [TIMESTAMP] qemu-system-x86_64: -device virtio-blk-pci,..... Failed to get shared "write" lock
Copy

Confirm Cinder volume status:

Management Plane
    
 
$ openstack volume show <VOLUME_ID>
Copy

Database state mismatch between Nova and hypervisor

The openstack server show reports ERROR or wrong hypervisor, but VM is running on a different host.
The /var/log/pf9/ostackhost.log from the source compute host:

ostackhost.log
    
nova.exception_Remote.UnexpectedTaskStateError_Remote: Conflict updating instance [INSTANCE_ID]  Expected: {\'task_state\': [\'migrating\']}. Actual: {\'task_state\': None}\nTraceback (most recent call last):\n\n  File  "/var/lib/openstack/lib/python3.10/site-packages/nova/db/main/api.py", line 2368, in _instance_update\n    update_on_match(compare, raise NoRowsMatched("Zero rows matched for %d attempts"  attempts)\n\noslo_db.sqlalchemy.update_match.NoRowsMatched: Zero rows matched for 3 attempts​[TIMESTAMP] TRACE nova.compute.manager [instance:  [INSTANCE_ID]] nova.exception.UnexpectedTaskStateError: Conflict updating instance  [INSTANCE_ID]. Expected: {'task_state': ['migrating']}. Actual: {'task_state': None}
Copy

Check Nova DB entry vs actual hypervisor:

Command
    
 
$ openstack server show -c OS-EXT-SRV-ATTR:hypervisor_hostname <VM_ID>
Copy

Verify if the VM is running on the above mentioned hypervisor and its state:

On Host
    
 
$ virsh list --all | grep <INSTANCE_ID>
Copy

For Self-Hosted Private Cloud Director only

Scheduler exceptions (e.g., NoValidHost) after migration

Migration logs show NoValidHost, but VM still boots/migrates at the hypervisor level.
The nova-scheduler logs :

Management Plane
    
 
$ kubectl logs <NOVA_SCHEDULER_POD> -n <NAMESPACE>NoValidHost(reason="")\n\nnova.exception.NoValidHost: No valid host was found.
Copy

Resolution

To restore consistency and recover the VM:

Set the VM state to Active

Bash
    
 
$ openstack server set --state active <VM_ID>
Copy

Start the VM:

Bash
    
 
$ openstack server start <VM_ID>
Copy

Validation

Confirm the VM is now in ACTIVE state:

Management Plane
    
 
$ openstack server show -c status <VM_ID> | status | ACTIVE |
Copy

Ensure it is hosted correctly:

Management Plane
    
 
$ openstack server show -c OS-EXT-SRV-ATTR:hypervisor_hostname <VM_ID>
Copy

Additional Information

This procedure is safe: it resets Nova’s recorded state without modifying VM disk or memory.
Always validate using virsh command that the VM is running before resetting state.
For frequent issues, investigate root causes in Nova conductor, scheduler, and Cinder logs.

Last updated on

Was this page helpful?

VM Migration Failure Which Results the VM in "ERROR" State

Problem

Environment

Cause

Diagnostics

Resolution

Validation

Additional Information