Unable to Create Clone Volumes From Source Volume

Problem

The VM shows an attached bootable volume in the PCD UI. However, the actual disk file being used by the hypervisor does not match the volume registered in the storage service.

This resulted in Failure when creating a clone of the volume present in the Cinder but without having a valid backend file.

Environment

  • Private Cloud Director Virtualization - v2025.6 and Higher

  • Private Cloud Director Kubernetes – v2025.6 and Higher

  • Self-Hosted Private Cloud Director Virtualization - v2025.6 and Higher

  • Self-Hosted Private Cloud Director Kubernetes - v2025.6 and Higher

  • Component - Block Storage

Cause

This issue is caused by a metadata inconsistency between the Compute service and Block storage service which likely occurred during a failed volume operation. The file mapped for /dev/vda bootable volume as per virsh dumpxml does not exist in Block storage, and therefore is not manageable by Block Storage APIs.

Meanwhile, the Cinder-registered volume has no backend file present on the Shared mount used for tintri, confirming metadata drift between Compute Service and Block storage service .

Diagnostics

The following command outputs confirm the mismatch between the control plane metadata and the data plane reality.

  1. Verify the Volume Status: This shows that the reported boot volume is in-use, but a search for the actual disk file will fail in the Block storage backend:

    openstack volume show <VOLUME_UUID_1> --fit
    +--------------------------------+----------------+
    | Field                          | Value          |
    +--------------------------------+----------------+
    | name                           | [VOLUME_NAME]  |
    | status                         | in-use         |
    +--------------------------------+----------------+
  2. Inspect the VM XML on the Hypervisor: Log in to the compute host where the VM is running and inspect the VM domain XML. This reveals the actual disk file in use. Note: The output shows the serial matching the Cinder-reported ID, but the source file uses the unmanaged ID

    Hypervisor Host
    $ sudo virsh dumpxml <VM_UUID> | grep -B5 -i volume
    <...>
      <disk type='file' device='disk'>
        <driver name='qemu' type='raw' cache='none' io='native'/>
        <source file='/path/to/volume-<VOLUME_UUID_2>' index='2'/>
        <backingStore/>
        <target dev='vda' bus='virtio'/>
    <...>
  3. Confirm the Actual Volume is Unmanaged: Searching for the volume ID found in the virsh dumpxml confirms that Block storage service does not know about this active disk.

    $ openstack volume show <VOLUME_UUID_2> --fit
    No volume with a name or ID of '<VOLUME_UUID_2>' exists.
  4. Check the Disk Format: Inspecting the file on the hypervisor confirms it is a raw disk

    $ qemu-img info /path/to/volume-<VOLUME_UUID_2>
    volume-<VOLUME_UUID_2>
    image: /path/to/volume-<VOLUME_UUID_2>
    file format: raw
    virtual size: 160 GiB (171798691840 bytes)
    disk size: 162 GiB

The Nova domain XML on the compute host points to the correct, functional disk file (ID <VOLUME_UUID_2>), but that volume is missing from the Cinder database, leaving it an unmanaged resource. The original Cinder volume record <VOLUME_UUID_1> is a ghost record with no physical file, which is why attempts to clone or snapshot fail.

Resolution

The resolution involves directly recovering the active disk file, converting it to a standard image format, and creating a new Cinder-managed volume and instance from that recovered image.

  1. On the compute hypervisor, copy the active, raw disk file to a temporary location.

  2. Convert the raw disk image to the recommended qcow2 format using qemu-img to save space and enable sparse image features.

  3. Upload the recovered .qcow2 file to Image Library.

  4. Launch a new instance using the newly created Image Library Service image to create a boot volume.

  5. Verify that the new instance, boots successfully and operates normally.

Additional Information

  • This recovery method bypasses the corrupted Block Storage metadata by extracting the live, functional disk data from the hypervisor.

  • The cleanup of the original instance and its associated volume records is required to fully resolve the metadata corruption.

  • If the original instance has a second attached volume, the secondary non-bootable volume should be carefully detached and re-attached to the recovered instance if it contains user data.

Last updated