Storage Service Troubleshooting Guide

Problem

This troubleshooting guide aims to empower users by providing clear, actionable steps, common error explanations, and best practices to quickly and independently solve Storage-related problems. Specifically, Cinder in Private Cloud Director.

Environment

Private Cloud Director Virtualization - v2025.4 and Higher
Self-Hosted Private Cloud Director Virtualization - v2025.4 and Higher
Component - PCD Storage Service

Deep Dive

Volume Creation Flow

This is the process of provisioning a new block storage device from a storage backend.

API Request: A user sends a request to create a volume via the OpenStack CLI, Private Cloud Director dashboard, or direct API call. The cinder-api service receives this request, authenticates the user with Keystone, and raises a POSThttps://<FQDN>/v3/<TENANT_UUID>/volumes volume request, which further creates a Cinder database entry for the volume with a status of creating. The below cinder-api pod logs sample shows the POST request, Volume size and successful issue for volume creation request.

Sample logs
    
INFO cinder.api.openstack.wsgi [None [REQ-ID] [USER_ID] [TENANT_ID] - - default default] POST https:/<FQDN>/v3/<TENANT_UUID>/volumesINFO cinder.api.v3.volumes [None [REQ-ID] [USER_ID] [TENANT_ID] - - default default] Create volume of 2 GBINFO cinder.volume.api [None [REQ-ID] [USER_ID] [TENANT_ID] - - default default] Availability Zones retrieved successfully.INFO cinder.volume.api [None [REQ-ID] [USER_ID] [TENANT_ID] - - default default] Create volume request issued successfully.
Copy

Cinder-Scheduler: The request is passed to the cinder-scheduler. This component makes a decision on where to store the volumes using filters like Capacity Filters, Availability zone filters and many other filters. More filters can be found here to decide which storage backend (e.g., Ceph, LVM) is the best place to create the volume based on size, type, and availability.
Volume Service Action: The scheduler sends the request to the pf9-cindervolume-base service responsible for the chosen backend. This service is the worker that uses a specific storage driver to command the backend.
Backend Provisioning: The storage backend (the actual storage system) receives the commands and provisions the physical or logical block device. Here, on the underlying Cinder hosts, the /var/log/pf9/cindervolume-base.log will show the requested raw volume specifications, which include Volume name, Volume UUID and Volume size.

Sample logs
    
 
INFO cinder.volume.flows.manager.create_volume [[REQ-ID] None service] Volume [VOLUME_UUID]: being created as raw with specification: {'status': 'creating', 'volume_name': 'volume-[VOLUME_UUID]', 'volume_size': 2}
Copy

Status Update: Once the backend confirms the volume is created, the pf9-cindervolume-base service sends the update request to the cinder database, changing the volume's status to available. Here, on the underlying Cinder hosts, the /var/log/pf9/cindervolume-base.log will show the final status that volume is created.

Sample logs
    
INFO cinder.volume.flows.manager.create_volume [[REQ-ID] None service] Volume volume-[VOLUME_UUID] ([VOLUME_UUID]): created successfullyINFO cinder.volume.manager [[REQ-ID] None service] Created volume successfully.
Copy

Attaching a Volume to VM Flow

This process is a collaboration, primarily between Nova (Compute) and Cinder (Block Storage).

User Request (via Nova): A user requests to attach an existing, available volume to a specific VM. This request goes to the nova-api-osapi service, not the Cinder API.

Sample logs
    
 
INFO nova.osapi_compute.wsgi.server [None [REQ-ID] [USER_ID] [TENANT_ID] - - default default] [IP],[IP] "POST /v2.1/[PROJECT_UUID]/servers/[VM_UUID]/os-volume_attachments HTTP/1.1" status: 200 len: 569 time: 0.8244848
Copy

Nova to Cinder Communication: The pf9-ostackhost service on the host where the VM is running calls the cinder-api to get the connection information for the volume. Once volume information is received it further attach the volume as shown in /var/log/pf9/ostackhost.log logs.

Sample logs
    
 
INFO nova.compute.manager [[REQ-ID] [USER_NAME] [TENANT_NAME]] [instance: [VM_UUID]] Attaching volume [VOLUME_UUID] to /dev/vdx
Copy

Cinder Prepares the Attachment: The cinder-api passes the request to the pf9-cindervolume-base service. Cinder performs necessary actions to "reserve" the volume and prepares it for attachment. It then generates the required connection details (e.g., the iSCSI target, Ceph RBD path). Once that is successful the /var/log/pf9/cindervolume-base.log logs will shows the attachment successful message. Volume status will be "reserved".

Sample logs
    
 
INFO cinder.volume.manager [[REQ-ID] None service] attachment_update completed successfully.INFO cinder.volume.manager [[REQ-ID] None service] Volume connection completed successfully.
Copy

Cinder Responds to Nova: Cinder sends these connection details back to nova-compute to the pf9-ostackhost service on the host.
Nova Makes the Connection: Once pf9-ostackhost receives the connection info, pf9-ostackhost service uses the host's operating system and hypervisor (e.g., QEMU/KVM) to connect the VM to the storage volume. Volume status will be "attaching".
Final Status Update: Once the connection is successful, pf9-ostackhost service informs Cinder, and Cinder updates the volume's status in its database to in-use and records which VM it's attached to.

Volume Deletion Flow

This process is a collaboration, primarily on Cinder (Block Storage).

User Request (via Nova): A user requests to delete an existing volume via the OpenStack CLI, Private Cloud Director dashboard, or direct API call. which validates the user's authentication token with Keystone, and performs a permission check, and changes the volume status in the database to deleting. This request DELETE /v3/{project_id}/volumes/{volume_id} goes to cinder-api service.
Further Validation: Cinder-service checks Volume state. if it is in available, error, error_restoring, error_extending then the Normal delete operation is performed. If the volume state is in-use (attached), then Normal delete will be rejected unless force delete option is used.

Sample logs
    
INFO cinder.api.openstack.wsgi [None [REQ-ID] [USER_NAME] [TENANT_NAME] - - default default] DELETE https:/[FQDN]/v3/[PROJECT_UUID]/volumes/[VOLUME_UUID]INFO cinder.api.v3.volumes [None [REQ-ID] [USER_NAME] [TENANT_NAME] - - default default] Delete volume with id: [VOLUME_UUID]INFO cinder.volume.api [None [REQ-ID] [USER_NAME] [TENANT_NAME] - - default default] Volume info retrieved successfully.INFO cinder.volume.api [None [REQ-ID] [USER_NAME] [TENANT_NAME] - - default default] Delete volume request issued successfully.INFO cinder.api.openstack.wsgi [None [REQ-ID] [USER_NAME] [TENANT_NAME] - - default default] https:/[FQDN]/v3/[PROJECT_UUID]/volumes/[VOLUME_UUID] returned with HTTP 202
Copy

Cinder Prepares for delete: The RPC request is routed to the the pf9-cindervolume-base service hosting the volume (no scheduler step needed for delete). Backend driver/manager attempts to terminate connections and detach (best-effort). If connector cleanup fails, delete may fail with error_deleting. Driver delete_volume() removes the LUN/target/extent from the storage backend. Further the /var/log/pf9/cindervolume-base.log show the volume device mapper is being deleted.

Sample logs
    
INFO cinder.volume.volume_utils [[REQ-ID] None service] Performing secure delete on volume: /dev/mapper/cinder--volumes-volume--[VOLUME_UUID]
Copy

Cinder Volume Deletion Confirmation: On successful backend delete, quotas for volumes and gigabytes are decremented and further the /var/log/pf9/cindervolume-base.log show the volume is successfully deleted.

Sample logs
    
 
INFO cinder.volume.drivers.lvm [[REQ-ID] None service] Successfully deleted volume: [VOLUME_UUID]
Copy

Final Status Update: Cinder service pf9-cindervolume-base sends the database update request to cinder DB.

Procedure

Ensure that openstack and cinder binaries are present on the system.

Check if all cinder volume hosts are enable and running,

Command
    
 
$ openstack volume service list
Copy

List all volumes and grep for the affected volumes and get volume details like hosts information, status, errors using below command:

Command
    
 
$ openstack volume list | grep -i "<Affected_Volume_Name_or_UUID"
Copy

Command
    
 
$ openstack volume show <VOLUME_UUID>
Copy

The management plane has a cinder-api & cinder-scheduler pod to provide the volume service. Check if the a cinder-api & cinder-scheduler pods are running in the workload region namespace. Review all these pods::

Step 3 is applicable only for Self-Hosted Private Cloud Director

- Check if they are in "CrashLoopBackOff/OOMkilled/Pending/Error/Init" state. - Also, verify if all containers in the pods are Running. - See the events section in pod describe output. - Review pods logs using REQ_ID or VM_UUID for relevant details.

Command
    
​x
 
$ kubectl get pods -o wide -n <WORKLOAD_REGION> | grep -i "cinder"​$ kubectl describe -n <WORKLOAD_REGION> <CINDER_API_POD>$ kubectl describe -n <WORKLOAD_REGION> <CINDER_SCHEDULER_POD>​$ kubectl logs -n <WORKLOAD_REGION> <CINDER_API_POD>$ kubectl logs -n <WORKLOAD_REGION> <CINDER_SCHEDULER_POD>
Copy

Once the underlying cinder host is identified review the pf9-cindervolume-base service status it should be up and running.

Command
    
 
$ sudo systemctl status pf9-cindervolume-base
Copy

Review the /var/log/pf9/cindervolume-base.log logs, check if there are any errors related to the Volume UUID.
If these steps prove insufficient to resolve the issue, kindly reach out to the Platform9 Support Team for additional assistance.

Most common causes

Volume Stuck in Creating / Deleting / Detaching State
Volume Attach Failure
Cinder Scheduler Can’t Place Volume
Incorrect storage backend configuration

Last updated on

Was this page helpful?