Multiple Instance Creation Failed Due to Max Scheduling Attempts Exceeded for Cinder Volume

Problem

During the creation of multiple VM instances, certain deployments are successful, but a subset experiences failures attributed to an error identified within the Cinder scheduler.

Cinder Scheduler Logs
    
Failed to run task cinder.scheduler.flows.create_volume.ScheduleCreateVolumeTask;volume:create: No valid backend was found. Exceeded max scheduling attempts 3 for resource <VOLUME_UUID>: cinder.exception.NoValidBackend: No valid backend was found. Exceeded max scheduling attempts 3 for resource <VOLUME_UUID>
Copy

Environment

Self-Hosted Private Cloud Director Virtualisation – v2025.4
Private Cloud Director Virtualisation – v2025.4

Cause

The Cinder scheduler overloads due to multiple concurrent volume creation requests in the backend. The changes are now part of the product starting in PCD June Release.

Workaround

The resolution for this issue is a two part process, which involves making changes on the Management Plane and on the hosts.

SaaS customers should reach out to Platform9 Support Team to implement Part-1 of the Resolution.

PART-1

Cinder pods use cinder.conf from cinder-etc secret, so we need to update the cinder-etc secret. Verify if secret is available in the namespace.

Command
    
 
$ kubectl get secrets -n <REGION_NAMESPACE> | grep -i cinder-etc
Copy

Take backup of the secret.

Command
    
 
$ kubectl get secrets -n <REGION_NAMESPACE> cinder-etc -o yaml > cinder-etc.secret.bk
Copy

Get the cinder.conf information from the secret

Command
    
 
$ kubectl get secrets -n <REGION_NAMESPACE> cinder-etc -o json | jq -r '.data."cinder.conf"' | base64 -d > cinder.conf
Copy

Open cinder.conf file in a file editor and make the below changes in default section. As shown below:

cinder.conf
    
 
[default]scheduler_max_attempts = 10osapi_volume_workers = 8service_down_time = 180report_interval = 60
Copy

It is not recommended to increase the scheduler__max__attempts beyond 10. As there are multiple factors such as Storage backend network latency, Storage IOPS, etc.

Save the cinder.conf and encode the file using base64.

Encode Cinder.conf
    
 
$ cat cinder.conf | base64 -w0
Copy

Copy the encoded value from above command and edit the secret replace the older cinder.conf content with new encoded value.

Edit Cinder.conf
    
 
$ kubectl edit secrets -n <REGION_NAMESPACE> cinder-etc
Copy

Save the secret and verify if the new cinder.conf values are reflecting.

Command
    
 
$ kubectl get secrets -n <REGION_NAMESPACE> cinder-etc -o json | jq -r '.data."cinder.conf"' | base64 -d
Copy

Now restart cinder-api and cinder-scheduler pod so that it will start utilising updated cinder.conf file.

Command
    
 
$ kubectl get pods -n <REGION_NAMESPACE> | grep -i cindercinder-api-xxxxx-xxxxx                     2/2     Running           0               10scinder-scheduler-xxxxxxx-xxxxx             1/1     Running           0               10s
Copy

PART-2

On every host that has Persistent Storage role, open /opt/pf9/etc/pf9-cindervolume-base/conf.d/cinder.conf file in a file editor and make the below changes in default section as shown below:

/opt/pf9/etc/pf9-cindervolume-base/conf.d/cinder.conf
    
 
[default]scheduler_max_attempts = 10osapi_volume_workers = 8service_down_time = 180report_interval = 60
Copy

Restart the below service on every host that has Persistent Storage role:

Command
    
 
$ systemctl restart pf9-cindervolume-base.service
Copy

The changes made will not persist through an upgrade. Therefore, it is important to reapply these steps immediately after the upgrade to ensure continued functionality.

Execute these commands carefully, ensuring no unintended characters are added to cinder.conf. If the cinder.conf file is corrupted, restore the cinder-etc secret from the backup file and restart the cinder-api and cinder-scheduler pods.

Last updated on

Was this page helpful?

Multiple Instance Creation Failed Due to Max Scheduling Attempts Exceeded for Cinder Volume

Problem

Environment

Cause

Workaround

Additional Information