Persistent Storage Volume Creation Fails With VolumeSizeExceedsAvailableQuota (HTTP 413)

Problem

  • A volume operation: creating a new volume, expanding an existing one, or migrating a VM via a tool such as vjailbreak, fails with an HTTP 413 response from the Cinder API.

cinder-api Logs
WARNING cinder.quota_utils [None [REQ_ID] [USER_ID] [TENANT_ID] - - default default] Quota exceeded for [TENANT_ID], tried to create [SIZE_IN_Gi] volume ([USAGE_SIZE] of [TOTAL_SIZE] already consumed).: cinder.exception.OverQuota: Quota exceeded for resources: ['gigabytes']

In SaaS environments, above verification is handled by Platform9 on the backend. Please contact Platform9 Support for verification details if needed.

  • Via vJailbreak migration:

vJailbreak agent Logs
Failed to migrate VM: failed to add volumes to host: failed to create volume: failed to create volume: Expected HTTP response code [202] when accessing [POST https://[FQDN]/cinder/v3/[TENANT_UUID]/volumes], but got 413 instead: {"overLimit": {"code": 413, "message": "VolumeSizeExceedsAvailableQuota: Requested volume or snapshot exceeds allowed gigabytes quota. Requested [SIZE], quota is [TOTAL_SIZE] and [USAGE_SIZE] has been consumed.", "retryAfter": "0"}}. 
  • The disk utilization via df on the persistent storage host shows plenty of free space yet the volume creation fails with the aforementioned errors.

Environment

  • Private Cloud Director Virtualization - v2025.6 and Higher

  • Self-Hosted Private Cloud Director Virtualization - v2025.6 and Higher

  • Component - Quota

Cause

The error is a logical (quota) rejection from Persistent Storage, not a physical capacity problem. Persistent Storage Service tracks two independent quota tiers per tenant, and the request is rejected if either is exceeded:

  1. Tenant-wide gigabytes quota wherein the total volume capacity allowed across all backends in the project is configured.

  2. Per-backend gigabytes_<BACKEND_NAME> quota wherein a separate cap for each Persistent Storage backend is configured

The physical capacity reported by df on the storage host reflects the backend size. Persistent Storage Service does not consult df when admitting a new volume rather it only compares the requested size against the tenant's logical quota counters. Expanding the underlying storage (LUN, volume group, any storage provider volume etc.) increases what the backend can hold but does not raise the Persistent Storage tenant quota. The two must be adjusted independently.

Diagnostics

  1. Review the quota and current usage side-by-side for the reported tenant:

Output columns are Limit, In Use, and Reserved. Compare:

  • If the project-wide gigabytes row is close to or at its limit, the global cap is the constraint.

  • If a specific gigabytes_<backend> row is at its limit, that backend's per-backend cap is the constraint.

  • A non-zero Reserved column indicates in-flight allocations that may be stuck.

How to read above example output to understand the example configured limits:

  • The first row, gigabytes, is the tenant-wide aggregate cap across all Persistent Storage backends. Limit is 10,000 GB and 9,995 GB is in use; only 5 GB of headroom. Any request larger than 5 GB will be rejected with VolumeSizeExceedsAvailableQuota.

  • Every gigabytes_<backend> row is a per-backend cap. All show -1 (unlimited), so none of them are blocking the request. If one of them had a finite Limit and was at it, that backend's cap would be the bottleneck instead; even if the global row had plenty of headroom. Either tier can trip the 413, so check both.

  • Counter sanity check: the per-backend In Use values should sum to the global In Use. Here, 5000 + 3000 + 1500 + 495 = 9995, which matches the global row exactly.

  • If the sum did not match the global In Use, the counters have drifted from reality. A cinder-manage quota check/sync (admin operation) is needed before raising the cap; otherwise the new limit will be measured against incorrect usage data.

  • Reserved is 0 on every row. No in-flight allocations are stuck. A non-zero value here indicates volumes mid-creation that may be hung that needs further investigation before raising the quota, since they may eventually succeed or fail and shift In Use in either direction.

  • Backends showing 0 in use (dr_test, __DEFAULT__ in the example) are configured but unused.

2. Confirm that the request would have landed on the expected backend.

Check the volume type used by the failing request, and which backend it maps to:

Look at the properties for volume_backend_name. The corresponding per-backend quota is gigabytes_<volume_backend_name>.

3. Check for orphaned volumes that may be inflating in_use.

Volumes in error or error_deleting state, and unattached available volumes left over from failed operations, all count against the quota.

Resolution

Pick the path that matches the diagnosis:

  • If the counters are consistent and the cap is genuinely too low.

    • Raise the quota that is constraining the request. To raise the project-wide cap:

    • Set the new value above the committed physical capacity for the backend(s), with headroom for growth.

Use -1 for configuring no limits on a specific backend.

  • If Orphaned volumes are inflating usage.

    • Delete leftover volumes from prior failed operations:

    • For volumes stuck in error or error_deleting, reset state first:

    • After cleanup, re-run the quota show --usage command to confirm the in_use figure dropped.

  • If Counters have drifted from reality.

    • If the per-backend in_use totals don't match the global gigabytes in_use, the cached quota counters have diverged from the actual volume records and must be reconciled before raising the cap.

    • For Self-Hosted users, the above command can also be executed from Control Plane, cinder-api pod

    • This recomputes usage from the actual volume records and rewrites the quota counters. The first command check scans the tenant's volume records, compares them against the cached quota counters, and reports any discrepancies. It does not modify anything.

    • However, sync rewrites the cached In Use and Reserved values to match what is actually present in the volume tables.

    • After sync completes, re-run the diagnostic from step 1 of the Diagnostics section; the per-backend In Use values should now sum to the global In Use. Only then is it safe to raise the cap with the quota set command

Last updated