Troubleshooting Image Creation Issues
Problem
This guide provides step-by-step instructions for troubleshooting failures during the Image Creation phase in the PCD UI or CLI. This covers scenarios where the "Image Library Host" is reported as unreachable, the creation request is rejected due to metadata conflicts, or a "Remote URL" fetch fails to complete.
Environment
Private Cloud Director Virtualization - v2025.4 and Higher
Self-Hosted Private Cloud Director Virtualization - v2025.4 and Higher
Component- Image Library
Deep Dive: Image Creation & Registration Flow
The image creation process is managed by the Glance service. It follows a strictly orchestrated sequence to ensure that only valid, authorized data reaches the persistent storage backend.
1. User Request & Authentication
The glance logs needs to be checked on the host with image library role.
A user initiates an image creation via the PCD UI, CLI, or API. The API service receives the request and validates the user’s authentication token with the identity service (Keystone).
API Log Entry (
/var/log/pf9/glance-api.log):INFO glance.api.v2.image_data [REQ-ID] - - - Use the existing user token.
2. Validation & Staging
The platform validates the requested metadata (Disk Format, Container Format) and confirms that the image's virtual size meets the requirements of the targeted project.
Virtual Size Check: The platform uses
qemu-img infointernally to calculate the actual footprint of the disk.Staging: Data is initially placed in a temporary staging area (default:
/var/lib/glance/os_glance_staging_store/) before being committed to long-term storage.API Log Entry (
/var/log/pf9/glance-api.log)):
3. Registry & Database Entry
The API service communicates with the registry to create a unique UUID for the image. At this point, the image status is set to queued. It exists in the database but contains no data yet.
4. Storage Backend Hand-off
The Image Service moves the data from the staging area to the local disk or the persistent backend storage (e.g., a Ceph pool, or an NFS mount like /var/lib/glance/images) as per the configuration.
Audit Check: The audit logs record the final outcome of the storage request, including the Username and UUID.
Audit Log Entry (
/var/log/pf9/glance-audit.log):
5. Status Finalization
Once the backend confirms the write is complete and the checksum is verified, the status in the database is updated from queued to active. The image is now bootable.
Procedure
1. Verify Service and Host Connectivity
Confirm the Image Services are healthy via both the PCD UI and the Library Host CLI. Option A: PCD UI (Quick Check)
Navigate to the Service Health section in the PCD Dashboard.
Look for the health of Image Library services.
Option B: CLI (Detailed Check)
pf9-glance-api(Inactive): The platform cannot create, upload, or fetch images.pf9-glance-scrubber(Inactive): Deleted images will not be removed from the physical disk, leading to hidden storage exhaustion.
2. Triage: Identify the Failure Point
Determine if the failure happened during Authentication, Validation, or the Physical Write.
Status
queued: The metadata was accepted, but the data transfer never started or was interrupted.Status
killed/error: The data transfer started but failed (likely storage or network timeout).No record found: The API rejected the initial request (likely quota or invalid format).
3. Interrogate the Image Library Host Logs
Use the Request ID (REQ-ID) to find the exact line of failure in the local logs.
4. Verify Backend Storage Health
Ensure the physical storage destination has capacity and is writable.
The <IMAGE_STORAGE_PATH> can be obtained from the cluster blueprint under the Image Library section in the cluster details.
5. Check Project Quotas
Confirm the project has "Administrative" room for the image, regardless of physical disk space.
Note: Look for
images(count) andgigabytes(total capacity).
Most Common Causes
Glance Service Down: The
pf9-glance-apiservice has crashed, as evidenced by a "Critical" status in the UI Service Health section.Staging Area Full: The
/var/lib/glance/os_glance_staging_storedirectory is 100% full, causing uploads to fail before they can be moved to final storage.Scrubber Failure: The
pf9-glance-scrubberis down, causing "Ghost" storage usage where deleted images still occupy disk space.Backend Storage Exhaustion: The persistent storage (NFS, Ceph, etc.) is full, preventing the transition from
queuedtoactive.
Last updated
