Troubleshooting Image Creation Issues

Problem

This guide provides step-by-step instructions for troubleshooting failures during the Image Creation phase in the PCD UI or CLI. This covers scenarios where the "Image Library Host" is reported as unreachable, the creation request is rejected due to metadata conflicts, or a "Remote URL" fetch fails to complete.

Environment

  • Private Cloud Director Virtualization - v2025.4 and Higher

  • Self-Hosted Private Cloud Director Virtualization - v2025.4 and Higher

  • Component- Image Library

Deep Dive: Image Creation & Registration Flow

The image creation process is managed by the Glance service. It follows a strictly orchestrated sequence to ensure that only valid, authorized data reaches the persistent storage backend.

1. User Request & Authentication

circle-info

The glance logs needs to be checked on the host with image library role.

A user initiates an image creation via the PCD UI, CLI, or API. The API service receives the request and validates the user’s authentication token with the identity service (Keystone).

  • API Log Entry (/var/log/pf9/glance-api.log):

    INFO glance.api.v2.image_data [REQ-ID] - - - Use the existing user token.

2. Validation & Staging

The platform validates the requested metadata (Disk Format, Container Format) and confirms that the image's virtual size meets the requirements of the targeted project.

  • Virtual Size Check: The platform uses qemu-img info internally to calculate the actual footprint of the disk.

  • Staging: Data is initially placed in a temporary staging area (default: /var/lib/glance/os_glance_staging_store/) before being committed to long-term storage.

  • API Log Entry (/var/log/pf9/glance-api.log)):

3. Registry & Database Entry

The API service communicates with the registry to create a unique UUID for the image. At this point, the image status is set to queued. It exists in the database but contains no data yet.

4. Storage Backend Hand-off

The Image Service moves the data from the staging area to the local disk or the persistent backend storage (e.g., a Ceph pool, or an NFS mount like /var/lib/glance/images) as per the configuration.

  • Audit Check: The audit logs record the final outcome of the storage request, including the Username and UUID.

  • Audit Log Entry (/var/log/pf9/glance-audit.log):

5. Status Finalization

Once the backend confirms the write is complete and the checksum is verified, the status in the database is updated from queued to active. The image is now bootable.

Procedure

1. Verify Service and Host Connectivity

Confirm the Image Services are healthy via both the PCD UI and the Library Host CLI. Option A: PCD UI (Quick Check)

  • Navigate to the Service Health section in the PCD Dashboard.

  • Look for the health of Image Library services.

Option B: CLI (Detailed Check)

  • pf9-glance-api (Inactive): The platform cannot create, upload, or fetch images.

  • pf9-glance-scrubber (Inactive): Deleted images will not be removed from the physical disk, leading to hidden storage exhaustion.

2. Triage: Identify the Failure Point

Determine if the failure happened during Authentication, Validation, or the Physical Write.

  • Status queued: The metadata was accepted, but the data transfer never started or was interrupted.

  • Status killed/error: The data transfer started but failed (likely storage or network timeout).

  • No record found: The API rejected the initial request (likely quota or invalid format).

3. Interrogate the Image Library Host Logs

Use the Request ID (REQ-ID) to find the exact line of failure in the local logs.

4. Verify Backend Storage Health

Ensure the physical storage destination has capacity and is writable.

circle-info

The <IMAGE_STORAGE_PATH> can be obtained from the cluster blueprint under the Image Library section in the cluster details.

5. Check Project Quotas

Confirm the project has "Administrative" room for the image, regardless of physical disk space.

  • Note: Look for images (count) and gigabytes (total capacity).

Most Common Causes

  • Glance Service Down: The pf9-glance-api service has crashed, as evidenced by a "Critical" status in the UI Service Health section.

  • Staging Area Full: The /var/lib/glance/os_glance_staging_store directory is 100% full, causing uploads to fail before they can be moved to final storage.

  • Scrubber Failure: The pf9-glance-scrubber is down, causing "Ghost" storage usage where deleted images still occupy disk space.

  • Backend Storage Exhaustion: The persistent storage (NFS, Ceph, etc.) is full, preventing the transition from queued to active.

Last updated