[Outdated] Troubleshooting Heat Stack Issues

Problem

This guide provides step-by-step instructions for troubleshooting and resolving stack issues in Private Cloud Director.

Environment

  • Private Cloud Director Virtualization – v2025.4 and Higher

  • Self-Hosted Private Cloud Director Virtualization – v2025.4 and Higher

Procedure

When troubleshooting stack issues, follow these steps:

1. Identify the Stack Status

$ openstack stack list

Look for statuses like CREATE_IN_PROGRESS, CREATE_FAILED, or ROLLBACK_IN_PROGRESS.

2. Get Stack Information

$ openstack stack show <stack_name or id>

Review stack parameters, outputs, and overall status.

Parameters: Ensure required inputs like image name, flavor, or network ID are correct and exist.

Outputs: Confirm expected outputs (like IP addresses or resource IDs) are present. Missing outputs may indicate failed resource creation.

3. Check Stack Events for Failures

4. Inspect Individual Resource Status

Identify which resource caused the failure. Find out if any resource is stuck in CREATE_IN_PROGRESS or CREATE_FAILED.

  1. Check where did the Stack Failed. Identify the exact resource(s) and reason(s) for failure during stack creation.

6. Check Heat Component Pod Status and Logs

circle-info

Info

This step is applicable only for self-hosted PCD environments.

To check for errors related to the stack or resource ID in the logs, run:

Look for stack tracebacks or API-related errors.

7. Validate Stack Template

Ensure the template syntax is correct before deployment.

8. Check Quotas

Verify if quotas are causing resource creation failures.

Check both compute and network quotas for the project, and compare them against the requested values in the stack.

key quotas to check:

  • Compute:

    • vCPUs: Requested vCPUs ≤ Available vCPUs

    • RAM: Requested RAM ≤ Available RAM

    • Instances: Total number of VMs within allowed limit

  • Network:

    • Ports: Requested number of ports ≤ Available quota

    • Security Groups: Total security groups ≤ quota

    • Floating IPs: Requested number ≤ quota

9. Confirm Resource Availability

Ensure the referenced images and flavors exist.

10. Check Network Connectivity

Ensure network components are operational and properly configured.

11. Mark Failed Resource as Unhealthy and Attempt Stack Update

circle-exclamation

If applicable, update the stack with corrected parameters. If the issue persists, consider deleting and redeploying the stack.

If these steps do not resolve the issue, please contact the Platform9 Support Teamarrow-up-right for further assistance.

Most common causes:

  • Template syntax errors (YAML/JSON issues, missing parameters)

  • Resource conflicts (duplicate names, unavailable images)

  • Quota limits exceeded (compute, network, or storage)

  • Networking issues (missing subnets, no floating IPs)

  • Heat engine service failures

  • API rate limits exceeded

  • Delays in dependent resource creation

  • Authentication failures (expired tokens, invalid credentials)

Last updated