VMHA Stuck in "Waiting"

Problem

VMHA remains stuck in the "waiting" state during enablement.

Environment

  • Private Cloud Director Virtualization – v2025.4 and Higher
  • Self-Hosted Private Cloud Director Virtualization – v2025.4 and Higher

Cause

A decommissioned host was still listed in Nova's service records. Because of this, VMHA tried to use that host during setup, which caused an error and left the VMHA stuck in the "waiting" state.

Diagnostics

For SAAS customers contact Platform9 Support Team to validate if you are hitting the issue mentioned in this article.

  1. Check VMHA logs:
command
Copy

Look for log entries like:

hamgr.log
Copy
  1. List compute services and validate if any of the hypervisors are showing the "Status" as disabled and "State" down

Identify services that are down, disabled, or associated with non-existent or decommissioned hosts. In the sample output the HOST2.EXAMPLE.COM is the decommissioned node.

command
Copy
  1. List hypervisors and validate host mapping. In the sample output, we see that the node [HOST2.EXAMPLE.COM] is in a downstate. we can check its associated service ID to validate the host mapping
command
Copy

Resolution

  1. Identify the stale compute service entry from the output of the below command, in the sample output we see the node HOST2.EXAMPLE.COM is down.
command
Copy
  1. Delete the stale service using below command, post deletion of the stale entry we will still have minimum two working hypervisors as per the requirement of enabling VMHA
command
Copy
  1. Wait for the VMHA to retry the operation automatically, or disable and re-enable VMHA to trigger a fresh attempt.

Validation:

  1. Ensure VMHA state transitions from waiting to enabled.
  2. Confirm no additional stale hosts remain.

Additional Information:

  1. At minimum two working hypervisors are needed for enabling VMHA
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard