Unable to Create VM On Compute Host- Aggregate Sync Issue
Problem
When attempting to create a VM on a newly added compute node, the following error is observed in the UI or through the CLI:
No valid host was found. There are not enough hosts available.
Additionally, the nova-scheduler pod logs show the following message:
Got no allocation candidates from the Placement API. This could be due to insufficient resources or a temporary occurrence as compute nodes start up.
The above message in logs indicate that the issue is the resource provider mapping for the new/existing compute node is incorrect
Environment
- Platform9 Private Cloud Director - v2025.4 and Higher
- Self Hosted Private Cloud Director Virtualization – v2025.4 and Higher
- Component - Compute
Cause
This issue may occur due to host-aggregate synchronization problems. Specifically, the compute host may not be properly recognised by the Placement API as part of the aggregate, which leads to the scheduler being unable to locate valid hosts for allocation.
For SaaS users who do not have access to the DB, an alternative way to identify the affected host is to migrate an existing test VM to new or existing hosts.
For Self-Hosted PCD this can be validated using the below steps:
Validate if the resource mapping is completed by comparing the data of the existing nodes to the new node in the placement DB by following below steps
- Log into the database
$ kubectl exec -it deploy/mysqld-exporter -n <REGION_NAMESPACE> -c mysqld-exporter -- mysql resmgr -u root -p<password>
- Switch to the placement DB
MySQL [resmgr]> use placement;
- Check the resource provider mapping is created in the tables resource_provider_aggregates and resource_providers. If the resource details are not mapped correctly, there will be a discrepancy observed with the created_at and updated_at timestamps in the below two tables with the timestamp when the node was onboarded.
MySQL [placement]> select * from resource_provider_aggregates;
+---------------------+------------+----------------------+--------------+
| created_at | updated_at | resource_provider_id | aggregate_id |
+---------------------+------------+----------------------+--------------+
| 2025-06-07 19:15:45 | NULL | 144 | 3 |
| 2025-06-07 19:15:10 | NULL | 151 | 3 |
| 2025-06-09 22:45:13 | NULL | 4671 | 3 |
| 2025-06-09 23:33:38 | NULL | 4672 | 3 |
| 2025-06-09 23:51:00 | NULL | 4673 | 3 |
+---------------------+------------+----------------------+--------------+
MySQL [placement]> select * from resource_providers;
+---------------------+---------------------+------+---------------------------+---------------------+------------+------------------+--------------------+
| created_at | updated_at | id | uuid | name | generation | root_provider_id | parent_provider_id |
+---------------------+---------------------+------+---------------------------+---------------------+------------+------------------+--------------------+
| 2025-05-28 12:42:40 | 2025-06-05 17:23:20 | 144 | [resource_provider_UUID1] | [HOST1.EXAMPLE.COM] | 432 | 144 | NULL |
| 2025-05-28 12:59:01 | 2025-06-10 17:38:18 | 151 | [resource_provider_UUID2] | [HOST2.EXAMPLE.COM] | 711 | 151 | NULL |
| 2025-06-09 08:16:12 | 2025-06-10 17:23:21 | 4671 | [resource_provider_UUID3] | [HOST3.EXAMPLE.COM] | 1058 | 4671 | NULL |
| 2025-06-09 08:16:52 | 2025-06-10 17:38:18 | 4672 | [resource_provider_UUID4] | [HOST4.EXAMPLE.COM] | 951 | 4672 | NULL |
| 2025-06-09 08:55:40 | 2025-06-10 17:23:20 | 4673 | [resource_provider_UUID5] | [HOST5.EXAMPLE.COM] | 434 | 4673 | NULL |
+---------------------+---------------------+------+--------------------------------------+---------------------------+------------+------------------+--------------------+
In the above sample output, the updated_at timestamp (2025-06-05 17:23:20) in resource_providers table for id "144" is older than the created_at (2025-06-07 19:15:45) timestamp from the resource_provider_aggregates which highlights the discrepancy. In this case, host1.example.com is affected by this issue.
Resolution
The issue could be resolved by removing the problematic compute host from its host aggregate and re-adding it. This action triggers Nova to re-notify placement and successfully schedule VMs to the host.
Steps:
- List All Aggregates
$ openstack aggregate list
- Inspect a Specific Aggregate and check if the problematic host is part of any aggregate:
$ openstack aggregate show <AGGREGATE_NAME_OR_ID>
- Remove the Host from Aggregate
$ openstack aggregate remove host <AGGREGATE_NAME_OR_ID> <HOSTNAME>
- Add the Host Back to the Aggregate
$ openstack aggregate add host <AGGREGATE_NAME_OR_ID> <HOSTNAME>
This action re-registers the host within the aggregate and updates Placement API visibility.
Additional Information
Ensure that the nova-compute service is running and registered:
$ openstack compute service list --service nova-compute
Check for any errors in the ostackhost log on the hosts:
/var/log/pf9/ostackhost.log
For Self-Hosted PCD users, check the below pod logs:
$ kubectl logs <NOVA_SCHEDULER_POD> -n <REGION_NAMESPACE>
$ kubectl logs <NOVA_API_OSAPI_POD> -n <REGION_NAMESPACE>
$ kubectl logs <NOVA_SCHEDULER_POD> -n <REGION_NAMESPACE>
$ kubectl logs <NOVA_CONDUCTOR_POD> -n <REGION_NAMESPACE>
$ kubectl logs <PLACEMENT_API_POD> -n <REGION_NAMESPACE>