Hypervisor role Deauth and Reauthorisation

In PCD, when a hypervisor role is applied , it auto configures the compute and network services on the hosts depending on the settings defined in the Cluster Blueprint.

Until Apr Release we have seen some issues while perform auth/de-auth on Hypervisor role multiple times on the same host. These issues were mainly related to below areas

A) Improper clean up of state files ( /opt/pf9/data/state/compute_id ) which define the unique compute id for a host and hence when the same host is reauthorised the newer compute_id fails to create a resource provider record in the placement API.

What do we do if this happens:

  • Deauthorize the hypervisor role from the host cleanly
  • Decommission the host
  • Re-authorise the host

B) Forceful decommission of the host using pcdctl CLI without the role check ( pcdctl decommission-node -r ) that leaves state compute services in the databases causing conflicts when the same host is added back with same hostname.

What do we do if this happens:

  • As a best practice forceful decommission is not recommended. However if we do so and there are stale compute services left in the Database, they need to be deleted using below command

#openstack compute service delete <uuid>

  • Restart pf9-ostackhost service

#systemctl restart pf9-ostackhost

C) Forceful decommission of the host using pcdctl CLI without the role check ( pcdctl decommission-node -r ) that leaves a non-empty instance list in #virsh list output which causes the role to fail with a check that hypervisor has existing instances.

What do we do if this happens:

  • As a best practice forceful decommission is not recommended. However if we do so and there are stale instances are left out on the host. Delete the instances using below commands List all running VM’s

#virsh list —all

Destroy and undefined those VM’s

#virsh destroy < id from above command >

#virsh undefine < id from above command >

  • Restart pf9-ostackhost service

#systemctl restart pf9-ostackhost

Starting June Release onwards, please follow generic guidelines as below:

Option 1: If the host has only hypervisor role and we need to de-auth the same , suggested process for de-auth/auth is as below

  • Deauthorize the hypervisor role from the host using the Private Cloud Director console cleanly
  • Decommission the host using pcdctl decommission-node ( DONT use the -r option to skip role checks )
  • Onboard the host using the instructions in the UI to add a new host
  • Re-authorise the host with hypervisor role using the Private Cloud Director console.

Option 2: If the host has multiple roles and we need to de-auth the hypervisor role while retaining the existing hostconfig and other roles ( Image Library, Persistent Storage etc ) , suggested process for de-auth/auth is as below

  • Deauthorize the hypervisor role from the host using the Private Cloud Director console cleanly
  • Re-authorise the host with hypervisor role using the Private Cloud Director console
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard