Hostagent Certificate Rotation Failing due to Comms Connection Failures
Problem
- Empty host agent certificates generated causing node to be disconnected with the management plane.
pf9-comms
tunnels are broken due to missing host agent certificates.- Nodelet TLS certificates generation impacted due to missing host agent certificates.
Environment
- Platform9 Managed Kubernetes - v5.6
Cause
- The probable root cause was pinned down to the vouch returning empty data set while certificates were being requested for hostagent, causing
/etc/pf9/certs/hostagent/cert.pem
to be empty. This also impactedpf9-comms.service
as comms uses the hostagent certificates to create the tunnels to communicate with management plane services. Further, nodelet uses the tunnel to talk to the vouch service to sign the PMK related certificates. In this case, since the hostagent certificates were missing, the comms tunnel was broken, and that ended up breaking the nodelet certificate generation as well. - There could be a comms or nginx issue in reaching to vouch.
Resolution
- This issue is being actively tracked in CORE-1303, CORE-1304 and will be fixed in PMK 5.10 release. Further reach out to Platform9 support to retrieve latest updates on the filed issues.
- As a workaround, utilize the backup certificate & key pair generated by the name
cert.pem.0
&key.pem.0
in/etc/pf9/certs/hostagent/
directory that can be used to restore the older certificates. This is required to be performed with/etc/pf9/certs/ca
certificates as well. - After copying the certificates, restart the
pf9-comms
&pf9-hostagent
services over the host.
sudo systemctl restart pf9-comms
sudo systemctl restart pf9-hostagent
Was this page helpful?