Nodelet Phase got Stuck at Cert Generation Phase due to no Response from Vault.

Problem

  • When a node is rebooted or on Nodelet Phases restart, the Certificate Signing Requests are failing on the nodes with the error Certificate is not signed by CA.
Javascript
Copy

Environment

  • Platform9 Managed Kubernetes
  • Platform9 Edge Cloud

Cause

  • During nodelet cert generation phase, one of the task is to sign the certificates generated on the node by the vault.
  • During this process, the certificate signing request may not complete and may result in an empty response if the node is unable to connect to the vault through communication.
  • Enabling verbose logging for nodelet phases will help to identify the task. Look for curl requests similar to the example below.
Javascript
Copy
  • Running the below curl command manually will return an empty response like below.
Javascript
Copy

Resolution

  • Among other factors noted, the most frequently observed issue is communication failure between the node and the management plane. Check comms.log
Bash
Copy
  • Ensure that there is communication between node and the management plane via pf9-comms service.
  • The communication between node and Management plane can be checked using below command.
Bash
Copy
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard