How To Ensure Nodelet Service Status Checks are Not Honored

Problem

The pf9-nodeletd service (Platform9 Kubernetes Management Agent Service) responsible to bootstrap and run the PMK stack on a node has various status checks configured which are run every minute to detect any issue with a specific part of the phase being implemented. If any particular status check fails, then that check is retried until succession upto 10 times. Post that, the entire stack (i.e. all phases) are re-run.

If there is a requirement on a node to not comply with it, the same can be worked around by setting an override in place.

Environment

  • Platform9 Managed Kubernetes - v4.0 and Higher

Procedure

Create a file /etc/pf9/nodelet/override.yaml on the node with following contents:

Copy

The argument takes integer values only. The value determines how long the pf9-nodeletd service will ignore the failing status check from the last successful status check.

Example: If set,

PF9_STATUS_THRESHOLD_SECONDS = 3600 #1 hour

  • Status check of a Nodelet Phase at 9 am is Successful.
  • Status check for a Nodelet Phase fails at 9:01 am but the time is within threshold so pf9-nodeletd service will report this as a successful check.
  • Status check at 9:02 am would be the same as above.
Bash
Copy
  • If the status check at 10 am fails for a Phase, then pf9-nodeletd service will consider this as a failed check and restart the associated script in an attempt to bring it back to a healthy state. In case the status check for the phase at 10 am is successful on its own then the pf9-nodeletd service will not attempt to restart the phase.
Bash
Copy
  • If the status check for a phase at 10 am was successful and the status check at 10:01 am fails then the failed check time again will be considered to be within the threshold from the last successful status check so pf9-nodeletd service will consider this as a successful status check.

Once the override file is created, for it to take effect the pf9-nodeletd service will have to be restarted.

Bash
Copy
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard