Impact, Validation and Repair Action if pf9-nodeletd and pf9-comms Services are Down

Problem

Impact, validation and repair action if pf9-nodeletd and pf9-comms services are down.

Environment

  • Platform9 Managed Kubernetes - All Versions

  • Nodeletd

  • Comms

Answer

  • If pf9-nodelet service goes down on it’s own, then the pf9-hostagent service will start it back up.

Example:

# systemctl stop pf9-nodeletd
# tail -f hostagent.log
2022-02-14 22:47:58,035 - session.py INFO - Already converged. Idling...
2022-02-14 22:49:04,208 - session.py INFO - --- Converging ---
2022-02-14 22:49:07,696 - pf9_app.py INFO - Setting the desired service state
2022-02-14 22:49:07,697 - pf9_app.py INFO - Setting service state pf9-nodeletd.1.20.11-pmk.2038. Command: sudo systemctl start pf9-nodeletd
2022-02-14 22:49:10,246 - session.py INFO - Converge succeeded
circle-info

Info

If pf9-nodelet & pf9-hostagent services both go down OR are stopped, even then the PMK stack will continue to run, but operations of managing the stack on that node, status checks on nodeletd phases etc. will not occur.

  • For pf9-comms service, if it is stopped, the pf9-hostagent service will start it back up.

circle-info

Info

If pf9-comms & pf9-hostagent services both go down or are stopped, even then the PMK stack will continue to run, but the node itself will report as offline to the management plane.

Additional Information

Platform9 Host Componentsarrow-up-right

Last updated