Impact, Validation and Repair Action if pf9-nodeletd and pf9-comms Services are Down
Problem
Impact, validation and repair action if pf9-nodeletd and pf9-comms services are down.
Environment
- Platform9 Managed Kubernetes - All Versions
- Nodeletd
- Comms
Answer
- If
pf9-nodeletservice goes down on it’s own, then thepf9-hostagentservice will start it back up.
Example:
# systemctl stop pf9-nodeletd # tail -f hostagent.log2022-02-14 22:47:58,035 - session.py INFO - Already converged. Idling...2022-02-14 22:49:04,208 - session.py INFO - --- Converging ---2022-02-14 22:49:07,696 - pf9_app.py INFO - Setting the desired service state2022-02-14 22:49:07,697 - pf9_app.py INFO - Setting service state pf9-nodeletd.1.20.11-pmk.2038. Command: sudo systemctl start pf9-nodeletd2022-02-14 22:49:10,246 - session.py INFO - Converge succeeded# systemctl status pf9-nodeletd● pf9-nodeletd.service - Platform9 Kubernetes Management Agent Service Loaded: loaded (/usr/lib/systemd/system/pf9-nodeletd.service; disabled; vendor preset: disabled) Active: active (running) since Mon 2022-02-14 22:49:07 UTC; 17s agoIf pf9-nodelet & pf9-hostagent services both go down OR are stopped, even then the PMK stack will continue to run, but operations of managing the stack on that node, status checks on nodeletd phases etc. will not occur.
- For
pf9-commsservice, if it is stopped, thepf9-hostagentservice will start it back up.
# systemctl stop pf9-commsx
# less comms.log[2022-02-14 22:52:56.848] [INFO] comms - Caught SIGTERM ... exiting cleanly now.[2022-02-14 22:53:11.330] [INFO] sniMapWatcher - IPv6 settings found on host. Will listen on IPv4 & IPv6[2022-02-14 22:53:11.389] [INFO] comms - pf9-comms started at Mon Feb 14 2022 22:53:11 GMT+0000 (Coordinated Universal Time) # tail -f hostagent.log2022-02-14 22:53:10,056 - session.py WARNING - Not sending status message because channel is closed2022-02-14 22:53:10,056 - pf9_app.py INFO - Setting the desired service state2022-02-14 22:53:10,056 - pf9_app.py INFO - Setting service state pf9-comms.5.3.0-975.046bd33. Command: sudo systemctl start pf9-comms2022-02-14 22:53:12,546 - session.py INFO - Converge succeeded2022-02-14 22:53:12,546 - session.py WARNING - Not sending status message because channel is closed# systemctl status pf9-comms● pf9-comms.service - Platform9 Communications Service Loaded: loaded (/usr/lib/systemd/system/pf9-comms.service; disabled; vendor preset: disabled) Active: active (running) since Mon 2022-02-14 22:53:10 UTC; 2 min agoIf pf9-comms & pf9-hostagent services both go down or are stopped, even then the PMK stack will continue to run, but the node itself will report as offline to the management plane.
Additional Information
Was this page helpful?