Impact, Validation and Repair Action if pf9-nodeletd and pf9-comms Services are Down
Problem
Impact, validation and repair action if pf9-nodeletd
and pf9-comms
services are down.
Environment
- Platform9 Managed Kubernetes - All Versions
- Nodeletd
- Comms
Answer
- If
pf9-nodelet
service goes down on it’s own, then thepf9-hostagent
service will start it back up.
Example:
# systemctl stop pf9-nodeletd
# tail -f hostagent.log
2022-02-14 22:47:58,035 - session.py INFO - Already converged. Idling...
2022-02-14 22:49:04,208 - session.py INFO - --- Converging ---
2022-02-14 22:49:07,696 - pf9_app.py INFO - Setting the desired service state
2022-02-14 22:49:07,697 - pf9_app.py INFO - Setting service state pf9-nodeletd.1.20.11-pmk.2038. Command: sudo systemctl start pf9-nodeletd
2022-02-14 22:49:10,246 - session.py INFO - Converge succeeded
# systemctl status pf9-nodeletd
● pf9-nodeletd.service - Platform9 Kubernetes Management Agent Service
Loaded: loaded (/usr/lib/systemd/system/pf9-nodeletd.service; disabled; vendor preset: disabled)
Active: active (running) since Mon 2022-02-14 22:49:07 UTC; 17s ago
If pf9-nodelet
& pf9-hostagent
services both go down OR are stopped, even then the PMK stack will continue to run, but operations of managing the stack on that node, status checks on nodeletd phases etc. will not occur.
- For
pf9-comms
service, if it is stopped, thepf9-hostagent
service will start it back up.
# systemctl stop pf9-comms
x
# less comms.log
[2022-02-14 22:52:56.848] [INFO] comms - Caught SIGTERM ... exiting cleanly now.
[2022-02-14 22:53:11.330] [INFO] sniMapWatcher - IPv6 settings found on host. Will listen on IPv4 & IPv6
[2022-02-14 22:53:11.389] [INFO] comms - pf9-comms started at Mon Feb 14 2022 22:53:11 GMT+0000 (Coordinated Universal Time)
# tail -f hostagent.log
2022-02-14 22:53:10,056 - session.py WARNING - Not sending status message because channel is closed
2022-02-14 22:53:10,056 - pf9_app.py INFO - Setting the desired service state
2022-02-14 22:53:10,056 - pf9_app.py INFO - Setting service state pf9-comms.5.3.0-975.046bd33. Command: sudo systemctl start pf9-comms
2022-02-14 22:53:12,546 - session.py INFO - Converge succeeded
2022-02-14 22:53:12,546 - session.py WARNING - Not sending status message because channel is closed
# systemctl status pf9-comms
● pf9-comms.service - Platform9 Communications Service
Loaded: loaded (/usr/lib/systemd/system/pf9-comms.service; disabled; vendor preset: disabled)
Active: active (running) since Mon 2022-02-14 22:53:10 UTC; 2 min ago
If pf9-comms
& pf9-hostagent
services both go down or are stopped, even then the PMK stack will continue to run, but the node itself will report as offline to the management plane.
Additional Information
Was this page helpful?