Node Disconnected From Management Plane Due To Hostagent Certificate expiry.
Problem
Node disconnected from the management plane. Found below errors in comms.log and hosagent.log
.
x
[2024-03-23 04:13:11.086] [DEBUG] sni-prometheus.v0.example.platform9.net-::1-9118-24 - New client socket details - local port: 9118 , remote port: 44162
[2024-03-23 04:13:11.101] [ERROR] sni-broker.v0.example.platform9.net-::1-5672-4 - TLS socket for client 4610 error: Error: 139771737151360:error:14094415:SSL routines:ssl3_read_bytes:sslv3
alert certificate expired:../deps/openssl/openssl/ssl/record/rec_layer_s3.c:1544:SSL alert number 45
[2024-03-23 04:13:11.101] [DEBUG] sni-broker.v0.example.platform9.net-::1-5672-4 - Server socket for client 4610 closed normally remaining: 0
[2024-03-23 04:13:11.102] [DEBUG] sni-broker.v0.example.platform9.net-::1-5672-4 - Client 4610 socket closed normally remaining: 0
[2024-03-23 04:13:11.241] [DEBUG] sni-prometheus.v0.example.platform9.net-::1-9118-24 - CONNECT via proxy 10.42.25.62:8083 to example.platform9.net:443 succeeded.
[2024-03-23 04:13:11.241] [INFO] sni-prometheus.v0.example.platform9.net-::1-9118-24 - Server socket for client 4611 established, numServers: 332
[2024-03-23 04:13:11.245] [DEBUG] sni-prometheus.v0.example.platform9.net-::1-9118-24 - CONNECT via proxy 10.42.25.62:8083 to example.platform9.net:443 succeeded.
[2024-03-23 04:13:11.245] [INFO] sni-prometheus.v0.example.platform9.net-::1-9118-24 - Server socket for client 4612 established, numServers: 333
[2024-03-23 04:13:11.743] [ERROR] sni-prometheus.v0.example.platform9.net-::1-9118-24 - TLS socket for client 4253 error: Error: Client network socket disconnected before secure TLS connection
2024-03-23 04:13:32,770 - session.py INFO - Already converged. Idling
2024-03-23 04:13:32,770 - session.py WARNING - Not sending status message because channel is closed
2024-03-23 04:13:32,770 - session.py INFO - Using the default virtual host '/' on the AMQP broker localhost
2024-03-23 04:13:33,410 - slave.py ERROR - Connection error. Retrying in 10 seconds.
Traceback (most recent call last):
File "/opt/pf9/hostagent/lib/python3.9/site-packages/bbslave/slave.py", line 127, in reconnect_loop
start(config, log, app_db, agent_app_db, app_cache,
File "/opt/pf9/hostagent/lib/python3.9/site-packages/bbslave/session.py", line 770, in start
dual_channel_io_loop(log,
File "/opt/pf9/hostagent/lib/python3.9/site-packages/bbcommon/amqp.py", line 245, in dual_channel_io_loop
conn.ioloop.start()
File "/opt/pf9/hostagent/lib/python3.9/site-packages/pika/adapters/select_connection.py", line 461, in start
self._poller.start()
File "/opt/pf9/hostagent/lib/python3.9/site-packages/pika/adapters/select_connection.py", line 721, in start
self.poll()
File "/opt/pf9/hostagent/lib/python3.9/site-packages/pika/adapters/select_connection.py", line 1114, in poll
self._dispatch_fd_events(fd_event_map)
File "/opt/pf9/hostagent/lib/python3.9/site-packages/pika/adapters/select_connection.py", line 831, in _dispatch_fd_events
handler(fileno, events)
File "/opt/pf9/hostagent/lib/python3.9/site-packages/pika/adapters/base_connection.py", line 410, in _handle_events
self._handle_read()
File "/opt/pf9/hostagent/lib/python3.9/site-packages/pika/adapters/base_connection.py", line 460, in _handle_read
return self._on_terminate(
File "/opt/pf9/hostagent/lib/python3.9/site-packages/pika/connection.py", line 2119, in _on_terminate
self.callbacks.process(0,
File "/opt/pf9/hostagent/lib/python3.9/site-packages/pika/callback.py", line 60, in wrapper
return function(*tuple(args), **kwargs)
File "/opt/pf9/hostagent/lib/python3.9/site-packages/pika/callback.py", line 92, in wrapper
return function(*args, **kwargs)
File "/opt/pf9/hostagent/lib/python3.9/site-packages/pika/callback.py", line 236, in process
callback(*args, **keywords)
File "/opt/pf9/hostagent/lib/python3.9/site-packages/pika/connection.py", line 1856, in _on_connection_error
raise exceptions.AMQPConnectionError(error_message or
pika.exceptions.AMQPConnectionError: (-1, 'EOF')
Environment
- Platform9 Managed Kubernetes - v5.6.8 or higher
Cause
The hostagent certificate within the nodes have been expired.
% openssl x509 -in etc/pf9/certs/hostagent/cert.pem -noout -dates
notBefore=Mar 22 17:05:58 2023 GMT
notAfter=Mar 21 17:06:28 2024 GMT
Resolution
- Replace existing certificate files from etc/pf9/certs/hostagent/ directory.
- Restart the pf9 hostagent service.
$ systemctl restart pf9-hostagent
Additional Information
This is a known issue. An internal Jira CORE-1304 is already filed to track this issue. Please open a support ticket to know the progress.
Was this page helpful?