Node Disconnected From Management Plane Due To Hostagent Certificate expiry.
Problem
Node disconnected from the management plane. Found below errors in comms.log and hosagent.log .
x
[2024-03-23 04:13:11.086] [DEBUG] sni-prometheus.v0.example.platform9.net-::1-9118-24 - New client socket details - local port: 9118 , remote port: 44162[2024-03-23 04:13:11.101] [ERROR] sni-broker.v0.example.platform9.net-::1-5672-4 - TLS socket for client 4610 error: Error: 139771737151360:error:14094415:SSL routines:ssl3_read_bytes:sslv3alert certificate expired:../deps/openssl/openssl/ssl/record/rec_layer_s3.c:1544:SSL alert number 45[2024-03-23 04:13:11.101] [DEBUG] sni-broker.v0.example.platform9.net-::1-5672-4 - Server socket for client 4610 closed normally remaining: 0[2024-03-23 04:13:11.102] [DEBUG] sni-broker.v0.example.platform9.net-::1-5672-4 - Client 4610 socket closed normally remaining: 0[2024-03-23 04:13:11.241] [DEBUG] sni-prometheus.v0.example.platform9.net-::1-9118-24 - CONNECT via proxy 10.42.25.62:8083 to example.platform9.net:443 succeeded.[2024-03-23 04:13:11.241] [INFO] sni-prometheus.v0.example.platform9.net-::1-9118-24 - Server socket for client 4611 established, numServers: 332[2024-03-23 04:13:11.245] [DEBUG] sni-prometheus.v0.example.platform9.net-::1-9118-24 - CONNECT via proxy 10.42.25.62:8083 to example.platform9.net:443 succeeded.[2024-03-23 04:13:11.245] [INFO] sni-prometheus.v0.example.platform9.net-::1-9118-24 - Server socket for client 4612 established, numServers: 333[2024-03-23 04:13:11.743] [ERROR] sni-prometheus.v0.example.platform9.net-::1-9118-24 - TLS socket for client 4253 error: Error: Client network socket disconnected before secure TLS connection 2024-03-23 04:13:32,770 - session.py INFO - Already converged. Idling2024-03-23 04:13:32,770 - session.py WARNING - Not sending status message because channel is closed2024-03-23 04:13:32,770 - session.py INFO - Using the default virtual host '/' on the AMQP broker localhost2024-03-23 04:13:33,410 - slave.py ERROR - Connection error. Retrying in 10 seconds.Traceback (most recent call last): File "/opt/pf9/hostagent/lib/python3.9/site-packages/bbslave/slave.py", line 127, in reconnect_loop start(config, log, app_db, agent_app_db, app_cache, File "/opt/pf9/hostagent/lib/python3.9/site-packages/bbslave/session.py", line 770, in start dual_channel_io_loop(log, File "/opt/pf9/hostagent/lib/python3.9/site-packages/bbcommon/amqp.py", line 245, in dual_channel_io_loop conn.ioloop.start() File "/opt/pf9/hostagent/lib/python3.9/site-packages/pika/adapters/select_connection.py", line 461, in start self._poller.start() File "/opt/pf9/hostagent/lib/python3.9/site-packages/pika/adapters/select_connection.py", line 721, in start self.poll() File "/opt/pf9/hostagent/lib/python3.9/site-packages/pika/adapters/select_connection.py", line 1114, in poll self._dispatch_fd_events(fd_event_map) File "/opt/pf9/hostagent/lib/python3.9/site-packages/pika/adapters/select_connection.py", line 831, in _dispatch_fd_events handler(fileno, events) File "/opt/pf9/hostagent/lib/python3.9/site-packages/pika/adapters/base_connection.py", line 410, in _handle_events self._handle_read() File "/opt/pf9/hostagent/lib/python3.9/site-packages/pika/adapters/base_connection.py", line 460, in _handle_read return self._on_terminate( File "/opt/pf9/hostagent/lib/python3.9/site-packages/pika/connection.py", line 2119, in _on_terminate self.callbacks.process(0, File "/opt/pf9/hostagent/lib/python3.9/site-packages/pika/callback.py", line 60, in wrapper return function(*tuple(args), **kwargs) File "/opt/pf9/hostagent/lib/python3.9/site-packages/pika/callback.py", line 92, in wrapper return function(*args, **kwargs) File "/opt/pf9/hostagent/lib/python3.9/site-packages/pika/callback.py", line 236, in process callback(*args, **keywords) File "/opt/pf9/hostagent/lib/python3.9/site-packages/pika/connection.py", line 1856, in _on_connection_error raise exceptions.AMQPConnectionError(error_message orpika.exceptions.AMQPConnectionError: (-1, 'EOF')Environment
- Platform9 Managed Kubernetes - v5.6.8 or higher
Cause
The hostagent certificate within the nodes have been expired.
% openssl x509 -in etc/pf9/certs/hostagent/cert.pem -noout -datesnotBefore=Mar 22 17:05:58 2023 GMTnotAfter=Mar 21 17:06:28 2024 GMTResolution
- Replace existing certificate files from etc/pf9/certs/hostagent/ directory.
- Restart the pf9 hostagent service.
$ systemctl restart pf9-hostagentAdditional Information
This is a known issue. An internal Jira CORE-1304 is already filed to track this issue. Please open a support ticket to know the progress.
Was this page helpful?