Kubelet Stops Posting Node Status

Problem

A node intermittently reports its status as NotReady. Rebooting the node fixes this issue.

Environment

  • Platform9 Managed Kubernetes - v5.3.0 and Higher

Cause

This issue is particularly noticed when there is a surge of resources on the node resulting into multiple API requests by Kubelet.

Resolution

This issue has been addressed in Platform9 Managed Kubernetes(PMK) version 5.1 release which has Kubernetes 1.19.6 support and will be compiled with the Golang version 1.15.2.

Workaround:

Increase the value of --http2-max-streams-per-connection flag from the default value 1000 to 2000 on all master nodes one at a time. The steps are as follows:

  1. Stop pf9-hostagent service on the node:
Copy
  1. Stop pf9-nodeletd service:
Copy
  1. Stop the pf9-kube services:
Copy
  1. Edit the /opt/pf9/pf9-kube/conf/masterconfig/base/master.yaml file and change the value of --http2-max-streams-per-connection flag to 2000.
Copy
  1. Once the master.yaml file is edited, start pf9-hostagent which will also start pf9-nodeletd the pf9-kube services.
Copy

Note: Once the change is made, wait until the master node starts reporting as Ready and the cluster to move from pending to connected state. Once we see it in Ready state, repeat the same steps for the next master in the cluster.

Additional Information

It is an issue with the Golang http2 library code and currently there is no upstream fix available in the Kubernetes version 1.17.9(go1.13.9) and 1.18.10(go1.13.15).

Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard