Kubernetes Node in NotReady State After Reboot for Containerd Runtime Cluster

Problem

  • A Kubernetes (master or worker) node that has been rebooted (e.g. due to a maintenance activity) is showing as NotReady.

$ kubectl get nodes
NAME          STATUS     ROLES    AGE   VERSION
master1       NotReady   master   34d   v1.21.3
master2       Ready      master   34d   v1.21.3
master3       Ready      master   34d   v1.21.3
  • A description of the node similarly reports KubeletNotReady due to the CNI plugin being uninitialized.

$ kubectl describe node master1
.....
Conditions:
Ready                False   Mon, 13 Jun 2022 21:42:05 +0000   Mon, 13 Jun 2022 21:32:01 +0000   KubeletNotReady              container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized

Environment

  • Platform9 Managed Kubernetes - v5.4 and Higher

  • Kubernetes - All 1.21 versions except v1.21.3-pmk.183

  • Runtime - Containerd

Cause

Due to an upstream issuearrow-up-right in containerd, the CNI config is not reloaded when the directory is deleted and recreated during the Platform9 Kubernetes stack initialization.

Resolution

  • This issue is now fixed on the pf9-kube-1.21.3-pmk.183. and above releases.

Workaround

  1. Verify that the CNI configuration directory referenced by containerd is not empty.

For Flannel based clusters the directory should contain the following files:

For Calico based clusters the directory should contain the following files:

  1. Restart containerd service on the affected node.

circle-info

Info

Restarting containerd will not affect restart any running containers

circle-check
  1. Verify the status of the node after restarting containerd

Last updated