Node NotReady With Error "container runtime is down, PLEG is not healthy"

Problem

A Kubernetes node is in a "NotReady" state.
The Kubelet process on the corresponding node is in a defunct state.

The node is also exhibiting a high load average (comparative to the # of CPUs), as observed via top or via the pf9-muster log.

/var/log/pf9/muster.log
    
 
[host] instances:0 loadavg:173.62 proc_active:16 proc_total:4218
Copy

In the Kubelet log, it is observed that the "container runtime is down, and that PLEG is not healthy".

/var/log/pf9/kubelet/kubelet.INFO
    
581159  25606 kubelet_node_status.go:430] Recording NodeNotReady event message for node <IP>I1110 XX:XX:XX.581176  25606 setters.go:518] Node became not ready: {Type:Ready Status:False LastHeartbeatTime:YYYY-YY-YY XX:XX:XX.581137659 +0530 IST m=+4777925.197815516 LastTransitionTime:YYYY-YY-YY XX:XX:XX.581137659 +0530 IST m=+4777925.197815516 Reason:KubeletNotReady Message:container runtime is down,PLEG is not healthy: pleg was last seen active 21m34.729124335s ago; threshold is 3m0s}
Copy

In the Docker log, a "broken pipe" error is observed.

/var/log/pf9/kube/docker.log
    
<host> dockerd[20998]: time="YYYY-YY-YYTXX:XX:XX.510007829+05:30" level=error msg="Handler for GET /v1.31/containers/json returned error: write unix /var/run/docker.sock->@: write: broken pipe"
Copy

Environment

Platform9 Managed Kubernetes - v3.6.0 and Higher
Docker

In every iteration, the PLEG health check calls docker ps to detect container states changes and docker inspect to get the details of those containers. After finishing each iteration, it updates a timestamp. If the timestamp hasn't been updated for 3 minutes, the health check fails.

In most common occurrences of such issues, PLEG could not finish doing all the tasks in 3 minutes in turn causing Docker socket connection errors which likely result in the pf9-kubelet service to eventually enter a defunct state.

In scenarios where the Cluster Node is flapping between Ready/NotReady state due to PLEG issues accounted due to high load average, the pf9-nodelet service does continuously monitors the health of the pf9-kubelet service via phase Configure and start kubelet. If there is an issue with the service, pf9-nodelet service will try and perform a restart of that phase. But from a long term perspective, investigating the cause of high load average would be beneficial from preventing the Node to flap.

Resolution

The defunct process cannot be cleared aside from rebooting the node.

Reboot the node.

Additional Information

Use the below script to identify the time taken to inspect containers:

Containerd
    
# TIMEFORMAT=%R; time (/opt/pf9/pf9-kube/bin/crictl -r unix:///run/containerd/containerd.sock ps | grep -v POD | awk '{print $1, $7}') | while read id name; do echo -e "\nChecking Container: $name : $id"; RESP=$(time /opt/pf9/pf9-kube/bin/crictl -r unix:///run/containerd/containerd.sock inspect $id 2>&1  > /dev/null); echo -e "Took$RESP above secs for $name ID: $id \n"; done; echo -e "Total Time"
Copy

Docker
    
# TIMEFORMAT=%R; time docker ps --format "{{.ID}}\t{{.Names}}" | while read id name; do echo -e "\nChecking Container: $name : $id"; RESP=$(time docker inspect $id 2>&1  > /dev/null); echo -e "Took$RESP above secs for $name ID: $id \n"; done; echo -e "Total Time"
Copy

Last updated on

Was this page helpful?

Node NotReady With Error "container runtime is down, PLEG is not healthy"

Problem

Environment

Cause

Resolution

Additional Information