Failed to Bring Up The Kubernetes Stack Due to Kube-Proxy Container Failure.

Problem

  • Not able to start the kubernetes (PMK) stack on the cluster node since the kube-proxy container is failing tos start with below mentioned error.

[2023-01-28 05:39:06] time="2023-01-28T05:39:06Z" level=fatal msg="failed to create shim task: OCI runtime create failed: runc create failed: mountpoint for devices not found: unknown"
[2023-01-28 05:39:06] Waiting for "pf9ctr_run          run --name proxy          --detach=true         --net=host --privileged --volume /etc/pf9/kube.d/kubeconfigs/kube-proxy.yaml:/etc/kubernetes/pf9/kube-proxy/kube-proxy.yaml k8s.gcr.io/kube-proxy:v1.23.8 kube-proxy --kubeconfig=/etc/kubernetes/pf9/kube-proxy/kube-proxy.yaml --v=2 --hostname-override=10.192.105.25 --proxy-mode ipvs                          --cluster-cidr 10.20.0.0/16 --bind-address 0.0.0.0 --ipvs-strict-arp" to evaluate to true ...

Environment

  • Platform9 Managed Kubernetes - v5.6 and Above

  • K8s - v1.23

Cause

  • Sometime an older file named proxy is being left on the node even when there was no proxy container running on the node.

/var/lib/nerdctl/1935db59/names/k8s.io$ ll
total 12
drwxr-xr-x 2 root root 4096 Feb 23 01:58 ./
drwxr-xr-x 4 root root 4096 Feb 23 21:29 ../
-rw-r--r-- 1 root root   64 Feb 23 01:58 proxy
  • This file blocks the pf9-kubelet phases script for kube-proxy deployment on the node.

  • An upstream issuearrow-up-right has been reported for this on the Github

Resolution

  • We have an internal JIRA, PMK-5558 reported for this issue and our engineering team is working on it.

  • Till then, as a workaround we can manually cleanup the proxy file and this will unblock pf9-kube scripts which eventually help to start the kuberenetes stack successfully on the node.

Last updated