Node in NotReady state and nodelet phases stuck at "Configure and start kube-proxy" stage
Problem
- Node in NotReady state and nodelet phases stuck at "Configure and start kube-proxy" stage.
Environment
- Platform9 Managed Kubernetes - v5.6.4 and Higher
Cause
- The node was not able to configure and start kube-proxy and we saw below error in the logs:
{"L":"ERROR","T":"2023-08-15T19:05:49.939-0700","C":"command/command.go:124","M":"Error: exit status 1"}{"L":"ERROR","T":"2023-08-15T19:05:49.939-0700","C":"command/command.go:127","M":"Exit Status: 1"}{"L":"ERROR","T":"2023-08-15T19:05:49.939-0700","C":"bash_script_based_phases/bash_script_base.go:89","M":"Error running phase: /opt/pf9/pf9-kube/phases/kube_proxy_start.sh"}{"L":"INFO","T":"2023-08-15T19:05:49.939-0700","C":"bash_script_based_phases/bash_script_base.go:144","M":"[2023-08-15 19:05:31] Kernel module ip_vs_sh loaded"}{"L":"INFO","T":"2023-08-15T19:05:49.939-0700","C":"bash_script_based_phases/bash_script_base.go:144","M":"[2023-08-15 19:05:31] Kernel module nf_conntrack_ipv4 loaded"}{"L":"INFO","T":"2023-08-15T19:05:49.939-0700","C":"bash_script_based_phases/bash_script_base.go:144","M":"[2023-08-15 19:05:31] Ensuring container 'proxy' is destroyed"}{"L":"INFO","T":"2023-08-15T19:05:49.939-0700","C":"bash_script_based_phases/bash_script_base.go:144","M":"[2023-08-15 19:05:31] time=\"2023-08-15T19:05:31-07:00\" level=fatal msg=\"name \\\"proxy\\\" is already used by ID \\\"9fc3f2596a2e9ccac58a55c731093ec836af9d5b370a31e3bded93abdbaec594\\\"\""}- A known upstream bug is causing this issue https://github.com/containerd/nerdctl/issues/499
Resolution
- As a workaround move old file named
proxypresent in /var/lib/nerdctl/<ID>/names/k8s.io/ to some other directory(like /tmp) on the node.
Additional Information
- An internal JIRA PMK-5969 is raised to track this issue.
Was this page helpful?