Nodelet Phases Restart on Master Node Stuck at "Wait for k8s services and network to be up" Stage.
Problem
The scaling of master node in an existing cluster fails, the subsequent nodelet phases restart stuck at "Wait for k8s services and network to be up" stage.
In the kubelet logs below error observed:
I0507 19:41:50.579598 28415 kubelet.go:416] "Attempting to sync node with API server"
I0507 19:41:50.579604 28415 kubelet.go:278] "Adding static pod path" path="/etc/pf9/kube.d/master.yaml"
I0507 19:41:50.579614 28415 file.go:68] "Watching path" path="/etc/pf9/kube.d/master.yaml"
I0507 19:41:50.579620 28415 kubelet.go:289] "Adding apiserver pod source"
I0507 19:41:50.579626 28415 apiserver.go:42] "Waiting for node sync before watching apiserver pods"
E0507 19:41:50.579718 28415 file_linux.go:61] "Unable to read config path" err="path does not exist, ignoring" path="/etc/pf9/kube.d/master.yaml"
E0507 19:41:50.579720 28415 file.go:98] "Unable to read config path" err="path does not exist, ignoring" path="/etc/pf9/kube.d/master.yaml"
Environment
- Platform9 Managed Kubenetes - v5.6
- Kubernetes - v1.23
Cause
- The static pods(control-plane pods) were pointing towards non-existing file
/etc/pf9/kube.d/master.yaml.
cat /etc/pf9/kube.d/master.yaml
cat: /etc/pf9/kube.d/master.yaml: No such file or directory
- A known bug is causing this issue PMK-5807.
Resolution
- The issue is fixed in Platform9 Managed Kubernetes v5.7. Reference JIRA [PMK-5807].
Additional Information
- In
nodelet.log
file below error observed.
Running command 'cgexec -g cpu:pf9-kube-status sudo /opt/pf9/pf9-kube/setup_env_and_run_script.sh /opt/pf9/pf9-kube/phases/wait_for_k8s_services.sh status' from wd: ''
STDOUT:
--- /opt/pf9/pf9-kube/phases/wait_for_k8s_services.sh status at 2024-05-07 19:21:37 ---
[2024-05-07 19:21:38] + operation=status
[2024-05-07 19:21:38] + case $operation in
[2024-05-07 19:21:38] + status
[2024-05-07 19:21:38] + source network_plugin.sh
[2024-05-07 19:21:38] ++ [[ -z calico ]]
[2024-05-07 19:21:38] ++ source network_plugins/calico/calico.sh
[2024-05-07 19:21:38] + network_running
[2024-05-07 19:21:38] + '[' master == none ']'
[2024-05-07 19:21:38] + return 0
[2024-05-07 19:21:38] + '[' master == worker ']'
[2024-05-07 19:21:38] + kubernetes_api_available
[2024-05-07 19:21:38] + ADMIN_CERTS=/etc/pf9/kube.d/certs/admin
[2024-05-07 19:21:38] + '[' master == master ']'
[2024-05-07 19:21:38] + api_endpoint=localhost
[2024-05-07 19:21:38] + timeout 60 curl --silent https://localhost:443/livez --cacert /etc/pf9/kube.d/certs/admin/ca.crt --key /etc/pf9/kube.d/certs/admin/request.key --cert /etc/pf9/kube.d/certs/admin/request.crt --fail
[2024-05-07 19:21:38] + return 1
Error: exit status 1
Exit Status: 1
Running command 'cgexec -g cpu:pf9-kube-status sudo /opt/pf9/pf9-kube/setup_env_and_run_script.sh /opt/pf9/pf9-kube/phases/wait_for_k8s_services.sh status' from wd: ''
STDOUT:
--- /opt/pf9/pf9-kube/phases/wait_for_k8s_services.sh status at 2024-05-07 19:21:39 ---
- As a workaround updated the static pod file path in
master-default-kubelet-config
configmap.
x
# kubectl edit cm master-default-kubelet-config -n kube-system
configmap/master-default-kubelet-config edited
-- staticPodPath: "/etc/pf9/kube.d/master.yaml"
++ staticPodPath: "/etc/pf9/kube.d/pod-manifests/master.yaml"
- Restart the PMK Stack.
# systemctl stop pf9-hostagent; systemctl stop pf9-nodeletd
# /opt/pf9/nodelet/nodeletd phases stop
# systemctl start pf9-hostagent
Was this page helpful?