Worker Node "NotReady" Issue
Problem
Troubleshoot issues with the Node in the NotReady
state or Cluster NodeGroup stuck in the ScalingUp
state.
Environment
- Private Cloud Director - v2025.4 and Higher.
- Kubernetes Cluster 1.31.2 or Higher.
Procedure
- Get the OpenStack VM console logs using the given command. Here check what errors or messages these logs show. E.g. Below log shows that the worker nodes joined the cluster successfully.
$ openstack console log show <Worker-node-VM-ID>
$ openstack console log show <worker-noode-VM-ID>
[..] cloud-init[1004]: [...] [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[..] cloud-init[1004]: [...] [kubelet-start] Starting the kubelet
[..] cloud-init[1004]: [...] [patches] Applied patch of type "application/strategic-merge-patch+json" to target "kubeletconfiguration"
[..] cloud-init[1004]: [...] [kubelet-check] Waiting for a healthy kubelet at http://127.0.0.1:10248/healthz. This can take up to 4m0s
[..] cloud-init[1004]: [...] [kubelet-check] The kubelet is healthy after 505.386495ms
[..] cloud-init[1004]: [...] [kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap
[..] cloud-init[1004]: [...]
[..] cloud-init[1004]: [...] This node has joined the cluster:
[..] cloud-init[1004]: [...] * Certificate signing request was sent to apiserver and a response was received.
[..] cloud-init[1004]: [...] * The Kubelet was informed of the new secure connection details.
[..] cloud-init[1004]: [...]
[..] cloud-init[1004]: [...] Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
[..] cloud-init[1004]: [...]
[..] cloud-init[1004]: [...] Cloud-init v. 24.4.1-0ubuntu0~22.04.2 finished at Fri, 23 May 2025 03:14:57 +0000. Datasource DataSourceOpenStackLocal [net,ver=2]. Up 49.81 seconds
- Now run
$ kubectl describe node <node-name>
and check the Events section to get more information. E.g. In the below case, the node is "NotReady" to join due tokubelet
being unable to properly get disk statistics for the filesystem where container images are stored.
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Starting 17m kube-proxy
Normal NodeAllocatableEnforced 18m kubelet Updated Node Allocatable limit across pods
Warning InvalidDiskCapacity 18m kubelet invalid capacity 0 on image filesystem
- Try to troubleshoot issues based on the error shown in the above Events. Better to have a look at the most common causes given below.
- If these steps prove insufficient to resolve the issue, kindly reach out to the Platform9 Support Team for additional assistance.
Most Common causes:
- The image version and the cluster version mismatch. Use
$ kubectl get nodes
and verify theVERSION
column corresponds to your Kubernetes cluster's version. - Try a different image with the current version, or deploy a new cluster with an alternative version.
- Resource availability (CPU, memory, storage) on the underlying PCD-V host.
kubelet
and container runtime service is down on VM.port security
is disabled on the worker nodes VM network.
Was this page helpful?