Pods on Master Node Stuck in NodeAffinity Status After Master Node is Rebooted

Problem

After restarting a master node some pods on the master node are stuck in NodeAffinity status.

Environment

  • Platform9 Managed Kubernetes - v5.0 and Higher
  • Kubelet

Cause

During the initialization of the nodes, the nodes are temporarily available for scheduling without the necessary label to match the deployment's node selector. Depending on how long the nodes are available for scheduling without the necessary node labels for the deployment, the scheduler will start to spam with cluster with pods in NodeAffinity status. This spamming stops once the worker nodes are fully initialized and the pods are scheduled successful.

This is a known issue tracked in upstream Jira 92067. The patch for this bug is yet to be included in the Platform9 Managed Kubernetes. We have created an internal Jira to backport this patch, we will update this document once the patch is available in Platform9 Managed Kubernetes.

Resolution

As a workaround, cleanup the pods stuck in NodeAffinity status manually.

By default the pods are scheduled on the node once the node is fully initialized with proper labels.

Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard