Node went into NotReady state after updating CPU manager policy
Problem
- Node went into NotReady state after updating CPU manager policy.
Environment
- Platform9 Managed Kubernetes - v5.4 and Higher
Cause
- CPU Manager doesn't support offlining and onlining of CPUs at runtime. Also, if the set of online CPUs changes on the node, the node must be drained and CPU manager manually reset by deleting the state file cpu_manager_ state in the kubelet root directory.
Resolution
Since the CPU manager policy can only be applied when kubelet spawns new pods, simply changing from "none" to "static" won't apply to existing pods. So in order to properly change the CPU manager policy on a node, perform the following steps:
- Drain the node.
- Stop kubelet.
- Remove the old CPU manager state file. The path to this file is
/var/lib/kubelet/cpu_manager_state
by default. This clears the state maintained by the CPUManager so that the cpu-sets set up by the new policy won’t conflict with it. - Edit the kubelet configuration to change the CPU manager policy to the desired value.
- Start kubelet.
This is mentioned in kubernetes official documentation.
Was this page helpful?