Nodes Fluctuating Between Ready and NotReady State Due to Kernel Memory Leak Issue
Problem
- The Red Hat / CentOS nodes in our cluster are fluctuating between
Ready
andNotReady
states. - SSH terminal is frozen when running any linux commands. Below error message is observed in the system logs when system hits this bug.
kernel: XFS: 6(238873) possible memory allocation deadlock size 144 in kmem_alloc (mode:0x8250)
Environment
- Platform9 Managed Kubernetes - All Versions
- Red Hat / CentOS 7 with kernel < 3.10.0-1075.el7
Answer
- This happens due to a known Red Hat / CentOS 7.6 memory leak issue and can be resolved by adding
cgroup.memory=nokmem
to the GRUB configuration. - Please engage the System Admin for adding above parameter in GRUB. If the issue persists, please reach out to the respective community or the OS support.
Additional Information
- Refer below link for more details on the issue.
Was this page helpful?