Pods Stuck In the Terminating State Due to Volume Unmount Error.
Problem
- Pods are getting stuck in
Terminating
state after the deletion with the below mentioned error.
E0413 09:01:03.159172 8845 nestedpendingoperations.go:301] Operation for "{volumeName:kubernetes.io/projected/$pod_id-kube-api-access-kxdxg podName:$pod_id nodeName:}" failed. No retries permitted until 2022-04-13 09:01:03.659142773 -0500 CDT m=+783674.310558707 (durationBeforeRetry 500ms). Error: "UnmountVolume.TearDown failed for volume \"kube-api-access-kxdxg\" (UniqueName: \"kubernetes.io/projected/$pod_id-kube-api-access-kxdxg\") pod \"$pod_id\" (UID: \"$pod_id\") : unlinkat /var/lib/kubelet/pods/$pod_id/volumes/kubernetes.io~projected/kube-api-access-kxdxg: device or resource busy"
Environment
- Platform9 Managed Kubernetes - All Versions
- Operating System: RHEL or CentOS v7.4 Onwards
Answer
- This has been a known issue with RHEL and CentOS systems and starting with RHEL7.4 kernel there is a new
sysctl
parameter available to overcome this behaviour . - This parameter is
may_detach_mounts
and its value is set to0
by default. - It can be enabled by executing the below mentioned command on the appropriate kubernetes cluster nodes.
# echo 1 > /proc/sys/fs/may_detach_mounts
# cat /proc/sys/fs/may_detach_mounts
1
- The kubelet will retry to unmount the projected volumes after enabling
may_detach_mounts
on the node. - This can take a few minutes and once done the terminating pod should get deleted from the node.
Was this page helpful?