Service 'pf9-kubelet' Keeps Restarting

Problem

The pf9-kubelet service is continuously restarting.

The following errors are present in the /var/log/pf9/kube/kube.log on the node.

watcher.go:146] Failed to watch directory "/sys/fs/cgroup/memory/system.slice/run-[UUID].scope": inotify_add_watch /sys/fs/cgroup/memory/system.slice/run-[UUID].scope: no space left on devicekubelet.go:1365] Failed to start cAdvisor inotify_add_watch /sys/fs/cgroup/memory/system.slice/run-[UUID].scope: no space left on device

Environment

  • Platform9 Managed Kubernetes - v4.0 and Higher

  • kubelet

Cause

The kubelet service fails to start due to inotify resource issues. The error itself means that system is getting low on inotify watches, which enable programs to monitor file or directory changes.

Resolution

  1. Identify the current setting.

# cat /proc/sys/fs/inotify/max_user_watches
  1. The output resembles the following code

8192
  1. Increase value.

  1. To make the changes persistent across reboot.

Additional Information

  • This is a known Bug #10421arrow-up-right in Kubernetes.

  • Use the attached script inotify_watcher_count.sh to find out which application is using the inotify resource and the exact count of inotify watches on a system.

    • Note: It will need the root access to parse the details from /proc filesystem

Last updated