Tuning Kubelet Garbage Collection & Eviction Thresholds for Devicemapper
Problem
Kubelet does not perform garbage collection with Docker as the underlying Container Runtime using the Device Mapper storage driver.
Environment
- Platform9 Managed Kubernetes – All Versions
- Kubelet
- Docker
- Devicemapper
Cause
Due to a an alleged discrepancy in the Kubernetes code, and based on observations made when querying the Kubelet resource metrics, it appears that Kubelet does not properly record the image filesystem usage based on the DM thin-pool; rather, the disk capacity is based on the root disk.
Resolution
Option A (Recommended): Switch to Supported Storage Driver (Overlay2)
- Stop the Hostagent and Nodelet daemon services on each worker node.
systemctl stop pf9-{hostagent,nodeletd}
The node will now show as offline in the Platform9 UI, and you may receive a host-down notification.
- Issue a
stop
for the Nodelet phases.
sudo /opt/pf9/nodelet/nodeletd phases stop
All running pods will be drained and all running containers destroyed. Kubelet will no longer report its status, and the Docker daemon will be brought down also.
- Follow Steps #2-#4 from Configuring Docker with the overlay2 Storage Driver.
- Start the Hostagent service.
systemctl start pf9-hostagent
Option B: Tune Kubelet Parameters for Garbage Collection & Eviction Thresholds
- Run the
docker info
command on the worker node and identify theData loop file
.
docker info | grep /var/lib
Data loop file: /var/lib/docker/devicemapper/devicemapper/data
Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata
Docker Root Dir: /var/lib/docker
WARNING: the devicemapper storage-driver is deprecated, and will be removed in a future release.
WARNING: devicemapper: usage of loopback devices is strongly discouraged for production use.
Use `--storage-opt dm.thinpooldev` to specify a custom block storage device.
- Check the size of the disk/partition on which the data loop file exists and note it down.
df -h /
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/centos00-root 1.4T 86G 1.3T 7% /
- Check the size of the data loop file itself and note it down also.
ls -lh /var/lib/docker/devicemapper/devicemapper/data
-rw-------. 1 root root 100G Jul 13 11:52 /var/lib/docker/devicemapper/devicemapper/data
- Backup the current worker ConfigMap –
worker-default-kubelet-config
.
kubectl get configmap -n kube-system worker-default-kubelet-config -o yaml > worker-default-kubelet-config.yaml
- Edit the
worker-default-kubelet-config
ConfigMap, and set the following parameters for Garbage Collection (GC) and Eviction Thresholds.
kubectl edit -n kube-system worker-default-kubelet-config
evictionHard:
"imagefs.available": "89%" // evictionSoft - 5
evictionSoft:
imagefs.available: "94%" // 100 - ((imagefs * 0.85) / rootdiskfs * 100)
evictionSoftGracePeriod:
imagefs.available: "5m30s"
imageGCHighThresholdPercent: 4 // (100 - evictionSoft) - X
imageGCLowThresholdPercent: 1 // < imageGCHighThreshold
- (If necessary, should Kubelet not consume the updated configuration automatically.) Restart the Kubelet service on the worker(s).
systemctl restart pf9-kubelet
Troubleshooting
Scenario: Kubelet Crashed
If Kubelet has crashed with an unexplainable stacktrace or error, it is likely that there was an error in the configuration. Take the following steps to restore the worker(s).
- Backup the Kubelet dynamic configuration directory.
tar -czvf /var/opt/pf9/kube/kubelet-config/dynamic-config dynamic-config-$(date +%s).tgz
- Recursively remove the directory.
rm -rf /var/opt/pf9/kube/kubelet-config/dynamic-config
- Restart the Kubelet service.
systemctl restart pf9-kubelet