Tuning Kubelet Garbage Collection & Eviction Thresholds for Devicemapper

Problem

Kubelet does not perform garbage collection with Docker as the underlying Container Runtime using the Device Mapper storage driver.

Environment

  • Platform9 Managed Kubernetes – All Versions
  • Kubelet
  • Docker
  • Devicemapper

Cause

Due to a an alleged discrepancy in the Kubernetes code, and based on observations made when querying the Kubelet resource metrics, it appears that Kubelet does not properly record the image filesystem usage based on the DM thin-pool; rather, the disk capacity is based on the root disk.

Resolution

  1. Stop the Hostagent and Nodelet daemon services on each worker node.
Bash
Copy

The node will now show as offline in the Platform9 UI, and you may receive a host-down notification.

  1. Issue a stop for the Nodelet phases.
Bash
Copy

All running pods will be drained and all running containers destroyed. Kubelet will no longer report its status, and the Docker daemon will be brought down also.

  1. Follow Steps #2-#4 from Configuring Docker with the overlay2 Storage Driver.
  2. Start the Hostagent service.
Bash
Copy

Option B: Tune Kubelet Parameters for Garbage Collection & Eviction Thresholds

  1. Run the docker info command on the worker node and identify the Data loop file .
Bash
Copy
  1. Check the size of the disk/partition on which the data loop file exists and note it down.
Bash
Copy
  1. Check the size of the data loop file itself and note it down also.
Bash
Copy
  1. Backup the current worker ConfigMap – worker-default-kubelet-config .
Bash
Copy
  1. Edit the worker-default-kubelet-config ConfigMap, and set the following parameters for Garbage Collection (GC) and Eviction Thresholds.
Bash
Copy
Bash
Copy
  1. (If necessary, should Kubelet not consume the updated configuration automatically.) Restart the Kubelet service on the worker(s).
Bash
Copy

Troubleshooting

Scenario: Kubelet Crashed

If Kubelet has crashed with an unexplainable stacktrace or error, it is likely that there was an error in the configuration. Take the following steps to restore the worker(s).

  1. Backup the Kubelet dynamic configuration directory.
Bash
Copy
  1. Recursively remove the directory.
Bash
Copy
  1. Restart the Kubelet service.
Bash
Copy

Additional Information

Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard