How to Enable Graceful Node Shutdown?
Problem
- How to Enable Graceful Node Shutdown in a PMK cluster?
Environment
- Platform9 Managed Kubernetes - v5.2 Onward
- Kubernetes - v1.20 and Above
Procedure
- For enabling Graceful Node Shutdown feature, there are changes that needs to be made to the node's Kubelet configuration.
- For a cluster with Kubernetes version 1.20, it is required to explicitly enable the
GracefulNodeShutdownfeature under the featureGates (as it is a alpha feature). For Kubernetes v1.21 onward, it is enabled by default (as it is a beta feature). - To enable the
GracefulNodeShutdownunder the featureGates, edit the required master/workerconfigmap. Along with the feature gate, there is a requirement to set valuesShutdownGracePeriod&ShutdownGracePeriodCriticalPods(by default are set to zero).
In order for this to happen safely without any intervention from other services which keeps track of the status of the pf9-kubelet service, first stop the pf9-hostagent & pf9-nodeletd services on ALL the worker/master nodes depending on which configmap is being edited.
- There are 2 different
default-kubelet-configconfigmaps under the namespacekube-system, one for master nodesmaster-default-kubelet-configand other for worker nodesworker-default-kubelet-config. - Edit appropriate configmap according to the requirement. e.g. in this case we are changing the worker configmap.
# kubectl edit configmap worker-default-kubelet-config -n kube-systemconfigmap/worker-default-kubelet-config edited# kubectl get configmap worker-default-kubelet-config -n kube-system -o yamlapiVersion: v1data:kubelet: |apiVersion: kubelet.config.k8s.io/v1beta1kind: KubeletConfiguration... featureGates: DynamicKubeletConfig: true GracefulNodeShutdown: true maxPods: 200 shutdownGracePeriod: 30s shutdownGracePeriodCriticalPods: 10s- Start the pf9-hostagent service on all the worker/master nodes where the service was stopped initially. This will eventually start the pf9-nodeletd service.
sudo systemctl start pf9-hostagent- Verify if the GracefulNodeShutdown feature has been enabled or not from the worker node post editing the configmap.
# less /var/log/pf9/kubelet/kubelet.INFO | grep "feature gates"I1208 22:01:52.214565 22305 feature_gate.go:243] feature gates: &{map[]}I1208 22:01:52.217698 22305 feature_gate.go:243] feature gates: &{map[DynamicKubeletConfig:true]}I1208 22:01:52.217779 22305 feature_gate.go:243] feature gates: &{map[DynamicKubeletConfig:true]}I1208 22:01:52.220247 22305 feature_gate.go:243] feature gates: &{map[DynamicKubeletConfig:true GracefulNodeShutdown:true]}I1208 22:01:52.220355 22305 feature_gate.go:243] feature gates: &{map[DynamicKubeletConfig:true GracefulNodeShutdown:true]}I1208 22:01:52.243430 22305 feature_gate.go:243] feature gates: &{map[DynamicKubeletConfig:true GracefulNodeShutdown:true]}I1208 22:01:52.243550 22305 feature_gate.go:243] feature gates: &{map[DynamicKubeletConfig:true GracefulNodeShutdown:true]}x
[root@worker01 ~]# systemd-inhibit --listWho: kubelet (UID 0/root, PID 22305/kubelet)What: shutdownWhy: Kubelet needs time to handle node shutdownMode: delay1 inhibitors listed.Additional Information
Was this page helpful?