Ability to Define Specific CPUs/Cores for ETCD container
Problem
In production environments it is required to have the ability to define the specific CPUs or cores for the ETCD container, so as to ensure it does not impact other applications using guaranteed CPUs as defined via kubelet’s cpu manager policy.
Environment
- Platform9 Edge Cloud - v5.3.
- Kuberneted version-1.20.
Cause
Since ETCD runs as Docker container and is not managed by Kubelet, it doesn't inherit the kubelet cpu manager policy and by default has usage of all available cpus in the system.
Workaround
The below change will not persist across cluster upgrade
Please find the steps below to set specific CPUs for ETCD container. Note that the same will have to be done on each master node part of the cluster, one node at a time to ensure quorum is maintained
- Stop the Stack and set required permission:
$ sudo systemctl stop pf9-hostagent pf9-nodeletd
$ sudo /opt/pf9/nodelet/nodeletd phases stop
$ sudo chmod 644 /opt/pf9/pf9-kube/master_utils.sh
- Edit the file
/opt/pf9/pf9-kube/master_utils.sh
to addcpuset-cpus
argument:
$ sudo vi /opt/pf9/pf9-kube/master_utils.sh
function ensure_etcd_running()
{
local node_endpoint=$1
mkdir -p "$ETCD_DATA_DIR"
chmod 0700 "$ETCD_DATA_DIR"
local etcd_log_level="info"
if [[ "${DEBUG}" == 'true' ]]; then
etcd_log_level="debug"
fi
# ETCD_LOG_LEVEL: --debug flag and ETCD_DEBUG to be deprecated in v3.5
# ETCD_LOGGER: default logger capnslog to be deprecated in v3.5, using zap
# ETCD_ENABLE_V2: Need this for flannel's compatibility with etcd v3.4.14
local run_opts="--net=host \
--detach=true \
--cpuset-cpus=0-3 \ <----- Adding required argument "cpuset-cpus"
--volume /etc/ssl:/etc/ssl \
--volume /etc/pki:/etc/pki \
--volume /etc/pf9/kube.d/certs/etcd:/certs/etcd \
--volume /etc/pf9/kube.d/certs/apiserver:/certs/apiserver \
--volume ${ETCD_DATA_DIR}:/var/etcd/data \
-e ETCD_LOG_LEVEL=${etcd_log_level} \
-e ETCD_LOGGER=zap \
-e ETCD_ENABLE_V2=true \
-e ETCD_PEER_CLIENT_CERT_AUTH=true"
$ sudo cat /opt/pf9/pf9-kube/master_utils.sh | grep cpuset
--cpuset-cpus=0-3 \
- Start the stack
$ sudo systemctl start pf9-hostagent
- Validation post stack startup:
$ sudo docker inspect etcd | grep CpusetCpus
"CpusetCpus": "0-3",
$ taskset -pc 12815
pid 12815's current affinity list: 0-3
Resolution
To persist the changes across upgrades the internal configurations will be reconfigured in the SMCP-5.11 release.
Additional Information
The ETA for the release SMCP-5.11 is not confirmed, you can track the progress by opening a support request mentioning the jira id- AIR-659.