Enable ETCD Encryption in Existing PMK Clusters
Problem
- How to enable ETCD secret encryption on an existing PMK cluster?
Environment
- Platform9 Managed Kubernetes- v5.7 and Higher.
- ETCD - v3.x and Higher
At least take a downtime of around 45 minutes to 1 hour per master node to complete the process.
It shouldn't take that long to apply the suggested changes below. However, if any issues occur after restarting the PF9 stack or when the changes are applied at the cluster level.
Procedure
1. Check etcd Version:
- If etcd version is v3.x and Higher, then only proceed with the next steps.
ETCDCTL_API=3 /opt/pf9/pf9-kube/bin/etcdctl version --endpoints=https://127.0.0.1:2379The steps that need to be executed on the master nodes either need to be done by the customer OR if ARS is enabled, Platform9 can do the same by taking access of the respective master nodes.
2. Backup existing etcd database:
- This step is just from a safety point of view and to verify that the etcd is healthy before starting any process of secret encryption(Command to be executed from the master node).
# ETCDCTL_API=3 /opt/pf9/pf9-kube/bin/etcdctl member list --cacert=/etc/pf9/kube.d/certs/etcdctl/etcd/ca.crt --cert=/etc/pf9/kube.d/certs/etcdctl/etcd/request.crt --key=/etc/pf9/kube.d/certs/etcdctl/etcd/request.key --endpoints=https://127.0.0.1:2379 --write-out=table# mkdir ~/etcd_backup# ETCDCTL_API=3 /opt/pf9/pf9-kube/bin/etcdctl snapshot save ~/etcd_backup/snapshot-$(date +%Y-%m-%d_%H:%M:%S_%Z).db --cacert=/etc/pf9/kube.d/certs/etcdctl/etcd/ca.crt --cert=/etc/pf9/kube.d/certs/etcdctl/etcd/request.crt --key=/etc/pf9/kube.d/certs/etcdctl/etcd/request.key --endpoints=https://127.0.0.1:2379//=> Verify the generated db.# ETCDCTL_API=3 /opt/pf9/pf9-kube/bin/etcdctl snapshot status ~/etcd_backup/snapshot-*.db --write-out=table3. Check if all the secrets gets listed:
# kubectl get secrets --all-namespaces4. Enabling Encryption Provider:
- Generate a 32-byte random key and base64 encode it. Use this key as a secret in the encryption provider.
# head -c 32 /dev/urandom | base642MpxxxxxxxxxxxxxPZJk==- On every master node create an encryption provider configuration file as shown below. Add the
secretvalue as the random key that got generated from the 1st step.
# vi /var/opt/pf9/kube/apiserver-config/encryption-provider.yamlapiVersion: apiserver.config.k8s.io/v1kind: EncryptionConfigurationresources:- resources: - secrets providers: - aescbc: keys: - name: key1 secret: 2MpxxxxxxxxxxxxxPZJk= - identity: {}- Set the ownership and permission of the file.
# chown pf9:pf9group /var/opt/pf9/kube/apiserver-config/encryption-provider.yaml# chmod 640 /var/opt/pf9/kube/apiserver-config/encryption-provider.yaml5. Add kube-apiserver flag
- To add the required kube-apiserver flags, please reach out to the Platform9 support team.
Platform9 support team will add the "--encryption-provider-config=/var/opt/pf9/kube/apiserver-config/encryption-provider.yaml" kube-apiserver flag for the respective cluster in our qbert database.
6. Restart the PF9 Stack
- Once the required kube-apiserver flags are added by Platform9 support team, load the changes at cluster level. Which include restarting the PF9 stack by running the below commands on ALL the master nodes.
DO NOT RUN THE COMMANDS SIMULTANEOUSLY ON ALL THE MASTER NODES. Once the changes gets applied successfully on one master, then only proceed to next master.
The master node that has VIP, it is PREFERRED to apply changes on that node at the last!
# systemctl stop pf9-hostagent pf9-nodeletd# /opt/pf9/nodelet/nodeletd phases stop# systemctl start pf9-hostagent7. Perform Validations
- It takes some time to fully apply the changes to the node after restarting the PF9 stack. To verify one can check if the
k8s-master-xxxpod has got recreated with the flag that Platform9 support team have added in the qbert database.
❯ kubectl get pod -n kube-system k8s-master-da1t-gen-kub-con004 -o yamlapiVersion: v1kind: Pod[] name: k8s-master-da1t-gen-kub-con004 namespace: kube-systemspec: containers: [] - command: - kube-apiserver [] - --encryption-provider-config=/var/opt/pf9/kube/apiserver-config/encryption-provider.yaml[] name: kube-apiserver- Also verify if all the k8s
secretsare getting loaded/listed with below commands.
# kubectl get secret -A- If the above command gives error as below, Perform next step.
# kubectl get secret -AError from server (InternalError): Internal error occurred: unable to transform key "/registry/secrets/kubernetes-dashboard/kubernetes-dashboard-certs": no matching prefix found- Run below command that replaces all the secrets with the new encryption key that was added and then again check if all the secrets gets loaded. If an error occurs due to a conflicting write, retry the command. It is safe to run that command more than once. (this step is required only if you notice error as mentioned in previous step)
# kubectl get secrets -A -oyaml | kubectl replace -f -# kubectl get secret -A8. Check Encryption is working as Expected
- Going forward the data will be encrypted when it is written to etcd. When the kube-apiserver gets reloaded as shown above, any newly created or updated
secretshould be encrypted when stored. To check this, create a new secret and then use theetcdctlcommand to retrieve the contents of the secret data as shown below.
# kubectl create secret generic secret1 -n default --from-literal=mykey=mydata# ETCDCTL_API=3 etcdctl \ --cacert=/etc/pf9/kube.d/certs/etcdctl/etcd/ca.crt \ --cert=/etc/pf9/kube.d/certs/etcdctl/etcd/request.crt \ --key=/etc/pf9/kube.d/certs/etcdctl/etcd/request.key \ get /registry/secrets/default/secret1 | hexdump -CSecret key rotation
Changing an encryption key for Kubernetes without incurring downtime requires a multi-step operation, especially in the presence of a highly-available deployment where multiple kube-apiserver processes/pods are running.
- Generate a new 32-byte random key and base64 encode it. Use this key as a secret in the encryption provider.
# head -c 32 /dev/urandom | base64JF4G/xxxxxxxxxxxTdo8=- Add the newly generated key (
key2in this case) in the encryption provider file on each master node.
# cat /var/opt/pf9/kube/apiserver-config/encryption-provider.yamlapiVersion: apiserver.config.k8s.io/v1kind: EncryptionConfigurationresources:- resources: - secrets providers: - aescbc: keys: - name: key1 secret: 2MpxxxxxxxxxxxxxPZJk= - name: key2 secret: JF4G/xxxxxxxxxxxTdo8= - identity: {}- Delete the
k8s-master-<node-ip-address>pods, one at a time preferably starting with the pods that do not host the VIP. This will start the new apiserver container with the updated encryption provider file in it. Make sure the apiserver pod has restarted and all containers are in ‘READY’ state before moving on to the next one. - Make a secure backup of the new encryption key. If you lose all copies of this key you would need to delete all the resources were encrypted under the lost key, and workloads may not operate as expected during the time that at-rest encryption is broken.
- Make the new key2 the first entry in the
keysarray so that it is used for encryption-at-rest for new writes.
[] keys: - name: key2 secret: JF4G/xxxxxxxxxxxTdo8= - name: key1 secret:2MpxxxxxxxxxxxxxPZJk=- Perform the above step 3. again to ensure each control plane host now encrypts using the new key2.
- As a privileged user, run below command to encrypt all existing secrets with the new key2.
# kubectl get secrets --all-namespaces -o json | kubectl replace -f -- After you have updated all existing
secretsto use the new key and have made a secure backup of the new key, remove the old decryption key from theencryption-provider.yamlconfiguration file.
General Notes
- Reference of all the above process is taken from this k8s documentation: Encrypting Confidential Data at Rest