Enable ETCD Encryption in Existing PMK Clusters
Problem
- How to enable ETCD secret encryption on an existing PMK cluster?
Environment
- Platform9 Managed Kubernetes- v5.7 and Higher.
- ETCD - v3.x and Higher
At least take a downtime of around 45 minutes to 1 hour per master node to complete the process.
It shouldn't take that long to apply the suggested changes below. However, if any issues occur after restarting the PF9 stack or when the changes are applied at the cluster level.
Procedure
1. Check etcd Version:
- If etcd version is v3.x and Higher, then only proceed with the next steps.
ETCDCTL_API=3 /opt/pf9/pf9-kube/bin/etcdctl version --endpoints=https://127.0.0.1:2379
The steps that need to be executed on the master nodes either need to be done by the customer OR if ARS is enabled, Platform9 can do the same by taking access of the respective master nodes.
2. Backup existing etcd database:
- This step is just from a safety point of view and to verify that the etcd is healthy before starting any process of secret encryption(Command to be executed from the master node).
# ETCDCTL_API=3 /opt/pf9/pf9-kube/bin/etcdctl member list --cacert=/etc/pf9/kube.d/certs/etcdctl/etcd/ca.crt --cert=/etc/pf9/kube.d/certs/etcdctl/etcd/request.crt --key=/etc/pf9/kube.d/certs/etcdctl/etcd/request.key --endpoints=https://127.0.0.1:2379 --write-out=table
# mkdir ~/etcd_backup
# ETCDCTL_API=3 /opt/pf9/pf9-kube/bin/etcdctl snapshot save ~/etcd_backup/snapshot-$(date +%Y-%m-%d_%H:%M:%S_%Z).db --cacert=/etc/pf9/kube.d/certs/etcdctl/etcd/ca.crt --cert=/etc/pf9/kube.d/certs/etcdctl/etcd/request.crt --key=/etc/pf9/kube.d/certs/etcdctl/etcd/request.key --endpoints=https://127.0.0.1:2379
//=> Verify the generated db.
# ETCDCTL_API=3 /opt/pf9/pf9-kube/bin/etcdctl snapshot status ~/etcd_backup/snapshot-*.db --write-out=table
3. Check if all the secrets gets listed:
# kubectl get secrets --all-namespaces
4. Enabling Encryption Provider:
- Generate a 32-byte random key and base64 encode it. Use this key as a secret in the encryption provider.
# head -c 32 /dev/urandom | base64
2MpxxxxxxxxxxxxxPZJk==
- On every master node create an encryption provider configuration file as shown below. Add the
secret
value as the random key that got generated from the 1st step.
# vi /var/opt/pf9/kube/apiserver-config/encryption-provider.yaml
apiVersion: apiserver.config.k8s.io/v1
kind: EncryptionConfiguration
resources:
- resources:
- secrets
providers:
- aescbc:
keys:
- name: key1
secret: 2MpxxxxxxxxxxxxxPZJk=
- identity: {}
- Set the ownership and permission of the file.
# chown pf9:pf9group /var/opt/pf9/kube/apiserver-config/encryption-provider.yaml
# chmod 640 /var/opt/pf9/kube/apiserver-config/encryption-provider.yaml
5. Add kube-apiserver flag
- To add the required kube-apiserver flags, please reach out to the Platform9 support team.
Platform9 support team will add the "--encryption-provider-config=/var/opt/pf9/kube/apiserver-config/encryption-provider.yaml"
kube-apiserver flag for the respective cluster in our qbert database.
6. Restart the PF9 Stack
- Once the required kube-apiserver flags are added by Platform9 support team, load the changes at cluster level. Which include restarting the PF9 stack by running the below commands on ALL the master nodes.
DO NOT RUN THE COMMANDS SIMULTANEOUSLY ON ALL THE MASTER NODES. Once the changes gets applied successfully on one master, then only proceed to next master.
The master node that has VIP, it is PREFERRED to apply changes on that node at the last!
# systemctl stop pf9-hostagent pf9-nodeletd
# /opt/pf9/nodelet/nodeletd phases stop
# systemctl start pf9-hostagent
7. Perform Validations
- It takes some time to fully apply the changes to the node after restarting the PF9 stack. To verify one can check if the
k8s-master-xxx
pod has got recreated with the flag that Platform9 support team have added in the qbert database.
❯ kubectl get pod -n kube-system k8s-master-da1t-gen-kub-con004 -o yaml
apiVersion: v1
kind: Pod
[ ]
name: k8s-master-da1t-gen-kub-con004
namespace: kube-system
spec:
containers:
[ ]
- command:
- kube-apiserver
[ ]
- --encryption-provider-config=/var/opt/pf9/kube/apiserver-config/encryption-provider.yaml
[ ]
name: kube-apiserver
- Also verify if all the k8s
secrets
are getting loaded/listed with below commands.
# kubectl get secret -A
- If the above command gives error as below, Perform next step.
# kubectl get secret -A
Error from server (InternalError): Internal error occurred: unable to transform key "/registry/secrets/kubernetes-dashboard/kubernetes-dashboard-certs": no matching prefix found
- Run below command that replaces all the secrets with the new encryption key that was added and then again check if all the secrets gets loaded. If an error occurs due to a conflicting write, retry the command. It is safe to run that command more than once. (this step is required only if you notice error as mentioned in previous step)
# kubectl get secrets -A -oyaml | kubectl replace -f -
# kubectl get secret -A
8. Check Encryption is working as Expected
- Going forward the data will be encrypted when it is written to etcd. When the kube-apiserver gets reloaded as shown above, any newly created or updated
secret
should be encrypted when stored. To check this, create a new secret and then use theetcdctl
command to retrieve the contents of the secret data as shown below.
# kubectl create secret generic secret1 -n default --from-literal=mykey=mydata
# ETCDCTL_API=3 etcdctl \
--cacert=/etc/pf9/kube.d/certs/etcdctl/etcd/ca.crt \
--cert=/etc/pf9/kube.d/certs/etcdctl/etcd/request.crt \
--key=/etc/pf9/kube.d/certs/etcdctl/etcd/request.key \
get /registry/secrets/default/secret1 | hexdump -C
Secret key rotation
Changing an encryption key for Kubernetes without incurring downtime requires a multi-step operation, especially in the presence of a highly-available deployment where multiple kube-apiserver
processes/pods are running.
- Generate a new 32-byte random key and base64 encode it. Use this key as a secret in the encryption provider.
# head -c 32 /dev/urandom | base64
JF4G/xxxxxxxxxxxTdo8=
- Add the newly generated key (
key2
in this case) in the encryption provider file on each master node.
# cat /var/opt/pf9/kube/apiserver-config/encryption-provider.yaml
apiVersion: apiserver.config.k8s.io/v1
kind: EncryptionConfiguration
resources:
- resources:
- secrets
providers:
- aescbc:
keys:
- name: key1
secret: 2MpxxxxxxxxxxxxxPZJk=
- name: key2
secret: JF4G/xxxxxxxxxxxTdo8=
- identity: {}
- Delete the
k8s-master-<node-ip-address>
pods, one at a time preferably starting with the pods that do not host the VIP. This will start the new apiserver container with the updated encryption provider file in it. Make sure the apiserver pod has restarted and all containers are in ‘READY’ state before moving on to the next one. - Make a secure backup of the new encryption key. If you lose all copies of this key you would need to delete all the resources were encrypted under the lost key, and workloads may not operate as expected during the time that at-rest encryption is broken.
- Make the new key2 the first entry in the
keys
array so that it is used for encryption-at-rest for new writes.
[ ]
keys:
- name: key2
secret: JF4G/xxxxxxxxxxxTdo8=
- name: key1
secret:2MpxxxxxxxxxxxxxPZJk=
- Perform the above step 3. again to ensure each control plane host now encrypts using the new key2.
- As a privileged user, run below command to encrypt all existing secrets with the new key2.
# kubectl get secrets --all-namespaces -o json | kubectl replace -f -
- After you have updated all existing
secrets
to use the new key and have made a secure backup of the new key, remove the old decryption key from theencryption-provider.yaml
configuration file.
General Notes
- Reference of all the above process is taken from this k8s documentation: Encrypting Confidential Data at Rest