Enable ETCD Encryption in Existing PMK Clusters

Problem

  • How to enable ETCD secret encryption on an existing PMK cluster?

Environment

  • Platform9 Managed Kubernetes- v5.7 and Higher.
  • ETCD - v3.x and Higher

At least take a downtime of around 45 minutes to 1 hour per master node to complete the process.

It shouldn't take that long to apply the suggested changes below. However, if any issues occur after restarting the PF9 stack or when the changes are applied at the cluster level.

Procedure

1. Check etcd Version:

  • If etcd version is v3.x and Higher, then only proceed with the next steps.
Master node
Copy

The steps that need to be executed on the master nodes either need to be done by the customer OR if ARS is enabled, Platform9 can do the same by taking access of the respective master nodes.

2. Backup existing etcd database:

  • This step is just from a safety point of view and to verify that the etcd is healthy before starting any process of secret encryption(Command to be executed from the master node).
Master node
Copy

3. Check if all the secrets gets listed:

Master node
Copy

4. Enabling Encryption Provider:

  1. Generate a 32-byte random key and base64 encode it. Use this key as a secret in the encryption provider.
Master node
Copy
  1. On every master node create an encryption provider configuration file as shown below. Add the secret value as the random key that got generated from the 1st step.
Master nodes
Copy
  1. Set the ownership and permission of the file.
Master nodes
Copy

5. Add kube-apiserver flag

  • To add the required kube-apiserver flags, please reach out to the Platform9 support team.

Platform9 support team will add the "--encryption-provider-config=/var/opt/pf9/kube/apiserver-config/encryption-provider.yaml" kube-apiserver flag for the respective cluster in our qbert database.

6. Restart the PF9 Stack

  • Once the required kube-apiserver flags are added by Platform9 support team, load the changes at cluster level. Which include restarting the PF9 stack by running the below commands on ALL the master nodes.

DO NOT RUN THE COMMANDS SIMULTANEOUSLY ON ALL THE MASTER NODES. Once the changes gets applied successfully on one master, then only proceed to next master.

The master node that has VIP, it is PREFERRED to apply changes on that node at the last!

Master nodes
Copy

7. Perform Validations

  1. It takes some time to fully apply the changes to the node after restarting the PF9 stack. To verify one can check if the k8s-master-xxx pod has got recreated with the flag that Platform9 support team have added in the qbert database.
Master node
Copy
  1. Also verify if all the k8s secrets are getting loaded/listed with below commands.
Master node
Copy
  1. If the above command gives error as below, Perform next step.
Master node
Copy
  1. Run below command that replaces all the secrets with the new encryption key that was added and then again check if all the secrets gets loaded. If an error occurs due to a conflicting write, retry the command. It is safe to run that command more than once. (this step is required only if you notice error as mentioned in previous step)
Master node
Copy

8. Check Encryption is working as Expected

  • Going forward the data will be encrypted when it is written to etcd. When the kube-apiserver gets reloaded as shown above, any newly created or updated secret should be encrypted when stored. To check this, create a new secret and then use the etcdctl command to retrieve the contents of the secret data as shown below.
Master node
Copy

Secret key rotation

Changing an encryption key for Kubernetes without incurring downtime requires a multi-step operation, especially in the presence of a highly-available deployment where multiple kube-apiserver processes/pods are running.

  1. Generate a new 32-byte random key and base64 encode it. Use this key as a secret in the encryption provider.
Master node
Copy
  1. Add the newly generated key (key2 in this case) in the encryption provider file on each master node.
Master node
Copy
  1. Delete the k8s-master-<node-ip-address> pods, one at a time preferably starting with the pods that do not host the VIP. This will start the new apiserver container with the updated encryption provider file in it. Make sure the apiserver pod has restarted and all containers are in ‘READY’ state before moving on to the next one.
  2. Make a secure backup of the new encryption key. If you lose all copies of this key you would need to delete all the resources were encrypted under the lost key, and workloads may not operate as expected during the time that at-rest encryption is broken.
  3. Make the new key2 the first entry in the keys array so that it is used for encryption-at-rest for new writes.
Master node
Copy
  1. Perform the above step 3. again to ensure each control plane host now encrypts using the new key2.
  2. As a privileged user, run below command to encrypt all existing secrets with the new key2.
Master node
Copy
  1. After you have updated all existing secrets to use the new key and have made a secure backup of the new key, remove the old decryption key from the encryption-provider.yaml configuration file.

General Notes

Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard