How To Re-generate Certificates on LTS1 patch#13 if Hostagent Certificates are Expired
Problem
- Hostagent certificate and other components certificates are expired and all nodes are marked as disconnected and unable to perform upgrades.
- Performing Higher Version Upgrade is not re-generating hostagent certificates even when the certificates are not expired.
Environment
- Platform9 Edge Cloud - v5.3 and Higher
- Airctl
- Hostagent Certificate
Procedure
There is product bug where the hostagent certificates are not being regenerated during same/higher version upgrades on patch 12. Please follow the steps mentioned in this KB if running a deployment on patch 12.
Starting patch 13, these certificates will be regenerated during both - a same version upgrade, as well as a higher version upgrade.
- Login/SSH to Deployment Unit Host (DU Host).
- Start MongoDB container and exec into it.
# docker start airctl-mongo# docker exec -it airctl-mongo bash- Now inside
airctl-mongorun the below commands:
bash$ mongo > use pf9 switched to pf9 DB > db.secrets.getIndexes()If multiple indexes are seen with the db.secrets.getIndexes() output then skip below step to create a unique tag index. If getIndexes() output has only one index then create a unique tag index and verify using below commands.
> db.secrets.createIndex({"tag": 1}, {unique:true})> db.secrets.getIndexes()- Now open another terminal for Deployment Unit Host and start the
pf9deployExeccontainer using the below command and exec into it to export the mentioned env variables:
# docker run --name pf9deployExec -d -v /opt/pf9/airctl:/airctl -v /opt/pf9/airctl/ansible-stack:/ansible-stack --network host pf9-deploy:latest sleep 1000000# docker exec -it pf9deployExec bashbash$ export PF9_ANSIBLE_DIR=/ansible-stackbash$ export PF9DEPLOY_CONF_FILE=/airctl/conf/pf9deploy.inibash$ export DU_FQDN=<your DU_FQDN>bash$ export SHORTNAME=<shortname>bash$ export PF9DEPLOY_CRYPT_BACKEND=nullbash$ /app/deployutil.py check-certs --shortname $SHORTNAME --debug- The above check-certs would show the certs version and denote if it's expiring.
- Now generate certs and check using:
bash$ /app/deployutil.py generate-certs --fqdn $DU_FQDN --debugbash$ /app/deployutil.py check-certs --shortname $SHORTNAME --debug- Now dump the MongoDB data using airctl:
# /opt/pf9/airctl/airctl advanced-du save-mongo --config /opt/pf9/airctl/conf/airctl-config.yaml- Once the new certs are generated, you stop the pf9deployExec and airctl-mongo containers and delete the pf9deployExe container using:
# docker stop pf9deployExec# docker stop airctl-mongo# docker rm pf9deployExec# docker ps -a- Now we can start with the Same Version or Higher Version upgrade using the steps mentioned in the Upgrade documentation. Only follow steps in that page till the
**Upgrade DU section**section.
For such cases, skip the below steps of manually copying the certificate to individual hosts and proceed directly with the very final step which is Configure Host and HostAgent Upgrade.
When performing a SAME VERSION upgrade, ensure to use a pristine QCOW2 image when running the upgrade command.
Copying new hostagent certificates to individual hosts:
Once these certificates are generated, they can be checked by the following steps:
- SSH into the DU Host.
- Then, SSH into the DU VM. The IP address of the DU VM is the dhcpEndIp in
/opt/pf9/airctl/conf/airctl-config.yaml. The default value is192.168.120.254. - Go to
/etc/pf9/certsdirectory and check forv*certs which are created as a part of the upgrade or manually using the workaround above. If there is no such directory here, then the new certificates weren't generated successfully.
After verifying that these certificates are present in the DU VM, return to the DU host and then run the script mentioned below to copy these certificates from the DU VM to the individual DU hosts. The provided script will copy the new certificates to each individual host.
airctlConfig="/opt/pf9/airctl/conf/airctl-config.yaml"duUser=$(awk '/^sshUser:/ {print $2}' $airctlConfig)dhcpEndpoint=$(grep "^# dhcpEndIp:" $airctlConfig | cut -d " " -f 3- | sed 's/# //')ipList="/home/$duUser/ipList.txt"certPrefix="/home/$duUser/certPrefix.txt"tmpSrcDirDU="/tmp/certs"# Fetching nodeHostnames and storing them in a text fileHostNameLine=$(grep -n "^nodeHostnames:" $airctlConfig | cut -d: -f1)sed -n "${HostNameLine},$ p" $airctlConfig | awk '/- /{print $2}' | while read -r hostname; do echo "$hostname" >> $ipListdonesed -i '$ d' $ipListssh-keyscan $dhcpEndpoint >> ~/.ssh/known_hosts# ssh into the DU VMsshStatus=$(ssh $duUser@$dhcpEndpoint echo ok 2>&1)if [ "$sshStatus" != ok ]; then echo "Unable to ssh into the DU VM. Make sure it is present in the ~/.ssh/known_hosts" exit 1fi# Fetching latest certificate directory from the DU VMlatestCertsDir=$(ssh $duUser@$dhcpEndpoint "ls -d /etc/pf9/certs/v* 2>/dev/null | sort -n | tail -1")newCertPrefix=$(basename $latestCertsDir)# Check if the directory existsif [ -n "$latestCertsDir" ]; then echo "Latest certs directory: $latestCertsDir"else echo "No cert directories found. Generate new certs before running this script" exit 1fi# Store temporary certificates on CDU for later transfer to hostsmkdir -p $tmpSrcDirDUscp -r $duUser@$dhcpEndpoint:$latestCertsDir $tmpSrcDirDU# Iterating over one host at a time, updating the certs in the host and storing the older certs in /tmp/ directory in the host.while read -r hostname; do echo "Copying certs to host: $hostname" echo "Adding hostname to ~/.ssh/known_hosts" ssh-keyscan $hostname >> ~/.ssh/known_hosts # Creating temorary directory to store new certificates from the CDU and backing up the old certificates on the host to /tmp/temp-date-XXXXXX directory ssh -n $duUser@$hostname "if [ -d /home/$duUser/tempcerts ]; then sudo rm -rf /home/$duUser/tempcerts; fi" ssh -n $duUser@$hostname "mkdir -p /home/$duUser/tempcerts/ca /home/$duUser/tempcerts/hostagent" ssh -n $duUser@$hostname 'TMPDIR=$(mktemp -d "/tmp/temp-$(date +%Y-%m-%d_%H-%M-%S)-XXXXXX"); sudo cp -r /etc/pf9/certs/ $TMPDIR; sudo cp /etc/pf9/hostagent.conf $TMPDIR; sudo chmod 0440 $TMPDIR/certs/hostagent/cert.pem $TMPDIR/certs/hostagent/key.pem; sudo chown pf9:pf9group $TMPDIR/certs/hostagent/key.pem $TMPDIR/certs/hostagent/cert.pem $TMPDIR/certs/ca/cert.pem $TMPDIR/hostagent.conf' echo "old certificates and hostagent.conf have been backed to /tmp directory" scp $tmpSrcDirDU/$newCertPrefix/hostagent/cert.pem $duUser@$hostname:/home/$duUser/tempcerts/hostagent/cert.pem scp $tmpSrcDirDU/$newCertPrefix/hostagent/key.pem $duUser@$hostname:/home/$duUser/tempcerts/hostagent/key.pem scp $tmpSrcDirDU/$newCertPrefix/ca/cert.pem $duUser@$hostname:/home/$duUser/tempcerts/ca/cert.pem # Copy new certs to the /etc/pf9/certs directory from the temporary directory echo "Copying new certs to /etc/pf9/certs" ssh -n $duUser@$hostname "sudo chown pf9:pf9group /home/$duUser/tempcerts/hostagent/cert.pem /home/$duUser/tempcerts/hostagent/key.pem /home/$duUser/tempcerts/ca/cert.pem && sudo chmod 0440 /home/$duUser/tempcerts/hostagent/cert.pem /home/$duUser/tempcerts/hostagent/key.pem" ssh -n $duUser@$hostname "sudo cp /home/$duUser/tempcerts/ca/cert.pem /etc/pf9/certs/ca/cert.pem && sudo cp /home/$duUser/tempcerts/hostagent/key.pem /etc/pf9/certs/hostagent/key.pem && sudo cp /home/$duUser/tempcerts/hostagent/cert.pem /etc/pf9/certs/hostagent/cert.pem" echo "Successfully copied new certs to /etc/pf9/certs" # Update the cert_version in /etc/pf9/hostagent.conf to reflect the latest version echo "Updating cert version in /etc/pf9/hostagent.conf" replaceVersionCmd="sudo sed -i '/^\[ssl\]$/,/^\[/ s/cert_version=.*/cert_version=${newCertPrefix}/' /etc/pf9/hostagent.conf" ssh -n $duUser@$hostname ''"$replaceVersionCmd"'' echo "Successfully updated cert version in /etc/pf9/hostagent.conf" # Killing the sidekick pod and restarting pf9-comms, pf9-hostagent echo "Restarting pf9-comms and pf9-hostagent" ssh -n $duUser@$hostname "sidekickPid=\$(ps aux | grep \"sidekick\" | grep -v grep | awk '{print \$2}'); sudo kill \$sidekickPid" ssh -n $duUser@$hostname 'sudo systemctl restart pf9-comms' ssh -n $duUser@$hostname 'sudo systemctl restart pf9-hostagent' echo "Removing temporary certs directory" ssh -n $duUser@$hostname "sudo rm -rf /home/$duUser/tempcerts"done < $ipList# Removing temporary certs directory and ip list from the CDUrm -rf $ipList $tmpSrcDirDU- Once this script runs successfully, SSH into the individual hosts with centos user and then check the certificate validity in the
/etc/pf9/certsdirectory. This certificate validity should match with the certificate validity of the new certificate/etc/pf9/certs/v*/that was generated previously on the DU VM. Additionally,/etc/pf9/hostagent.confshould reflect the latest cert_version now. - The status of pf9-comms and pf9-hostagent services should also be up and running. This can be verified by running this command:
systemctl status pf9-comms pf9-hostagent - The older host certificates (stored in
/etc/pf9/certs/before replacement) andhostagent.confwill be stored in/tmp/temp-date-XXXXXXdirectory for backup. - Final Step: Perform host upgrade using steps mentioned here Configure Host and HostAgent Upgrade.