How To Re-generate Certificates on LTS1 patch#13 if Hostagent Certificates are Expired
Problem
- Hostagent certificate and other components certificates are expired and all nodes are marked as disconnected and unable to perform upgrades.
- Performing Higher Version Upgrade is not re-generating hostagent certificates even when the certificates are not expired.
Environment
- Platform9 Edge Cloud - v5.3 and Higher
- Airctl
- Hostagent Certificate
Procedure
There is product bug where the hostagent certificates are not being regenerated during same/higher version upgrades on patch 12. Please follow the steps mentioned in this KB if running a deployment on patch 12.
Starting patch 13, these certificates will be regenerated during both - a same version upgrade, as well as a higher version upgrade.
- Login/SSH to Deployment Unit Host (DU Host).
- Start MongoDB container and exec into it.
# docker start airctl-mongo
# docker exec -it airctl-mongo bash
- Now inside
airctl-mongo
run the below commands:
bash$ mongo
> use pf9
switched to pf9 DB
> db.secrets.getIndexes()
If multiple indexes are seen with the db.secrets.getIndexes() output then skip below step to create a unique tag index. If getIndexes() output has only one index then create a unique tag index and verify using below commands.
> db.secrets.createIndex({"tag": 1}, {unique:true})
> db.secrets.getIndexes()
- Now open another terminal for Deployment Unit Host and start the
pf9deployExec
container using the below command and exec into it to export the mentioned env variables:
# docker run --name pf9deployExec -d -v /opt/pf9/airctl:/airctl -v /opt/pf9/airctl/ansible-stack:/ansible-stack --network host pf9-deploy:latest sleep 1000000
# docker exec -it pf9deployExec bash
bash$ export PF9_ANSIBLE_DIR=/ansible-stack
bash$ export PF9DEPLOY_CONF_FILE=/airctl/conf/pf9deploy.ini
bash$ export DU_FQDN=<your DU_FQDN>
bash$ export SHORTNAME=<shortname>
bash$ export PF9DEPLOY_CRYPT_BACKEND=null
bash$ /app/deployutil.py check-certs --shortname $SHORTNAME --debug
- The above check-certs would show the certs version and denote if it's expiring.
- Now generate certs and check using:
bash$ /app/deployutil.py generate-certs --fqdn $DU_FQDN --debug
bash$ /app/deployutil.py check-certs --shortname $SHORTNAME --debug
- Now dump the MongoDB data using airctl:
# /opt/pf9/airctl/airctl advanced-du save-mongo --config /opt/pf9/airctl/conf/airctl-config.yaml
- Once the new certs are generated, you stop the pf9deployExec and airctl-mongo containers and delete the pf9deployExe container using:
# docker stop pf9deployExec
# docker stop airctl-mongo
# docker rm pf9deployExec
# docker ps -a
- Now we can start with the Same Version or Higher Version upgrade using the steps mentioned in the Upgrade documentation. Only follow steps in that page till the
**Upgrade DU section**
section.
For such cases, skip the below steps of manually copying the certificate to individual hosts and proceed directly with the very final step which is Configure Host and HostAgent Upgrade.
When performing a SAME VERSION upgrade, ensure to use a pristine QCOW2 image when running the upgrade command.
Copying new hostagent certificates to individual hosts:
Once these certificates are generated, they can be checked by the following steps:
- SSH into the DU Host.
- Then, SSH into the DU VM. The IP address of the DU VM is the dhcpEndIp in
/opt/pf9/airctl/conf/airctl-config.yaml
. The default value is192.168.120.254
. - Go to
/etc/pf9/certs
directory and check forv*
certs which are created as a part of the upgrade or manually using the workaround above. If there is no such directory here, then the new certificates weren't generated successfully.
After verifying that these certificates are present in the DU VM, return to the DU host and then run the script mentioned below to copy these certificates from the DU VM to the individual DU hosts. The provided script will copy the new certificates to each individual host.
airctlConfig="/opt/pf9/airctl/conf/airctl-config.yaml"
duUser=$(awk '/^sshUser:/ {print $2}' $airctlConfig)
dhcpEndpoint=$(grep "^# dhcpEndIp:" $airctlConfig | cut -d " " -f 3- | sed 's/# //')
ipList="/home/$duUser/ipList.txt"
certPrefix="/home/$duUser/certPrefix.txt"
tmpSrcDirDU="/tmp/certs"
# Fetching nodeHostnames and storing them in a text file
HostNameLine=$(grep -n "^nodeHostnames:" $airctlConfig | cut -d: -f1)
sed -n "${HostNameLine},$ p" $airctlConfig | awk '/- /{print $2}' | while read -r hostname; do
echo "$hostname" >> $ipList
done
sed -i '$ d' $ipList
ssh-keyscan $dhcpEndpoint >> ~/.ssh/known_hosts
# ssh into the DU VM
sshStatus=$(ssh $duUser@$dhcpEndpoint echo ok 2>&1)
if [ "$sshStatus" != ok ]; then
echo "Unable to ssh into the DU VM. Make sure it is present in the ~/.ssh/known_hosts"
exit 1
fi
# Fetching latest certificate directory from the DU VM
latestCertsDir=$(ssh $duUser@$dhcpEndpoint "ls -d /etc/pf9/certs/v* 2>/dev/null | sort -n | tail -1")
newCertPrefix=$(basename $latestCertsDir)
# Check if the directory exists
if [ -n "$latestCertsDir" ]; then
echo "Latest certs directory: $latestCertsDir"
else
echo "No cert directories found. Generate new certs before running this script"
exit 1
fi
# Store temporary certificates on CDU for later transfer to hosts
mkdir -p $tmpSrcDirDU
scp -r $duUser@$dhcpEndpoint:$latestCertsDir $tmpSrcDirDU
# Iterating over one host at a time, updating the certs in the host and storing the older certs in /tmp/ directory in the host.
while read -r hostname; do
echo "Copying certs to host: $hostname"
echo "Adding hostname to ~/.ssh/known_hosts"
ssh-keyscan $hostname >> ~/.ssh/known_hosts
# Creating temorary directory to store new certificates from the CDU and backing up the old certificates on the host to /tmp/temp-date-XXXXXX directory
ssh -n $duUser@$hostname "if [ -d /home/$duUser/tempcerts ]; then sudo rm -rf /home/$duUser/tempcerts; fi"
ssh -n $duUser@$hostname "mkdir -p /home/$duUser/tempcerts/ca /home/$duUser/tempcerts/hostagent"
ssh -n $duUser@$hostname 'TMPDIR=$(mktemp -d "/tmp/temp-$(date +%Y-%m-%d_%H-%M-%S)-XXXXXX"); sudo cp -r /etc/pf9/certs/ $TMPDIR; sudo cp /etc/pf9/hostagent.conf $TMPDIR; sudo chmod 0440 $TMPDIR/certs/hostagent/cert.pem $TMPDIR/certs/hostagent/key.pem; sudo chown pf9:pf9group $TMPDIR/certs/hostagent/key.pem $TMPDIR/certs/hostagent/cert.pem $TMPDIR/certs/ca/cert.pem $TMPDIR/hostagent.conf'
echo "old certificates and hostagent.conf have been backed to /tmp directory"
scp $tmpSrcDirDU/$newCertPrefix/hostagent/cert.pem $duUser@$hostname:/home/$duUser/tempcerts/hostagent/cert.pem
scp $tmpSrcDirDU/$newCertPrefix/hostagent/key.pem $duUser@$hostname:/home/$duUser/tempcerts/hostagent/key.pem
scp $tmpSrcDirDU/$newCertPrefix/ca/cert.pem $duUser@$hostname:/home/$duUser/tempcerts/ca/cert.pem
# Copy new certs to the /etc/pf9/certs directory from the temporary directory
echo "Copying new certs to /etc/pf9/certs"
ssh -n $duUser@$hostname "sudo chown pf9:pf9group /home/$duUser/tempcerts/hostagent/cert.pem /home/$duUser/tempcerts/hostagent/key.pem /home/$duUser/tempcerts/ca/cert.pem && sudo chmod 0440 /home/$duUser/tempcerts/hostagent/cert.pem /home/$duUser/tempcerts/hostagent/key.pem"
ssh -n $duUser@$hostname "sudo cp /home/$duUser/tempcerts/ca/cert.pem /etc/pf9/certs/ca/cert.pem && sudo cp /home/$duUser/tempcerts/hostagent/key.pem /etc/pf9/certs/hostagent/key.pem && sudo cp /home/$duUser/tempcerts/hostagent/cert.pem /etc/pf9/certs/hostagent/cert.pem"
echo "Successfully copied new certs to /etc/pf9/certs"
# Update the cert_version in /etc/pf9/hostagent.conf to reflect the latest version
echo "Updating cert version in /etc/pf9/hostagent.conf"
replaceVersionCmd="sudo sed -i '/^\[ssl\]$/,/^\[/ s/cert_version=.*/cert_version=${newCertPrefix}/' /etc/pf9/hostagent.conf"
ssh -n $duUser@$hostname ''"$replaceVersionCmd"''
echo "Successfully updated cert version in /etc/pf9/hostagent.conf"
# Killing the sidekick pod and restarting pf9-comms, pf9-hostagent
echo "Restarting pf9-comms and pf9-hostagent"
ssh -n $duUser@$hostname "sidekickPid=\$(ps aux | grep \"sidekick\" | grep -v grep | awk '{print \$2}'); sudo kill \$sidekickPid"
ssh -n $duUser@$hostname 'sudo systemctl restart pf9-comms'
ssh -n $duUser@$hostname 'sudo systemctl restart pf9-hostagent'
echo "Removing temporary certs directory"
ssh -n $duUser@$hostname "sudo rm -rf /home/$duUser/tempcerts"
done < $ipList
# Removing temporary certs directory and ip list from the CDU
rm -rf $ipList $tmpSrcDirDU
- Once this script runs successfully, SSH into the individual hosts with centos user and then check the certificate validity in the
/etc/pf9/certs
directory. This certificate validity should match with the certificate validity of the new certificate/etc/pf9/certs/v*/
that was generated previously on the DU VM. Additionally,/etc/pf9/hostagent.conf
should reflect the latest cert_version now. - The status of pf9-comms and pf9-hostagent services should also be up and running. This can be verified by running this command:
systemctl status pf9-comms pf9-hostagent
- The older host certificates (stored in
/etc/pf9/certs/
before replacement) andhostagent.conf
will be stored in/tmp/temp-date-XXXXXX
directory for backup. - Final Step: Perform host upgrade using steps mentioned here Configure Host and HostAgent Upgrade.