Management Plane Same/Higher Version Fails with Resmgr Service not Coming up.
Problem
- While following the Upgrade Process Documentation, the upgrade fails with below error:
TASK [pf9-configure : Wait for resmgr service] *********************************
2025-02-03T23:32:44.466-0800 info INFO:pf9deploy.server.util.shell:| Tuesday 04 February 2025 07:02:42 +0000 (0:00:01.849) 0:12:17.577 ******
2025-02-03T23:32:44.466-0800 info INFO:pf9deploy.server.util.shell:| fatal: [airctl-1.pf9.localnet]: FAILED! => {"changed": false, "elapsed": 1800, "msg": "Timeout when waiting for 127.0.0.1:8083"}
Environment
- Platform9 Edge Cloud - v5.3.0 and Higher
Diagnostic Steps:
- Check and identify if the logs of service pf9-resmgr inside the duVM are as below (use as String Search reference):
Feb 03 09:03:13 airctl-1.pf9.localnet pf9-resmgr[3472]: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 4: invalid continuation byte
Resolution
- Rollback the environment to the latest available backup that is taken during upgrade start.
- Identify the bbslave password in the file
.airctl/mongo/secrets.json
present on the DU Host. - On the DU Host, perform the below commands sequentially.
x
# docker start airctl-mongo
# docker exec -it airctl-mongo bash
bash$ mongo
> use pf9
> db.secrets.find().pretty()
- The above command prints out all the secrets in the mongoDB; identify the secret with tag as below:
airctl-1-pf9-localnet-rabbit_bbslave_password
- Confirm the password present for this record in mongoDB is the same as seen in the file
.airctl/mongo/secrets.json
- If it is not the same, change the mongoDB record with the below command:
db.secrets.updateOne({"tag":"airctl-1-pf9-localnet-rabbit_bbslave_password"}, {$set: {"record": {"HYAT!": "<password present in secrets.json file>", "binary": false}}})
- Exit the mongoDB shell.
- On the DU Host run the below command:
# /opt/pf9/airctl/airctl advanced-du save-mongo --config /opt/pf9/airctl/conf/airctl-config.yaml
- Attempt the Upgrade again.
Additional Information
- Reach out to Platform9 Support if the failures continue.
Was this page helpful?