Configure Host Command Fails 502 Error Code

Problem

After the execution of the configure host command [Reference KB] as part of the LTS3 Installation. The host-status shows host is in false status with following error:

Host-status
    
 
Error preparing node Error: Unable to install hostagent. Invalid status code when identifiying hostagent type: 502
Copy

Environment

Self Managed Cloud Platform9- v5.9.0 and v5.9.1.

Cause

While troubleshooting it is identified that the nginx service inside nginx pod is not started as it tries to resolve s3-us-west-1.amazonaws.com in air-gapped environment.

Answer

This issue has been fixed in the SMCP-5.9.2 release onwards.

Workaround

If the onprem setup does not have a private DNS, empty the /etc/resolve.conf file and add nameserver {NODE_IP} in the file.
Otherwise, if there is a custom DNS setup, add the entries of the custom NS in the /etc/resolve.conf file.
Run /opt/pf9/airctl/airctl advanced-ddu create-mgmt --config airctl config as mentioned in the documentation published to get the management cluster up and running.
After the mgmt-cluster is up and running, append the s3 url entry in the nodelet-bootstrap-config.yaml i.e, add the entry 34.35.69.42 s3-us-west-1.amazonaws.com http://s3-us-west-1.amazonaws.com to the file nodelet-mgmt-cluster.yaml. The DNS field should look like the one mentioned in the below snippet.

nodelet-mgmt-cluster.yaml
    
 
dns:corednsHosts:- 34.35.69.42 s3-us-west-1.amazonaws.com
Copy

The DU FQDN entry should not be present until airctl start is run, so just adding the s3 entry should suffice. If however, the entry is present from the previous runs, it is safe to just leave it there.

5. Run /opt/pf9/airctl/airctl start --config airctl config for the DU to start.

Additional Notes

To check with automation that the management cluster up and running after running below command?

/opt/pf9/airctl/airctl advanced-ddu create-mgmt --config airctl config as mentioned in the documentation published to get the management cluster up and running.

Wait for the pods to be in running state using the following command sudo kubectl --kubeconfig /etc/nodelet/<clustername>/certs/admin.kubeconfig wait --for=condition=ready pod -l <label> -n kube-systemThe list of labels for the pods:

'k8s-app=calico-kube-controllers'

'k8s-app=calico-node'

'k8s-app=calico-typha' # skip this check if node count is less than 3 as it does not work on 2 node cluster due to replica count

'k8s-app=kube-dns'

'k8s-app=kube-dns-autoscaler'

Last updated on

Was this page helpful?