Hostagent Stuck While Executing an Extension

Problem

  • A Platform9 managed host is showing offline from Clarity UI.
  • The Hostagent i.e. pf9-hostagent service has a forked process which is running a script from /opt/pf9/hostagent/extensions.
Copy
  • The forked process is in a ' D' (defunct) state.
Copy

Environment

  • Platform9 Managed OpenStack - All Versions
  • Platform9 Managed Kubernetes - All Versions

Cause

There may be various factors which can contribute to one of the Hostagent extension scripts failing to execute and subsequently entering a defunct state.

Resolution

  1. Inspect the script in question and identify if any of the commands may be ran manually.

Example: /opt/pf9/hostagent/extensions/fetch_mounted_nfs.py

Copy

The corresponding command would be as follows.

Copy
  1. If the command executes successfully, proceed to Step #3. Otherwise, you will need to identify what is causing the command to fail.
  2. Stop the pf9-hostagent service.
Copy
  1. Kill any remaining forked processes (if present).
Copy

You may need to add a -9 or -SIGKILLflag if any of the processes remain.

  1. Start pf9-hostagent service.
Copy
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard