Platform9 Host Components
PF9 Host Side Services
This is a list of the services that run on the master/worker nodes.
pf9-hostagent
The pf9-hostagent service is required by the hosts for management by Platform9. Its function introduces a method to install and configure the platform9 apps on the host using roles. The hostagent receives requests from the management plane to apply a node role on the host. It then updates and reports the success or failure of the role apply action back to the management plane.
Then it receives the request from the management plane to push a role, downloads the role’s packages, installs them, and finally executes the configuration scripts using the role data that comes as part of the payload. Hostagent also sends a periodic heartbeat to the management plane via the pf9-comms service. It includes CPU, memory, storage, and network information in the heartbeat.
Additionally, the pf9-hostagent monitors currently running services for the applied role and checks configuration matches with the configuration stored on the management plane in the host role mapping database. For Platform9 Managed Kubernetes (PMK), the host role is known as pf9-kube. It keeps the role configuration consistently applied on the host with the information stored in the management plane, e.g.
- The keepalived configuration for VIP. If the VIP network interface is modified in the configuration file on the host and keepalived is restarted, the hostagent will reset the changes back to the original network interface in keepalived during its next check.
- It also monitors pf9-comms and restarts it if it is found to be dead.
The pf9-hostagent is also responsible for updating roles on the host. Host ‘role ’ e.g., ‘pf9-kube’ apply, modify and delete, are directed from the management plane using the pf9-hostagent. Host role updates are received on the host by the hostagent to update pf9 components on the host during an upgrade from Platform9.
pf9-comms
‘pf9-comms’ is used as a single point of communication with the management plane for platform9 managed host services. For the host side platform9 services like pf9-hostagent to talk to the management plane, pf9 has a tunnel from the host side to the management plane that channels communication of host side platform9 services to the management plane.
The only requirement for this to work is an outbound TCP 443 connection from the host to the management plane. This service on the host is called ‘pf9-comms’. ‘pf9-comms’ acts as a multiplexer intercepting the request arriving at multiple localhost ports of the host pf9 service and sends it over TCP 443 port to the management plane.
On a PMK cluster, ‘pf9-hostagent’ and ‘pf9-muster’ services connect to a port on localhost that gets multiplexed in pf9-comms. The requests from pf9-comms are intercepted by an ingress service in the management plane, demultiplexed, and passed on for further processing to the intended receiver. ‘pf9-comms’ uses TLS with trusted certificates to connect with the management plane ingress service.
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
pf9-comms 17951 pf9 37u IPv4 46812429 0t0 TCP mav-3-1:41768->airctl-1.pf9.localnet:https (ESTABLISHED)
pf9-comms 17951 pf9 41u IPv4 46812664 0t0 TCP mav-3-1:41784->airctl-1.pf9.localnet:https (ESTABLISHED)
pf9-comms 17951 pf9 43u IPv4 46812674 0t0 TCP mav-3-1:41788->airctl-1.pf9.localnet:https (ESTABLISHED)
pf9-hostd 17814 pf9 10u IPv6 47533401 0t0 TCP localhost:38486->localhost:amqp (ESTABLISHED)
pf9-muste 28887 pf9 3u IPv4 47534314 0t0 TCP localhost:49962->localhost:amqp (ESTABLISHED)
pf9-muste 28887 pf9 5u IPv4 47538143 0t0 TCP localhost:50316->localhost:amqp (ESTABLISHED)
pf9-comms 96362 pf9 14u IPv4 47532817 0t0 TCP localhost:amqp (LISTEN)
pf9-comms 96362 pf9 15u IPv6 47532818 0t0 TCP localhost:amqp (LISTEN)
pf9-comms 96362 pf9 34u IPv6 47533404 0t0 TCP localhost:amqp->localhost:38486 (ESTABLISHED)
pf9-comms 96362 pf9 36u IPv4 47533407 0t0 TCP localhost:amqp->localhost:49962 (ESTABLISHED)
pf9-comms 96362 pf9 38u IPv4 47538961 0t0 TCP localhost:amqp->localhost:50316 (ESTABLISHED)
pf9-sidekick
It runs parallel to pf9-comms, but runs independently and is a backup channel for some of the hostagent's operations. In situations where communication with pf9-hostagent is lost from the management plane, pf9-sidekick is used to allow diagnosis & recovery through its support for bundle upload operations and remote execution.
This service is typically used for debugging and talks to the host. This provides a secondary channel to execute commands on managed hosts. It connects with the service component named as sidekickserver on the management plane.
pf9-muster
Muster is a monitoring and troubleshooting tool. It sends back statistics such as memory and load usage on the host. It also exposes a limited API allowing the Platform9 Support team to send whitelisted commands for troubleshooting. Communication is via pf9-comms.
pf9-nodelet
Nodelet comes into action after the cluster create operation and after the host has been discovered and authorized in the management plane. It refers to YAML files under /etc/pf9/nodelet/ directory for config options needed to configure the host as Kubernetes master or worker node. Nodelet writes to /etc/pf9/kube.env, which will be used by component phase scripts. While doing so, it is responsible for starting, restarting, and stopping Kubernetes components in a controlled manner.
It can take corrective actions such as performing a partial restart or partial rollback of the Kubernetes components if they fail during the startup. For example, if docker is not running, then nodelet only attempts to restart the chain of components till the docker configuration phase.
Nodelet will continue to monitor the k8s stack once the node has been added to a cluster successfully to make sure that the components continue running properly. These status checks are invoked after an interval of 1 min from the previous run. Nodelet creates a cgroup called — pf9-kube-status, to limit the CPU used during these status checks.
Nodelet starts pf9 components when the pf9-kube role is pushed to the host. Nodelet works with the hostagent. It is responsible to configure pf9 Kubernetes components in a phased manner and takes corrective actions if failures happen during this course. It keeps track of the consecutive phases and retries only selective phases where failures are observed. Furthermore, it does full cleanup and restart of all components from the beginning on every 10th attempt if any component phase fails to successfully complete in the preceding 9 consecutive attempts.
pf9-kubelet
Kubernetes kubelet runs as a systemd service in platform9 managed Kubernetes host. This service starts three master containers namely api-server, scheduler, and controller in a static pod on the master nodes. On the worker nodes, this service is responsible for communicating with the Kubernetes API service on the masters to manage the Kubernetes pods on that specific node.
keepalived
This service is responsible for keeping the Master VIP highly available. Keepalived is installed and started only when VIP and VIP interface fields are chosen during cluster bootstrapping options selection.
Standalone Docker Containers
etcd
Key-value data store as the backing for all Kubernetes cluster data. In Platform9, etcd runs as a docker container. Pf9 etcd containers run on the master nodes and are configured in single, three, or five node configurations.
kube-proxy
This is the Kubernetes proxy service that runs as a docker container on every node, implementing components of the Kubernetes Service concept. kube-proxy also maintains network rules on the nodes. These rules permit network communication to your Pods from network sessions inside or outside the cluster.
pf9-bouncer
This service runs as a docker container and is configured to receive authentication validation requests from the Kubernetes API server. It is the Kubernetes cluster level authentication service that is configured to work by default on our clusters, using Keystone as the identity provider. If the Kubernetes API has a valid authentication token in the request, the request is passed on to the API server to process it.
Pf9 Namespaces and Pods
Namespace | Example Pod Name | Purpose of Pod |
---|---|---|
pf9-olm | packageserver-844d4fb848-zmpxd | OLM Internal pod for processing olm package installation |
pf9-olm | packageserver-844d4fb848-cpr4b | OLM Internal pod for processing olm package installation |
pf9-olm | platform9-operators-df5bl | OLM repository pod, OLM operator fetches packages from this repo to install operators |
pf9-olm | olm-operator-fbd9c955c-j85zb | OLM operator: Watches OLM subscription objects and installs operators in response |
pf9-olm | catalog-operator-d59cf9dfb-5t7s8 | OLM catalog operator: Works with olm operator pod and processes, validates olm packages before installing |
pf9-olm | prometheus-operator-54dd4d9b-kxv9r | Prometheus operator: Installed through OLM by creating an olm subscription object |
pf9-operators | monhelper-5c8558c46-p5hw2 | Pf9 helper pod, created as part of olm package for prometheus operators, installs and configures all monitoring objects like prometheus, prometheus rules, alertmanager, grafana, etc |
pf9-monitoring | prometheus-system-0 | Actual Prometheus instance installed through prometheus operator |
pf9-monitoring | grafana-986c774cf-p8w58 | Grafana, for UI visualization of prometheus metrics, is pre-configured to talk to the prometheus instance installed above. Has built-in dashboards to visualize all scraped metrics. |
pf9-monitoring | alertmanager-sysalert-0 | Alertmanager instance, configured with prometheus to receive alerts. The user needs to configure target alerts to deliver to the expected destination. |
pf9-monitoring | node-exporter-llgkj | Node Exporter daemonset, exports metrics for each kube node which the prometheus object scrapes |
pf9-monitoring | node-exporter-nbw5k | Node Exporter daemonset, exports metrics for each kube node which the prometheus object scrapes |
pf9-monitoring | node-exporter-rjtdg | Node Exporter daemonset, exports metrics for each kube node which the prometheus object scrapes |
pf9-monitoring | kube-state-metrics-595cb5cc | The kube state metrics exporter exports/distributes Kubernetes cluster metrics with the prometheus object scrapes |
platform9-system | pf9-sentry | Pf9 UI queries the HTTP server running inside the pod to get a list of CSI drivers |
pf9-addons | pf9-addon-operator-7f9784f867-ktn6z | Addon operator supports numerous addon types like coredns, metrics-server, dashboard, auto-scaler, etc. |
Etcd-backup
If this feature is enabled, the etcd-backup directory is configured when bootstrapping the cluster. The frequency of the etcd backup defaults to every 24 hours, but is configurable to suit clients’ need. The etcd backup file is saved in the /etc/pf9/etcd-backup file on the master node. This backup is useful for recovering the cluster state after an unforeseen event like etcd corruption.