Multi-Master Architecture

Creating and operating a highly available Kubernetes cluster requires multiple Kubernetes control plane nodes along with their Master Nodes. To achieve this, each Master Node must be able to communicate with every other Master, and be addressable by a single IP address. This can be achieved through external load balancers, VRRP, or through an ingress controller within the Kubernetes cluster itself. Each of these architectures have limitations, however, only through the use of VRRP can a multi-master cluster be receptively set up on any Virtual or Physical infrastructure.

PMK uses the Virtual Router Redundancy Protocol (VRRP) with Keepalived to provide a virtual IP (VIP) that fronts the active master node in a multi-master Kubernetes cluster. At any point in time, the VRRP protocol associates one of the master nodes with the virtual IP to which the clients (kubelet, users, Qbert) connect.

Multi Master Reserved IP Address Requirement

To deploy a multi-master cluster, you must create a reserved IP address for the cluster's Virtual IP interface. The Virtual IP is used to load balance requests via VRRP/Keepalived and provide high availability across master nodes.

The Virtual IP must be reserved. If any other network device is provisioned and claims the IP, the cluster will become unavailable. To ensure connectivity isn't blocked, ensure firewalls allow ingress and egress to the virtual IP and ensure that any infrastructure virtualization isn't blocking traffic at the virtual machine layer.

Virtual IP Addressing with VRRP

Multi-master VRRP Diagram

Multi-master VRRP Diagram

During cluster creation, PMK will bind the virtual IP to a specific physical interface on the master node, specified by the admin during cluster creation. The Virtual IP should be reachable from the network that the specified physical interface connects to. The label for the specified physical interface, such as eth0, for example, must be provided by the user while creating the cluster, and every master must have the same label for the interface to be bound to the virtual IP.

When the cluster is running, all client requests for Kubernetes API server are sent to the active master node only (the master that is currently mapped to the Virtual IP). If that master goes down, the VRRP protocol will elect a new master to be the active master and remap the Virtual IP to that master, making that master the new target of all new client requests.

To create a highly available cluster, it is recommended to design your clusters with 3 or 5 master nodes.

While Managed Kubernetes supports creation of a single master bare metal cluster today, we do not recommend this configuration for production deployments.

To scale a single master cluster a Virtual IP must be provided at the time of cluster creation.

For production deployments, we recommend creating clusters with 3 or 5 master nodes. Once a cluster is created with at least 3 master nodes, you can then scale up the number of masters post cluster creation.

Etcd cluster configuration

In a multi-master cluster, Platform9 runs an instance of etcd in each of the master nodes. For the etcd cluster to be healthy, there must be a quorum (or majority) number of etcd nodes up and running all the time (for example, 2 out of 3 masters should be up and running). Losing quorum will result in a non-functional etcd cluster, causing the Kubernetes cluster to also not function. Thus, it’s recommended to create your production clusters with at least 3 or 5 master nodes.

See this page for more info.

Cluster master configuration vs tolerance for loss of masters

As discussed in the Etcd cluster configuration section above, Platform9 runs an instance of etcd on each master node. For the etcd cluster to function properly, it needs a majority of nodes, and a quorum, to agree on updates to the cluster state. Hence, the number of masters you configure your cluster with has a direct impact on how the cluster can tolerate loss of a master node.

A cluster can lose one or more master nodes in the following scenarios:

  1. One or more master nodes goes down
  2. A network partition results in masters not being able to communicate with each other

For a cluster with n members, a quorum is defined via the formula (n/2)+1. For any odd-sized cluster, adding one node will always increase the number of nodes necessary for quorum. Although adding a node to an odd-sized cluster appears better since there are more machines, the fault tolerance is worse since the same number of nodes may fail without losing quorum, but there are more nodes that can fail.

If the cluster is in a state where it can’t tolerate any more failures, adding a node before removing nodes is dangerous. This reason is if the new node fails to register with the cluster (e.g., the address is misconfigured), a quorum will be permanently lost. The following table describes the number of masters you configure a cluster with vs loss of masters the cluster can tolerate.

Number of MastersImpact of loss of masters
1A single master cluster is thus never recommended for production environments as it can not tolerate any loss of masters. Its only recommended for test environments where you wish to quickly deploy a cluster and you can tolerate the possibility of cluster downtime caused by the master going down. You can not add more master nodes to a cluster that's created with a single master node today. You need to start with a cluster that has atleast 2 master nodes before you can add any more masters to it.
2A 2 master cluster can not tolerate any master loss. Losing 1 master will cause quorum to be lost and so the etcd cluster and hence the Kubernetes cluster will not function.
33 masters is the minimum we recommend for a highly available cluster. A 3 master cluster can tolerate loss of at most 1 master at a given time. In that case, the remaining 2 masters will elect a new active master if necessary
4A 4 master cluster can tolerate loss of at most 1 master at a given time. In that case, the remaining 3 masters will have majroity and will elect a new active master if necessary.
5A 5 master cluster can tolerate loss of at most 2 masters at a given time. In that case, the remaining 4 or 3 masters will have majroity and will elect a new active master if necessary.

For more information, see the etcd FAQ documentation.

Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard
  Last updated