Kubernetes on-premises: why and how

Overview

In this post you will learn:

The best Kubernetes architecture for your organization depends on your needs and goals. Kubernetes is often described as a cloud-native technology, and it certainly qualifies as one. However, the cloud-native concept does not exclude the use of on-premises infrastructure in cases where it makes sense. Depending on your organization’s needs regarding compliance, locality, current architecture, and cost for running your workloads, there may be significant advantages to running Kubernetes deployments on-premises.

Kubernetes has achieved an unprecedented adoption rate, due in part to the fact that it substantially simplifies the deployment and management of microservices. Almost equally important is that it allows users who are unable to utilize the public cloud to operate in a “cloud-like” environment. It does this by decoupling dependencies and abstracting infrastructure away from your application stack, giving you the portability and the scalability that are associated with cloud-native applications.

Why Run Kubernetes On-premises

Why do organizations choose to run Kubernetes in their own data centers, compared to the relative “cake-walk” with public cloud providers? There are typically a few important reasons why an enterprise may choose to invest in a Kubernetes on-premises strategy:

1. Compliance & Data Privacy

Some organizations simply can’t use the public cloud, as they are bound by stringent regulations related to compliance and data privacy issues. For example, the GDPR compliance rules may prevent enterprises from serving customers in the European region using services hosted in certain public clouds.

2. Business Policy Reasons

Business policy needs, such as having to run your workloads at specific geographical locations, may make it difficult to use public clouds. Additionally, some enterprises may not be able to utilize public cloud offerings from a specific cloud provider due to their business policies related to competition.

3. Being Cloud Agnostic to Avoid Lock-in

Many enterprises may not wish to be tied to a single cloud provider and hence may want to deploy their applications across multiple clouds, including an on-premises private cloud. This could potentially reduce business continuity risk due to issues with a specific cloud provider. It also gives you leverage around price negotiation with your cloud providers.

4. Cost

Cost is probably the most important reason to run Kubernetes on-premises. Running all of your applications in the public cloud can get expensive at scale. Specifically, if your applications rely on ingesting and processing large amounts of data, such as with an AI/ML application, a public cloud can get extremely expensive. If you have existing data centers on-premises or in a co-location-hosted facility, running Kubernetes on-premises can be an effective way to reduce your operational costs.

According to a 2021 report from a16z, “It’s becoming evident that while cloud clearly delivers on its promise early on in a company’s journey, the pressure it puts on margins can start to outweigh the benefits, as a company scales and growth slows. Because this shift happens later in a company’s life, it is difficult to reverse as it’s a result of years of development focused on new features and not infrastructure optimization.”

An effective Kubernetes strategy running on-premises in your own data centers can be used to transform your business and modernize your applications for cloud-native – while improving infrastructure utilization and saving costs at the same time.

Challenges Running Kubernetes On-premises

There is a downside to running Kubernetes on-premises, however. Do-It-Yourself (DIY), or self-managed, Kubernetes is known for its steep learning curve and operational complexity. When using Kubernetes on AWS or Azure, your public cloud provider essentially abstracts the complexities from you. Running Kubernetes on-premises means you’re on your own. Here are specific areas where this challenge can be most apparent:

  1. Etcd – Manage highly available etcd cluster. You need to take frequent backups to ensure business continuity in case the cluster goes down, and the etcd data is lost.
  2. Load balancing – Load balancing may be needed both for your cluster master nodes and your application services running on Kubernetes. Depending on your existing networking setup, you may want to use a load balancer such as F5 or use a software load balancer such as metallb.
  3. Availability – It’s critical to ensure that your Kubernetes infrastructure is highly available and can withstand data center and infrastructure downtimes. This would mean having multiple master nodes per cluster, and, when relevant, having multiple Kubernetes clusters across different availability zones.
  4. Auto-scaling – Auto-scaling based on workload needs can help save resources. This is difficult to achieve for bare metal Kubernetes clusters unless you are using a bare metal automation platform such as open-source Ironic or Platform9’s Managed Bare Metal.
  5. Networking – Networking is very specific to your data center configuration.
  6. Persistent storage – The majority of your production workloads running on Kubernetes will require persistent storage – block or file storage. The good news is that most of the popular enterprise storage vendors have CSI plugins and supported integrations with Kubernetes. You will need to work with your storage vendor to identify the right plugin and install any needed components before you can integrate your existing storage solution with Kubernetes on-premises.
  7. Upgrades – You will need to upgrade your clusters roughly every 3 months when a new upstream version of Kubernetes is released. The version upgrade may create issues if there are API incompatibilities introduced with a newer version. A staged upgrading strategy, where your development/test clusters are upgraded first before upgrading your production clusters, is recommended.
  8. Monitoring – You will need to invest in tooling to monitor the health of your Kubernetes clusters in your on-premise Kubernetes environment. Most monitoring and log management tools have specific capabilities around K8s monitoring. If you are already using Datadog, Splunk, or similar tools, you’ll have the ability to monitor your Kubernetes on-prem implementation. Or you may consider investing in an open-source monitoring stack designed to help you monitor Kubernetes clusters, such as Prometheus and Grafana.

Best Practices for Kubernetes On-premises

Below you will find a set of best practices to run Kubernetes on-premises. Depending on your environment configuration, some or all of these may apply to you.

Integrating with Existing Environment

Kubernetes enables users to run clusters on diverse of infrastructure on-premises. So you can repurpose your environment to integrate with Kubernetes, using virtual machines or creating your own cluster from scratch on bare metal. But to do this, you would need to build a deep understanding of the specifics of deploying Kubernetes in your existing environment, including your servers, storage systems, and networking infrastructure, to get a well-configured production K8s environment.

The three most popular ways to deploy Kubernetes on-premises are:

  1. Virtual machines on your existing VMware vSphere environment
  2. Linux physical servers running Ubuntu, CentOS, or RHEL Linux
  3. Virtual machines on other types of IaaS environments on-premises, such as OpenStack.

Running Kubernetes on physical servers can give you native hardware performance which may be critical for certain types of workloads. However, it may limit your ability to quickly scale your infrastructure. If getting bare metal performance is important to you, and if you need to run Kubernetes clusters at scale, then consider investing in a bare metal automation platform such as Ironic , Metal3, or a managed bare metal stack such as Platform9 Managed Bare Metal.

Running Kubernetes on virtual machines in your private cloud on VMware or KVM can give you the elasticity of the cloud, as you can dynamically scale your Kubernetes clusters up or down based on workload demand. Clusters created on virtual machines are also easy to set up and tear down, making it easy to create ephemeral test environments for developers.

Staffing Your Team

The Cloud Native Computing Foundation (CNCF) has introduced certifications like Certified Kubernetes Administrator (CKA) and Certified Kubernetes Application Developer (CKAD). The certifications are a good way to assess one’s Kubernetes skills. A great way to ensure that you have the right skills for your on-premise Kubernetes implementations is to train or hire team members with these certifications.

You should also plan for a DIY enterprise Kubernetes project to balloon to months-long (and even years-long) projects while trying to tame and effectively manage the open-source components at scale. If not appropriately planned for, this can accumulate costs and delay time to market.

Node Configuration

For a test deployment, Kubernetes can run on one server that can act as both a master and a worker node for the cluster. But to run a meaningful application in practice, you will need at least three servers: one for all the master components, which include all the control plane components like the kube-apiserver, etcd, kube-scheduler and kube-controller-manager, and two for the worker nodes where you’ll run kubelet.

  • While master components can run on any machine, best practice dictates using a separate set of servers for the master nodes and not running any of your application containers on these machines.
  • One key feature of Kubernetes is the ability to recover from failures without losing data. It does this with a ‘political’ system of leaders, elections, and terms – referred to as quorum – which requires “good” hardware to properly fulfill this capability. To be both available and recoverable, it’s recommended that you allocate three nodes as master nodes with 4GB RAM and 16GB SSD each to this task, with three being the bare minimum and seven being the maximum for master nodes.
  • An SSD is recommended here since etcd writes to disk, and the smallest delay can adversely affect performance. Lastly, always have an odd number of cluster members so a majority can be reached.
  • For production environments, you need a dedicated HAProxy load balancer node, as well as a client machine, to run automation.
  • It’s also a good idea to get substantially more power than what Kubernetes’ minimum requirements call for. Modern Kubernetes servers typically feature two CPUs with 32 cores each, 2TB of error-correcting RAM, and at least four SSDs, eight SATA SSDs, and a couple of 10G network cards.
  • It is best practice to run your clusters in a multi-master fashion in production to ensure high availability and resiliency of the master components themselves. This means you’ll need at least 3 Master nodes (an odd number, to ensure quorum). You’ll further need to monitor the master(s) and fix any issues in case one of the replicas are down.

Etcd Configuration

etcd is an open-source distributed key-value store and the persistent storage for Kubernetes. Kubernetes uses etcd to store all cluster-related data. This includes all the information that exists on your pods, nodes, and cluster. Accounting for this store is mission-critical, to say the least, since it’s the last line of defense in case of cluster failure. Managing highly available, secured etcd clusters for large-scale production deployments is one of the key operational complexities you need to handle when managing Kubernetes on your own infrastructure.

For production use, where availability and redundancy are important factors, running etcd as a cluster is critical. Bringing up a secure etcd cluster – particularly on-premises – involves downloading the right binaries, writing the initial cluster configuration on each etcd node, and setting and bringing up etcd. This is in addition to configuring the certificate authority and certificates for secure connections. For an easier way to run etcd cluster on-prem, check out the open-source etcdadm tool.

Repositories

If you are deploying offline or in an air-gapped environment, you’ll need to have your own repositories in place for docker, Kubernetes, and any other open-source tools you may be using. This includes helm chart repositories for Kubernetes manifests, as well as binary repositories.

Storage and Networking

Keep in mind that when running Kubernetes in your own data center on-premises, you will need to manage all of the storage integrations, load balancers, and DNS.

In addition, each one of these components – from storage to networking – needs its own monitoring and alerting systems, and you will need to set up your internal processes to monitor, troubleshoot and fix any common issues that might arise in these related services to ensure the health of your environments.

Container Registry

A container registry enables you to store container images for your applications in a secure and highly available manner. Even when deploying Kubernetes clusters on-premises, you could use hosted registry options such as ECR, docker hub, etc. If your container registry must be hosted on-premises, open-source Harbor is a good option, although you must assess the complexity involved in deploying your own registry.

UI

You also definitely want to install the Kubernetes dashboard, which is one of the most useful and popular add-ons.  The dashboard is not installed by default and must be configured separately. Once installed, the dashboard can provide great visibility into all your containerized workloads deployed on your cluster. It will also let you access container logs that can help with debugging.

Troubleshooting

Best practices include always checking logs when something goes wrong by looking in your syslog files.

Additional Services

This stage can be a lot of fun since you get to experiment with all the tools in the industry, or a major pain — depending on your infrastructure and processes complexity.

Weaveworks and Flannel are both great networking tools, while Istio and Linkerd are popular service mesh options. Grafana and Prometheus help with monitoring and there are a number of tools to automate CI/CD like Jenkins, Bamboo, and JenkinsX.

Security is a major concern. Every open source component needs to be scanned for threats and vulnerabilities. Additionally, keeping track of version updates and patches and then managing their introduction can be labor-intensive, especially if you have a lot of additional services running.

Note that bare-bone Kubernetes is never enough for real-world production applications. A complete Kubernetes infrastructure on-prem needs proper DNS, load balancing, Ingress and K8’s role-based access control (RBAC), alongside a slew of additional components that then makes the deployment process quite daunting for IT.

Once Kubernetes is deployed comes the addition of monitoring, tracing, logging, and all the associated operations for troubleshooting — such as when running out of capacity, ensuring HA, backups, and more.

Conclusion

In conclusion, Kubernetes helps on-premise data centers benefit from cloud-native applications and infrastructure, irrespective of hosting or public cloud providers. They could be on Openstack, KVM, VMware vSphere or even bare metal and still reap the cloud-native benefits that come from integrating with Kubernetes.

Kubernetes On-Premises With Platform9

Platform9 Managed Kubernetes (PMK) addresses a number of the above best practices in a single, easy to use container management and orchestration platform that lets you manage Kubernetes clusters on any infrastructure anywhere. Check out our PMK page for more details on PMK features. You can also find more information about PMK including useful product demo videos here Getting started is easy.

Further Readings

Kubernetes on Bare Metal: Why and How

This post dives deeper into details of benefits of running Kubernetes on bare metal, comparison of running Kubernetes on bare metal vs virtual machines, and additional details.

Read more: Kubernetes on Bare Metal: Why and How


7 Key Considerations for Kubernetes in Production

A complete Kubernetes infrastructure needs proper DNS, load balancing, Ingress and Kubernetes role-based access control (RBAC), alongside a slew of additional components that then makes the deployment process quite daunting for IT. Once Kubernetes is deployed comes the addition of monitoring and all the associated operations playbooks to fix problems as they occur — such as when running out of capacity, ensuring HA, backups, and more. Finally, the cycle repeats again, whenever there’s a new version of Kubernetes released by the community, and your production clusters need to be upgraded without risking any application downtime.

Bare-bone Kubernetes is never enough for real-world production applications. In this blog post you’ll learn 7 Key Considerations for Kubernetes in Production.

Read more: 7 Key Considerations for Kubernetes in Production


Kubernetes Upgrade: The Definitive Guide to Do-It-Yourself

Often you are required to upgrade the Kubernetes cluster to keep up with the latest security features and bug fixes, as well as benefit from new features being released on an on-going basis. This is especially important when you have installed a really outdated version or if you want to automate the process and always be on top of the latest supported version.

In general, when operating an HA Kubernetes Cluster, the upgrade process involves two separate tasks which may not overlap or be performed simultaneously: upgrading the Kubernetes Cluster; and, if needed, upgrading the etcd cluster which is the distributed key-value backing store of Kubernetes. In this blog post you’ll see how to perform those tasks with minimal disruptions.

Read more: Kubernetes Upgrade: The Definitive Guide to Do-It-Yourself


Top Considerations for Migrating Kubernetes Across Platforms

Migration Kubernetes may include moving from one public cloud vendor to another; from a private data center to the cloud or vice-versa; from a data center or cloud to a colocation facility; or across private data centers. It could be a wholesale, one-time migration of your application to a new environment or a dynamic and ongoing migration between environments. Regardless of target, strategy, or reason, migration requires careful consideration and you’ll benefit through the use of third-party tools and managed platforms. There are many considerations in terms of data, differences in connectivity, cloud vendors, platform or bare-metal services, and so on.

Read more: Top Considerations for Migrating Kubernetes Across Platforms


Interested in More Content?

You may also enjoy

Kubernetes PaaS or Not to PaaS

By Chris Jones

The Nerve Wracking Journey of Working with an EKS Cluster

By Salil Apte

The browser you are using is outdated. For the best experience please download or update your browser to one of the following: