In this post you will learn:
- Why Run Kubernetes On-premises
- Challenges Running Kubernetes On-premises
- Best Practices for On-premise Kubernetes Implementation
- Kubernetes in Production Needs
- Additional Services
The best Kubernetes architecture for your organization depends on your needs and goals. Kubernetes is often described as a cloud-native technology, and it certainly qualifies as one. However, the cloud-native concept does not exclude the use of on-premises infrastructure in cases where it makes sense. Depending on your organization’s needs regarding compliance, locality and cost for running your workloads, there may be significant advantages to running Kubernetes deployments on-premises.
Kubernetes has achieved an unprecedented adoption rate, due in part to the fact that it substantially simplifies the deployment and management of microservices. Almost equally important is that it allows users who are unable to utilize the public cloud to operate in a “cloud-like” environment. It does this by decoupling dependencies and abstracting infrastructure away from your application stack, giving you the portability and the scalability that’s associated with cloud-native applications.
Why Run Kubernetes On-premises
Why do organizations choose the path of running Kubernetes in their own data centers, compared to the relative “cake-walk” with public cloud providers? There are typically a few important reasons why an Enterprise may choose to invest in a Kubernetes on-premises strategy:
1. Compliance & Data Privacy
Some organizations simply can’t use the public cloud, as they are bound by stringent regulations related to compliance and data privacy issues. For example, the GDPR compliance rules may prevent enterprises from serving customers in the european region using services hosted in certain public clouds.
2. Business Policy Reasons
Business policy needs, such as having to run your workloads at specific geographical locations, may make it difficult to use public clouds. Some enterprises may not be able to utilize public cloud offerings from a specific cloud provider due to their business policies related to competition.
3. Being Cloud Agnostic to Avoid Lock-in
Many enterprises may wish to not be tried to a single cloud provider, and hence may want to deploy their applications across multiple clouds, including an on-premises private cloud. This reduces your risk of business continuity impact due to issues with a specific cloud provider. It also gives you leverage around price negotiation with your cloud providers.
This is probably the most important reason to run Kubernetes on-premises. Running all of your applications in the public clouds can get pretty expensive at scale. Specifically if your applications rely on ingesting and processing a large amount of data, such as an AI/ML application, it can get extremely expensive to run it in a public cloud. If you have existing data centers on-premises or in a co-location hosted facility, running Kubernetes on-premises can be an effective way to reduce your operational costs. For more information on this topic, see this latest report from a16z: The Cost of Cloud, a Trillion Dollar Paradox Quoted from the report ” It’s becoming evident that while cloud clearly delivers on its promise early on in a company’s journey, the pressure it puts on margins can start to outweigh the benefits, as a company scales and growth slows. Because this shift happens later in a company’s life, it is difficult to reverse as it’s a result of years of development focused on new features, and not infrastructure optimization”
An effective strategy to run Kubernetes on-premises on your own data centers can be used to transform your business and modernize your applications for cloud-native – while improving infrastructure utilization and saving costs at the same time.
Challenges Running Kubernetes On-premises
There is a downside to running Kubernetes on-premises however, specially if you are going the Do-It-Yourself (DIY) route. Kubernetes is known for its steep learning curve and operational complexity. When using Kubernetes on AWS or Azure – your public cloud provider essentially hides all the complexities from you. Running Kubernetes on-premises means you’re on your own for managing these complexities. Here are specific areas where the complexities are involved:
- Etcd – manage highly available etcd cluster. You need to take frequent backups to ensure business continuity in case the cluster goes down and the etcd data is lost.
- Load balancing – Load balancing may be needed both for your cluster master nodes and your application services running on Kubernetes. Depending on your existing networking setup, you may want to use a load balancer such as F5 or use a software load balancer such as metallb.
- Availability – Its critical to ensure that your Kubernetes infrastructure is highly available and can withstand data center and infrastructure downtimes. This would mean having multiple master nodes per cluster, and when relevant, having multiple Kubernetes clusters, across different availability zones.
- Auto-scaling – Auto-scaling for the nodes of your cluster can help save resources because the clusters can automatically expand and contract depending on workload needs. This is difficult to achieve for bare metal Kubernetes clusters, unless you are using a bare metal automation platform such as open source Ironic or Platform9’s Managed Bare Metal.
- Networking – Networking is very specific to your data center configuration.
- Persistent storage – Majority of your production workloads running on Kubernetes will require persistent storage – block or file storage. The good news is that most of the popular enterprise storage vendors have CSI plugins and supported integrations with Kubernetes. You will need to work with your storage vendor to identify the right plugin and install any additional needed components before you can integration your existing storage solution with Kubernetes on-premises.
- Upgrades – You will need to upgrade your clusters roughly every 3 months when a new upstream version of Kubernetes is released. The version upgrade may create issues if there are API incompatibilities introduced with newer version of Kubernetes. A staged upgrading strategy where your development / test clusters are upgraded first before upgrading your production clusters is recommended.
- Monitoring – You will need to invest in tooling to monitor the health of your Kubernetes clusters in your on-premise Kubernetes environment. If you have existing monitoring and log management tools such as Datadog or Splunk, most of them have specific capabilities around Kubernetes monitoring. Or you may consider investing in an open source monitoring stack designed to help you monitor Kubernetes clusters such as Prometheus and Grafana.
Best Practices for Kubernetes On-premises
Below you will find a set of best practices to run Kubernetes on-premises. Depending on your environment configuration, some or all of these may apply to you.
Integrating with Existing Environment
Kubernetes enables users to run clusters on a diverse set of infrastructures on-premises . So you can “repurpose” your environment to integrate with Kubernetes – using virtual machines or creating your own cluster from scratch on bare metal. But you would need to build a deep understanding of the specifics of deploying Kubernetes on your existing environment, including your servers, storage systems, and networking infrastructure, to get a well configured production Kubernetes environment.
The three most popular ways to deploy Kubernetes on-premises are:
- Virtual machines on your existing VMware vSphere environment
- Linux physical servers running Ubuntu, CentOS or RHEL Linux
- Virtual machines on other types of IaaS environments on-premises such as OpenStack.
Running Kubernetes on physical servers can give you native hardware performance, which may be critical for certain types of workloads, however it may limit your ability to quickly scale your infrastructure. If getting bare metal performance is important to you, and if you need to run Kubernetes clusters at scale, then consider investing in a bare metal automation platform such as Ironic , Metal3 or a managed bare metal stack such as Platform9 Managed Bare Metal.
Running Kubernetes on virtual machines in your private cloud on VMware or KVM can give you the elasticity of cloud, as you can dynamically scale your Kubernetes clusters up or down based on workload demand. Clusters created on virtual machines are also easy to setup and tear down, making it easy to create ephemeral test environments for developers.
Staffing Your Team
The CNCF has introduced certifications like Certified Kubernetes Administrator (CKA) and Certified Kubernetes Application Developer (CKAD) that can be achieved by passing a test. The certifications are a good way to assess ones credibility wrt Kubernetes skills. If you are able to hire or train employees with CNCF Certifications, that can be a great option to ensure that you have the right talent for your on-premise Kubernetes project.
You should also plan for the fact that DIY enterprise Kubernetes projects often balloon to months-long (and even years-long) projects trying to tame and effectively manage the open source components at scale. If not planned for appropriately, this can accumulate costs and delay time to market.
Deploy Kubernetes Anywhere for Free
Kubernetes can run on one server that can act as both a master and a worker node for the cluster for a test deployment. But to run a meaningful application in practice, you will needs at least three: one for all the master components which include all the control plane components like the kube-apiserver, etcd, kube-scheduler and kube-controller-manager, and two for the worker nodes where you run kubelet.
- While master components can run on any machine, best practice dictates using a separate set of servers for the master nodes and not running any of your application containers on these machines.
- One key feature of Kubernetes is the ability to recover from failures without losing data. It does this with a ‘political’ system of leaders, elections, and terms – referred to as quorum – which requires “good” hardware to properly fulfill this capability. To be both available and recoverable, best practice dictates allocating three nodes as master nodes with 4GB RAM and 16GB SSD each to this task, with three being the bare minimum and seven the maximum for master nodes.
- An SSD is recommended here since etcd writes to disk, and the smallest delay adversely affects performance. Lastly, always have an odd number of cluster members so a majority can be reached.
- For production environments, you would need a dedicated HAProxy load balancer node, as well as a client machine to run automation.
- It’s also a good idea to get a lot more power than what Kubernetes’ minimum requirements call for. Modern Kubernetes servers typically feature two CPUs with 32 cores each, 2TB of error-correcting RAM and at least four SSDs, eight SATA SSDs, and a couple of 10G network cards.
- It is best practice to run your clusters in a multi-master fashion in Production – to ensure high availability and resiliency of the master components themselves. This means you’ll need at least 3 Master nodes (an odd number, to ensure quorum). You’ll further need to monitor the master(s) and fixing any issues in case one of the replicas are down.
etcd is an open-source distributed key-value store, and the persistent storage for Kubernetes. Kubernetes uses etcd to store all cluster-related data. This includes all the information that exists on your pods, nodes, and cluster. Accounting for this store is mission-critical, to say the least, since it’s the last line of defense in case of cluster failure. Managing highly available, secured etcd clusters for large-scale production deployments is one of the key operational complexities you need to handle when managing Kubernetes on your own infrastructure. For production use, where availability and redundancy are important factors, running etcd as a cluster is critical. Bringing up a secure etcd cluster – particularly on-premises – involves downloading the right binaries, writing the initial cluster configuration on each etcd node, and setting and bringing up etcd. This is in addition to configuring the certificate authority and certificates for secure connections. For an easier way to run etcd cluster on-prem, check out the open-source etcdadm tool.
In case you’re deploying offline or in an air-gapped environment, you’ll need to have your own repositories in place for docker, Kubernetes and any other open-source tools you may be using. This includes helm chart repositories for Kubernetes manifests, as well as binary repositories.
Storage and Networking
Keep in mind that when running Kubernetes on your own in your own data center on-premises, you will need to manage all the storage integrations, load balancers, DNS.
In addition, each one of these components – from storage to networking – needs its own monitoring and alerting systems, and you will need to set up your internal processes to monitor, troubleshoot and fix any common issues that might arise in these related services to ensure the health of your environments.
A container registry enables you to store container images for your applications in a secure and highly available manner. Even when deploying Kubernetes clusters on-premises, you could use hosted registry options such as ECR, docker hub etc. If your container registry must be hosted on-premises, open source Harbor is a good option, although you must access for complexity involved in deploying your own registry.
You also definitely want to install the Kubernetes dashboard which is one of the most useful and popular add-ons. The dashboard is not installed by default, and must be configured separately. Once installed, the dashboard can provide great visibility into all your containerized workloads deployed on your cluster. It will also let you access container logs that can help with debugging.
Best practices include always checking logs when something goes wrong by looking in your syslog files.
This stage can be a lot of fun since you get to experiment with all the tools in the industry, or a major pain — depending on your infrastructure and processes complexity.
Weaveworks and Flannel are both great networking tools, while Istio and Linkerd are popular service mesh options. Grafana and Prometheus help with monitoring and there are a number of tools to automate CI/CD like Jenkins, Bamboo, and JenkinsX.
Security is a major concern. Every open source component needs to be scanned for threats and vulnerabilities. Additionally, keeping track of version updates and patches and then managing their introduction can be labor-intensive, especially if you have a lot of additional services running.
Note that bare-bone Kubernetes is never enough for real-world production applications. A complete Kubernetes infrastructure on-prem needs proper DNS, load balancing, Ingress and K8’s role-based access control (RBAC), alongside a slew of additional components that then makes the deployment process quite daunting for IT.
Once Kubernetes is deployed comes the addition of monitoring, tracing, logging, and all the associated operations for troubleshooting — such as when running out of capacity, ensuring HA, backups, and more.
In conclusion, Kubernetes helps on-premise data centers benefit from cloud-native applications and infrastructure, irrespective of hosting or public cloud providers. They could be on Openstack, KVM, VMware vSphere or even bare metal and still reap the cloud-native benefits that come from integrating with Kubernetes.
Kubernetes On-Premises With Platform9
Platform9 Managed Kubernetes (PMK) tries to address a number of the above best practices in a single, easy to use, SaaS managed platform, that can let you manage one or many Kubernetes clusters, across your infrastructure on-premises, in the public clouds, or both. Check out our PMK page for more details on PMK features. You can also find more information about PMK including useful product demo videos here Getting started is easy. Create your free account today.
Kubernetes on Bare Metal: Why and How
This post dives deeper into details of benefits of running Kubernetes on bare metal, comparison of running Kubernetes on bare metal vs virtual machines, and additional details.
Read more: Kubernetes on Bare Metal: Why and How
7 Key Considerations for Kubernetes in Production
A complete Kubernetes infrastructure needs proper DNS, load balancing, Ingress and Kubernetes role-based access control (RBAC), alongside a slew of additional components that then makes the deployment process quite daunting for IT. Once Kubernetes is deployed comes the addition of monitoring and all the associated operations playbooks to fix problems as they occur — such as when running out of capacity, ensuring HA, backups, and more. Finally, the cycle repeats again, whenever there’s a new version of Kubernetes released by the community, and your production clusters need to be upgraded without risking any application downtime.
Bare-bone Kubernetes is never enough for real-world production applications. In this blog post you’ll learn 7 Key Considerations for Kubernetes in Production.
Kubernetes Upgrade: The Definitive Guide to Do-It-Yourself
Often you are required to upgrade the Kubernetes cluster to keep up with the latest security features and bug fixes, as well as benefit from new features being released on an on-going basis. This is especially important when you have installed a really outdated version or if you want to automate the process and always be on top of the latest supported version.
In general, when operating an HA Kubernetes Cluster, the upgrade process involves two separate tasks which may not overlap or be performed simultaneously: upgrading the Kubernetes Cluster; and, if needed, upgrading the etcd cluster which is the distributed key-value backing store of Kubernetes. In this blog post you’ll see how to perform those tasks with minimal disruptions.
Top Considerations for Migrating Kubernetes Across Platforms
Migration Kubernetes may include moving from one public cloud vendor to another; from a private data center to the cloud or vice-versa; from a data center or cloud to a colocation facility; or across private data centers. It could be a wholesale, one-time migration of your application to a new environment or a dynamic and ongoing migration between environments. Regardless of target, strategy, or reason, migration requires careful consideration and you’ll benefit through the use of third-party tools and managed platforms. There are many considerations in terms of data, differences in connectivity, cloud vendors, platform or bare-metal services, and so on.
- See the Future of Cluster Management with Platform9 Experts at ArgoCon 2022 - August 31, 2022
- Supercloud 2022: Is Kubernetes an Enabler or a Blocker? - August 17, 2022
- Meet Emilia A’ Bell – Chief Revenue Officer, Platform9 - July 28, 2022