Kubernetes networking has matured a lot since its inception. At a quick glance, Kubernetes architecture encompasses all the components you need – like load balancer integration, egress gateways, network security policies, multiple ways to handle ingress traffic, and routing within the cluster. Kubernetes has the ability to layer these components and combine them to make a holistic network that supports almost any scenario that organizations need to successfully leverage Kubernetes as their container orchestration platform.
Conceptually, all these components work as you would imagine. But like all things Kubernetes, there are different approaches that can be taken – like NodePort versus an Ingress Controller – to ultimately have the same result of moving traffic from outside the cluster to a running application instance. There are different benefits with the drastically different approaches.
Another option for ingress traffic is the load balancer module but it only works out-of-the-box on the largest public clouds for various reasons.
Networking in Kubernetes is one of the best examples of where relying on a distribution or managed offering makes a lot of sense, as there is as much art as science involved in making all the components work seamlessly together. All the required components are in the Kubernetes ecosystem. But the magic is in the delivery.
It All Revolves Around a Service
A “service” is defined as the combination of a group of pods, and a policy to access them. This sounds simple, and that is intentional. A service needs three things: a name (mychatapp-service), a way to identify the pods in its group (typically a label like irc=forever), and a way to access those pods (port 6667 via TCP). It can get more complicated when you start talking about health checks and all the background processes; but for the purposes of understanding networking, they aren’t important.
Once a service has been established, Kubernetes assigns it a ClusterIP, which is an IP address that is accessible only within the Kubernetes cluster. Now other containers within the cluster can start to access the service through its ClusterIP and not care about how many pods are supporting the service or the exact nodes they are running on. But what if the service is for external clients not in the cluster? The reality is that most services aren’t created for internal-only consumption.
We now have a dilemma where we need options to expose the service to the world. Thankfully Kubernetes does have multiple options. The problem with multiple options – each with their own unique approach – is that a choice needs to be made on which approach to use. Mixing and matching multiple types in one cluster does increase complexity, and complexity always makes on-going management more challenging.
Now let’s discuss those options.
The Easiest Way is Via a NodePort
NodePort is named quite literally like many other functional components within Kubernetes. It is an open port on every worker node in the cluster that has a pod for that service. When traffic is received on that open port, it directs it to a specific port on the ClusterIP for the service it is representing. In a single-node cluster this is very straight forward. In a multi-node cluster the internal routing can get more complicated. In that case you might want to introduce an external load balancer so you can spread traffic out across all the nodes and be able to handle failures a bit easier.
NodePort is great, but it has a few limitations. The first is that you need to track which nodes have pods with exposed ports. The second is that it only exposes one service per port. The third is that the ports available to NodePort are in the 30,000 to 32,767 range.
This is the default method for many Kubernetes installations in the cloud, and it works great. It supports multiple protocols and multiple ports per service. But by default it uses an IP for every service, and that IP is configured to have its own load balancer configured in the cloud. These add costs and overhead that is overkill for essentially every cluster with multiple services, which is almost every cluster these days.
Load Balancers in the Cloud vs on Bare Metal
There is an advantage to using the Kubernetes Load Balancer feature on the biggest public clouds. Through the cloud controller, Kubernetes will automatically provision and deprovision the required external IP and associated load balancer, and the nodes it will connect to in the cluster.
If you are running on-premises, especially on bare metal, there is no load-balancer service and pool of IPs just sitting idle and waiting for general usage. This is where the open source project MetalLB comes into play. It has been designed from the ground up to specifically address this need for Kubernetes. It is still a fairly new project and requires experience to make it stable and reliable. Platform9 includes MetalLB and has the expertise to support on-premise deployments.
MetalLB even runs within Kubernetes to make it easier to manage and maintain. To use MetalLB you need a pool of IP addresses it can distribute, and a few ports open. It even supports working with Border Gateway Protocol (BGP) for more complex networking scenarios. For example, when multiple Kubernetes clusters are involved in the environment.
In these multi-cluster scenarios, MetalLB manages the pool of IPs across the clusters including which cluster is primary, secondary, and tertiary for specific services.
While ingress – in normal networking functionality – refers to any inbound traffic, in Kubernetes it strictly refers to the API that manages traffic routing rules like SSL termination. The ingress controller in Kubernetes is the application that is deployed to implement those rules.
Ingress isn’t a service type like NodePort, ClusterIP, or LoadBalancer. Ingress actually acts as a proxy to bring traffic into the cluster, then uses internal service routing to get the traffic where it is going. Under the hood, Ingress will use a NodePort or LoadBalancer service to expose itself to the world so it can act as that proxy.
Here is an example of how ingress works: A deployment would define a new service. It would then tell Ingress that new.app.example.com is the external DNS that maps to the service. The service wants to receive traffic on TCP port 8181. The Ingress controller then sets up those rules so when it receives a request asking for new.app.example.com:8181, it knows where to send the payload and URI information for processing.
The actual rules can get much more complicated. Out-of-the-box, they typically stick to layer 4 requests like the example above; although layer 7 requests involving cookie paths and specific query parameters on the URI are becoming more prevalent; especially when a service mesh is involved. Service meshes, like Istio, allow very fine-grained control of how traffic is sent to one or more versions of a service – including blue/green, AB, Canary, or even payload-based.
As an additional benefit, service meshes can even route services between Kubernetes clusters without using Ingress or any of the other methods discussed here.
Commonly used ingress controllers are NGINX, Contour, and HAProxy. A more comprehensive list is available in Kubernetes’ documentation on Ingress Controllers.
Now, Which To Use?
Now that we have reviewed the three types of ingress, it will help to have a cheat sheet of sorts to quickly compare some key points that make the decision on which to use a little easier.
|Supported by core Kubernetes||Yes||Yes||Yes|
|Works on every platform Kubernetes will deploy||Yes||Only supports a few public clouds.|
MetalLB project allows use on-premises.
|Direct access to service||Yes||Yes||No|
|Proxies each service through third party (NGINX, HAProxy, etc)||No||No||Yes|
|Multiple ports per service||No||Yes||Yes|
|Multiple services per IP||Yes||No||Yes|
|Allows use of standard service ports (80, 443, etc)||No||Yes||Yes|
|Have to track individual node IPs||Yes||No||Yes, when using NodePort; No, when using LoadBalancer|
At the end of the day it comes down to a couple decisions.
NodePort wins on simplicity, but you need to open firewall rules to allow access to ports 30,000 to 32,767, and know the IPs of the individual worker nodes.
LoadBalancer when on a public cloud, or supported by MetalLB, works great with the service being able to control the exact port it wants to use. The downside is it can get expensive, as every service will get its own load balancer and external IP, which cost $$$ on the public cloud.
Ingress is becoming the most commonly used, combined with the load balancer service; especially with MetalLB now available, as it minimizes the number of IPs being used while still allowing for every service to have its own name and/or URI routing.
Platform9 allows organizations to use any and all of these services across public clouds; and thanks to MetalLB, across on-premises virtual machine and bare metal deployments. Whether Platform9 is managing Kubernetes for an organization, or providing expert level support for its certified Kubernetes distribution, you can be assured that your applications will work seamlessly in any scenario from single-node clusters in development to multi-cluster, multi-datacenter scenarios in production.
- Kubernetes Deployment: The Ultimate Guide - September 17, 2021
- Why and How to Run Machine Learning Workloads on Kubernetes - July 23, 2021
- Learning About Kubernetes Admission Controllers and OPA Gatekeeper - July 20, 2021