Kubernetes deployments can scale to support services and applications that run in a broad array of environments, from multi-cloud deployments to resource-constrained edge computing environments. But as businesses look to save money by optimizing infrastructural resources, the process of deploying thousands of microservices over geographically distributed servers can become increasingly complicated.
If your company will have (or already has) a large-scale Kubernetes deployment, here are four crucial things to consider.
1. Make Sure It Scales
Kubernetes employs autoscaling to adjust the number of nodes in a cluster. As the demands for computing resources change, the autoscaler can increase or decrease the number of nodes. When nodes in the cluster are running at high CPU utilization for extended periods, the autoscaler will add nodes. Similarly, if nodes become idle for some period of time, they’re removed from the cluster. Adjusting the number of nodes in a cluster is referred to as horizontal scaling.
Another way to scale is to use servers with more resources. For example, instead of deploying nodes with 16 CPUs and 96GB of memory, you could use nodes with 64 CPUs and 400GB of memory. This is called vertical scaling.
Scaling is an important consideration because it directly impacts the availability of services. A resource-constrained cluster doesn’t have the capacity to process additional workloads. Over-provisioning is an option, but it’s a costly one. A better approach is to ensure you’ve instrumented the cluster so you can collect metrics about its state and automatically respond to changing workloads.
2. Secure It Properly
Security is always a consideration when deploying services. As an administrator of Kubernetes clusters, you’ll need to attend to multiple security mechanisms, including access controls, encryption, and managing secrets.
Access controls depend on identity management. There must be a way to represent users and service accounts within the cluster. To streamline identity management, users should be assigned to roles or groups that have permissions assigned to them.
You should also consider how you’ll enforce the principle of least privilege—granting only the permissions a user needs to perform their job and no more. In addition to these authorization considerations, you’ll also need to deploy authentication methods that support the way users employ the cluster.
Also, plan for how and when you’ll use encryption. Sensitive and confidential information should be encrypted at rest, as well as in transit.
You should plan to provide a mechanism for storing secrets, such as database passwords and API keys. Developers may be used to storing secrets in configuration files and setting environment variables with those secret values, but a centralized repository for managing secrets is more secure.
3. Keep It Available
The formal definition of availability is the percentage of time a system is ready for use. This way of thinking about availability is useful when working with service-level agreements (SLAs). It’s also an appropriate way to think of availability from a user’s perspective—a system is available if they can use it. A developer’s perspective is slightly different.
Developers have a more expansive view of availability. It includes ensuring a production environment is functioning and able to meet the workload on the system at any time. Developers also depend on development and test environments being available to do their work. To ensure developers have the necessary environments available to them, it’s important to create repeatable processes for deploying clusters and services.
The repeatable processes for developer environments may be different from the repeatable processes used in production environments. Site reliability engineers (SREs), for example, may have a specific set of design principles they apply to production environments. For example, there may be different levels of health checking, monitoring, and alerting. Service-level agreements (SLAs) will likely be different, as well.
Also, developers will likely have different needs from SREs. For example, developers shouldn’t have administrative access to a production cluster, but they should have administrative privileges to a cluster in their development environment, rather than depend on others to configure and maintain it.
4. Keep It Humming
When planning for Kubernetes at scale, consider how you’ll maintain appropriate levels of performance. Specifically, is your system able to meet compute, storage, and network needs at any point in time? Think about performance at both an application and a cluster level.
At the application level, deployments should be performant. Deployments consist of multiple pods, so pods need to be performant for the deployment to be performant. Of course, with a sufficient number of pods, the deployment can continue to meet the needs of workloads even if some small number are not functioning as expected.
At the cluster level, you should consider how to maintain the overall performance of a cluster. This is largely a factor of how performant the nodes are, but other cluster-level properties, such as how fast a cluster can autoscale, can impact the overall performance of the system.
The geographic location of the cluster nodes that Kubernetes manages is closely related to the latency that clients experience. For example, nodes that host pods located in Europe will have faster DNS resolve times and lower latencies for customers in that region.
Make sure these best practices are high on your list, and your Kubernetes experience will be rewarding instead of frustrating.