Tackling Kubernetes Underutilization: Cutting EKS Costs by 50%

How Platform9’s Elastic Machine Pool (EMP) Works: A Technical Summary

Kubernetes is the most inefficient consumer of infrastructure, especially in public clouds. The average AWS Elastic Kubernetes Service (EKS) utilization is at most 30%, which is the primary source of compute waste. This, in turn, drives up cloud infrastructure costs by as much as 50%. Platform9 Elastic Machine Pool (EMP) finally solves Kubernetes underutilization by re-architecting how cloud infrastructure is consumed at the compute layer.

EMP increases resource utilization by up to 70% by combining dedicated AWS bare metal servers, advanced virtualization techniques, and intelligent real-time optimization. This innovative approach to workload consolidation enables Kubernetes users to realize game-changing cost savings and efficiency gains.

In a recent Platform9 cloud native livestream event, Kamesh Pemmaraju and Shamsher Ansari discussed in depth how EMP works under the hood. They specifically explored why EMP is a revolutionary approach for the EKS underutilization problem, one that none of the existing tools on the market solve. Watch the full video below.

Here are the main takeaways from this livestream:

The role of Bare Metal in EMP

EMP uses dedicated AWS bare metal servers as the foundation for a highly efficient virtualization layer, creating “Elastic VMs” (EVMs), that look and feel exactly like regular EC2 VMs. AWS Bare metal provides EMP complete control without the noisy neighbors and resource constraints of multi-tenant virtualization. Crucially, EMP improves resource utilization at the bare metal layer without any changes needed to your applications

EMP is 100% compatible with EKS and AWS infrastructure. You can continue to use AWS services like EBS, EFS, and VPC. No changes are needed to existing operational tooling, automation, or upgrade processes. Furthermore, EMP’s EVMs can also operate in parallel to EC2 instances.

A diagram showing the difference of EKS with & without using EMP. EMP deploys an alternate virtualization layer, using AWS bare metal underneath, creating “Elastic VMs” (EVMs), that look and feel exactly like regular EC2 VMs

How EMP works: Resource overprovisioning

Overprovisioning enables allocating more virtualized compute resources than are physically present on the underlying bare metal host. EMP safely overprovisions (by as much as 2x)  vCPUs and memory on EVMs beyond the actual capacity of underlying bare metal servers. This packs substantially higher densities of workloads using fewer physical hosts.

Diagram showing how EMP works to provide higher utilized bare metal capacity in aggregate

Overprovisioning solves two problems:

The bin-packing problem

Unallocated space is created when a developer “requests” pod resources that do not match the available EC2 instance sizes. This is known as the bin-packing problem. Because EVMs are “elastic,” they match the resource requirement without requiring the developer to change their requests. Keep in mind that there is no need to “right-size” VM instances using this technique.

Allocated but unused capacity

To maintain app SLAs, developers frequently set high resource “limits.” This results in unused resources because average usage is typically lower than peak usage. Kubernetes does not “free” unused resources automatically, resulting in waste. Even though individual EVMs may be underutilized, EMP maximizes bare metal capacity at the aggregate level, thereby reducing waste.

How EMP works: Real-time resource rebalancing

To maintain high utilization levels, EMP leverages live migration to automatically rebalance workloads across the bare metal cluster.

A diagram showing how EMP rebalances resources in real-time

An advanced algorithm analyzes real-time performance data and fluidly redistributes elastic VMs based on current demand. When the resource usage for the EVMs on a bare metal node starts increasing and does not fit on one bare metal node, EMP will “live migrate” some EVMs from one Bare Metal server to another Bare Metal server. This is done using real-time metrics and historical usage trend patterns. Your pods stay alive without churn, and the app SLA is not compromised.

Metrics like true vCPU and memory consumption guide optimal placement decisions. This real-time optimization based on precise utilization metrics eliminates the static allocation issues that plague Kubernetes. The cluster transforms into an efficient, dynamic, self-optimizing pool of resources that maximizes utilization and reduces costs in real time.

Storage and Network considerations during live migration

Migrating running workloads between hosts imposes additional requirements to prevent disruption:

Advanced Storage Migration

EMP leverages advanced AWS storage like io3 volumes to coordinate data synchronization during live VM migrations. This keeps the exact dataset available to applications, even while a VM moves hosts. Ongoing disk writes are mirrored to ensure no data loss.

IP and Network Continuity

Live VM migration also updates IP addressing and networking behind the scenes to keep applications seamlessly connected. As the VM activates on the destination host, networking reconnects while the outgoing host disengages. This transition retains IP continuity to avoid any downtime.

The entire migration process orchestrates storage, network and compute movements in a precisely choreographed sequence. This atomic transition shifts VMs between hosts while maintaining storage, network and IP integrity as well as application runtime continuity.

Limitations of Tools like Karpenter

Karpenter, an open-source project from AWS, focuses on automating the provisioning and scaling of worker nodes based on resource demands. It leverages AWS services and aims to optimize node group management in Kubernetes. Karpenter employs bin-packing algorithms to efficiently allocate resources, dynamically adjusting the number of nodes to match the workload. However, limitations arise in its static nature of resource allocations, potentially leading to underutilization or over-provisioning in dynamic cluster environments. Despite its automation capabilities, Karpenter might face challenges in adapting to the constantly changing nature of workloads, and adjustments often require manual intervention.

Pod Disruptions – when and how they happen

Making resource changes can lead to pod disruptions. When developers, DevOps teams or administrators manually adjust resource configurations, such as CPU or memory limits and requests, it triggers a process of updating the running pods or containers. This process involves rescheduling or restarting the affected pods to apply the new resource specifications.

For instance, if the resource limits are lowered to optimize resource utilization, Kubernetes may need to move the pods to nodes with available resources, causing temporary downtime during the migration process. Similarly, if resource limits are increased to meet higher demands, it may result in restarting pods on nodes with sufficient capacity. These operations introduce a level of disruption, potentially leading to application downtime or performance issues.

While Kapenter solves the bin-packing issue, it still does not solve the under-utilization issue and still requires manual processes and app disruption challenges.

The EMP difference

Platform9’s Elastic Machine Pool (EMP) stands out as a revolutionary solution, fundamentally distinct from other tools in its approach to optimizing Kubernetes clusters.

In summary, the unique differentiation capabilities of EMP include:

Innovative Approach

Platform9’s Elastic Machine Pool (EMP) distinguishes itself with a novel approach to Kubernetes optimization, introducing an alternate virtualization layer and real-time resource rebalancing, which are not currently available in public cloud infrastructure.

Maximized Utilization

EMP’s utilization-maximizing capabilities are unparalleled, addressing the shortcomings of traditional tools such as Kubernetes Descheduler and Karpenter. By deploying Elastic VMs (EVMs) on AWS Bare Metal, EMP eliminates the bin-packing problem and dynamically adjusts resource allocations, ensuring optimal utilization.

No more app disruptions

Unlike other tools that may cause application disruption during resource adjustments, EMP’s advanced features, such as live migration of EVMs, enable seamless, uninterrupted operation while maintaining application SLAs.

Compatibility and co-existence

EMP is a versatile and comprehensive solution due to its compatibility with existing AWS infrastructure and ability to work in conjunction with other optimization tools. It optimizes not only at the compute layer but also collaborates with other tools to save an additional 35%.

This is the first and only solution that provides this combination. The technological advancements boost Kubernetes efficiency to new heights. The user experience, however, remains unchanged. This seamless transition to a significantly improved utilization and cost profile, with no application changes or outages, is truly revolutionary.

Dig Deeper

Are you ready to take action?

Let our team walk you through a live demo session, addressing your unique challenges and discussing how EMP can be tailored to your specific EKS needs. Don’t miss this chance to revolutionize your EKS cost optimization and achieve unmatched efficiency.

Book your live EMP demo & discussion now

Kamesh Pemmaraju

You may also enjoy

Kubernetes FinOps: Elastic Machine Pool Step-by-Step guide : Part 2

By Joe Thompson

The argument for AWS Spot Instances

By Chris Jones

The browser you are using is outdated. For the best experience please download or update your browser to one of the following:

Leaving VMware? Get the VMware alternatives guideDownload now