Java app performance over the decades

A graphic representing Java on Kubernetes - a technical landscape of devices, severs, and a cloud environment

In 2008, I had my first encounter with performance monitoring. Back then it was Systems Management, and I was using BMC Patrol! By 2010 we had entered Application Performance Management. I was consulting and using HP Software Load Runner, HP Diagnostics, HP BPM (you get the idea) and CA Wily. This was a fantastic period of my life.

While employed at JDS Australia, I had the opportunity to work with performance testing teams and production application support teams. We conducted load testing and set up production monitoring for applications across various industries, including telecommunications, mining, retail, banking, insurance, and healthcare. It was an exhilarating experience!

Fast forward to 2020, when the hot new term “observability” along with SLO and SLA became all the rage. Now, in 2024, SRE teams have their hands full. To be honest, not much has changed.

This backstory is crucial because there has been one consistent challenge that every generation of support and engineering teams has had to face: Java.

My most memorable experience with Java’s challenges was a transformation project in 2008/09. We virtualized client-server Java applications using Microsoft App-V, ran them on VMware VMs, and served them to end users via Citrix.

The project was a success. One Windows OS was running multiple versions of Java, multiple instances of the applications, and the Citrix physical server farm was reduced to 70ish VMs from a few hundred physical nodes! (Side note: building the monitoring stack for that using BMC Patrol was no simple task, luckily the business users were hot on the phone when things went side-ways).

2006 – 2010: The P2V Era

When virtualization was introduced, Java was a primary target. Why? Because of the way memory was requested and subsequently managed. Running on a physical server, Java JVM would request a maximum amount of RAM at start up, and over the hours a fluctuation would be recorded in your favorite observability tool of choice. Noticing drastic fluctuations architects moved to P2V (physical to virtual) these applications so that the “unused” RAM could be used by another VM. Perfect.

In the physical world Java performance could be diagnosed in isolation as everything around it was not shared, sans the databased and network.

-Xmx: This setting specifies the maximum heap size ¹ ². For example, -Xmx1024m allows a maximum of 1GB of memory for the JVM ¹.

2010 – 2019: The VM Era

Early movements of Java applications to virtual environments coincided with the emergence of multiple profiling and monitoring tools. As the enterprise grew, so did the demand for a deep understanding of Java performance in real-time. One key element of this period was ‘dynamic byte code instrumentation’. The -javaagent flag. The gold standard that still reigns king today.

java -javaagent:path/to/opentelemetry-javaagent.jar -jar myapp.jar

The advent of DevOps and the ability to investigate Java application performance in real-time shifted the focus from RAM allocation to the application’s code. If you were savvy, you also considered how the Java application interacted with the underlying databases.

Little credence was given to those of us who delved deep into the full stack, looking into VM provisioning practices, contention, and storage performance. While critical and often the root cause of issues, this approach was not as exciting as diving into API latency in real-time.

2015 – 2024: The Cloud Instance Era

V2V2C, the loved and dreaded virtual to cloud migration (I added a V as the VM technically was re-virtualized in the process) period didn’t change much. Most companies decided that outsourcing their entire datacenter and moving to the cloud would save money and improve performance due to the elastic access to resources.

With infrastructure performance solved, all focus centered on the JVM. Byte-code instrumentation if you could afford it, if not, long live the log file which Splunk had made cool again.

Unnoticed, or ignored due to low interest rates and growing economies, was the drastic increase in waste caused by the JVM.

The move to VMs enabled over-provisioning. However, when companies migrated the same assets to the cloud without any changes (a common practice till date), it created a boon for the public cloud industry. The savings from over-provisioning were lost, but the benefits of elastic horizontal scaling for cloud instances were gained. Yet, each new instance brought the same amount of waste, as the JVM could not be dynamically resized.

2020 – 2024: Java Micro Service Era

Today, the solution seems obvious: simply convert VMs to containers (V2C), and bam, the waste is gone.

Unfortunately, it’s not that simple.

In 2024, there are three ways a JVM might find itself running on Kubernetes:

In the cloud, inside a container that runs on an instance.
On-prem, inside a container that runs on a virtual machine.
On-prem, inside a container on bare metal.

At this point I’ll state my argument:

JVMs have always had headroom added to the JVM, captured in the -Xmx flag. Even the best companies that run rigorous performance load tests add headroom. This headroom in the year 2009 resulted in unused capacity on a physical server causing more power use, wasted data center space and excess carbon emissions.

In 2015, as a VM, headroom was likely mitigated using overprovisioning and, for some very memory-sensitive applications, left to ensure optimal performance. By 2019, firmly in the cloud, headroom became financial waste.

In 2021, as a container, headroom resurfaced:

In the cloud, it resulted in waste.
On a VM in the data center, it was likely captured in overprovisioning.
On a physical server, headroom led to unused capacity, power usage, wasted data center space, and excess carbon emissions.

It’s now performance and cost

Solving JVM performance has not changed in 20 years. The solution has always been self-evident: test, apply, refine, and observe, as outlined in ITIL v2. However, with each new technology and the persistent belief that “infrastructure” is the solution, companies worldwide have deferred addressing it. Admittedly, a few may have chosen not to modernize and accepted the costs, but they are in the minority.

By ignoring the issue and focusing on advancing infrastructure, companies running Java in the cloud on Kubernetes now face performance and cost challenges. These challenges are intrinsically linked and not easily solved without significant investment in processes, tooling, and transformation.

So what can be done?

One approach is to attempt to apply Horizontal Pod Autoscaler (HPA) with the new (still Alpha) In-Place Update of Pod Resources for resizing CPU and RAM.
Another is to implement a feedback loop and constantly update Requests and Limits with a bin-packing algorithm to ensure minimal overall waste in the OS.
Use Spot Instances or a tool like Spot Ocean to automate a few infrastructure strategies.
Or, leverage virtualization inside your public cloud by combining AWS bare metal with KVM and bringing those nodes into your EKS cluster.

Author
Recent Posts

Chris Jones