Kubernetes FinOps: Java memory management in containerized environments

Java memory management

If you’ve spent much time working in the enterprise application space lately, either as a developer or an operator, you’ve probably run across Java applications in containerized environments like Kubernetes. If so, you’ve probably used terms like “wrestled with Java applications” to more accurately describe the experience! Java as a language and a platform is almost three decades old, still actively developed and has a huge installed base. But certain fundamental parts of Java, particularly memory management, just don’t play well inside containers.  What’s going on in there, and what can you do to at least put some ointment on the pain?

Background: Java memory management is different

How memory allocation usually works

In a typical system, processes allocate and deallocate memory on the fly, as they need more or less. The kernel handles not-yet-allocated or released memory, either assigning it to other processes or using it as cache for system-level activities like filesystem I/O. These processes with memory needs that change over time can run together even when their maximum memory requirements add up to more memory than the system has. This approach works as long as they don’t all need their maximum memory at once.

How memory allocation works in Java

Java processes add a layer of abstraction to ordinary operating system memory handling.  The Java virtual machine (JVM) that’s part of the Java runtime, manages memory itself, to give to or reclaim from the actual Java code running inside the JVM.  Typically the runtime will preallocate some large percentage of memory on the system (tunable with Java runtime command-line arguments).  The runtime will allocate more memory (if available) from the OS as needed for the JVM to allocate to the running Java program.

There are more aspects to how all the Java runtime and JVM’s memory is structured, like heap vs. non-heap memory, but they’re beyond the scope of this blog post.

Garbage collection, and why Java workloads are difficult to memory-manage

The host OS sees the entire amount of memory the JVM manages as in use. However, inside the JVM, it may be almost completely unused – either not yet allocated or used by unreferenced objects.

There is a very recent proposal to assist the OS in knowing which parts of memory inside the JVM are free by zeroing them out after garbage collection. This will make things like VM snapshots and restores faster since zeroed memory doesn’t need to actually be read or written. However, the prerequisite garbage-collect would be an expensive operation in many situations. In any case, this is a long way from being merged.

Unused memory inside a JVM is typically not returned to the host OS except, in some cases (as discussed below), after garbage collection. The JVM does not even know what memory is used until it traces Java object references.  Unreferenced objects are “eligible” for garbage collection. Typically, this involves marking those objects, then freeing up the memory they consume in a second pass.

The default garbage collector: G1

Under the current default garbage collector (called “garbage-first” or G1), garbage collection is normally an expensive operation that halts the running application until it completes (usually taking hundreds of milliseconds).  For this reason G1 rarely garbage-collects at all until it starts to hit the system limit of allocatable memory. For example, it approaches the maximum heap size defined by the Java command-line arguments at runtime, or the runtime tries to allocate more memory from the OS and the system call returns an error.

Garbage collection can also be forced externally by a user or system process explicitly calling for it, but this is generally not advisable with G1, because of the effect on the running application.

Notably, even when it garbage-collects, G1 does not release unused memory back to the OS by default.  From a practical standpoint this makes sense. Memory that was needed once, will likely be needed again. Memory (re)allocation is an expensive operation that’s not guaranteed to succeed if other processes request that memory in the meantime.  In addition, even if memory is available to return, it might be fragmented into small pieces – returning it usably would involve defragmenting it first, another computationally-expensive process.

Alternative garbage collectors

Although G1’s default behavior is not to return unused memory, it can be configured to do so after a full garbage collection. But it will only do that if some preconditions are met that will ensure minimal impact on application performance.

Alternative garbage collectors have been developed to return memory more aggressively or while preserving some dimension of Java application performance. Using an alternative collector can allow garbage collection to happen more frequently and/or with a lower impact on important parameters of a given application’s performance. Alternative garbage collectors include Shenandoah and ZGC.

G1 was itself an alternative garbage collector early in its history, that later became the default.

Note that, as alluded to above, garbage collection itself (regardless of the particular collector used) does not automatically release memory back to the OS. It only marks memory as unused internally to the JVM. How that unused memory is dealt with is up to the particular GC implementation. The user configures it through runtime arguments. The garbage collector may retain it for some period of time (potentially forever), or it may be quickly freed to the OS.

Java memory in containerized environments

Although the challenges of Java memory management were present in bare-metal and virtualized environments, as container adoption grew and Java workloads started to be run in these more tightly-constrained settings, memory issues became much more prominent. But that greater visibility also drew attention and effort to some solutions.

A walk down container memory lane

In the early days of Docker and Kubernetes, first attempts at running Java apps in containers would often immediately get OOMKilled on startup because of Java’s memory-preallocation behavior. In a container with memory limits, even though the JVM was inside a container it still saw the host’s memory resources as “available” and by default, tried to preallocate based on that. This often caused the container to hit its memory limit during or just after the preallocation, resulting in an immediate OOMKill. To fix this, most people explicitly set the maximum heap memory in the container arguments – tedious and error-prone to say the least.

Not long after Kubernetes started to see mass adoption and this issue became a frequent pain for Kubernetes operators, Java gained container-awareness (starting with Java 10, and backported to later updates of Java 8). The JVM would see the container limit instead of the system memory as the total amount of available memory, making it behave better on startup. But even with container awareness, the additional abstraction Java imposes on memory management continues to frustrate Kubernetes operators.

Java garbage collection and containerized environments

The fact that the used or free status of memory allocated by the JVM is opaque to everything outside it means that tools and techniques for optimizing memory allocation of Kubernetes applications. From basics like the Vertical Pod Autoscaler to advanced features in development like in-place pod resizing – either don’t work at all, or don’t work as well, with typical Java defaults.

Paths forward for running Java applications well in Kubernetes

Fortunately there are a few techniques for running Java apps in Kubernetes that will help maximize your return on optimization effort in nearly any environment, containerized or not.

Investigate alternative garbage-collection options

As noted previously, various alternative garbage collectors implement different collection processes and parameters to try to allow garbage collection to happen more often or more effectively without compromising critical dimensions of application performance.  While an overview of alternative collectors is beyond the scope of this article, one of them will likely give you better behavior than the default – even enabling the G1 collector’s memory-freeing behavior may be sufficient for your needs.

Tell the JVM in your containers how to manage memory better

Whether or not you use an alternative garbage-collector configuration, there are some knobs you can tweak to make the JVM manage memory more like you want it to:

  • For cases where the JVM is trying to avoid long application pauses for garbage collection, but your application can actually tolerate those pauses well, you can use -XX:GCTimeRatio to make garbage collection run longer.
  • Where the JVM is holding on to memory you would rather have it release, use -XX:MaxHeapFreeRatio to tell the JVM the maximum proportion of unused memory it should retain, releasing any excess back to the OS.

Make your application tolerant of more aggressive garbage-collection by making it more cloud-native

The main reason garbage collection has such a negative impact is that applications don’t tolerate the pause it incurs well. Depending on how garbage-collection happens, they might become unresponsive for relatively long periods. Hence, operators try not to enable aggressive garbage-collection.  However, Java applications are not the only ones affected by the issue of an unresponsive workload. Any application can stall for various reasons, from I/O starvation on the node running the workload to something external that the application depends on (like a database) responding slowly.

In the cloud-native world, people have developed well-understood methods for working around these issues. Usually some variation of running multiple replicas and having data replication and failover for critical services, or caching data to avoid relatively long round-trips across a network link.  It may require significant work to rearchitect things. But enabling your Java application (or things that depend on it) to handle a blip because of garbage-collection is likely to pay off in other congestion or failure scenarios.

Whichever of these options you choose (or even something not covered above!), the big takeaway is that you have options that can enhance the effectiveness of your other optimization tools. Experiment and see what works for you!

Additional reading

Baeldung: Does GC Release Back Memory to OS?

Previously on the Platform9 blog:

Java app performance over the decades

Kubernetes FinOps: Right-sizing Kubernetes workloads

Kubernetes FinOps: Resource management challenges

Documentation:

Release Note: Java Improvements for Docker Containers

ZGC

Shenandoah GC

Joe Thompson

You may also enjoy

Top 6 FinOps KPIs for EKS  

By Chris Jones

Kubernetes FinOps: Comparing Platform9 Elastic Machine Pool to the Karpenter Autoscaler for EKS

By Joe Thompson

The browser you are using is outdated. For the best experience please download or update your browser to one of the following:

Leaving VMware? Get the VMware alternatives guideDownload Now