JVM performance watch roundup May 2025

2025-06-03

Posted by Ingo Kegel

I post notes on JVM performance-related news on social media whenever a patch, JEP, or benchmark catches my eye. This blog post collects everything shared in May 2025. Each section rewrites the original thread in plain text.

Follow me on to get these updates as soon as they go out.

1. Streamlined ahead-of-time caching with single-step command-line option

JEP 514 proposes a streamlined command-line interface for generating ahead-of-time (AOT) caches in the JVM. Until now, AOT cache creation requires a two-step process: first recording with -XX:AOTMode=record, then generating the cache using -XX:AOTMode=create. The new -XX:AOTCacheOutput=<file> option simplifies this by combining both steps into a single JVM invocation.

This integrated approach automatically manages a temporary config file for the training phase. Additionally, the new JDK_AOT_VM_OPTIONS environment variable will allow developers to customize the cache creation phase independently of the training phase. This is useful for scenarios where cache generation might benefit from different hardware or memory configurations.

The existing two-step workflow remains supported, making it possible to record on a small instance and then create the cache on a more powerful machine.

2. JFR stack sampling overhaul with cooperative safepoint sampling

JFR stack sampling is undergoing a significant rewrite aimed at improving safety, reliability, and scalability in production environments. JEP 518 introduces a cooperative stack sampling mechanism that eliminates the old crash-prone asynchronous heuristics in favor of sampling at safepoints.

In the current implementation, JFR forcibly stops threads at arbitrary points and guesses the call stack. This approach often resulted in invalid traces and occasional JVM crashes, particularly when classes were unloaded during sampling. The new design instead marks threads and lets them stop at the next safepoint. At that point, JFR records just the program counter and stack pointer, deferring the actual stack reconstruction to the thread itself at a safe location.

This change allows JFR to allocate memory safely during stack parsing. While some limitations such as incomplete traces for intrinsics or native methods remain, the risk of JVM crashes is effectively removed.

For JProfiler, this improvement means JFR recordings will become a more interesting data source, which is especially useful in environments where native agents are restricted.

3. Generational Shenandoah GC is graduating from experimental to production

Generational Shenandoah is moving from experimental to production status. Initially introduced as an experimental feature in JEP 404 (JDK 24), generational Shenandoah aims to improve throughput and pause times for allocation-heavy workloads by dividing the heap into young and old generations.

Currently, enabling generational mode required using experimental flags:

-XX:+UseShenandoahGC
-XX:+UnlockExperimentalVMOptions
-XX:ShenandoahGCMode=generational

According to JEP 521, generational Shenandoah is now considered stable enough to drop the +UnlockExperimentalVMOptions flag requirement, thereby simplifying its activation. However, single-generation Shenandoah remains the default.

Benchmarks like DaCapo, SPECjbb2015, and Heapothesys have reported encouraging results. For memory-sensitive or low-latency workloads, generational mode is worth evaluating.

4. JFR method-level timing and tracing

JEP 520 introduces method-level timing and tracing to JDK Flight Recorder (JFR) via JVM-managed bytecode instrumentation, starting with Java 25. With the jdk.MethodTiming and jdk.MethodTrace events, developers can now record exact invocation counts, execution times, and stack traces for targeted methods. Targets are configured using class names, method references, or annotations.

This feature is particularly useful for pinpointing performance bottlenecks in critical or infrequently called code such as startup logic, connection pools, or custom endpoints. Filters allow multiple selection types: class, method, or annotation. For example, ::<clinit> targets all static initializers, and annotations like @RequestMapping can be used to track selected methods.

JProfiler users benefit from JEP 520 by gaining some high-resolution CPU data when analyzing JFR snapshots. This aligns with JProfiler’s existing support for both agent-based instrumentation and JVMTI sampling.

Blog