Bundling a minimal 'bare bones' JVM with your Application

One of the gripes why people dislike Java as a desktop application is probably because of the additional requisite of having the presence of the JVM, which is generally either not bundled with any of the major operating systems in the market today, or that it can be the wrong version of the JVM that is needed. This makes it really annoying for end users if they just wanted something that ‘works out of the box’.

I got interested with the idea that, what if your application can create a seamless experience, as if it’s like a binary application running on the native platform, by having the JVM embedded in as part of the package, so that most of the core application besides the loader can be written in Java?

The impetus for this is because that the entire JVM is usually a huge piece of bloat, and mostly, you don’t really need to use every single feature that is present. So what about just packing only the actual components that really have to present for an application to work?

With respect to a simple application that prints out just ‘Hello World’, just how many files are there that the JVM really needs in order to function? So that’s when I started experimenting by taking my Java installation and picking out each individual file apart, leaving only the files that would otherwise cause the JVM to fail. By trial and error, here’s what I found to be required in order to have the minimal JVM functioning:

70k     jre/bin/java
103k    jre/bin
29k     jre/lib/i386/native_threads/libhpi.so
33k     jre/lib/i386/native_threads
4.3M    jre/lib/i386/client/libjvm.so
4.4M    jre/lib/i386/client
91k     jre/lib/i386/libzip.so
50k     jre/lib/i386/libverify.so
148k    jre/lib/i386/libjava.so
4.1k    jre/lib/i386/jvm.cfg
4.7M    jre/lib/i386
467k    jre/lib/rt.jar*

What you’re seeing here is the output from a linux version of the JVM, hence the point to note from this is, that the file prefix and what follows after will be different for different operating systems, although the names should largely remain the same. What I mean by that is, that for example in Windows, 'java' will be 'java.exe', and for libraries like libjvm.so' will be corresponding to 'jvm.dll' instead.

The sum of the size of the files are roughly a little past 5Mb. But in order to get to this figure, there are a number of additional steps that I have to take, so in some ways this isn’t actually a fully functioning JVM, for various reasons:

  1. I’ve left out most of the ‘core’ dynamic libraries, like awt, sound, network, io, nio, awt and various libraries that resides in the /lib/i386/ directory. If the application attempts to utilize those corresponding java classes, the JVM will fail given the underlying libraries which provides the actual implementation aren’t present.

  2. Also because that actual binary libraries aren’t around, there really isn’t any need in having those corresponding class files as well, so I’ve taken it out from the core rt.jar jar package, which is where the Java system classes reside. The original rt.jar is probably 20Mb worth, so in this case, it is worth the trouble in doing that.

Caveat is, this is an error-prone process, and the files used by the JVM is guesstimated by passing the java launcher with the ‘-verbose:class’ flag and capturing the resulting classes loaded, and subsequently removing all the remaining untouched files from 'rt.jar'. The list of classes loaded looks something like this:

% java -class:verbose HelloWorld
[Opened /opt/sun-jdk-]
[Opened /opt/sun-jdk-]
[Opened /opt/sun-jdk-]
[Opened /opt/sun-jdk-]
[Loaded java.lang.Object from /opt/sun-jdk-]
[Loaded java.io.Serializable from /opt/sun-jdk-]
[Loaded java.lang.Comparable from /opt/sun-jdk-]

  ... [ lines truncated for brevity ] ...

[Loaded java.security.Principal from /opt/sun-jdk-]
[Loaded java.security.cert.Certificate from /opt/sun-jdk-]
[Loaded HelloWorld from file:/home/vincent/code/]
Hello World
[Loaded java.lang.Shutdown from /opt/sun-jdk-]
[Loaded java.lang.Shutdown$Lock from /opt/sun-jdk-]

Many people may be surprised that for a simple 'HelloWorld' application needs to load so many class files prior to execution. Because of that, and the series of auxiliary operations that the JVM performs before the application is start, this accounts for the ‘slow start’ phenomenon that we normally encounter with Java apps. In order to use this information, you’ll probably need to pipe the result into a file, and perform some manipulation before you can pass the list to ‘jar’ for repackaging.

The error proneness that I mentioned about, lies in the fact that applications that can dynamically load other classes at any given point in time, which means you can have missing class files that are required which is not captured by the profile of a single instance of execution. This shouldn’t be a problem as a developer, since it should be a relatively straightforward exercise to find out what system packages your application use.

After all that trimming, the entire embedded system adds up to about 5Mb worth, which is a sensible size for embedding into your application. But when compared to the offline java installer, which only comes about 20Mb compressed, the savings isn’t really that substantial, given that downloads are getting cheaper and faster by the day. I’m sure your YouTube bandwidth use will easily have exceeded that in any given day, so it might just be as well that not worth the effort to have the JVM integrated into your application, and instead have an installation script that will seamlessly install the JVM to work with it.