Image Blog Dangerous Code
November 19, 2013

How to be Unsafe with Java Memory Management

Java Application Development
Developer Productivity

Have you ever wondered about the internals of Java memory management? Do you ask yourself weird questions like:

  • How much space does a class take up in memory?
  • How much space do my objects consume in machine memory?
  • What’s the deal with the alignment of object properties in memory?

In our article today, we'll cover all those questions and a few more as we explore Java memory usage and management.

 

The Class and Object Structure of Java in Memory

For Java geeks like us over here at RebelLabs, these mysterious questions have been orbiting our minds for a long time: if you are interested knowing more about the instrumentation of classes, knowing how classes are laid out will make it easier to get some specific fields from memory, or hack these fields within memory on the fly. This means that you can actually change the data or even the code within the memory!

Other points that might be striking your interest are “Off-Heap Memory” and “High Performance Serialization” implementations, which are a couple of good samples based on the object memory structure. This covers the ways to access memory addresses of classes and their instances, the layouts of these classes and instances in memory, along with a detailed explanation on the layouts of the object fields. We’re hoping to explain the content as simply as possible, but even so this article isn’t for Java beginners and knowledge of some Java programming principles are needed.

N.B. The writing below on the layouts of classes and objects are specific to Java SE 7, so it’s not recommended to automatically assume that any of it will be applicable in past or future versions of Java. For your convenience, we’ve placed the sample code for this article in a GitHub project for convenience, which you can find here: https://github.com/serkan-ozal/ocean-of-memories/tree/master/src/main/java/com/zeroturnaround/rebellabs/oceanofmemories/article1

What is the way of Direct Memory Access in Java?

Java was initially designed as a safe, managed environment. Nevertheless, Java HotSpot VM contains a “backdoor” that provides a number of low-level operations to manipulate memory and threads directly. This backdoor class – sun.misc.Unsafe – is widely used by JDK itself in packages like java.nio or java.util.concurrent. However, using this backdoor is certainly not suggested for use in the production environment, because this API is extremely dangerous, non-portable, and volatile. The Unsafe class provides an easy way to look into HotSpot JVM internals and to do some tricks. Sometimes it can be used to study VM internals without C++ code debugging, sometimes it can be leveraged for profiling and development tools.

How to be Unsafe

The sun.misc.Unsafe class was so unsafe that JDK developers added special checks to restrict access to it. Its constructor is private and the caller of the factory method getUnsafe() should be loaded by Bootloader. As you can see at line 8 in the snippet below, this goodie is not even loaded by any class loader, so its class loader is null. It will throw a SecurityException to prevent intruders.


public final class Unsafe {
   ...
   private Unsafe() {}
   private static final Unsafe theUnsafe = new Unsafe();
   ...
   public static Unsafe getUnsafe() {
      Class cc = sun.reflect.Reflection.getCallerClass(2);
      if (cc.getClassLoader() != null)
          throw new SecurityException("Unsafe");
      return theUnsafe;
   }
   ...
}


Fortunately there is theUnsafe field that can be used to retrieve the Unsafe instance. We can easily write a helper method to do this via reflection, as seen below. (http://highlyscalable.wordpress.com/2012/02/02/direct-memory-access-in-java/)


public static Unsafe getUnsafe() {
   try {
           Field f = Unsafe.class.getDeclaredField("theUnsafe");
           f.setAccessible(true);
           return (Unsafe)f.get(null);
   } catch (Exception e) { 
       /* ... */ 
   }
}


Some useful features of Unsafe

  • VM “intrinsification” i.e. CAS (compare-and-swap) used in Lock-Free Hash Tables. For example the compareAndSwapInt method makes real JNI calls into native code that contains special instructions for CAS. You can read more about CAS here: http://en.wikipedia.org/wiki/Compare-and-swap. 
  • The sun.misc.Unsafe functionality of the host VM can be used to allocate uninitialized objects (by “allocateInstance” method) and then interpret the constructor invocation as any other method call.
  • You can track the data from the native address. It’s possible to retrieve an object’s memory address using the java.lang.Unsafe class, and operate on its fields directly via unsafe get/put methods!
  • With the allocateMemory method, memory can be allocated from off-heap. For example, the DirectByteBuffer constructor internally calls it when the allocateDirect method is invoked.
  • The methods, arrayBaseOffset and arrayIndexScale, can be used to develop arraylets, a technique for efficiently breaking up large arrays into smaller objects to limit the real-time cost of scan, update or move operations on large objects.

In the next section, as we get the memory address of a class, we give some examples of using “Unsafe”.

OOPs and Compressed OOPs

Java objects in the heap are represented by an Ordinary Object Pointer (OOP). An OOP is a managed pointer to a memory location inside the Java heap, which is allocated as a single, continuous address range in terms of the JVM process’s virtual address space.

An OOP is normally the same size as a native machine pointer, which means 64 bits on an LP64 system. On an ILP32 system, maximum heap size is somewhat less than 4 gigabytes, which is insufficient for many applications.

Managed pointers in the Java heap point to objects which are aligned on 8-byte address boundaries. Compressed OOPs represent managed pointers in many, but not all, places in the JVM as 32-bit object offsets from the 64-bit Java heap base address. Because they're object offsets rather than byte offsets, they can be used to address up to four billion objects (not bytes), or a heap size of up to about 32 gigabytes. To use them, they must be scaled by a factor of 8 and added to the Java heap base address to find the object to which they refer. Object sizes using compressed OOPs are comparable to those in ILP32 mode.

Compressed OOPs are supported and enabled by default in Java v6u23 and later. In Java v7, use of compressed OOPs are the default for 64-bit JVM processes when “-Xmx” isn't specified and for values of “-Xmx” less than 32 gigabytes. For JDK 6 before the 6u23 release, use the “-XX:+UseCompressedOops” flag with the java command to enable the feature. (http://docs.oracle.com/javase/7/docs/technotes/guides/vm/performance-enhancements-7.html)

In our examples, we disable compressed OOPs support by using the “-XX:-UseCompressedOops” VM option. So the size of our pointers for 64-bit samples is 8 bytes.


A Few Things About the SampleClass

Within this article we’ll be using a sample class (named SampleClass as expected) to give examples about retrieving the address of its object, listing the layout of its fields and etc. The class is a simple one that contains three primitive types and it extends the SampleBaseClass to demonstrate the inheritance on memory layout part. The class definitions are given here, and the code of these classes can also be found on GitHub:


public final class SampleClass extends SampleBaseClass {

	private final static byte b = 100;

	private int i = 5;
	private long l = 10;

	public SampleClass() {

	}

	public SampleClass(int i, long l) {
		this.i = i;
		this.l = l;
	}

	public int getI() {
		return i;
	}

	public void setI(int i) {
		this.i = i;
	}

	public long getL() {
		return l;
	}

	public void setL(long l) {
		this.l = l;
	}

	public static byte getB() {
		return b;
	}
}

 


public class SampleBaseClass {

	protected short s = 20;
}


How Can We Get the Memory Address of a Class?

There is no simple way for getting the memory address of a Java class. In order to get it, tricky stuff must be done and sacrifices must be made! Here we explain the two methods of achieving this trick.

Method #1: In the JVM, every object has a pointer to its class, but only to its concrete class and not to its interface or abstract class. If we get the memory address of an object, we can get the address of its class easily. This method is useful only for classes whose instances can be created. Neither interfaces nor abstract classes can be used in this way.


For 32 bit JVM:
	_mark	: 4 byte constant
	_klass	: 4 byte pointer to class 

For 64 bit JVM:
	_mark	: 8 byte constant
	_klass	: 8 byte pointer to class

For 64 bit JVM with compressed-oops:
	_mark	: 8 byte constant
	_klass	: 4 byte pointer to class


The second field in the object layout in memory (for a 32-bit JVM, the offset is 4, for a 64-bit JVM offset is 8 from the address of an object in memory) points to the class definition of object in memory. For getting a value at this offset, you can use “sun.misc.Unsafe” class. The SampleClass used here is listed in the section above.


For 32 bit JVM:
	SampleClass sampleClassObject = new SampleClass();
	int addressOfSampleClass = unsafe.getInt(sampleClassObject, 4L);

For 64 bit JVM:
	SampleClass sampleClassObject = new SampleClass();
	long addressOfSampleClass = unsafe.getLong(sampleClassObject, 8L);

For 64 bit JVM with compressed-oops:
	SampleClass sampleClassObject = new SampleClass();
	long addressOfSampleClass = unsafe.getInt(sampleClassObject, 8L);


Method #2: With this technique, address of any class (interface, annotation, abstract class, enum) can be found. There memory address of a class definition in Java 7 looks like this: for a 32-bit JVM, 4 bytes from an 80-byte offset, and for a 64-bit JVM, 8 bytes from a 160-byte offset and for a 64-bit JVM with compressed-oops, 4 bytes from 84-byte offset.

There aren't defined offsets, but they are documented in the class file parser as "hidden" fields (there are actually three fields here: class, arrayClass, resolvedConstructor), meaning they just happen to be at that offset, because there are 18 non-static reference-type fields in java.lang.Class.

The code samples to retrieve the addresses are listed here:


For 32 bit JVM:
	int addressOfSampleClass = unsafe.getInt(SampleClass.class, 80L);

For 64 bit JVM:
	long addressOfSampleClass = unsafe.getLong(SampleClass.class, 160L);

For 64 bit JVM with compressed-oops:
	long addressOfSampleClass = unsafe.getInt(SampleClass.class, 84L);


How Can we Get the Memory Address of an Object?

Getting the memory address of an object is trickier than getting the address of the class. In order to get the address, we use length and java.lang.Object typed helper array, whose length is 1.

Here are the steps for getting the address of an object:

  1. Set the target object as the first element (only element) of a helper array. Since the element is a complex type (non-primitive), its address is stored in an array with an index of 0.
  2. Then get the base offset of the helper array. The base offset of an array is the offset of starting point of elements to starting address of array object.
  3. We need to check the JVM address size:
    • If the JVM is 32 bit, read an integer value from <address_of_array> + <base_offset_of_array> via sun.misc.Unsafe class. This 4-byte integer value is the address of the target object.
    • If the JVM is 64 bit, read a long value from <address_of_array> + <base_offset_of_array> via sun.misc.Unsafe class. This 8-byte long value is the address of the target object.

For 32 bit JVM
	Object helperArray[] 	= new Object[1];
	helperArray[0] 		= targetObject;
	long baseOffset 		= unsafe.arrayBaseOffset(Object[].class);
	int addressOfObject	= unsafe.getInt(helperArray, baseOffset);

For 64 bit JVM
	Object helperArray[] 	= new Object[1];
	helperArray[0] 		= targetObject;
	long baseOffset 		= unsafe.arrayBaseOffset(Object[].class);
	long addressOfObject	= unsafe.getLong(helperArray, baseOffset);


You can think of targetObject as an instance of SampleClass, which was given in the above section. But also keep in mind that the object could be any instance of any class!


Class Memory Layout

32-bit JVM


	[header                ] 4  byte
	[klass pointer         ] 4  byte (pointer)
	[C++ vtbl ptr          ] 4  byte (pointer)
	[layout_helper         ] 4  byte
	[super check offset    ] 4  byte 
	[name                  ] 4  byte (pointer)
	[secondary super cache ] 4  byte (pointer)
	[secondary supers      ] 4  byte (pointer)
	[primary supers        ] 32 byte (8 length array of pointer)
	[java mirror           ] 4  byte (pointer)
	[super                 ] 4  byte (pointer)
	[first subklass        ] 4  byte (pointer)
	[next sibling          ] 4  byte (pointer)
	[modifier flags        ] 4  byte
	[access flags          ] 4  byte


64-bit JVM


	[header                ] 8  byte
	[klass pointer         ] 8  byte (4 byte for compressed-oops)
	[C++ vtbl ptr          ] 8  byte (4 byte for compressed-oops)
	[layout_helper         ] 4  byte
	[super check offset    ] 4  byte 
	[name                  ] 8  byte (4 byte for compressed-oops)
	[secondary super cache ] 8  byte (4 byte for compressed-oops)
	[secondary supers      ] 8  byte (4 byte for compressed-oops)
	[primary supers        ] 64 byte (32 byte for compressed-oops)
                                         {8 length array of pointer}
	[java mirror           ] 8  byte (4 byte for compressed-oops)
	[super                 ] 8  byte (4 byte for compressed-oops)
	[first subklass        ] 8  byte (4 byte for compressed-oops)
	[next sibling          ] 8  byte (4 byte for compressed-oops)
	[modifier flags        ] 4  byte
	[access flags          ] 4  byte


The memory layout of the SampleClass can be detailed as below for 32-bit JVM. 128 bytes are listed from its starting address.

java class memory layout samplebaseclassjava class memory layout samplebaseclass

And the memory layout of the SampleBaseClass can be detailed as below for 32-bit JVM. Also 128 bytes are listed from its starting address.

java class memory layout samplebaseclass

We will only explain the important fields. The colored numbers also match the legend mapping given below.

header always has the constant value of “0x00000001”.

klass pointer is the pointer (0x38970v8a8 for both of classes) to definition of java.lang.Class class in memory, since this memory structure refers to a class.

C++ vtbl ptr is the pointer to the virtual table definition structure of a defined class. A virtual table is a mechanism used to support dynamic dispatch, which is the process of selecting which implementation of a polymorphic method to call at runtime.

layout helper refers to the shallow size of the instance. This size is calculated by considering the field layout alignment mechanism of the JVM. In our environment, object alignment size is 8 bytes.

super is the pointer to cthe lass definition of a super class, which is SampleBaseClass in our example. In this example, the value of this field is 0x34104b70, which is address of the class definition of SampleBaseClass as you can see. For the SampleBaseClass class, this field value is 0x38970000 which is the address of java.lang.Object class. Because in Java, every class is a subclass of the Object class ☺.

modifier flags refer to the Java class modifiers which could be “public”, “protected”, “private”, “abstract”, “abstract”, “static”, “final” and “strictfp”. The “modifier flags” value of class is calculated by operating “Bitwise OR” operation on modifier values of target class. In our example, SampleClass class is “public” and “final” class. Hence “modifier flags” value for this class is calculated as “0x00000001 | 0x00000010 = 0x00000011”. SampleBaseClass is only “public” class and so its “modifier flags” value is “ 0x00000001 ”. The values for the modifiers are listed below.

Protected

0x00000002

Private

0x00000004

Abstract

0x00000400

Static

0x00000008

Final

0x00000010

Strict

0x00000800

Public

0x00000001

Field Layouts & Alignments

Unlike C/C++, Java doesn't have the sizeOf operator to calculate how much space is being consumed for primitive types or objects which would be useful for IO Operations / memory management and etc. Actually, having an operator doesn't make sense since the size of the primitive types are defined in the language specification and Java doesn't have the pointers to be used for memory copy and pointer arithmetic.

There are two approaches to defining how much space is consumed within its properties of a class. They are the shallow size and deep size. Shallow Size depicts the size of the object with its own fields but not with the object references that it might contain. That’s what Deep Size is for. It extends the shallow size with the size of the objects that are referenced within the class.

With the Sun Java Virtual Machine Specification, every object, but not the arrays, has a two-word header, one-word flags and another one word as a reference to the object's class. When a new object gets created by new Object(), 8 bytes will be consumed from the heap for these two header words.

When a class extends the Object class, things get a little bit complicated and also interesting. After the 8 bytes, the attributes of the class gets aligned in the heap. But they are not aligned according to their definition order.

The primitive types get aligned in the following order:

  • doubles and longs
  • ints and floats
  • shorts and chars
  • booleans and bytes

Then references to other classes get aligned within the heap for the object. The JVM fits the size of the object to an 8-byte granularity.

Take this example:

class BooleanClass { byte a; } This will be stretched to a size of 16 bytes with padding of 7 bytes. Header : 8 bytes value of byte : 1 byte padding : 7 bytes


More on OOPs

Some basic information about OOPs have been given at “OOPs and Compressed OOPs” section before. We assume that you know terminology of OOPs for JVM now. Lets have an in-depth look at OOPs.

An OOP consists of two machine word-sized fields (4 byte for a 32-bit JVM, 8 byte for a 64-bit JVM) called the "Mark" and the "Klass". These fields are followed by the member fields of this instance. But an array has an extra header word before the fields - the array’s length. The "Mark" field is used in garbage collection (in the mark part of mark-and-sweep) and the "Klass" field is used as a pointer to class metadata. Both primitives and reference fields are laid out after the OOP header - and object references are, of course, also OOPs. ( http://www.infoq.com/articles/Introduction-to-HotSpot )

KlassOOPs

The "Klass" field is a pointer to the metadata (such as field definitions, methods of this class, which are expressed as a C++ virtual method table) for this class. Since carrying details of all fields and methods of every instance would be very inefficient, "KlassOOP" is a good way of sharing that information among instances.

It’s also important to note that the "KlassOOPs" are different from the Class objects that are the result of class loading operations. The difference between the two can be summarized as:

  • Class objects (e.g. String.class) are just regular Java objects. They are represented as OOPs like any other Java objects ("InstanceOOPs") and have the same behaviour as any other objects, and they can be put into Java variables.
  • "KlassOOPs" are the JVMs representation of class metadata. For example, they carry the methods of the class in a vtable structure. It is not possible to obtain a reference to a "KlassOOP" directly from Java code because they live in the Permgen area of the heap. The easy way to remember this distinction is to consider a "KlassOOP" as the JVM-level “mirror” of the Class object for the relevant class. [/list]

MarkOOPs

The "Mark" field of the OOP header is a pointer to a structure which holds housekeeping information about the OOP. On a 32-bit JVM, the bitfields of the mark structure look like this:

  • Hash (25 bits): Comprise the hashCode() value of the object.
  • Age (4 bits): The age of the object (in terms of number of garbage collections the object has survived).
  • Biased_lock (1 bit) + Lock (2 bits): Indicate the synchronization lock status of the object

Java 5 introduced a new approach to object synchronization, called "Biased Locking" (and it was made the default in Java 6). The idea is based around the observed runtime behaviour of objects and in many cases objects are only ever locked by one thread. In "Biased locking", an object is “Biased” towards the first thread that locks it and this thread then achieves much better locking performance. The thread which has acquired the bias is recorded in the "Mark" header:

  • JavaThread* : 23 bits
  • Epoch : 2 bits
  • Age : 4 bits
  • Biased_lock : 1 bit
  • Lock : 2 bits [/list]

If another thread attempts to lock the object, then the bias is revoked (it will not be reacquired) and from then on all threads must explicitly lock/unlock the object. The possible states for the object are as follows:

  • Unlocked
  • Biased
  • Lightweight Locked
  • Heavyweight Locked
  • Marked (only possible during Garbage Collection)

32-bit JVM


[mark]         ] 8  byte
[klass pointer ] 8  byte (pointer)
[fields        ] values of all fields including fields from super classes


64-bit JVM


[mark]         ] 8  byte
[klass pointer ] 8  byte (4 byte for compressed-oops)
[fields        ] values of all fields including fields from super classes


Let’s chat a bit about deep size calculation, and move on by introducing inheritance. We’ll continue with our SampleClass and SampleBaseClass for 32-bit JVM. The memory layout of the SampleClass object is given below. Please have a look at the code and attributes of the both classes to have a better understanding.

java class memory layout samplebaseclass

The mark field is the first worvd contains the object's identity hash code plus some flags like lock state and age.

The klass field has the value 0x32104cc0 which is the pointer to SampleClass class definition.

The field ‘s’ has the value 20 (0x0014), which is the value of field s from super-class (SampleBaseClass class). Fields from super-class are layout first and never mixed up with sub-class fields. Field layouts of superclass ends with granularity as “WORD” size of system. After fields are layout, if “WORD” size is 4 byte, padding bytes are added to fill gap until it reaches to 4 byte granularity; else if “WORD” size is 8 byte, padding bytes are added to fill gap until it reaches to 4 byte granularity. Here 2 padding bytes (0x0000) are added to fill gap to 4 byte (as “WORD” size) granularity.

The field ‘i’ has the value 5 (0x00000005). Class attributes are ordered like this: first longs and doubles; then ints and floats; then chars and shorts; then bytes and booleans, and last the references as stated before. The attributes are aligned to their own granularity. When the first field of a subclass is a double or long and the superclass doesn't align to an 8 bytes boundary, JVM will make an exception here and try to put an int, then shorts, then bytes, and then references at the beginning of the space reserved to the subclass until it fills the gap. So integer fields are laid out before long fields. So, field i is laid out before the field l.

The field l is laid out after the field i and in here, 10 (,0x000000000000000a) is the value of field l.


More to Come on Java Memory Management

We hope you’ve enjoyed this deeper look into this very cool niche in Java. Hopefully you’ve been exposed to some new ideas, and you now know more about the Unsafe backdoor to do a number of low-level operations for manipulating memory and threads directly. You can now easily get the memory address of a Class or an instance of it by navigating through the object of a class or by using the predefined offset addresses of the classes within the memory.

Now you can also reveal the memory layout of a class with its fields and show how they align perfectly to consume the minimum possible memory. We have used our SampleClass class definition all over the article to make it more understandable and consistent to the reader. We also detailed all the related examples for 32-bit and also for 64 bit JVM so we hope that this article reaches out on more people.

We’re planning to publish another article in this series, one which will take you to the off-heap memory management architecture. You will get to know what off-heap is, and we’ll even demonstrate the byte buffer methodology by storing a ton of data in memory, plus a comparison for the metrics on read/write access, sequential or random, with the traditional approach. Finally comparing it with the GC.

Thanks for tuning in! It would be great to continue this conversation in the comments below, so if you read this and love/hate what we’re saying, just say so. Talk to us on Twitter @jrebel_java.


Looking for ways to improve code performance in Java?

XRebel helps developers to find performance issues during development -- and that's whether you're in the monolith, microservices, or somewhere in between.

Try it free for 10 days on your project and see how it can make your application perform better.

Try XRebel for Free