
Java Bytecode: Objects and Calling Methods
In this article, we'll discuss Java bytecode, how to use objects and call methods, and how to pass parameters to methods and return values. You can click the links below to jump to either section.
Looking Back at Java Bytecode
After my recent post on Java Bytecode Fundamentals, I received some valuable feedback from the community. The topic turned out to be quite popular and it seems that Java developers miss the old days of programming in assembly language and fitting the program into 64k of memory. I thought it would be a good idea to revise the post a little bit and touch on some aspects that were left uncovered.
The Java Bytecode Fundamentals post covered the very fundamental aspects related to Java bytecode — how to obtain the listings, how to read the mnemonics, the constant table, frame structure, local variable table, etc. After compiling Java source code with javac
, you obtain Java executables — the *.class files. Looking at the contents of those files, you can actually recognize some of the opcodes and even full frames, like in the screenshot below.
Speed Up Your Development
JRebel can save you hours of development time. Try it free for 10 days with a JRebel trial.
What Is Javap?
javap is the standard disassembler from JDK tools which can be used to view the mnemonical representation of the compiled Java class. You cannot do much with the results that
javap
provides, but you can read and understand it fairly well.
DISCLAIMER: If you haven't read the first article, it would be a good idea to take a look at that one first. Then, proceed to the text below for a greater understanding.
Now, let's take a dive into more specific aspects of Java bytecode: using classes, calling methods, and how the stack is involved in the whole process of passing the parameters to the methods.
Using Objects & Calling Methods
Creating class instances, calling methods, obtaining field values — all these operations reserve a dedicated opcode in the Java bytecode instruction set. Our aim is to reveal the code constructs that produce the desired bytecode instructions. Let's create a simple example: a class Scheduler
calls a method on another class JobImpl
via its interface Job
. JobImpl
then implements some logic to produce the result.
//Scheduler.java public class Scheduler { Job job = new JobImpl(); public void main() { String result = (String) job.execute(); print(result); } private static void print(String message) { System.out.println(message); } }
//Job.java public interface Job { Object execute(); }
// JobImpl.java import java.util.Random; public class JobImpl implements Job { public Object execute(){ Integer value = createRandomValue(); return incValue(value); } private Integer createRandomValue(){ return new Random().nextInt(42); } private Integer incValue(Integer value){ return value + 1; } }
To see the bytecode listings for Scheduler
and JobImpl
, we'll use javap
as follows:
javap -c -private Scheduler
javap -c -private JobImpl
The -private
option is required, as the source code includes some private methods for which bytecode listings will not be printed otherwise. We're not using -verbose
for brevity.
public class Scheduler extends java.lang.Object{ Job job; public Scheduler(); Code: 0: aload_0 1: invokespecial #1; //Method java/lang/Object."<init>":()V 4: aload_0 5: new #2; //class JobImpl 8: dup 9: invokespecial #3; //Method JobImpl."<init>":()V 12: putfield #4; //Field job:LJob; 15: return public void main(); Code: 0: aload_0 1: getfield #4; //Field job:LJob; 4: invokeinterface #5, 1; //InterfaceMethod Job.execute:()Ljava/lang/Object; 9: checkcast #6; //class java/lang/String 12: astore_1 13: aload_1 14: invokestatic #7; //Method print:(Ljava/lang/String;)V 17: return private static void print(java.lang.String); Code: 0: getstatic #8; //Field java/lang/System.out:Ljava/io/PrintStream; 3: aload_0 4: invokevirtual #9; //Method java/io/PrintStream.println:(Ljava/lang/String;)V 7: return }
Let's start our review from Scheduler
's constructor that was generated by the compiler. Some may have expected the generated constructor to be empty, but there's the Job
field to be initialized. The first few lines for the constructor are the same as if we had an empty class without any fields:
0: aload_0 1: invokespecial #1; //Method java/lang/Object."<init>":()V
Next, we can see part of the initializer included in the constructor. First, the reference to the Scheduler
instance is loaded with aload_0
again as it was previously removed during the invokespecial
call.
On the next line, we can now see the use of the new
instruction that creates a new object of type identified by the class reference in the constant pool. Indeed, the constant #2
refers to JobImpl
. The newly created object reference will actually be pushed onto the stack. We can see that the instructions followed are invokespecial
and putfield
that will both pop the stack up. Therefore we'll need to save the reference to JobImpl
twice. This is done using dup
instruction.
5: new #2; // create the instance of JobImpl 8: dup // duplicate the instance on the stack 9: invokespecial #3; // call "<init>" and pop the stack 12: putfield #4; // stores the object reference in Job field, and pop the stack
Opcode 0xBB: new
As you may have noticed, the new
opcode is only used to "create a reference" of the type, but in order to initialize the object, it is still required to call <init>
on that object reference. In fact, the four-instruction-sequence (new/dup/invokespecial/astore
) is a common pattern when an object is new'ed and stored into a local variable. You can read bytecode faster if you remember this rule :)
invokeinterface (0xB9) and invokestatic (0xB8)
If we proceed to reading the bytecode listing for the Scheduler
class, in the main()
method, we can see a couple of instructions that are related to method calls — invokeinterface
and invokestatic
. The Scheduler
's field job
was declared using the Job
interface — i.e., all the calls will actually be interface calls. Therefore, we can see the execute()
method being called using the invokeinterface
instruction rather than invokevirtual
, explained later.
0: aload_0 1: getfield #4; //Field job:LJob; 4: invokeinterface #5, 1; //InterfaceMethod Job.execute:()Ljava/lang/Object;
invokestatic
is used to call the class methods, i.e., if the target method is declared with the static
keyword. The method is identified by a reference in the constant pool, so there's no need to load the target object reference to the stack — it only requires the parameters to be passed in.
13: aload_1 14: invokestatic #7; //Method print:(Ljava/lang/String;)V
invokevirtual (0xB6)
Further on in the Scheduler
class bytecode listing, we find the print(..)
method that includes one more instruction related to method invocation on an object reference — invokevirtual
. If you see the invokevirtual
opcode, you can be pretty sure that the method is being called directly on the class instance, without using an interface, and the method access is not private
. In our example, we can see that invokevirtual
is called on an instance of the java.io.PrintStream
class:
4: invokevirtual #9; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
invokespecial (0xB7)
In the example above, you probably spotted the invokespecial
instruction in use. The instruction is used to invoke instance method on object reference. Here's a good place for a question — what is the difference between invokespecial
and invokevirtual
?
The answer can be easily found if one reads the Java VM Spec carefully:
The difference between the invokespecial and the invokevirtual instructions is that invokevirtual invokes a method based on the class of the object. The invokespecial instruction is used to invoke instance initialization methods as well as private methods and methods of a superclass of the current class.
In other words, invokespecial
is used to call methods without concern for dynamic binding, in order to invoke the particular class' version of a method.
Passing the Parameters to Methods, Returning Values
For the last topic on this post, let's grasp how the parameters are passed to methods and how the result is returned. For this example, we will use the JobImpl
class introduced earlier in the article. Using javap -c -private JobImpl
, the bytecode listing is printed as follows:
public class JobImpl extends java.lang.Object implements Job{ public JobImpl(); Code: 0: aload_0 1: invokespecial #1; //Method java/lang/Object."<init>":()V 4: return public java.lang.Object execute(); Code: 0: aload_0 1: invokespecial #2; //Method createRandomValue:()Ljava/lang/Integer; 4: astore_1 5: aload_0 6: aload_1 7: invokespecial #3; //Method incValue:(Ljava/lang/Integer;)Ljava/lang/Integer; 10: areturn private java.lang.Integer createRandomValue(); Code: 0: new #4; //class java/util/Random 3: dup 4: invokespecial #5; //Method java/util/Random."<init>":()V 7: bipush 42 9: invokevirtual #6; //Method java/util/Random.nextInt:(I)I 12: invokestatic #7; //Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer; 15: areturn private java.lang.Integer incValue(java.lang.Integer); Code: 0: aload_1 1: invokevirtual #8; //Method java/lang/Integer.intValue:()I 4: iconst_1 5: iadd 6: invokestatic #7; //Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer; 9: areturn }
The incValue(..)
method is the one we're looking for — it takes a parameter in and returns a value. The incValue(..)
method is called from execute()
using the invokespecial
instruction, of course, because it is declared private
. So how is the parameter being passed to incValue
?
Let's read the execute()
method code:
0: aload_0 // load the reference to this 1: invokespecial #2; // call createRandomValue() 4: astore_1 // store the result to local variable #1 5: aload_0 6: aload_1 // load the value of local variable #1 to the stack 7: invokespecial #3; //Method incValue:(Ljava/lang/Integer;)Ljava/lang/Integer; 10: areturn
Before calling invokespecial
, the program loads the value of the local variable number 1 to the stack. This is how the parameter is passed. invokespecial
pops the stack as many times as the number of parameters it is about to consume, according to the method signature. That said, aload
is the type of instruction that prepares the parameters for a method call.
To return a value the program calls, the areturn
instruction means that it returns an object reference from a method. The instruction is prefixed with the 'a' character to indicate that we deal with an object reference here. If we tried to return a value of int
type, the instruction would have been ireturn
. There are also lreturn
, dreturn
, freturn
used for long
, double
and float
accordingly. The mechanics of the instruction is as follows: the result is popped from the operand stack of the current frame and pushed onto the operand stack of the frame of the invoker. The interpreter returns control to the invoker afterwards.
Final Thoughts
In the two sections above, we had a chance to observe how objects are created at the bytecode level, what the opcodes for method invocation are, how parameters are passed to the method invocations and how the return value is passed back. The important part to understand is how the stack is involved into the operations, along with the local variable table taking part in the game.
Additional Resources
There are some great articles on this topic via other resources. I'd like to give credit to Peter Haggar'sarticle at developerWorks. That is a superb piece, though a little outdated by now. Also, there's Ted Neward's The Working Developers Guide to Java Bytecode that explains all that I mentioned above, and even more. With all that said, even if such articles exist and you can find good coverage of the topic in the documentation of bytecode crunching utilities like ObjectWeb ASM, it is still worth it to refer to The Java Virtual Machine Specification in order to get the information "from the source,"
Finally, check out our blog post about mastering Java byte code.
Looking for more Java resources? Our resources page has everything from recorded webinars to white papers and cheatsheets.