streams the real powerhouse in java 8 by venkat subramaniam
January 18, 2016

Java Streams: The Real Powerhouse in Java 8

Java Tools
Java Application Development

Welcome to the new year of wonderful online presentations about software engineering, Java, best practices, tools and technology choices brought to you by the Virtual JUG – the online Java User Group. In this article we present a recap of the Virtual JUG presentation by Dr. Venkat Subramaniam on the topic of when to use Java streams and how to make the most of them.

About the Presentation

The first session in 2016 was a real treat! None other than the man himself, Dr. Venkat Subramaniam, delivered a fast-paced, astonishing presentation: Streams: The Real Powerhouse in Java 8. This was Venkat’s second time on Virtual JUG, the first time he spoke about creating reactive applications. This session was, however, all about the Java 8 streams API: What do they represent, what are the common code patterns around streams and how to get the most of them in your codebase? Just click on the video below and enjoy the session.

Java 8 Introduces the Java Stream API

The official Java 8 release came with a myriad of features, the most prominent of which are undoubtedly lambdas and the Java stream API. Many projects upgraded to Java 8 just to leverage the sweet lambda syntax, or because existing frameworks updated themselves to use them. Java streams are no less important.

The whole idea of Java streams is to enable functional-style operations on streams of elements. A stream is an abstraction, it’s not a data structure. It’s not a collection where you can store elements. The most important difference between a stream and a structure is that a stream doesn’t hold the data. For example you cannot point to a location in the stream where a certain element exists. You can only specify the functions that operate on that data. A stream is an abstraction of a non-mutable collection of functions applied in some order to the data.

When to Use Java Streams

Java streams represent a pipeline through which the data will flow and the functions to operate on the data. As such, they can be used in any number of applications that involve data-driven functions. In the example below, the Java stream is used as a fancy iterator:


List numbers = Arrays.asList(1, 2, 3, 4); 
List result = numbers.stream()
  .filter(e -> (e % 2) == 0)
  .map(e -> e * 2)
  .collect(toList());


In this example we select only even values, by using the filter method and doubled them by mapping the function that doubles the input. What does this provide us? The streams API gives us the power to specify a sequence of operations on the data in individual steps. We don’t specify any conditional processing code, we are not tempted to write large complex functions, we don’t care about the data flow. In fact, we only bother ourselves with one data processing step at a time: we compose the functions and the data flows through the functions by itself by the power of the streams framework. The example above shows one of the most important pattern you’ll end up using with the streams:

  • Raise a collection to a stream
  • Ride the stream: filter values, transform values, limit the output
  • Compose small individual operations
  • Collect the result back into a concrete collection

Common Operations in Java Streams

In Java 8 you can easily obtain a stream from any collection by calling the stream() method. After that there are a couple of fundamental functions that you’ll encounter all the time.

Here are some common operations in Java streams:

  • Filter - returns a new stream that contains some of the elements of the original. It accepts the predicate to compute which elements should be returned in the new stream and removes the rest. In the imperative code we would employ the conditional logic to specify what should happen if an element satisfies the condition. In the functional style we don’t bother with ifs, we filter the stream and work only on the values we require.
  • Map - transforms the stream elements into something else, it accepts a function to apply to each and every element of the stream and returns a stream of the values the parameter function produced. This is the bread and butter of the Java streaming API, map allows you to perform a computation on the data inside a stream.
  • Reduce - (also sometimes called a fold) performs a reduction of the stream to a single element. You want to sum all the integer values in the stream - you want to use the reduce function. You want to find the maximum in the stream - reduce is your friend.
  • Collect - is the way to get out of the streams world and obtain a concrete collection of values, like a list in the example above.

Of course you won’t use all of these functions every time you encounter a stream, but you have them available to use at will.

Potential Issues With Java Streams

There are some caveats of using the Java streaming API though, and Venkat showed us a great example of the stream processing getting a tad out of hands. Imagine we have the following class Person:


class Person { 
   Gender gender; String name; 
   public Gender getGender() { return gender; }
   public String getName() { return name; }
}
enum Gender { MALE, FEMALE, OTHER }


This is a typical Java bean with some getters on the fields. Now, suppose we have a list of these persons and want to get the list of uppercase names of all the “FEMALE” people in that list. Easy you say, right?


List names = new ArrayList(); 
List people = …
people.stream()
  .filter(p -> p.getGender() == Gender.FEMALE)
  .map(Person::getName)
  .map(String::toUpperCase)
  .forEach(name -> names.add(name)); 


The code is so natural, we just follow the specification of what we have to do at every step. The problem is though in the mutation of the shared state. We know nothing of the nature of the stream at our hands and if the stream is parallel, the concurrent addition of the elements into the stream can lead to errors.

Using Java Stream Collect to Avoid Concurrency Errors

Instead, we should have collected the stream into the resulting list, making worrying about the concurrency and mutability the responsibility of the streams framework. Here’s the example of how to do so:


List people = …
List names = people.stream()
  .filter(p -> p.getGender() == Gender.FEMALE)
  .map(Person::getName)
  .map(String::toUpperCase)
  .collect(Collectors.toList()); 


In general, the Collectors class provides almost all necessary primitives to transform a stream into a concrete collection. One of the examples Venkat showed was the toMap() collector. You might be confused about how can an element be transformed into a key-value pair required for the map. Easy, you specify a function that turns the element into the key and another function that creates the value. Here’s an example that collects the same stream of people into a map:


List people = …
Map<String, Person> names = people.stream()
  .collect(Collectors.toMap(p -> p.getName(), p -> p)); 


The first function given to the toMap method transforms the element into the key and the second to the value for the map.

Intermediate and Terminal Operations

One of the virtues of Java streams is that they are lazily evaluated. Some operations on the streams, particularly the functions that return an instance of the stream: filter, map, are called intermediate. This means that they won’t be evaluated when they are specified. Instead the computation will happen when the result of that operation is necessary. This means that if we just specify the code like:


Stream names = people.stream()
  .filter(p -> p.getGender() == Gender.FEMALE)
  .map(Person::getName)
  .map(String::toUpperCase);


None of the names will immediately collected and made into the upper case. When does the computation occur, you might ask. When a terminal operation is called. All operations that return something other than a stream are terminal. Operations like forEach, collect, reduce are terminal. This makes streams particularly efficient at handling large amounts of data.

On top of that, one can almost always try to parallelize the stream processing by converting the stream into a parallel stream by calling the parallel() method. Note, that although the stream doesn’t have to be parallelizable, the method to parallelize it is always there. So depending on the internal nature of the stream you can get the performance benefits.

Final Thoughts

There are pitfalls of running every stream operation in parallel, because most streams implementations use the default ForkJoinPool to perform the operations in background. Thus, you can easily make the particular stream processing a bit faster, but instead sacrifice the performance of the whole JVM without even realizing it!

Naturally, solving the problems using functional programming requires a different way of thinking. But with a bit of experimentation you can definitely get a handle of that. And often, you can really struggle with coming up with a functional solution, but once you get it, you realize that it’s not particularly complicated. And then the next time solving a similar problem will be much easier.

Additional Resources

One of the best resources to learn about streams is surprisingly the javadoc of the java.util.stream package. It will guide you through the common stream idioms, explain how streams are lazy evaluated and the difference between intermediate and terminal stream operations, the possibilities to parallelize a stream and so on.

Agile Learner - is a website where one can access many more presentations by Venkat. It is a commercial resource, so you’d be required to purchase the access to the videos. But it might totally be worth it! We have previously published a 1 page cheat sheet that talks about the best practices of using lambdas, streams and Optionals in Java 8. Print it out and have it handy near your desk to remember some typical idioms.

Want to learn more about the latest features in Java? Be sure to visit our resource hub, Exploring New Features in Java. It covers everything from JEPs, to JDK enhancement projects for releases since Java 8.

Don't think you can commit all this Java streams goodness to memory? We've created a handy cheat sheet for you.

Interview with Venkat Subramaniam

After the session Venkat has joined me for our regular interview with the Virtual JUG speaker. Besides asking him about his favorite JVM language and what’s the most terrible habit a programmer can have, we discussed one real-life experiment on the readability of imperative code vs. the code written in the functional style. Venkat shares his opinion on the adoption of Java 8 and how he thinks teams should go about adding Java 8 streams into their code.

 

Try JRebel for Free

If you're working in Java 8, be sure to check out JRebel. JRebel helps developers to save time during application development by skipping rebuilds and redeploys while maintaining application state.

Try it free for 10 days on your application by clicking the button below.

Try JRebel for Free