Java Streams Cheat Sheet
In our post today, we're focusing on one of the most important and substantial features in the Java 8 release, the Streams API. These operation pipelines are tricky, so having a Java Streams cheat sheet can help keep the operations straight. You can download the cheat sheet by clicking the image or button at the bottom of the page.
What Is a Java Stream?
Let’s start off with what a stream actually is and what a stream isn’t! Here are some important points which shape how you should think about a stream:
- A Stream is a pipeline of functions that can be evaluated
- A Stream can transform data
- A Stream is not a data structure
- A Stream cannot mutate data
Most importantly, a stream isn’t a data structure. You can often create a stream from collections to apply a number of functions on a data structure, but a stream itself is not a data structure. That’s so important, I mentioned it twice! A stream can be composed of multiple functions that create a pipeline that data that flows through. This data cannot be mutated. That is to say the original data structure doesn't change. However the data can be transformed and later stored in another data structure or perhaps consumed by another operation.
We stated that a stream is a pipeline of functions, or operations. These operations can either be classed as an intermediate operation or a terminal operation. The difference between the two is in the output which the operation creates. If an operation outputs another stream, to which you could apply a further operation, we call it an intermediate operation. However, if the operation outputs a concrete type or produces a side effect, it is a terminal type. A subsequent stream operation cannot follow a terminal operation, obviously, as a stream is not returned by the terminal operation!
An intermediate operation is always lazily executed. That is to say they are not run until the point a terminal operation is reached. We’ll look in more depth at a few of the most popular intermediate operations used in a stream.
- filter - the filter operation returns a stream of elements that satisfy the predicate passed in as a parameter to the operation. The elements themselves before and after the filter will have the same type, however the number of elements will likely change.
- map - the map operation returns a stream of elements after they have been processed by the function passed in as a parameter. The elements before and after the mapping may have a different type, but there will be the same total number of elements.
- distinct - the distinct operation is a special case of the filter operation. Distinct returns a stream of elements such that each element is unique in the stream, based on the equals method of the elements.
Here’s a table that summarises this, including a couple of other common intermediate operations.
|Function||Preserves count||Preserves type||Preserves order|
A terminal operation is always eagerly executed. This operation will kick off the execution of all previous lazy operations present in the stream. Terminal operations either return concrete types or produce a side effect. For instance, a reduce operation which calls the Integer::sum operation would produce an Optional
|Function||Output||When to use|
|reduce||concrete type||to cumulate elements|
|collect||list, map or set||to group elements|
|forEach||side effect||to perform a side effect on elements|
Java Stream Examples
Let’s take a look at a couple of examples and see what our functional code examples using streams would look like.
Exercise 1: Get the unique surnames in uppercase of the first 15 book authors that are 50 years old or older.
So you know, the source of our stream, library, is an ArrayList. Check out the code and follow along with the description. From this list of books, we first need to map from books to the book authors which gets us a stream of Authors and then filter them to just get those authors that are 50 or over. We’ll map the surname of the Author, which returns us a stream of Strings. We’ll map this to uppercase Strings and make sure the elements are unique in the stream and grab the first 15. Finally we return this as a list using toList from
library.stream() .map(book -> book.getAuthor()) .filter(author -> author.getAge() >= 50) .map(Author::getSurname) .map(String::toUpperCase) .distinct() .limit(15) .collect(toList()));
Did you notice that as we read through the code we can describe what it’s doing line by line. Imagine doing that with imperative code!
Exercise 2: Print out the sum of ages of all female authors younger than 25.
library.stream() .map(Book::getAuthor) .filter(author -> author.getGender() == Gender.FEMALE) .map(Author::getAge) .filter(age -> age < 25) .reduce(0, Integer::sum)
Using the same original stream we once again map the elements from Books to Authors and filter just on those authors that are female. Next we map the elements from Authors to author ages which gives us a stream of ints. We filter ages to just those that are less than 25 and use a reduce operation and Integer::sum to total the ages.
Parallel Java Streams
You can parallelise the work you do in a stream in a couple of ways. Firstly you can get a parallel stream directly from your source by calling the parallelStream() method directly as shown:
Alternatively, you can call an intermediate operation on an existing stream which spawns off threads and executes further operations in parallel, as shown:
One important thing to note is that parallel streams achieve parallelism through threads using the existing common ForkJoinPool. As a result there are possible complications as we detailed in this previous RebelLabs post. Using parallel streams can cause concurrency issues depending on what you’re doing in your stream as well. Make sure you need to use a parallel stream for a big enough job, rather than using them by default. Also given you’re using the common ForkJoinPool, be sure not to run any blocking operations.
A classic example of a potential concurrency issue when using parallel streams is when updating a shared mutable variables from a forEach operation. Let’s consider the following code:
myList = new ArrayList<>(); library.parallelStream().forEach(e -> myList.add(e));
Would you expect multiple threads to concurrently access an ArrayList without issue normally? Of course you wouldn’t! So you shouldn’t expect it to within a parallel stream. In fact, it’s best not to do this even in a non-parallel stream, just in case someone tries to make it parallel in future. It tends to be safer to collect this into a List using the collect() operation.
Anyway, don't want to hold you any longer, I know you're pretty convinced by now that Java 8 streams are paramount to the best practices of development and you can get all the information about them in this concise 1 page cheat sheet! Or, if you're looking for additional cheat sheets, we have a great Java cheat sheet collection available via the link.