image blog java parallel streams are bad for your health
July 3, 2014

Java Parallel Streams Are Bad for Your Health!

Developer Productivity

This post continues the series that started with sneaky default methods in interfaces, which when used unwisely can cause your application to turn into a code mess that you don't want to look at. Today we will look at parallel streams and how using them unwisely can cause problems.

Parallel Streams Overview

As we all know, Java 8 delivers three major features everyone is eager to use: Lambdas, stream API and default methods in interfaces. Sadly, all of them can easily be abused and can actually be a detriment to your code if you add them to your toolbelt. If you want to get an overview of the pitfalls that await you with serial stream processing, check out this post on the jOOQ blog by Lukas Eder.

But for now let’s focus on the parallel execution that the stream API is praised for. Allegedly, it might speed up some tasks your application executes by utilizing multiple threads from the default ForkJoinPool.

image of neurons streaming nerve impulses

A Problematic Example of Java Parallel Streams

Here’s a classic example of the awesomeness that parallel streams promise you. In this example we want to query multiple search engines and return the output from the first to reply.

public static String query(String question) {
  List engines = new ArrayList();
  // get element as soon as it is available
  Optional result = - {
    String url = base + question;
    // open connection and fetch the result
    return WS.url(url).get();
  return result.get();

Nice, isn’t it? But let’s dig a bit deeper and check what happens in the background. Parallel streams are processed by the parent thread that ordered the operation and additionally by the threads in the default JVM’s fork join pool: ForkJoinPool.common().

However, an important aspect to notice here is that querying a search engine is a blocking operation. So at some point of time every worker thread will call the get() operation and sit right there waiting for the results to come back.

Hang on, isn’t this what we wanted in the first place? Instead of going through the list and waiting for each url to respond sequentially, we wait on all of the responses at the same time. Saving your time, just like using JRebel does (sorry couldn’t resist :-) ).

Parallel Waiting and ForkJoin

However, one side-effect of such parallel waiting is that instead of just the main thread waiting, ForkJoin pool workers are. And given the current ForkJoin pool implementation, which doesn’t compensate workers that are stuck waiting with other freshly spawned workers, at some point of time all the threads in the ForkJoinPool.common() will be exhausted.

Which means next time you call the query method, above, at the same time with any other parallel stream processing, the performance of the second task will suffer!

However, don’t rush to blame the ForkJoinPool implementation, in a different use case you’d be able to give it a ManagedBlocker instance and ensure that it knows when to compensate workers stuck in a blocking call. And get your scalability back.

Now, the interesting bit is, that it doesn’t have to be a parallel stream processing with blocking calls to stall the performance of your system. Any long running function used to map over a collection can produce the same issue.

Consider this example:

long a = IntStream.range(0, 100).mapToLong(x -> {
    for (int i = 0; i < 100_000_000; i++) {
    System.out.println("X:" + i);
  return x; 

This code suffers from the same problem as our networking attempt. Every lambda execution is not instantaneous and during all that time workers won’t be available for other components of the system.

This means that any system that relies on parallel streams have unpredictable latency spikes when someone else occupies the common ForkJoin pool.

Limiting Parallelism in ForkJoinPool

Indeed, if you’re creating an otherwise single-threaded program and know exactly when you intend to use parallel streams, then you might think that this issue is kind of superficial. However, many of us deal with web applications, various frameworks, and heavy application servers.

How can a server that is designed to be a host for multiple independent applications, that do who knows what, offer you a predictable parallel stream performance if it doesn’t control the inputs?

One way to do this is to limit the parallelism that the ForkJoinPool offers you. You can do it yourself by supplying the -Djava.util.concurrent.ForkJoinPool.common.parallelism=1, so that the pool size is limited to one and no gain from parallelization can tempt you into using it incorrectly.

Alternatively, a parallelStream() implementation that would accept a ForkJoinPool to be parallelized might be a workaround for that. Unfortunately it is not currently offered by the JDK.

Final Thoughts

Parallel streams are unpredictable and complex to use correctly. Almost any use of parallel streams can affect the performance of other unrelated system components in an unpredictable way. I have no doubt that there are people who can manage to use them to their benefit, clearly and correctly. However, I’d think twice before typing stream.parallel() into my code and would look twice when reviewing the code containing it.

Do you think I'm over-dramatizing the issue?  Let's talk on Twitter: @shelajev.

Additional Resources

Looking for additional ways to improve Java application performance? This webinar discusses some common performance-boosting techniques for Java microservices.

Looking for additional reading on using streams? This vJUG session from Dr. Venkat Subramaniam  is a great place to start.

See vJUG Session