Performance Antipatterns in Java | The N+1 Problem
In our latest Java developer report, over half of Java developers reported application performance requirements in development. This is a clear indication that are being asked to play a bigger role in application performance. But why? As applications (and the tools used to build them) get more complex, and data demands grow bigger and bigger, it’s crucial that development teams produce applications that perform efficiently at baseline and at scale. But there are plenty of obstacles to that goal: performance antipatterns like the N+1 Problem can slow down or break the best applications.
In this article, we’re looking at the N+1 problem, common culprits behind N+1 issues in Java, and how developers can easily identify and fix this issue during development.
Performance Antipatterns in Java
Performance antipatterns typically center around inefficient or superfluous queries that compound at load or scale. These patterns can occur for a variety of reasons, but the end result can range from poor performance to cascading failure.
The antipattern we’re looking at today is called the N+1 Problem. It's marked by a series of excess requests to a database and frequently accompanies object-relational mapping (ORM) tools.
What Is an N+1 Problem?
An N+1 Problem, also known as an N+1 Select Problem or N+1 Query, happens when a service requests a number of lines (N) of data from a database, then individually requests dependent data for those items N times.
In the example from the distributed Spring PetClinic demo application below, we see the vet.specialties method requesting 24 rows from the database, then looping through and requesting 24 individual rows from the database with individual queries.
That’s 24 calls, plus the initial call to the database for 24 rows of data itself — or, succinctly, N + 1 calls. Those individual row calls, of course, could have been accomplished by a single call.
What Causes the N+1 Problem in Java Applications?
There are a few common culprits, and they typically have to do with ORM tools or frameworks and the way they generate queries.
1. ORM Frameworks
Object oriented languages like Java often need to work with relational databases. That either means a developer or database administrator needs to write (optimized) SQL requests, or they need to use an intermediary layer, like an ORM framework, that generates compatible requests for that database. While functionally great, ORM frameworks have a reputation for creating unoptimized queries — including N+1 queries.
2. Lazy Loading
ORM frameworks like Hibernate, by default, can use FetchType.LAZY in their generated databased requests. And, because those queries aren’t holding session, each N+1 query to that database every time it’s needed by the requesting service.
3. Developers and Database Administrators
The truth is that these ORM frameworks, while handy, can’t replace the honest efforts of a developer or Database Administrator writing the requests themselves. And, because ORM tools are generating queries without much oversight from the developer (aside from functionality), developers aren’t seeing this issue until it comes up in production.
How to Fix N+1 Problems in Java Applications
In this example, we’ll look at how to identify an N+1 issue using the distributed version of Spring Pet Clinic demo application and our distributed tracing tool, XRebel. XRebel will allow us to easily pinpoint a request that’s taking too long to complete.
If we begin to investigate to into further in the application we can understand that we are spending about the same amount of time initializing the Pets as we are initializing the Visits within the OwnerController.process.find.form.
Finding the N+1 Problem
I understand that we are only taking 86.5ms to process this request, however, we can see that we are taking twice as long processing both sets of query tables. Next, we'll proceed to the I/O view where we can further investigate the queries and determine if there is an issue.
Looking at the queries we can see that we are calling the find all owners, and that we are making a single query that is returning 13 rows of data.
Looking the at the visits table, we see that we are calling all visits that a pet has. In order to fetch the visits, a separate query is being issued for each pet. If, for example, an owner has 3 separate pets, then 3 additional queries will be executed to fetch the visits information from the pets. This is know as an N + 1 problem — where a single query is designed to fetch all N pets with an additional N queries to fetch all visits of those pets.
I added a new owner Spencer Last and with that new owner added three new pets N+1, Dopey and Jake. From the I/O we will see that we have increase the query count to 16.
Fixing the N+1 Problem
To remedy this we will be changing our fetching strategy in the Pet Class from Eager to Lazy.
After making the change we will return to the application refresh the page and look at our results.
When comparing the previous request with the updated code we can immediately see that we have reduced to the time spent in the Loader.doQueryAndInitializeNonLazyCollections method trace by over 2x.
We can also see in the XRebel Comparison view that we have removed branch from the request call tree AbstractPersistentCollection.forceInitialization. Next, if we proceed to the I/O view we can see that we have significantly reduced the number of queries to 7!
Now we are only calling the Owners and Pets in one query and returning the Pet types in another.
Looking for additional reading on solving your application performance issues? Be sure to check out our available Java resources:
- How To Improve JPA Performance
- JPA Performance Best Practices
- Testing Microservices in Java
- Exploring Java Hibernate
- SQL vs. NoSQL
Save Time on Development With JRebel
JRebel lets you skip redeploys altogether, saving hundreds of hours on development. Want to try it on your project? Get a free, 10-day trial when you click the link below.