Blog

December 6, 2023

How to Improve JPA Performance

Java Tools,

Enterprise Development

Developers often cite subpar JPA performance as one of their biggest productivity hurdles. However, many of those issues can be mitigated by harnessing features in the latest JPA version to avoid problems and boost performance. Learn more in this blog post.

What is JPA?
Common Causes of JPA Performance Issues
Issue #1: Too Many SQL Queries
Issue #2: Updating Entities Individually
Issue #3: Processing Data in the Database
Final Thoughts on JPA Performance

What is JPA?

JPA stands for Jakarta Persistence API and is a Java EE program interface. The latest version, JPA 3.1, was released in spring 2022.

📚 Further reading: Jakarta EE Overview

Common Causes of JPA Performance Issues

If you take a close look at the performance issues in JPA, quite often you will find similar root causes. These JPA performance issues can include:

Using too many SQL queries to fetch the required entities from the database, i.e., the n+1 query problem
updating entities one by one instead of doing it in using a single statement
doing data heavy processing on the Java side, rather than the database side

Luckily, there is no need to suffer from these inefficiencies. JPA has always offered a way to handle these kinds of issues, and newer JPA versions have introduced some additional features that can be used to gain significant performance improvements.

The latest version, JPA 3.1, was released in spring 2022, but many of these features have been available as early as JPA 2.1.

Issue #1: Too Many SQL Queries

Performing too many SQL queries to fetch all required entities is one of the most common causes of JPA performance issues.

If implemented incorrectly, even the most innocent looking query can trigger dozens or hundreds of SQL queries to the database. And it doesn’t even have to be in the explicit query form as you will see in this section, rather just a couple of annotations configured incorrectly.

Imagine the following piece of code in your project:


List authors = this.em.createQuery("SELECT a FROM Author a",
		Author.class).getResultList();

for (Author a : authors) {
	System.out.println("Author "
			+ a.getFirstName()
			+ " "
			+ a.getLastName()
			+ " wrote "
			+ a.getBooks()
					.stream()
					.map(b -> b.getTitle() + "("
							+ b.getReviews().size() + " reviews)")
					.collect(Collectors.joining(", ")));
}

The code snippet above prints out the names of all authors and the titles of their books. That snippet looks really simple. What do you think, how many queries are sent to the database? One? Maybe two (one for each type of entity)? The answer? It depends.

What is the n+1 Query Problem?

Using my small example database with only 11 authors and 6 books in it, this code triggers 12 queries. One to get all authors and 11 to get the books for each of the 11 authors. This issue is known as the n+1 query problem and it can easily occur with any libraries that you use for database access.

The worst thing is that the performance gets even worse with an increasing dataset, so in production the problem is exacerbated.

The good news is, we have multiple options to avoid this scenario by fetching all the required entities with one query. One of the newest and, from my point of view, the best way to solve this problem is to use a @NamedEntityGraph.

An entity graph specifies a graph of entities that shall be fetched from the database in a query independent way. That means, you create a standalone definition of an entity graph and combine it with a query when you need it. The snippet below shows how to define a @NamedEntityGraph which we fetch the books of a given author.


@Entity
@NamedEntityGraph(name = "graph.AuthorBooks", attributeNodes = @NamedAttributeNode("books"))
public class Author implements Serializable {
…
}

You can now provide this graph as a hint to the entity manager and get the authors and all their books in one query. As you have seen in the definition of the graph, I only provided the name of the property that contains the related entities. Therefore I use the @NamedEntityGraph as a loadgraph, so that all the other attributes are fetched with their defined fetch type, as follows:


EntityGraph graph = this.em.getEntityGraph("graph.AuthorBooks");

List authors = this.em
		.createQuery("SELECT DISTINCT a FROM Author a", Author.class)
		.setHint("javax.persistence.loadgraph", graph).getResultList();

This example shows a very simple entity graph and you will probably be using more complex graphs in a real application. But this is not a problem. You can define more complex ones by defining multiple @NamedAttributeNodes and you can also use the @NamedSubGraph annotation to create a graph with multiple levels.

For some use cases you might also need a more dynamic way to define the entity graph, e.g. based on some input parameters. In these cases it makes more sense to use a Java API to programmatically define the EntityGraph.

Issue #2: Updating Entities Individually

Updating entities one by one is another common reason for performance issues in JPA. As Java developers we are used to work with objects and to think in an object oriented way. While this is a good way to implement complex logic and applications, it is also a common cause of performance degradation when working with a database.

From an object oriented point of view it is perfectly acceptable to perform update and delete operations on the entities. But this is very inefficient when you have to update a huge set of entities. The persistence provider will create one update statement for each updated entity and send them to the database with the next flush operation.

SQL provides a more efficient way to do this. It allows you to construct an update statement that updates multiple entities at once. And you can do the same with the CriteriaUpdate and CriteriaDelete statements introduced in JPA 2.1.

If you have used criteria queries before, you will feel very familiar with the new CriteriaUpdate and CriteriaDelete statements. The update and delete operations are created in nearly the same way as the criteria queries introduced in JPA 2.0.

As you can see in the following code snippet, you need to get a CriteriaBuilder from the entity manager and use it to create a CriteriaUpdate object. This is done in a similar way to the CriteriaQuery. The main differences are the set methods which are used to define the update operations.


CriteriaBuilder cb = this.em.getCriteriaBuilder();
// create update
CriteriaUpdate update = cb.createCriteriaUpdate(Author.class);
// set the root class
Root a = update.from(Author.class);
// set update and where clause
update.set(Author_.firstName, cb.concat(a.get(Author_.firstName), " - updated"));
update.where(cb.greaterThanOrEqualTo(a.get(Author_.id), 3L));

// perform update
Query q = this.em.createQuery(update);
q.executeUpdate();

For CriteriaDelete operations you just need to call the createCriteriaDelete method on the entity manager to get a CriteriaDelete object and use it to define the FROM and WHERE parts of the query similar to the previous example.

Issue #3: Processing Data in the Database

Another common source of performance problems is that we, as Java developers, tend to implement all the logic of our application in Java. Don’t get me wrong, there are lots of good reasons to do it this way. But there can also be good reason to perform some part of the logic in the database and only send the result to the business tier.

There are multiple ways to perform logic in the database. You can do a lot of things with plain SQL and if that is not enough, you can still call database specific functions and stored procedures. Here I will have a closer look at stored procedures or to be more precise at the way you can call stored procedures.

There was no real support for it in JPA 2.0. Native queries were the only way you could call a stored procedure. This was changed with the introduction of @NamedStoredProcedureQuery and the more dynamic StoredProcedureQuery in JPA 2.1. In this post, I will focus on the annotation based definition of stored procedure calls with @NamedStoredProcedureQuery.

As you can see in the following code snippet, the definition of a @NamedStoredProcedureQuery is pretty straight forward. You need to define the name of the named query, the name of the stored procedure in the database as well as the input and output parameters. In this example, I’m calling the stored procedure calculate with the input parameters x and y. I expect the output parameter sum. Other supported parameter types are INPUT for parameters which are used for input and output and REF_COURSOR to retrieve result sets.


@NamedStoredProcedureQuery(
name = "calculate", 
	procedureName = "calculate", 
	parameters = { 	
@StoredProcedureParameter(mode = ParameterMode.IN, type = Double.class, name = "x"), 
		@StoredProcedureParameter(mode = ParameterMode.IN, type = Double.class, name = "y"), 
		@StoredProcedureParameter(mode = ParameterMode.OUT, type = Double.class, name = "sum") })

The @NamedStoredProcedureQuery is used in a similar way to @NamedQuery. You need to provide the name of the query to the createNamedStoredProcedureQuery method of the entity manager to get a StoredProcedureQuery object for this query. This can then be used to set the input parameters with the setParameter methods and to call the stored procedure with the execute method afterwards.


StoredProcedureQuery query = this.em.createNamedStoredProcedureQuery("calculate");
query.setParameter("x", 1.23d);
query.setParameter("y", 4.56d);
query.execute();
Double sum = (Double) query.getOutputParameterValue("sum");

Final Thoughts on JPA Performance

JPA makes it very easy to store and retrieve data from a database. While this is great to get a project started quickly and to solve the vast majority of its requirements, it also makes it easy to implement a very inefficient persistence tier. Some of the most common problems include using too many queries to get the required data, updating entities one by one and implementing all of the logic within Java.

Looking to further your Java development efficiency? Try JRebel. By eliminating redeploys, you could save a month of development time annually. See for yourself during your 14-day free trial.
Try Free

2025 Java Developer Productivity Report