Here's a quick short post. Putting it here, since it has been tremendously useful for me. I hope this will help others as well.
There have been several questions on SO about merge() and persist() operations. But I find that it is not just about one resulting to an UPDATE SQL command, and the other resulting to an INSERT SQL command.
Results of merge() and persist()
Instead of pasting relevant sections from the JPA 2.1 specification in this post, I've created a summary table of the results of the said operations. This is based on sections 3.2.2 (Persisting an Entity Instance) and 3.2.7.1 (Merging Detached Entity State) of the JPA 2.1 specification.
Operation
State
Result
persist
new
becomes managed
persist
managed
ignored (but cascaded)
persist
removed
becomes managed
persist
detached
throws exception or commit fails
Operation
State
Result
merge
new
becomes managed
merge
managed
ignored (but cascaded)
merge
removed
throws exception or commit fails
merge
detached
becomes managed
As you can see, both merge() and persist() operations treat new and managed entities the same way. But they only differ in the way they treat removed and detached entities.
Closing Thoughts
So, the next time you think that persist() results to an INSERT, and merge() results to an UPDATE, think again!
Here's my take. I personally prefer to use merge() operation to handle new, managed, and detached entities.
But for removed entities (which can only happen when you programmatically remove it), I use persist(). But then again, it's rare (in my experience) to remove an entity, and then reverse its removal in the same persistence context.
In the first part of the series, I showed how transactions work in plain-vanilla JDBC. And then I showed how Spring manages JDBC-based transactions. In this second part of the series, I'll show how transactions work in plain-vanilla JPA first. And then show how Spring manages JPA-based transactions.
Funds Transfer
To help illustrate transactions, I'll be using the same case study of transferring funds from one bank account to another. Here, we show code snippets of debit, credit, and transfer methods.
... class BankAccountService {
public void transfer(MonetaryAmount amount, ...) {
debit(amount, ...);
credit(amount, ...);
...
}
public void credit(MonetaryAmount amount, AccountId accountId) {
...
}
public void debit(MonetaryAmount amount, AccountId accountId) {
...
}
...
}
JPA Transactions
In plain-vanilla JPA, transactions are started by calling getTransaction().begin() on the EntityManager. The code snippet below illustrates this.
Technically, the EntityManager is in a transaction from the point it is created. So calling begin() is somewhat redundant. Until begin() is called, certain operations such as persist, merge, remove cannot be called. Queries can still be performed (e.g. find()).
Objects that were returned from queries can be changed. Although the JPA specification is somewhat unclear on what will happen to these changes when no transaction has been started.
Now, let's apply JPA to the funds transfer case study.
We defined a BankAccount entity to handle debit() and credit() behavior.
import javax.persistence.*;
@Entity
... class BankAccount {
@Id ...;
...
public void debit(MonetaryAmount amount) {...}
public void credit(MonetaryAmount amount) {...}
...
}
We add an EntityManagerFactory to BankAccountService to enable the creation of EntityManagers when needed.
The transfer, credit, and debit methods could sure use a template class (something like a JdbcTemplate) to remove all the boilerplate code. Spring previously provided a JpaTemplate class, but was deprecated as of Spring 3.1, in favor of native EntityManager usage (typically obtained through @PersistenceContext).
So, let's do just that — use EntityManager obtained through @PersistenceContext.
Our code is a little bit simpler. Since we didn't create an EntityManager, we don't have to close it. But we are still calling getTransaction().begin(). Is there a better way? And how does an EntityManager get injected into the object in the first place?
From my previous post in this series, the astute reader is probably already thinking of having Spring do the work for us. And rightfully so!
EntityManager and @PersistenceContext
We tell Spring to inject an EntityManager from the EntityManagerFactory by adding a PersistenceAnnotationBeanPostProcessor (either through XML <bean>, or simply using a Java-based configuration via @Configuration classes loaded via AnnotationConfigApplicationContext).
When using XML-based configuration, a PersistenceAnnotationBeanPostProcessor is transparently activated by the <context:annotation-config /> element. And this element also gets transparently activated by <context:component-scan />.
When using Java-based @Configuration, the AnnotationConfigApplicationContext is used. And with it, annotation config processors are always registered (one of which is the aforementioned PersistenceAnnotationBeanPostProcessor).
By adding a single bean definition, the Spring container will act as a JPA container and inject an EnitityManager from your EntityManagerFactory.
JPA and @Transactional
Now that we have an EntityManager, how can we tell Spring to begin transactions for us?
We tell Spring to start transactions by marking methods as @Transactional (or mark the class as @Transactional which makes all public methods transactional). This is consistent with the way Spring enables transactions with JDBC.
Wow, that was nice! Our code just got a lot shorter.
And just as explained in the first part of this series, when Spring encounters this annotation, it proxies the object (usually referred to as a Spring-managed bean). The proxy starts a transaction (if there is no on-going transaction) for methods that are marked as @Transactional, and ends the transaction when the method returns successfully.
A call to debit() will use a transaction. A separate call to credit() will use a transaction. But what happens when a call to transfer() is made?
Since the transfer() method is marked as @Transactional, Spring will start a transaction. This same transaction will be used for calls to debit() and credit(). In other words, debit(amount) and credit(amount) will not start a new transaction. It will use the on-going transaction (since there is one).
But wait! How does Spring know when to inject a proper entity manager? Is it only injected when a transactional method is invoked?
Shared EntityManager
In one of my training classes, I tried the following to better understand how Spring injects an EntityManager via @PersistenceContext. And I believe it will help others too. So, here's what I tried:
An output of something like this was displayed on the console after the application context started.
Shared EntityManager proxy for target factory [...]
So what is this shared entity manager?
When the application context starts, Spring injects a shared entity manager. The shared EntityManager will behave just like an EntityManager fetched from an application server's JNDI environment, as defined by the JPA specification. It will delegate all calls to the current transactional EntityManager, if any; otherwise, it will fall back to a newly created EntityManager per operation.
Going back to our question. Spring doesn't inject the right entity manager at the right time. It always injects a shared entity manager. But this shared entity manager is transaction-aware. It delegates to the current transactional EntityManager, if there is an on-going transaction.
Conclusion
This concludes the two-part series. I hope that by starting off with the plain-vanilla versions of JDBC and JPA (sans DAOs and repositories), I was able to make it clearer as to how Spring is able to manage transactions behind the scenes. And that by having a clearer idea as to what Spring is doing behind the scenes, you can troubleshoot better, understand why you get an TransactionRequiredException saying "No transactional EntityManager available", and add better fixes to your applications.
I've been meaning to write about this for quite some time now. In the training courses I've been privileged to conduct, I've noticed that course participants have had the most difficulty trying to understand how Spring manages transactions. In the first part of this series, I'll start by showing how transactions work in plain-vanilla JDBC. And then show how Spring manages JDBC-based transactions.
In the second part of this series, I'll show how transactions work in plain-vanilla JPA. And then show how Spring manages JPA-based transactions.
Funds Transfer
To help illustrate transactions, I'll be using the often used case study of transferring funds from one bank account to another. Here, we show code snippets of debit, credit, and transfer methods.
... class BankAccountService {
public void transfer(MonetaryAmount amount, ...) {
debit(amount, ...);
credit(amount, ...);
...
}
public void credit(MonetaryAmount amount, AccountId accountId) {
...
}
public void debit(MonetaryAmount amount, AccountId accountId) {
...
}
...
}
JDBC Transactions
In plain-vanilla JDBC, transactions are started by setting the Connection's auto-commit mode to false (or to manual commit mode). Once the Connection's auto-commit is set to false, subsequent database changes are not committed until a call to commit() is made. The changes brought about by the SQL operations will not be seen by other connections until the changes are committed. The code snippet below illustrates this.
import java.sql.*;
...
try (Connection conn = dataSource.getConnection()) {
conn.setAutoCommit(false);
... // SQL operations here (inserts, updates, deletes, etc)
// database changes will not be saved until commit
conn.commit(); // save changes (all-or-nothing)
} catch (SQLException e) {
// handle exception
conn.rollback(); // rollback changes
}
Now, let's apply this to the funds transfer case study.
The transfer() method is now overloaded with an additional connection parameter.
import java.sql.*;
...
... class BankAccountService {
public void transfer(MonetaryAmount amount, ...)
throws SQLException {...}
public void transfer(
MonetaryAmount amount, ..., Connection conn)
throws SQLException {...}
...
}
The method that does not accept a connection, creates a new connection, and calls the same method that accepts a connection parameter. The method that accepts a connection, uses it to carry out debit and credit operations.
The debit() and credit() methods are also overloaded. Just like the transfer() methods, the one that does not accept a connection parameter, creates a new connection, and calls the one that accepts a connection object. The method that accepts a connection, uses it to carry out SQL operations.
Notice that we'll need to pass connection objects all over the place. Yes, this looks ugly. But stay with me for a few more minutes and read on.
A pattern emerges from these methods. Whenever a database transaction needs to be made, a connection is either provided or established, then used to carry out SQL operations (or database changes), and the changes are committed (or rolled back) by the method that established the connection.
The method accepting a connection object gives up control of the transaction to its caller (as in the case of credit(amount) calling credit(amount, connection)). The scope of transactions can span one method call, or several methods calls (as in the case of transfer(amount) calling debit(amount, connection) and credit(amount, connection)).
As already mentioned, this looks ugly and error-prone. The connection object is passed all over the place. So, how can we improve on this?
Spring-managed JDBC Transactions
First, we'll use Spring's JdbcTemplate to help us deal with JDBC boilerplate code and SQLExceptions. Second, we'll use Spring's declarative transaction management.
When we start using a JdbcTemplate inside BankAccountService, we no longer need to handle connections explicitly, and we can remove those overloaded methods. The connection object actually gets hidden in some thread local object.
import org.springframework.jdbc.core.JdbcTemplate;
import org.springframework.transaction.annotation.Transactional;
@Transactional
... class BankAccountService {
private JdbcTemplate jdbcTemplate; // injected via constructor
...
public void transfer(MonetaryAmount amount, ...) {
...
}
public void credit(MonetaryAmount amount, AccountId accountId) {
jdbcTemplate.update(...); // SQL operations for credit
...
}
public void debit(MonetaryAmount amount, AccountId accountId) {
jdbcTemplate.update(...); // SQL operations for debit
...
}
...
}
How does JdbcTemplate get a Connection?
For each JdbcTemplate method, it retrieves a connection from the given DataSource that is provided (either as constructor argument, or as property setDataSource(DataSource)). After the JdbcTemplate method returns, the connection is closed. Without transactions, the connection's auto-commit mode remains true. The good thing is, JdbcTemplate is transaction-aware. That means, it will participate in the on-going transaction, if there is one (i.e. use the on-going transaction's connection).
So, how do we set the connection's auto-commit mode to false, so that a transaction can be started and span more than one method call to JdbcTemplate?
To use transactions, we do not set the connection's auto-commit mode to false. Instead, we tell Spring to start transactions. In turn, Spring will retrieve a connection, set its auto-commit mode to false, and keep using this same connection until the transaction is completed (either with all changes saved/committed, or all changes rolled back).
So, how do we tell Spring to start transactions, and what does @Transactional do?
As you might have guessed, the @Transactional annotation tells Spring when to start a transaction (by setting the connection's auto-commit mode to false).
We tell Spring to start transactions by marking methods as @Transactional (or mark the class as @Transactional which makes all public methods transactional). And for Spring to start noticing @Transactional annotations, we'll need to add <tx:annotation-driven /> in our XML-based configuration, or add @EnableTransactionManagement in our Java-based @Configuration.
When Spring encounters this annotation, it proxies the object (usually referred to as a Spring-managed bean). The proxy starts a transaction (if there is no on-going transaction) for methods that are marked as @Transactional, and ends the transaction when the method returns successfully.
Since JdbcTemplate is transaction-aware, it knows how to get the on-going transaction's connection (if one exists), and not create a new one.
...
import org.springframework.transaction.annotation.Transactional;
@Transactional
... class BankAccountService {
...
public void transfer(MonetaryAmount amount, ...) {
...
debit(amount, ...);
credit(amount, ...);
...
}
public void credit(MonetaryAmount amount, AccountId accountId) {...}
public void debit(MonetaryAmount amount, AccountId accountId) {...}
...
}
So, a call to debit() will use a transaction. A separate call to credit() will use a transaction. But what happens when a call to transfer() is made?
Since the transfer() method is marked as @Transactional, Spring will start a transaction. This same transaction will be used for calls to debit() and credit(). In other words, debit(amount) and credit(amount) will not start a new transaction. It will use the on-going transaction (since there is one).
Whenever I search the web for storing and retrieving BLOBs to and from a database, I usually get a complicated sample that uses byte arrays (byte[]) and JDBC code. Since I usually deal with web-based enterprise applications, I need to store BLOBs (e.g. images, spreadsheets, documents, PDFs) to a database and retrieve them later (e.g. as an <img> tag in an HTML page, or downloaded from a URL).
JDBC and BLOBs
The typical way of storing BLOBs with JDBC is shown below. Note that the created objects need to be closed properly to prevent resource leaks.
Connection connection = ...getConnection();
try {
PreparedStatement stmt =
connection.prepareStatement("INSERT...");
try {
// Sometimes, a byte[] is used here :(
InputStream bytes = new FileInputStream("..."); // e.g. image file
try {
... // other parameters set
// Unsure on how to create a Blob and use setBlob(...)
stmt.setBinaryStream(..., bytes);
stmt.execute();
} finally {
bytes.close();
}
} finally {
stmt.close();
}
} finally {
connection.close();
}
Note that the code above is possibly an over-simplification. Most often, the BLOB is embedded with another persistent object (e.g. a Person with a photo that is stored as a BLOB).
To retrieve the BLOB, the following code is typically seen. Note that the BLOB is only available while the JDBC connection stays open. Thus, some developers resort to using byte arrays (byte[]) to temporarily store and transfer them, before the connection gets closed.
Connection connection = ...getConnection();
try {
PreparedStatement stmt =
connection.prepareStatement("SELECT...");
try {
... // other parameters set
ResultSet rs = stmt.executeQuery();
try {
InputStream bytes = rs.getBinaryStream(...);
// Sometimes, a byte[] is used with rs.getBytes() :(
try {
// Must use bytes before JDBC connection gets closed
} finally {
bytes.close();
}
} finally {
rs.close();
}
} finally {
stmt.close();
}
} finally {
connection.close();
}
The astute reader can see (lines also highlighted) how error-prone BLOB handling is with JDBC. The lesser known Connection.createBlob() (introduced in Java 6 / JDBC 4.0) and ResultSet.getBlob(...) is not often used.
File Systems and BLOBs
Another often seen way of handling BLOBs in Java is to store them as files in a file system. But this has some drawbacks, like the tricky to use unique file names, and possible un-tracked tampering of file contents. In some cases, storing them as files may not be permitted when running in a server with a security manager (e.g. Servlet-container like Tomcat with a SecurityManager). The security manager protects the server from trojan servlets, JSPs, JSP beans, tag libraries, or even inadvertent mistakes (e.g. <% System.exit(1); %> in a JSP).
I find the above handling of BLOBs to be over-complicated. Because of this, I wanted to make BLOB handling easier, and less error-prone. I am also inspired by how Google's AppEngine for Java was handling BLOBs. Thus, I started looking at the BLOB-related code I have worked with in the past years, and came up with a simple API, I called jBlubble. The main interface is BlobstoreService.
The code is available at my GitHub account. As of this writing, a JDBC-implementation of the API is available. Other implementations are also welcome.
The createBlob methods abstract the JDBC-related code of using java.sql.Blobs, and making sure allocated resources are closed properly. To store BLOBs uploaded to a webapp, the code looks something like:
// Servlet configured to support multipart/form-data
// HttpServletRequest
request.getPart("...").write(fileName);
// Open an input stream with the file (created from uploaded part)
InputStream in = new FileInputStream(fileName);
try {
... blobKey = blobstoreService.createBlob(in, ...);
} finally {
in.close();
}
To serve the previously stored BLOB, the code looks something like:
In some cases, like when generating reports, the output stream can be stored as a BLOB too, like so:
... JasperPrint print = ...;
... JRExporter exporter = ...; // new JRPdfExporter();
... blobKey = blobstoreService.createBlob(
new BlobstoreWriteCallback() {
@Override
public long writeToOutputStream(OutputStream out) ... {
...
exporter.setParameter(JRExporterParameter.JASPER_PRINT, print);
exporter.setParameter(JRExporterParameter.OUTPUT_STREAM, out);
exporter.exportReport();
...
}
}, ...);
// generated blobKey can then be
// used to identify the generated report
Since report generation can take its time (especially when the report has several pages), it is a good idea to store the report in a database, rather than directly writing to a servlet output stream. When the report is complete, it can be downloaded (and re-downloaded) anytime.
The key design consideration was to abstract the persistence-related specifics (like JDBC) and use a simpler API. A callback interface was used to support writing to OutputStreams to ensure that it gets closed properly. A higher level method was used to serve the BLOBs directly to an output stream. The BlobInfo contains a timestamp field that can be used to support Last-Modified HTTP header for caching purposes. As of the moment, BLOBs are immutable and cannot be updated. To change, a new BLOB should be created, delete the old one, and reference the new one.
Instead of embedding the BLOB with their related persistent objects (or entities), I find it better to simply reference the BLOB. So, instead of this...
@Entity
public class Person {
@Id private Long id;
@Lob byte[] photo;
...
}
... I find it better to just reference the BLOB like this.
@Entity
public class Person {
@Id private Long id;
String photoId; // retrieved via BlobstoreService
...
}
I've started replacing existing BLOB handling code with the API we've developed. So far, it has simplified things. I hope this will help others too. Now, it's time for a cold one.
In the training courses I've given, providing concrete examples (not just UML diagrams) help the course participants understand better. Here in this post, I'm sharing a Java-implementation of the revenue recognition problem (as described in Martin Fowler's PoEAA book). The problem is used to explain three domain logic patterns: transaction script, table module, and domain model.
Revenue Recognition Problem
Just in case you haven't read the book, here's an excerpt from the book that explains the revenue recognition problem, taken from Martin Fowler's PoEAA book (page 112).
Revenue recognition is a common problem in business systems. It's all about when you can actually count the money you receive on your books. If I sell you a cup of coffee, it's a simple matter: I give you the coffee, I take your money, and I count the money to the books that nanosecond. For many things it gets complicated, however. Say you pay me a retainer to be available that year. Even if you pay me some ridiculous fee today, I may not be able to put it on my books right away because the service is to be performed over the course of a year. One approach might be to count only one-twelfth (1/12) of that fee for each month in the year, since you might pull out of the contract after a month when you realize that writing has atrophied my programming skills.
The rules for revenue recognition are many, various, and volatile. Some are set by regulation, some by professional standards, and some by company policy. Revenue tracking ends up being quite a complex problem.
I don't fancy delving into the complexity right now, so instead we'll imagine a company that sells three (3) kinds of products: word processors, databases, and spreadsheets. According to the rules, when you sign a contract for a word processor, you can book all the revenue right away. If it's a spreadsheet, you can book one-third (1/3) today, one-third in sixty (60) days, and one-third in ninety (90) days. If it's a database, you can book one-third today, one-third in thirty (30) days, and one-third in sixty (60) days. There's no basis for these rules other than my own fevered imagination. I'm told that the real rules are equally rational.
Emphasis, formatting, and numbers added for clarity
Domain Logic Patterns Example (Java-implementation)
I used the same integration test (AbstractRevenueRecognitionServiceFacadeTests) and applied it against three different implementations: transaction script, table module, and domain model.
Here are the three (3) patterns:
transaction script — organizes business logic by procedures where each procedure handles a single request from the presentation.
domain model — an object model of the domain that incorporates both behavior and data.
table module — a single instance that handles the business logic for all rows in a database table or view.
For transaction script and table module implementations, I used table data gateways.
For domain model, I used data mapper (via JPA/Hibernate). I tried to stick as close as possible to the database schema provided. In fact, all three patterns work with the same database schema.
The transaction script implementation can be found under the revenue.recognition.transaction.script package. The table module implementation is under the revenue.recognition.table.module package. The domain model is found under the following packages: revenue.recognition.interfaces.facade.internal, revenue.recognition.application.impl, and revenue.recognition.domain.model.
Please feel free to review the implementation, and let me know of any items I've missed.
Note that having three (3) different patterns in one application is unnecessary (and quite confusing). I've only done this to better compare the different approaches to organizing domain logic.
"Program to an 'interface', not an 'implementation'." (Gang of Four 1995:18)
Composition over inheritance: "Favor 'object composition' over 'class inheritance'." (Gang of Four 1995:20)
And I was thinking how I can make those two phrases stick more. And a thought occurred to me that perhaps if we make those principles rhyme, it'll be easier for developers to remember, and become more effective.
Rhyme-as-Reason Effect
In one study [1], researchers compared the perceived truthfulness of rhyming vs. similar non-rhyming statements. For example:
"Caution and measure will win you treasure" vs "Caution and measure will win you riches"
The study suggested that "rhyme, like repetition, affords statements an enhancement in processing fluency that can be misattributed to heightened conviction about their truthfulness."
In other words, rhyming makes statements easier to understand. Which, in turn, makes them appear more accurate.
Object-Oriented Design Principles with Poetry
Here's my attempt at making those two principles rhyme.
Program to an interface, not to an implementation.
An implementation is good and fine, an interface is a wiser design.
Programming to an implementation is logical, programming to an interface is exceptional.
Programming to an implementation is what is needed, programming to an interface is why we succeeded.
Favor composition, over inheritance.
Inheritance if you must, in composition we trust.
Next time…
So the next time I get a chance to run another training course on object-oriented design techniques, I'll give these rhymes, some good times.
If you have rhymes of your own, please feel free to hit the comments.
[1] McGlone, M.S., and Tofighbakhsh, J. (2000) Birds of a feather flock conjointly (?): rhyme as reason in aphorisms. Psychological Science, 11:424-28.
Recently, I've been reviewing a funds transfer system that uses the S.W.I.F.T. standard message types. After reviewing it, I find that splitting the domain entity from its draft form would make it easier to maintain. Let me explain further.
Background of Funds Transfer System
The system allows users to record funds transfer requests as S.W.I.F.T. messages. Note that the system users are familiar with the said message types, and are not afraid to enter the values into a form that displays all the possible fields. To put things in perspective, a typical message type like MT 103 (Single Customer Credit Transfer) has about 25 fields. Some fields can have several formats. For example, the Ordering Customer (field 50a), can be in one of three formats: option A, option K, and option F. If we include every possible format, MT 103 would easily have over 70 fields.
Modeling Entities
Based on Eric Evan's DDD book, domain entities have a clear identity and a life-cycle with state transitions that we care about.
The funds transfer system modeled each MT as a domain entity. The MT domain entity has a clear identity, and a life-cycle (from draft to approved). It is mapped to its own table with its own corresponding columns to represent its fields. The message fields are implemented as properties. All fields are mutable (via setters methods).
The system validates the MTs, and displays which fields have errors. But the errors did not stop the system from persisting/storing the message. The system allowed the draft message to be stored for further modification (until it is approved).
This analysis led me to re-think about how the system modeled messages as entities. While it is tempting to think of them as rich domain entities that represent a real-world funds/credit transfer, they're really not. They're actually request forms. Much like the real-world paper forms that one fills-out, and submits for approval, and possibly being re-submitted due to corrections.
Given this, I would model the draft message as another domain entity that I'll call MessageTypeForm (or MT103Form for MT 103). This domain entity will use a map of fields (still in compliance with S.W.I.F.T. messages, where each field is keyed by an alpha-numeric string) (e.g. get("50a")), and not use a separate property for each field (e.g. getOrderingCustomer()). By using a map of fields, the entity would require lesser code, and can still be persisted (or mapped to table rows).
The CreditTransfer (or SingleCustomerCreditTransfer) would be another entity (not the same as the entity with a map of fields — MT103Form). This credit transfer entity shall have a clear identity and a life-cycle (e.g. being amended or cancelled). It can have a reference (by ID, and not by type) to the MT103Form entity from which it was created. It is also easier to establish invariants on this entity, as opposed to having all properties being mutable.
The validation of MT103Form will have to change from using properties (e.g. getOrderingCustomer()) to using a get method with a key parameter (to determine the message field it wants to retrieve) (e.g. get("50a")).
Deeper Insight
The "draft/form" entity may look like a bag of getters and setters. But it has an identity, and a life-cycle (from draft to approved). So, it is an entity. Many "draft/form" instances will exist in the system simultaneously. The different instances may even have the same field values, but it is important for us to be able to track individual "draft/form" instances.
Discovering the distinction between the "draft/form" and "fund/credit transfer" entities have made things easier to implement. Having two separate domain entities has made the model better (reflect the problem it is trying to solve).
This deeper insight would not have been possible without the help of the domain experts, and my team mates: Tin, Anson, Richie, and Tina. Thanks guys!
In my recent training sessions on the (core) Spring Framework, I was asked, "If there was one thing that a (Java) Spring developer should know, what should that be?" That question caught me off guard. Yes, the (core) Spring Framework does cover a lot of areas (e.g. beans, configuration, aspect-oriented programming, transactions). And it was difficult for me to point out just one thing. I ended up mentioning everything that we covered in our (3 day) training course.
As I gave that question more thought, I began to think about the most important one. I ended up thinking of how Spring uses aspects to add behavior to managed objects (usually called beans) as the most important. This is how the Spring Framework supports transactions, security, scope, Java-based configuration, among others. And I'm sharing my thoughts here in this post.
ORM and Lazy Loading Exceptions
Most developers who use some form of ORM have encountered an exception that signifies that child entities could not be loaded (e.g. LazyInitializationException).
Some developers who have encountered this would use an "open session in view" (OSIV) pattern to keep the session open and prevent this exception from happening. But I find this to be an overkill. Worse, some developers consider the "open session in view" pattern to be the only solution. A possible underlying cause for this misconception could be that the developer is probably not armed with the knowledge of using the Spring Framework effectively to keep the ORM session open longer.
In the case of JPA, the "open entity manager in view" pattern will create an entity manager at the beginning of the request, bind it to the request thread, and close it when the response is completed.
So, if not the OSIV pattern, what would be a better solution?
The short answer is to use the Spring Framework to keep the session open for the duration that you need it (e.g. @Transactional). Keep on reading as I'll provide a longer answer.
Services and Repositories
In a layered architecture, a typical design pattern is to define a domain or application service (usually defined as an interface) to provide business functionality (e.g. start using a shopping cart, adding items to that shopping cart, searching for products). Domain and application service implementations would typically delegate the retrieval/persistence of domain entities to repositories.
Presentation Layer
Business Layer
Data Access (or Infrastructure) Layer
Repositories (or data access objects) are also defined as interfaces to retrieve/persist domain entities (i.e. provide ORM and CRUD access). Naturally, repository implementations use ORM libraries (e.g. JPA/Hibernate, myBATIS) to retrieve and persist domain entities. With this, it uses the ORM framework's classes to connect to the persistent store, retrieve/persist the entity, and close the connection (called session in Hibernate). There's no problem of lazy loading failures at this point.
The problem of lazy loading failures occur when the service retrieves a domain entity using the repository, and wants to load child entities (after the repository method has returned). By the time the repository returns the domain entity, the ORM session gets closed. Because of this, attempts to access/load child entities in the domain service cause an exception.
The code snippets below illustrate how a lazy loading exception can occur when the child items of an order entity is lazily loaded after being returned by the repository.
@Entity
public class Order {
@OneToMany // defaults to FetchType.LAZY
private List<OrderItem> items;
…
public List<OrderItem> getItems() {…}
}
public class SomeApplicationServiceImpl implements SomeApplicationService {
private OrderRepository orderRepository;
…
@Override
public void method1(…) {
…
order = orderRepository.findById(...);
order.getItems(); // <-- Lazy loading exception occurs!
…
}
…
}
public class OrderRepositoryImpl implements OrderRepository {
@PersistenceContext
private EntityManager em;
…
@Override
public Order findById(...) {...}
…
}
The repository implementation explicitly uses JPA for its ORM (as illustrated with the use of an EntityManager).
At this point, some developers may opt to use eager fetch to prevent the lazy initialization exception. Telling the ORM to eagerly fetch the child items of an order entity will work. But sometimes, we don't need to load the child items. And eagerly loading this might be unnecessary overhead. It would be great to only load it when we need it.
To prevent the lazy initialization exception (and not be forced to eagerly fetch), we'll need to keep the ORM session open until the calling service method returns. In Spring, it can be as simple as annotating the service method as @Transactional to keep the session open. I find that this approach is better than using "open session in view" pattern (or being forced to use eager fetching), since it keeps the session open only for the duration that we intend it to be.
public class SomeApplicationServiceImpl implements SomeApplicationService {
private OrderRepository orderRepository;
…
@Override
@Transactional // <-- open the session (if it's not yet open)
public void method1(…) {
…
order = orderRepository.findById(...);
order.getItems(); // <-- Lazy loading exception should not happen
…
}
…
}
Domain Entities in the Presentation Layer
Even after keeping the ORM session open in the service layer (beyond the repository implementation objects), the lazy initialization exception can still occur when we expose the domain entities to the presentation layer. Again, because of this, some developers prefer the OSIV approach, since it will also prevent lazy initialization exceptions in the presentation layer.
But why would you want to expose domain entities in the presentation layer?
From experience, I've worked with teams who prefer to expose domain entities in the presentation layer. This usually leads to anemic domain model, since presentation layer frameworks need a way to bind input values to the object. This forces domain entities to have getter and setter methods, and a zero-arguments constructor. Having getters and setters will make invariants difficult to enforce. For simple domains, this is workable. But for more complex domains, a richer domain model would be preferred, as it would be easier to enforce invariants.
In a richer domain model, the objects that represent the presentation layer input/output values are actually data transfer objects (DTOs). They represent inputs (or commands) that are carried out in the domain layer. With this in mind, I prefer to use DTOs and maintain a richer domain model. Thus, I don't really run into lazy initialization exceptions in the presentation layer.
Aspects to add behavior to managed objects
Spring intercepts calls to these @Transactional annotated methods to ensure that an ORM session is open.
Transactions (or simply keeping an ORM session open) are not the only behavior provided using aspects. There's security, scope, Java-based configuration, and others. Knowing that the Spring Framework uses aspects to add behavior is one of the key reasons why we let Spring manage the POJOs that we develop.
Conclusion
There you go. That for me is the one most important thing that a Spring Framework developer should know when using the core. Now that I've given my opinion as to what is the one most important thing, how about you? What do you think is the one most important thing to know when tackling Core Spring. Cheers!