Friday, December 30, 2016

Isolating the Domain Logic

In one design patterns class, I had an interesting discussion about modelling domain logic. Specifically, it was about isolating the domain logic. An application would typically be divided into three parts:

  1. Presentation (e.g. desktop GUI, browser, web service)
  2. Domain logic
  3. Infrastructure (e.g. persistence storage, e-mail)

The class found it interesting that the dependency arrows were pointing towards the domain logic part. They asked, “Is the diagram intentionally made wrong? Shouldn’t the domain logic part be dependent on the persistence storage?” It was a great question. And I wanted to share and post the discussion and explanation here.

Often Misunderstood

Most developers would usually have this misunderstanding in mind.

Misunderstood
vs.
Proper

And this misunderstanding is largely due to the sequence of operations. It usually starts with a trigger (e.g. a user clicking a button or a link) in the presentation layer, which then calls something within the domain logic layer, which then calls something within the infrastructure layer (e.g. update a database table record).

While this is the correct sequence of operations, there’s something subtle in the way in which the domain logic layer can be implemented. This has something to do with dependency inversion.

Dependency Inversion Principle

The domain logic layer may need something from the infrastructure layer, like some form of access to retrieve from persistence storage. The usual patterns for this are: DAO and repository. I won’t explain these two patterns here. Instead, I would point out that the interface definitions are placed within the domain logic layer, and their implementations are placed in another separate layer.

Placing the (DAO and repository) interface definitions inside the domain logic layer means that it is the domain logic layer that defines it. It is the one that dictates which methods are needed, and what return types are expected. This also marks the boundaries of the domain logic.

This separation between interface and implementation may be subtle, but key. Placing just the interface definitions allows the domain logic part to be free from infrastructure details, and allows it to be unit-tested without actual implementations. The interfaces can have mock implementations during unit testing. This subtle difference makes a big difference in rapid verification of (the development team’s understanding of) business rules.

This separation is the classic dependency inversion principle in action. Domain logic (higher-level modules) should not depend on DAO and repository implementations (low-level modules). Both should depend on abstractions. The domain logic defines the abstractions, and infrastructure implementations depend on these abstractions.

Most novice teams I’ve seen, place the DAO and repository interfaces together with their infrastructure-specific implementations. For example, say we have an StudentRepository and its JPA-specific implementation StudentJpaRepository. I would usually find novice teams placing them in the same package. While this is fine, since the application will still compile successfully. But the separation is gone, and domain logic is no longer isolated.

Now that I’ve explained why and how the domain logic part does not depend on the infrastructure part, I’d like to touch on how the presentation part is accidentally entangled with the domain logic.

Separated Presentation

Another thing I often see with novice teams is how they end up entangling their domain logic with their presentation. And this results into this nasty cyclic dependency. This cyclic dependency is more logical than physical. Which makes it all the more difficult to detect and prevent.

I won’t use a rich GUI presentation example here, since Martin Fowler has already written a great piece on it. Instead, I’ll use a web-browser-based presentation as an example.

Most web-based systems would use a web framework for its presentation. These frameworks usually implement some form of MVC (model-view-controller). The model used is usually the model straight from the domain logic part. Unfortunately, most MVC frameworks require something about the model. In the Java world, most MVC frameworks require that the model follow JavaBean conventions. Specifically, it requires the model to have a public zero-arguments constructor, and getters and setters. The zero-arguments constructor and setters are used to automatically bind parameters (from HTTP POST) to the model. The getters are used in rendering the model in a view.

Because of this implied requirement by MVC frameworks used in the presentation, developers would add a public zero-arguments constructor, getter and setters, to all their domain entities. And they would justify this as being required. Unfortunately, this gets the in the way of implementing domain logic. It gets entangled with the presentation. And worse, I’ve seen domain entities being polluted with code that emits HTML-encoded strings (e.g. HTML code with less-than and greater-than signs encoded) and XML, just because of presentation.

If it is all right to have your domain entity implemented as a JavaBean, then it would be fine to have it used directly in your presentation. But if the domain logic gets a bit more complicated, and requires the domain entity to lose its JavaBean-ness (e.g. no more public zero-arguments constructor, no more setters), then it would be advisable for the domain logic part to implement domain logic, and have the presentation part adapt by creating another JavaBean object to satisfy its MVC needs.

An example I use often is a UserAccount that is used to authenticate a user. In most cases, when a user wishes to change the password, the old password is also needed. This helps prevent unauthorized changing of the password. This is clearly shown in the code below.

public class UserAccount {
  ...
  public void changePassword(
      String oldPassword, String newPassword) {…}
}

But this does not follow JavaBean conventions. And if the MVC presentation framework would not work well with the changePassword method, a naive approach would be to remove the erring method and add a setPassword method (shown below). This weakens the isolation of the domain logic, and causes the rest of the team to implement it all over the place.

public class UserAccount {
  ...
  public void setPassword(String password) {…}
}

It’s important for developers to understand that the presentation depends on the domain logic. And not the other way around. If the presentation has needs (e.g. JavaBean convention), then it should not have the domain logic comply with that. Instead, the presentation should create additional classes (e.g. JavaBeans) that have knowledge of the corresponding domain entities. But unfortunately, I still see a lot of teams forcing their domain entities to look like JavaBeans just because of presentation, or worse, having domain entities create JavaBeans (e.g. DTOs) for presentation purposes.

Arrangement Tips

Here’s a tip in arranging your application. Keep your domain entities and repositories in one package. Keep your repository and other infrastructure implementations in a separate package. Keep your presentation-related classes in its own package. Be mindful of which package depends on which package. The package that contains the domain logic is preferrably at the center of it all. Everything else depends on it.

When using Java, the packages would look something like this:

  • com.acme.myapp.context1.domain.model
    • Keep your domain entities, value objects, and repositories (interface definitions only) here
  • com.acme.myapp.context1.infrastructure.persistence.jpa
    • Place your JPA-based repository and other JPA persistence-related implementations here
  • com.acme.myapp.context1.infrastructure.persistence.jdbc
    • Place your JDBC-based repository and other JDBC persistence-related implementations here
  • com.acme.myapp.context1.presentation.web
    • Place your web/MVC presentation components here. If the domain entities needed for presentation do not comply with MVC framework requirements, create additional classes here. These additional classes will adapt the domain entities for presentation-purposes, and still keep the domain entities separated from presentation.

Note that I’ve used context1, since there could be several contexts (or sub-systems) in a given application (or system). I’ll discuss about having multiple contexts and having multiple models in a future post.

That’s all for now. I hope this short explanation can shed some light to those who wonder why their code is arranged and split in a certain way.

Thanks to Juno Aliento for helping me with the class during this interesting discussion.

Happy holidays!

Thursday, October 27, 2016

Architectural Layers and Modeling Domain Logic

As I was discussing the PoEAA patterns used to model domain logic (i.e. transaction script, table module, domain model), I noticed that people get the impression (albeit wrong impression) that the domain model pattern is best. So, they set out to apply it on everything.

Not Worthy of Domain Model Pattern

Let's get real. The majority of sub-systems are CRUD-based. Only a certain portion of the system requires the domain model implementation pattern. Or, put it in another way, there are parts of the application that just needs forms over data, and some validation logic (e.g. required/mandatory fields, min/max values on numbers, min/max length on text). For these, the domain model is not worth the effort.

For these, perhaps an anemic domain model would fit nicely.

Anemic Domain Model Isn't As Bad As It Sounds

The anemic domain model isn't as bad as it sounds. There, I said it (at least here in my blog post).

But how does it look like?

package com.acme.bc.domain.model;
...
@Entity
class Person {
 @Id ... private Long id;
 private String firstName;
 private String lastName;
 // ...
 // getters and setters
}
...
interface PersonRepository /* extends CrudRepository<Person, Long> */ {
 // CRUD methods (e.g. find, find/pagination, update, delete)
}
package com.acme.bc.infrastructure.persistence;
...
class PersonRepositoryJpa implements PersonRepository {
 ...
}

In the presentation layer, the controllers can have access to the repository. The repository does its job of abstracting persistence details.

package com.acme.bc.interfaces.web;

@Controller
class PersonsController {
 private PersonRepository personRepository;
 public PersonsController(PersonRepository personRepository) {...}
 // ...
}

In this case, having the Person class exposed to the presentation layer is perfectly all right. The presentation layer can use it directly, since it has a public zero-arguments constructor, getters and setters, which are most likely needed by the view.

And there you have it. A simple CRUD-based application.

Do you still need a service layer? No. Do you still need DTO (data transfer objects)? No. In this simple case of CRUD, you don't need additional services or DTOs.

Yes, the Person looks like a domain entity. But it does not contain logic, and is simply used to transfer data. So, it's really just a DTO. But this is all right since it does the job of holding the data stored-to and retrieved-from persistence.

Now, if the business logic starts to get more complicated, some entities in the initially anemic domain model can become richer with behavior. And if so, those entities can merit a domain model pattern.

Alternative to Anemic Domain Model

As an alternative to the anemic domain model (discussed above), the classes can be moved out of the domain logic layer and in to the presentation layer. Instead of naming it PersonRepository, it is now named PersonDao.

package com.acme.bc.interfaces.web;

@Entity
class Person {...}

@Controller
class PersonsController {
 private PersonDao personDao;
 public PersonsController(PersonDao personDao) {...}
 // ...
}

interface PersonDao /* extends CrudRepository<Person, Long> */ {
 // CRUD methods (e.g. find, find/pagination, update, delete)
}
package com.acme.bc.infrastructure.persistence;

class PersonDaoJpa implements PersonDao {
 ...
}

Too Much Layering

I think that it would be an overkill if you have to go through a mandatory application service that does not add value.

package com.acme.bc.interfaces.web;
...
@Controller
class PersonsController {
 private PersonService personService;
 public PersonsController(PersonService personService) {...}
 // ...
}
package com.acme.bc.application;
...
@Service
class PersonService {
 private PersonRepository personRepository;
 public PersonService(PersonRepository personRepository) {...}
 // expose repository CRUD methods and pass to repository
 // no value add
}

Application Services for Transactions

So, when would application services be appropriate? The application services are responsible for driving workflow and coordinating transaction management (e.g. by use of the declarative transaction management support in Spring).

If you find the simple CRUD application needing to start transactions in the presentation-layer controller, then it might be a good sign to move them into an application service. This usually happens when the controller needs to update more than one entity that does not have a single root. The usual example here is transferring amounts between bank accounts. A transaction is needed to ensure that debit and credit both succeed, or both fail.

package sample.domain.model;
...
@Entity
class Account {...}
...
interface AccountRepository {...}
package sample.interfaces.web;
...
@Controller
class AccountsController {
 private AccountRepository accountRepository;
 ...
 @Transactional
 public ... transfer(...) {...}
}

If you see this, then it might be a good idea to move this (from the presentation layer) to an application-layer service.

package sample.interfaces.web;
...
@Controller
class AccountsController {
 private AccountRepository accountRepository;
 private TransferService transferService;
 ...
 public ... transfer(...) {...}
}
package sample.application;
...
@Service
@Transactional
class TransferService {
 private AccountRepository accountRepository;
 ...
 public ... transfer(...) {...}
}
package sample.domain.model;
...
@Entity
class Account {...}
...
interface AccountRepository {...}

Domain Model Pattern (only) for Complex Logic

I'll use the double-entry accounting as an example. But I'm sure there are more complex logic that's better suited.

Let's say we model journal entries and accounts as domain entities. The account contains a balance (a monetary amount). But this amount is not something that one would simply set. A journal entry needs to be created. When the journal entry is posted, it will affect the specified accounts. The account will then update its balance.

package ….accounting.domain.model;
...
/** Immutable */
@Entity
class JournalEntry {
 // zero-sum items
 @ElementCollection
 private Collection<JournalEntryItem> items;
 ...
}
...
/** A value object */
@Embeddable
class JournalEntryItem {...}
...
interface JournalEntryRepository {...}
...
@Entity
class Account {...}
...
interface AccountRepository {...}
...
@Entity
class AccountTransaction {...}
...
interface AccountTransactionRepository {...}

Now, in this case, a naive implementation would have a presentation-layer controller create a journal entry object, and use a repository to save it. And at some point in time (or if auto-posting is used), the corresponding account transactions are created, with account balances updated. All this needs to be rolled into a transaction (i.e. all-or-nothing).

Again, this transaction is ideally moved to an application service.

package ….accounting.application;

@Service
@Transactional
class PostingService {...}

If there's a need to allow the user to browse through journal entries and account transactions, the presentation-layer controller can directly use the corresponding repositories. If the domain entities are not suitable for the view technology (e.g. it doesn't follow JavaBean naming conventions), then the presentation-layer can define DTOs that are suitable for the view. Be careful! Don't change the domain entity just to suit the needs of the presentation-layer.

package ….interfaces.web;

@Controller
class AccountsController {
 private AccountRepository accountRepository;
 private AccountTransactionRepository accountTransactionRepository;
 private PostingService postingService;
  ...
}

In Closing...

So, there you have it. Hopefully, this post can shed some light on when (and when not) to use domain model pattern.

Tuesday, August 23, 2016

Spring Security OAuth2 with Google

I needed to create a web app using Spring MVC and secure it using OAuth2 with Google as a provider for authentication. Saket's Blog (posted back in September 2014) provided a good guide. But I needed something slightly different. I needed one that uses Maven (not Gradle) and minus Spring Boot. So, I thought it would just be a simple thing to do. But as I found out, it was not so simple, and I'm writing some details here to help others in using OAuth2 to secure their Spring MVC web apps.

Here's what I configured to make my web application use OAuth2 with Google as the provider.

  • Enable Spring Security with @EnableWebSecurity.
  • Add an OAuth2ClientAuthenticationProcessingFilter bean to the security filter chain just before the filter security interceptor. This authentication processing filter is configured to know where the authorization code resource can be found. This makes it possible for it to throw an exception that redirects the user to the authorization server for authentication and authorization.
  • Set an authentication entry point (specifically a LoginUrlAuthenticationEntryPoint) that redirects to the same URL as the one being detected by the OAuth2ClientAuthenticationProcessingFilter. Say, we choose the path “/oauth2/callback”. This path should be the one used by both authentication entry point and authentication processing filter.
  • Add @EnableOAuth2Client to create an OAuth2ClientContextFilter bean and make an OAuth2ClientContext available in request scope. To make request scope possible in the security filter chain, add a RequestContextListener or RequestContextFilter.
  • Add the OAuth2ClientContextFilter bean to the security filter chain just after the exception translation filter. This filter handles the exception that redirects the user (thrown by the authentication process filter). It handles this exception by sending a redirect.

Authorization Code Resource

The authentication processing filter needs to know where to redirect the user for authentication. So, a bean is configured and injected into the authentication process filter.

@Configuration
@EnableWebSecurity
@EnableOAuth2Client
@PropertySource("classpath:google-oauth2.properties")
public class ... extends WebSecurityConfigurerAdapter {
  ...
  @Value("${oauth2.clientId}")
  private String clientId;
  @Value("${oauth2.clientSecret}")
  private String clientSecret;
  @Value("${oauth2.userAuthorizationUri}")
  private String userAuthorizationUri;
  @Value("${oauth2.accessTokenUri}")
  private String accessTokenUri;
  @Value("${oauth2.tokenName}")
  private String tokenName;
  @Value("${oauth2.scope}")
  private String scope;
  @Value("${oauth2.userInfoUri}")
  private String userInfoUri;

  @Value("${oauth2.filterCallbackPath}")
  private String oauth2FilterCallbackPath;

  @Bean
  @Description("Authorization code resource")
  public OAuth2ProtectedResourceDetails authorizationCodeResource() {
    AuthorizationCodeResourceDetails details = new AuthorizationCodeResourceDetails();
    ...
    details.setClientId(clientId);
    details.setClientSecret(clientSecret);
    details.setUserAuthorizationUri(userAuthorizationUri);
    details.setAccessTokenUri(accessTokenUri);
    details.setTokenName(tokenName);
    String commaSeparatedScopes = scope;
    details.setScope(parseScopes(commaSeparatedScopes));
    details.setAuthenticationScheme(AuthenticationScheme.query);
    details.setClientAuthenticationScheme(AuthenticationScheme.form);
    return details;
  }

  private List<String> parseScopes(String commaSeparatedScopes) {...}

  ...

  @Bean
  @Description("Enables ${...} expressions in the @Value annotations"
      + " on fields of this configuration. Not needed if one is"
      + " already available.")
  public static PropertySourcesPlaceholderConfigurer propertySourcesPlaceholderConfigurer() {
    return new PropertySourcesPlaceholderConfigurer();
  }
}

Note that the authorization code resource details are externalized. These details include the URI for authentication, the URI to exchange an authorization code with an access token, client ID, and client secret.

Authentication Processing Filter

With an authorization code resource bean configured, we configure an authentication processing filter bean that will redirect to the authorization code resource when the incoming request is not yet authenticated. Note that the authentication processing filter is injected with an OAuth2RestTemplate that points to the authorization code resource.

@Configuration
@EnableWebSecurity
@EnableOAuth2Client
@PropertySource("classpath:google-oauth2.properties")
public class ... extends WebSecurityConfigurerAdapter {
  @Autowired
  private OAuth2ClientContext oauth2ClientContext;
  ...
  @Bean
  @Description("Authorization code resource")
  public OAuth2ProtectedResourceDetails authorizationCodeResource() {
    AuthorizationCodeResourceDetails details = new AuthorizationCodeResourceDetails();
    ...
    return details;
  }

  @Bean
  @Description("Filter that checks for authorization code, "
      + "and if there's none, acquires it from authorization server")
  public OAuth2ClientAuthenticationProcessingFilter
        oauth2ClientAuthenticationProcessingFilter() {
    // Used to obtain access token from authorization server (AS)
    OAuth2RestOperations restTemplate = new OAuth2RestTemplate(
        authorizationCodeResource(),
        oauth2ClientContext);
    OAuth2ClientAuthenticationProcessingFilter filter =
        new OAuth2ClientAuthenticationProcessingFilter(oauth2FilterCallbackPath);
    filter.setRestTemplate(restTemplate);
    // Set a service that validates an OAuth2 access token
    // We can use either Google API's UserInfo or TokenInfo
    // For this, we chose to use UserInfo service
    filter.setTokenServices(googleUserInfoTokenServices());
    return filter;
  }

  @Bean
  @Description("Google API UserInfo resource server")
  public GoogleUserInfoTokenServices googleUserInfoTokenServices() {
    GoogleUserInfoTokenServices userInfoTokenServices =
        new GoogleUserInfoTokenServices(userInfoUri, clientId);
    return userInfoTokenServices;
  }
  ...
}

Note that the access token is further checked by using it to access a secured resource (provided by a resource server). In this case, the Google API to retrieve user information like email and photo is used.

Arguably, the authorization code resource does not need to be configured as a bean, since it is only used by the authentication processing filter.

Authentication Entry Point

The authentication processing filter and the authentication entry point are configured to detect the same request path.

@Configuration
@EnableWebSecurity
@EnableOAuth2Client
@PropertySource("classpath:google-oauth2.properties")
public class ... extends WebSecurityConfigurerAdapter {
  ...
  public OAuth2ProtectedResourceDetails authorizationCodeResource() {...}

  @Bean
  @Description("Filter that checks for authorization code, "
      + "and if there's none, acquires it from authorization server")
  public OAuth2ClientAuthenticationProcessingFilter
        oauth2ClientAuthenticationProcessingFilter() {
    ...
    OAuth2ClientAuthenticationProcessingFilter filter =
        new OAuth2ClientAuthenticationProcessingFilter(oauth2FilterCallbackPath);
    ...
    return filter;
  }
  ...
  @Bean
  public AuthenticationEntryPoint authenticationEntryPoint() {
    return new LoginUrlAuthenticationEntryPoint(oauth2FilterCallbackPath);
  }
  ...
}

So, how does this all work?

This is how the security filter chain will look like with the added custom filters. Note that for brevity, not all filters were included.

Web Browser
SecurityContextPersistenceFilter
LogoutFilter
ExceptionTranslationFilter
OAuth2ClientContextFilter
OAuth2ClientAuthenticationProcessingFilter
FilterSecurityInterceptor
Secured Resource

So, here's what happens at runtime. The client referred to here, is a web application that uses OAuth2 for authentication.

  1. Request for a secured resource on the client is received. It travels through the security filter chain until FilterSecurityInterceptor. The request has not been authenticated yet (i.e. security context does not contain an authentication object), and the FilterSecurityInterceptor throws an exception (AuthenticationCredentialsNotFoundException). This authentication exception travels up the security filter chain, and is handled by ExceptionTranslationFilter. It detects that an authentication exception occured, and delegates to the authentication entry point. The configured authentication entry point (LoginUrlAuthenticationEntryPoint) redirects the user to a new location (e.g. “/oauth2/callback”). The request for a secured resource is saved, and request processing completes.
    Web Browser
    SecurityContextPersistenceFilter
    LogoutFilter
    ExceptionTranslationFilter Delegate to authentication entry point
    ↓ ↑
    OAuth2ClientContextFilter
    ↓ ↑
    OAuth2ClientAuthenticationProcessingFilter
    ↓ ↑
    FilterSecurityInterceptor Throws exception!
     
    Secured Resource
  2. Since a redirect is the response of the previous request, a request to the new location is made. This request travels through the security filter chain until OAuth2ClientAuthenticationProcessingFilter determines that it is a request for authentication (e.g. it matches “/oauth2/callback”). Upon checking the request, it determines that there’s no authorization code, and throws an exception (UserRedirectRequiredException) that contains a URL to the authorization code resource (e.g. https://accounts.google.com/o/oauth2/v2/auth?client_id=https://accounts.google.com/o/oauth2/v2/auth?client_id=…&redirect_uri=http://…/…/oauth2/callback&response_type=code&scope=…&state=…). This exception is handled by OAuth2ClientContextFilter. And request processing completes.
    Web Browser
    SecurityContextPersistenceFilter
    LogoutFilter
    ExceptionTranslationFilter
    OAuth2ClientContextFilter Handle exception by sending redirect
    ↓ ↑
    OAuth2ClientAuthenticationProcessingFilter Throws exception!
     
    FilterSecurityInterceptor
     
    Secured Resource
  3. Just as before, the redirect is followed. This time, it is a redirect to the authorization server (e.g. https://accounts.google.com/o/oauth2/v2/auth). The user is asked to authenticate (if not yet authenticated).
  4. Next, the user is asked to allow/authorize the client to have access to his/her information. After the user decides to allow/authorize the client, the authorization server redirects back to the client (based on the redirect_uri parameter).
  5. Request on the client is received. It travels through the security filter chain until OAuth2ClientAuthenticationProcessingFilter determines that it is a request for authentication (e.g. it matches “/oauth2/callback”). It finds that the request contains an authorization code, and proceeds to exchange the authorization code for an access token. Furthermore, it validates the access token by accessing a resource (on a resource server), and creates an Authentication object (with Principal and GrantedAuthority objects). This will be stored in the session and in the security context. And request processing completes with a redirect to the saved request (from #1).
    Web Browser
    SecurityContextPersistenceFilter
    LogoutFilter
    ExceptionTranslationFilter
    OAuth2ClientContextFilter
    OAuth2ClientAuthenticationProcessingFilter Exchanges authorization code with access token; creates authentication object and stores it in session
     
    FilterSecurityInterceptor
     
    Secured Resource
  6. Just as before, the redirect is followed. It travels through the security filter chain. This time, the FilterSecurityInterceptor allows the request to proceed, since there is an authentication object in the security context (retrieved from session). The secured resource is provided to the user (e.g. render a view/page of the secured resource).
    Web Browser
    SecurityContextPersistenceFilter
    LogoutFilter
    ExceptionTranslationFilter
    OAuth2ClientContextFilter
    OAuth2ClientAuthenticationProcessingFilter
    FilterSecurityInterceptor
    Secured Resource :)

Code and Credits

The code for the sample web application can be found here at my GitHub account.

Again, thanks to Saket's Blog.

Also, check out Security Architecture with Spring at Java Code Geeks.

Thursday, July 21, 2016

One-shot Delete with Hibernate (JPA)

In older versions of Hibernate, I can see the one-shot delete indicated in the manual. But newer versions no longer have this section. I'm not sure why. So, in this post, I take a look if it still works.

The one-shot delete section says:

Deleting collection elements one by one can sometimes be extremely inefficient. Hibernate knows not to do that in the case of an newly-empty collection (if you called list.clear(), for example). In this case, Hibernate will issue a single DELETE.

Suppose you added a single element to a collection of size twenty and then remove two elements. Hibernate will issue one INSERT statement and two DELETE statements, unless the collection is a bag. This is certainly desirable.

However, suppose that we remove eighteen elements, leaving two and then add thee new elements. There are two possible ways to proceed

  • delete eighteen rows one by one and then insert three rows
  • remove the whole collection in one SQL DELETE and insert all five current elements one by one

Hibernate cannot know that the second option is probably quicker. It would probably be undesirable for Hibernate to be that intuitive as such behavior might confuse database triggers, etc.

Fortunately, you can force this behavior (i.e. the second strategy) at any time by discarding (i.e. dereferencing) the original collection and returning a newly instantiated collection with all the current elements.

One-shot-delete does not apply to collections mapped inverse="true".

The inverse="true" is for (Hibernate Mapping) XML. But in this post, we'll see how "one-shot delete" works in JPA (with Hibernate as the provider).

We will try different approaches and see which one will result to a one-shot delete.

  1. Bi-directional one-to-many
  2. Uni-directional one-to-many (with join table)
  3. Uni-directional one-to-many (with no join table)
  4. Uni-directional one-to-many (using ElementCollection)

We'll use a Cart entity with many CartItems.

Bi-directional One-to-Many

For this, we have references from both sides.

@Entity
public class Cart { ...
 @OneToMany(mappedBy="cart", cascade=ALL, orphanRemoval=true)
 Collection<OrderItem> items;
}

@Entity
public class CartItem { ...
 @ManyToOne Cart cart;
}

To test this, we insert one row to the table for Cart, and three or more rows to the table for CartItem. Then, we run the test.

public class CartTests { ...
 @Test
 public void testOneShotDelete() throws Exception {
  Cart cart = entityManager.find(Cart.class, 53L);
  for (CartItem item : cart.items) {
   item.cart = null; // remove reference to cart
  }
  cart.items.clear(); // as indicated in Hibernate manual
  entityManager.flush(); // just so SQL commands can be seen
 }
}

The SQL commands shown had each item deleted individually (and not as a one-shot delete).

delete from CartItem where id=?
delete from CartItem where id=?
delete from CartItem where id=?

Discarding the original collection did not work either. It even caused an exception.

public class CartTests { ...
 @Test
 public void testOneShotDelete() throws Exception {
  Cart cart = entityManager.find(Cart.class, 53L);
  // remove reference to cart
  cart.items = new LinkedList<CartItem>(); // discard, and use new collection
  entityManager.flush(); // just so SQL commands can be seen
 }
}
javax.persistence.PersistenceException:
    org.hibernate.HibernateException:
        A collection with cascade="all-delete-orphan" was no longer referenced by the owning entity instance: ….Cart.items

I tested this with Hibernate 4.3.11 and HSQL 2.3.2. If your results vary, please hit the comments.

Uni-directional One-to-Many (With Join Table)

For this, we make changes to the mapping. This causes a join table to be created.

@Entity
public class Cart { ...
 @OneToMany(cascade=ALL)
 Collection<OrderItem> items;
}

@Entity
public class CartItem { ...
 // no @ManyToOne Cart cart;
}

Again, we insert one row to the table for Cart, and three or more rows to the table for CartItem. We also have to insert appropriate records to the join table (Cart_CartItem). Then, we run the test.

public class CartTests { ...
 @Test
 public void testOneShotDelete() throws Exception {
  Cart cart = entityManager.find(Cart.class, 53L);
  cart.items.clear(); // as indicated in Hibernate manual
  entityManager.flush(); // just so SQL commands can be seen
 }
}

The SQL commands shown had the associated rows in the join table deleted (with one command). But the rows in the table for CartItem still exist (and did not get deleted).

delete from Cart_CartItem where cart_id=?
// no delete commands for CartItem

Hmmm, not exactly what we want, since the rows in the table for CartItem still exist.

Uni-directional One-to-Many (No Join Table)

Starting with JPA 2.0, the join table can be avoided in a uni-directional one-to-many by specifying a @JoinColumn.

@Entity
public class Cart { ...
 @OneToMany(cascade=CascadeType.ALL, orphanRemoval=true)
 @JoinColumn(name="cart_id", updatable=false, nullable=false)
 Collection<OrderItem> items;
}

@Entity
public class CartItem { ...
 // no @ManyToOne Cart cart;
}

Again, we insert one row to the table for Cart, and three or more rows to the table for CartItem. Then, we run the test.

public class CartTests { ...
 @Test
 public void testOneShotDelete() throws Exception {
  Cart cart = entityManager.find(Cart.class, 53L);
  cart.items.clear(); // as indicated in Hibernate manual
  entityManager.flush(); // just so SQL commands can be seen
 }
}

Discarding the original collection also did not work either. It also caused the same exception (as with bi-directional one-to-many).

javax.persistence.PersistenceException:
    org.hibernate.HibernateException:
        A collection with cascade="all-delete-orphan" was no longer referenced by the owning entity instance: ….Cart.items

Uni-directional One-to-Many (with ElementCollection)

JPA 2.0 introduced @ElementCollection. This allows one-to-many relationships to be established with the many-side being either @Basic or @Embeddable (i.e. not an @Entity).

@Entity
public class Cart { ...
 @ElementCollection // @OneToMany for basic and embeddables
 @CollectionTable(name="CartItem") // defaults to "Cart_items" if not overridden
 Collection<OrderItem> items;
}

@Embeddable // not an entity!
public class CartItem {
 // no @Id
 // no @ManyToOne Cart cart;
 private String data; // just so that there are columns we can set
}

Again, we insert one row to the table for Cart, and three or more rows to the table for CartItem. Then, we run the test.

public class CartTests { ...
 @Test
 public void testOneShotDelete() throws Exception {
  Cart cart = entityManager.find(Cart.class, 53L);
  cart.items.clear(); // as indicated in Hibernate manual
  entityManager.flush(); // just so SQL commands can be seen
 }
}

Yey! The associated rows for CartItem were deleted in one shot.

delete from CartItem where Cart_id=?

Closing Thoughts

One-shot delete occurs with uni-directional one-to-many using ElementCollection (where the many-side is an embeddabled, and not an entity).

In the uni-directional one-to-many with join table scenario, deleting entries in a join table doesn't add much value.

I'm not sure why one-shot delete works (or why it works this way) in Hibernate. But I do have a guess. And that is the underlying JPA provider could not do a one-shot delete because it could not ensure that the many-side entity is not referenced by other entities. Unlike the ElementCollection, the many-side is not an entity and cannot be referenced by other entities.

Now, this does not mean that you have to use ElementCollection all the time. Perhaps the one-shot delete only applies to aggregate roots. In those cases, using Embeddable and ElementCollection might be appropriate for a collection of value objects that make up an aggregate. When the aggregate root is removed, then it would be good to see that the "child" objects should be removed as well (and in an efficient manner).

I wish there was a way in JPA to indicate that the child entities are privately owned and can be safely removed when the parent entity is removed (e.g. similar to @PrivateOwned in EclipseLink). Let's see if it will be included in a future version of the API.

Hope this helps.

Wednesday, July 20, 2016

Reference by Identity in JPA

In a previous post, I mentioned that I opted to reference other aggregates by their primary key, and not by type. I usually use this approach (a.k.a. disconnected domain model) when working with large or complex domain models. In this post, let me try to explain further how it can be done in JPA. Note that the resulting DDL scripts will not create a foreign key constraint (unlike the one shown in the previous post).

Reference by Identity

In most JPA examples, every entity references another entity, or is being referenced by another entity. This results into an object model that allows traversal from one entity to any other entity. This can cause unwanted traversals (and unwanted cascade of persistence operations). As such, it would be good to prevent this, by referencing other entities by ID (and not by type).

The code below shows how OrderItem references a Product entity by its primary key (and not by type).

@Entity
public class Product {
 @Id private Long id;
 // ...
}

@Entity
public class Order {
 // ...
 @OneToMany(mappedBy="order")
 private Collection<OrderItem> items;
}

@Entity
public class OrderItem {
 // ...
 @ManyToOne
 private Order order;
 // @ManyToOne
 // private Product product;
 private Long productId;
 // ...
}

There are several ways to get the associated Product entities. One way is to use a repository to find products given the IDs (ProductRepository with a findByIdIn(List<Long> ids) method). As mentioned in previous comments, please be careful not to end up with the N+1 selects problem.

Custom identity types can also be used. The example below uses ProductId. It is a value object. And because of JPA, we needed to add a zero-arguments constructor.

@Embeddable
public class ProductId {
 private Long id;
 public ProductId(long id) {
  this.id = id;
 }
 public long getValue() { return id; }
 // equals and hashCode
 protected ProductId() { /* as required by JPA */ }
}

@Entity
public class Product {
 @EmbeddedId private ProductId id;
 // ...
}

@Entity
public class Order { // ...
 @OneToMany(mappedBy="order")
 private Collection<OrderItem> items;
}

@Entity
public class OrderItem {
 // ...
 @ManyToOne
 private Order order;
 // @ManyToOne
 // private Product product;
 @Embedded private ProductId productId;
 // ...
}

But this will not work if you're using generated values for IDs. Fortunately, starting with JPA 2.0, there are some tricks around this, which I'll share in the next section.

Generated IDs

In JPA, when using non-@Basic types as @Id, we can no longer use @GeneratedValue. But using a mix of property and field access, we can still use generated value and ProductId.

@Embeddable
@Access(AccessType.FIELD)
public class ProductId {...}

@Entity
@Access(AccessType.FIELD)
public class Product {
 @Transient private ProductId id;
 public ProductId getId() { return id; }
 // ...
 private Long id_;
 @Id
 @GeneratedValue(strategy=...)
 @Access(AccessType.PROPERTY)
 protected Long getId_() { return id_; }
 protected void setId_(Long id_) {
  this.id_ = id_;
  this.id = new ProductId(this.id_);
 }
}

@Entity
public class Order { // ...
 @OneToMany(mappedBy="order")
 private Collection<OrderItem> items;
}

@Entity
public class OrderItem {
 // ...
 @ManyToOne
 private Order order;
 // @ManyToOne
 // private Product product;
 @Embedded private ProductId productId;
 // ...
}

The trick involves using property access for the generated ID value (while keeping the rest with field access). This causes JPA to use the setter method. And in it, we initialize the ProductId field. Note that the ProductId field is not persisted (marked as @Transient).

Hope this helps.

Monday, January 11, 2016

JPA Pitfalls / Mistakes

From my experience, both in helping teams and conducting training, here are some pitfalls/mistakes I have encountered that caused some problems in Java-based systems that use JPA.

  • Requiring a public no-arg constructor
  • Always using bi-directional associations/relationships
  • Using @OneToMany for collections that can become huge

Requiring a Public No-arg Constructor

Yes, a JPA @Entity requires a zero-arguments (or default no-args) constructor. But this can be made protected. You do not have to make it public. This allows better object-oriented modeling, since you are not forced to have a publicly accessible zero-arguments constructor.

The entity class must have a no-arg constructor. The entity class may have other constructors as well. The no-arg constructor must be public or protected. [emphasis mine]

If the entity being modeled has some fields that need to be initialized when it is created, this should be done through its constructor.

Let's say we're modeling a hotel room reservation system. In it, we probably have entities like room, reservation, etc. The reservation entity will likely require start and end dates, since it would not make much sense to create one without the period of stay. Having the start and end dates included as arguments in the reservation's constructor would allow for a better model. Keeping a protected zero-arguments constructor would make JPA happy.

@Entity
public class Reservation { ...
 public Reservation(
   RoomType roomType, DateRange startAndEndDates) {
  if (roomType == null || startAndEndDates == null) {
   throw new IllegalArgumentException(...);
  } ...
 }
 ...
 protected Reservation() { /* as required by ORM/JPA */ }
}

It also helps to add a comment in the zero-arguments constructor to indicate that it was added for JPA-purposes (technical infrastructure), and that it is not required by the domain (business rules/logic).

Although I could not find it mentioned in the JPA 2.1 spec, embeddable classes also require a default (no-args) constructor. And just like entities, the required no-args constructor can be made protected.

@Embeddable
public class DateRange { ...
 public DateRange(Date start, Date end) {
  if (start == null || end == null) {
   throw new IllegalArgumentException(...);
  }
  if (start.after(end)) {
   throw new IllegalArgumentException(...);
  } ...
 }
 ...
 protected DateRange() { /* as required by ORM/JPA */ }
}

The DDD sample project also hides the no-arg constructor by making it package scope (see Cargo entity class where no-arg constructor is near the bottom).

Always Using Bi-directional Associations/Relationships

Instructional material on JPA often show a bi-directional association. But this is not required. For example, let's say we have an order entity with one or more items.

@Entity
public class Order {
 @Id private Long id;
 @OneToMany private List<OrderItem> items;
 ...
}

@Entity
public class OrderItem {
 @Id private Long id;
 @ManyToOne private Order order;
 ...
}

It's good to know that bi-directional associations are supported in JPA. But in practice, it becomes a maintenance nightmare. If order items do not have to know its parent order object, a uni-directional association would suffice (as shown below). The ORM just needs to know how to name the foreign key column in the many-side table. This is provided by adding the @JoinColumn annotation on the one-side of the association.

@Entity
public class Order {
 @Id Long id;
 @OneToMany
 @JoinColumn(name="order_id", ...)
 private List<OrderItem> items;
 ...
}

@Entity
public class OrderItem {
 @Id private Long id;
 // @ManyToOne private Order order;
 ...
}

Making it uni-directional makes it easier since the OrderItem no longer needs to keep a reference to the Order entity.

Note that there may be times when a bi-directional association is needed. In practice, this is quite rare.

Here's another example. Let's say you have several entities that refer to a country entity (e.g. person's place of birth, postal address, etc.). Obviously, these entities would reference the country entity. But would country have to reference all those different entities? Most likely, not.

@Entity
public class Person {
 @Id Long id;
 @ManyToOne private Country countryOfBirth;
 ...
}

@Entity
public class PostalAddress {
 @Id private Long id;
 @ManyToOne private Country country;
 ...
}

@Entity
public class Country {
 @Id ...;
 // @OneToMany private List<Person> persons;
 // @OneToMany private List<PostalAddress> addresses;
}

So, just because JPA supports bi-directional association does not mean you have to!

Using @OneToMany For Collections That Can Become Huge

Let's say you're modeling bank accounts and its transactions. Over time, an account can have thousands (if not millions) of transactions.

@Entity
public class Account {
 @Id Long id;
 @OneToMany
 @JoinColumn(name="account_id", ...)
 private List<AccountTransaction> transactions;
 ...
}

@Entity
public class AccountTransaction {
 @Id Long id;
 ...
}

With accounts that have only a few transactions, there doesn't seem to be any problem. But over time, when an account contains thousands (if not millions) of transactions, you'll most likely experience out-of-memory errors. So, what's a better way to map this?

If you cannot ensure the maximum number of elements in the many-side of the association can all be loaded in memory, better use the @ManyToOne on the opposite side of the association.

@Entity
public class Account {
 @Id Long id;
 // @OneToMany private List<AccountTransaction> transactions;
 ...
}

@Entity
public class AccountTransaction {
 @Id Long id;
 @ManyToOne
 private Account account;
 ...
 public AccountTransaction(Account account, ...) {...}

 protected AccountTransaction() { /* as required by ORM/JPA */ }
}

To retrieve the possibly thousands (if not millions) of transactions of an account, use a repository that supports pagination.

@Transactional
public interface AccountTransactionRepository {
 Page<AccountTransaction> findByAccount(
  Long accountId, int offset, int pageSize);
 ...
}

To support pagination, use the Query object's setFirstResult(int) and setMaxResults(int) methods.

Summary

I hope these notes can help developers avoid making these mistakes. To summarize:

  • Requiring a public The JPA-required no-arg constructor can be made public or protected. Consider making it protected if needed.
  • Always using Consider uni-directional over bi-directional associations/relationships.
  • Using Avoid @OneToMany for collections that can become huge. Consider mapping the @ManyToOne-side of the association/relationship instead, and support pagination.