Wednesday, June 20, 2018

Dealing with Domain Objects in Spring MVC

I was recently surprised by how one code base had public default constructors (i.e. zero-arguments constructors) in all their domain entities, and had getters and setters for all the fields. As I dug deeper, I found out that the reason why the domain entites are the way they are is largely because the team thinks it was required by the web/MVC framework. And I thought it would be a good opportunity to clear up some misconceptions.

Specifically, we'll look at the following cases:

  1. No setter for generated ID field (i.e. the generated ID field has a getter but no setter)
  2. No default constructor (e.g. no public zero-arguments constructor)
  3. Domain entity with child entities (e.g. child entities are not exposed as a modifiable list)

Binding Web Request Parameters

First, some specifics and some background. Let's base this on a specific web/MVC framework - Spring MVC. When using Spring MVC, its data binding binds request parameters by name. Let's use an example.

@Controller
@RequestMapping("/accounts")
... class ... {
    ...
    @PostMapping
    public ... save(@ModelAttribute Account account, ...) {...}
    ...
}

Given the above controller mapped to "/accounts", where can an Account instance come from?

Based on documentation, Spring MVC will get an instance using the following options:

  • From the model if already added via Model (like via @ModelAttribute method in the same controller).
  • From the HTTP session via @SessionAttributes.
  • From a URI path variable passed through a Converter.
  • From the invocation of a default constructor.
  • (For Kotlin only) From the invocation of a "primary constructor" with arguments matching to Servlet request parameters; argument names are determined via JavaBeans @ConstructorProperties or via runtime-retained parameter names in the bytecode.

Assuming an Account object is not added in the session, and that there is no @ModelAttribute method, Spring MVC will end up instantiating one using its default constructor, and binding web request parameters by name. For example, the request contains "id" and "name" parameters. Spring MVC will try to bind them to "id" and "name" bean properties by invoking "setId" and "setName" methods, respectively. This follows JavaBean conventions.

No Setter Method for Generated ID Field

Let's start with something simple. Let's say that we have an Account domain entity. It has an ID field that is generated by the persistent store, and only provides a getter method (but no setter method).

@Entity
... class Account {
    @Id @GeneratedValue(...) private Long id;
    ...
    public Account() { ... }
    public Long getId() { return id; }
    // but no setId() method
}

So, how can we have Spring MVC bind request parameters to an Account domain entity? Are we forced to have a public setter method for a field that is generated and read-only?

In our HTML form, we will not place the "id" as a request parameter. We will place it as a path variable instead.

We use a @ModelAttribute method. It is called prior to the request handling method. And it supports pretty much the same parameters as a regular request handling method. In our case, we use it to retrieve an Account domain entity with the given unique identifier, and use it for further binding. Our controller would look something like this.

@Controller
@RequestMapping("/accounts")
... class ... {
    ...
    @ModelAttribute
    public Account populateModel(
            HttpMethod httpMethod,
            @PathVariable(required=false) Long id) {
        if (id != null) {
            return accountRepository.findById(id).orElseThrow(...);
        }
        if (httpMethod == HttpMethod.POST) {
            return new Account();
        }
        return null;
    }

    @PutMapping("/{id}")
    public ... update(...,
            @ModelAttribute @Valid Account account, ...) {
        ...
        accountRepository.save(account);
        return ...;
    }

    @PostMapping
    public ... save(@ModelAttribute @Valid Account account, ...) {
        ...
        accountRepository.save(account);
        return ...;
    }
    ...
}

When updating an existing account, the request would be a PUT to "/accounts/{id}" URI. In this case, our controller needs to retrieve the domain entity with the given unique identifier, and provide the same domain object to Spring MVC for further binding, if any. The "id" field will not need a setter method.

When adding or saving a new account, the request would be a POST to "/accounts". In this case, our controller needs to create a new domain entity with some request parameters, and provide the same domain object to Spring MVC for further binding, if any. For new domain entities, the "id" field is left null. The underlying persistence infrastructure will generate a value upon storing. Still, the "id" field will not need a setter method.

In both cases, the @ModelAttribute method populateModel is called prior to the mapped request handling method. Because of this, we needed to use parameters in populateModel to determine which case it is being used in.

No Default Constructor in Domain Object

Let's say that our Account domain entity does not provide a default constructor (i.e. no zero-arguments constructor).

... class Account {
    public Account(String name) {...}
    ...
    // no public default constructor
    // (i.e. no public zero-arguments constructor)
}

So, how can we have Spring MVC bind request parameters to an Account domain entity? It does not provide a default constructor.

We can use a @ModelAttribute method. In this case, we want to create an Account domain entity with request parameters, and use it for further binding. Our controller would look something like this.

@Controller
@RequestMapping("/accounts")
... class ... {
    ...
    @ModelAttribute
    public Account populateModel(
            HttpMethod httpMethod,
            @PathVariable(required=false) Long id,
            @RequestParam(required=false) String name) {
        if (id != null) {
            return accountRepository.findById(id).orElseThrow(...);
        }
        if (httpMethod == HttpMethod.POST) {
            return new Account(name);
        }
        return null;
    }

    @PutMapping("/{id}")
    public ... update(...,
            @ModelAttribute @Valid Account account, ...) {
        ...
        accountRepository.save(account);
        return ...;
    }

    @PostMapping
    public ... save(@ModelAttribute @Valid Account account, ...) {
        ...
        accountRepository.save(account);
        return ...;
    }
    ...
}

Domain Entity with Child Entities

Now, let's look at a domain entity that has child entities. Something like this.

... class Order {
    private Map<..., OrderItem> items;
    public Order() {...}
    public void addItem(int quantity, ...) {...}
    ...
    public Collection<CartItem> getItems() {
        return Collections.unmodifiableCollection(items.values());
    }
}

... class OrderItem {
    private int quantity;
    // no public default constructor
    ...
}

Note that the items in an order are not exposed as a modifiable list. Spring MVC supports indexed properties and binds them to an array, list, or other naturally ordered collection. But, in this case, the getItems method returns an unmodifiable collection. This means that an exception would be thrown when an object attempts to add/remove items to/from it. So, how can we have Spring MVC bind request parameters to an Order domain entity? Are we forced to expose the order items as a mutable list?

Not really. We must refrain from diluting the domain model with presentation-layer concerns (like Spring MVC). Instead, we make the presentation-layer a client of the domain model. To handle this case, we create another type that complies with Spring MVC, and keep our domain entities agnostic of the presentation layer.

... class OrderForm {
    public static OrderForm fromDomainEntity(Order order) {...}
    ...
    // public default constructor
    // (i.e. public zero-arguments constructor)
    private List<OrderFormItem> items;
    public List<OrderFormItem> getItems() { return items; }
    public void setItems(List<OrderFormItem> items) { this.items = items; }
    public Order toDomainEntity() {...}
}

... class OrderFormItem {
    ...
    private int quantity;
    // public default constructor
    // (i.e. public zero-arguments constructor)
    // public getters and setters
}

Note that it is perfectly all right to create a presentation-layer type that knows about the domain entity. But it is not all right to make the domain entity aware of presentation-layer objects. More specifically, presentation-layer OrderForm knows about the Order domain entity. But Order does not know about presentation-layer OrderForm.

Here's how our controller will look like.

@Controller
@RequestMapping("/orders")
... class ... {
    ...
    @ModelAttribute
    public OrderForm populateModel(
            HttpMethod httpMethod,
            @PathVariable(required=false) Long id,
            @RequestParam(required=false) String name) {
        if (id != null) {
            return OrderForm.fromDomainEntity(
                orderRepository.findById(id).orElseThrow(...));
        }
        if (httpMethod == HttpMethod.POST) {
            return new OrderForm(); // new Order()
        }
        return null;
    }

    @PutMapping("/{id}")
    public ... update(...,
            @ModelAttribute @Valid OrderForm orderForm, ...) {
        ...
        orderRepository.save(orderForm.toDomainEntity());
        return ...;
    }

    @PostMapping
    public ... save(@ModelAttribute @Valid OrderForm orderForm, ...) {
        ...
        orderRepository.save(orderForm.toDomainEntity());
        return ...;
    }
    ...
}

Closing Thoughts

As I've mentioned in previous posts, it is all right to have your domain objects look like a JavaBean with public default zero-arguments constructors, getters, and setters. But if the domain logic starts to get complicated, and requires that some domain objects lose its JavaBean-ness (e.g. no more public zero-arguments constructor, no more setters), do not worry. Define new JavaBean types to satisfy presentation-related concerns. Do not dilute the domain logic.

That's all for now. I hope this helps.

Thanks again to Juno for helping me out with the samples. The relevant pieces of code can be found on GitHub.

Wednesday, January 24, 2018

JasperReports: The Tricky Parts

If you have been programming in Java long enough, chances are you needed to generate reports for business users. In my case, I've seen several projects use JasperReports® Library to generate reports in PDF and other file formats. Recently, I've had the privilege of observing Mike and his team use the said reporting library and the challenges they faced.

JasperReports in a Nutshell

In a nutshell, generating reports using JasperReports (JR) involves three steps:

  1. Load compiled report (i.e. load a JasperReport object)
  2. Run report by filling it with data (results to a JasperPrint object)
  3. Export filled report to a file (e.g. use JRPdfExporter to export to PDF)

In Java code, it looks something like this.

JasperReport compiledReport = JasperCompileManager.compileReport(
        "sample.jrxml");
Map<String, Object> parameters = ...;
java.sql.Connection connection = dataSource.getConnection();
try {
    JasperPrint filledReport = JasperFillManager.fillReport(
            compiledReport, parameters, connection);
    JasperExportManager.exportReportToPdf(
            filledReport, "report.pdf");
} finally {
    connection.close();
}

Thanks to the facade classes, this looks simple enough. But looks can be deceiving!

Given the above code snippet (and the outlined three steps), which parts do you think takes the most amount of time and memory? (Sounds like an interview question).

If you answered (#2) filling with data, you're correct! If you answered #3, you're also correct, since #3 is proportional to #2.

IMHO, most online tutorials only show the easy parts. In the case of JR, there seems to be a lack of discussion on the more difficult and tricky parts. Here, with Mike's team, we encountered two difficulties: out of memory errors, and long running reports. What made these difficulties particularly memorable was that they only showed up during production (not during development). I hope that by sharing them, they can be avoided in the future.

Out of Memory Errors

The first challenge was reports running out of memory. During development, the test data we use to run the report would be too small when compared to real operating data. So, design for that.

In our case, all reports were run with a JRVirtualizer. This way, it will flush to disk/file when the maximum number of pages/objects in memory has been reached.

During the process, we also learned that the virtualizer needs to be cleaned-up. Otherwise, there will be several temporary files lying around. And we can only clean-up these temporary files after the report has been exported to file.

Map<String, Object> parameters = ...;
JRVirtualizer virtualizer = new JRFileVirtualizer(100);
try {
    parameters.put(JRParameter.REPORT_VIRTUALIZER, virtualizer);
    ...
    ... filledReport = JasperFillManager.fillReport(
            compiledReport, parameters, ...);
    // cannot cleanup virtualizer at this point
    JasperExportManager.exportReportToPdf(filledReport, ...);
} finally {
    virtualizer.cleanup();
}

For more information, please see Virtualizer Sample - JasperReports.

Note that JR is not always the culprit when we encountered out-of-memory errors when running reports. Sometimes, we would encounter an out-of-memory error even before JR was used. We saw how JPA can be misused to load the entire dataset for the report (Query.getResultList() and TypedQuery.getResultList()). Again, the error does not show up during development since the dataset is still small. But when the dataset is too large to fit in memory, we get the out-of-memory errors. We opted to avoid using JPA for generating reports. I guess we'll just have to wait until JPA 2.2's Query.getResultStream() becomes available. I wish JPA's Query.getResultList() returned Iterable instead. That way, it is possible to have one entity is mapped at a time, and not the entire result set.

For now, avoid loading the entire dataset. Load one record at a time. In the process, we went back to good ol' JDBC. Good thing JR uses ResultSets well.

Long Running Reports

The second challenge was long running reports. Again, this probably doesn't happen during development. At best, a report that runs for about 10 seconds is considered long. But with real operating data, it can run for about 5-10 minutes. This is especially painful when the report is being generated upon an HTTP request. If the report can start to write to the response output stream within the timeout period (usually 60 seconds or up to 5 minutes), then it has a good chance of being received by the requesting user (usually via browser). But if it takes more than 5 minutes to fill the report and another 8 minutes to export to file, then the user will just see a timed-out HTTP request, and log it as a bug. Sound familiar?

Keep in mind that reports can run for a few minutes. So, design for that.

In our case, we launch reports on a separate thread. For reports that are triggered with an HTTP request, we respond with a page that contains a link to the generated report. This avoids the time-out problem. When the user clicks on this link and the report is not yet complete, s/he will see that the report is still being generated. But when the report is completed, s/he will be able to see the generated report file.

ExecutorService executorService = ...;
... = executorService.submit(() -> {
    Map<String, Object> parameters = ...;
    try {
        ...
        ... filledReport = JasperFillManager.fillReport(
                compiledReport, parameters, ...);
        JasperExportManager.exportReportToPdf(filledReport, ...);
    } finally {
        ...
    }
});

We also had to add the ability to stop/cancel a running report. Good thing JR has code that checks for Thread.interrupted(). So, simply interrupting the thread will make it stop. Of course, you'll need to write some tests to verify (expect JRFillInterruptedException and ExportInterruptedException).

And while we were at it, we rediscovered ways to add "listeners" to the report generation (e.g. FillListener and JRExportProgressMonitor) and provide the user some progress information.

We also created utility test classes to generate large amounts of data by repeating a given piece of data over and over. This is useful to help the rest of the team develop JR applications that are designed for handling long runs and out-of-memory errors.

Further Design Considerations

Another thing to consider is the opening and closing of the resource needed when filling the report. This could be a JDBC connection, a Hibernate session, a JPA EntityManager, or a file input stream (e.g. CSV, XML). Illustrated below is a rough sketch of my design considerations.

1. Compiling
         - - - - - - - - - - - - - -\
         - - - -\                    \
2. Filling       > open-close         \
         - - - -/   resource           > swap to file
                                      /
3. Exporting                         /
         - - - - - - - - - - - - - -/

We want to isolate #2 and define decorators that would open the resource, fill the report, and close the opened resource in a finally block. The resource that is opened may depend on the <queryString> element (if present) inside the report. In some cases, where there is no <queryString> element, there is probably no need to open a resource.

<queryString language="hql">
    <![CDATA[ ... ]]>
</queryString>
...
<queryString language="csv">
    <![CDATA[ ... ]]>
</queryString>

Furthermore, we also want to combine #2 and #3 as one abstraction. This single abstraction makes it easier to decorate with enhancements, like flushing the created page objects to files, and load them back during exporting. As mentioned, this is what the JRVirtualizer does. But we'd like a design where this is transparent to the object(s) using the combined-#2-and-#3 abstraction.

Acknowledgements

That's all for now. Again, thanks to Mike and his team for sharing their experiences. Yup, he's the same guy who donates his app's earnings to charity. Also, thanks to Claire for the ideas on testing by repeating a given data again and again. The relevant pieces of code can be found on GitHub.

Tuesday, January 2, 2018

DataSource Routing with Spring @Transactional

I was inspired by Carl Papa's use of aspects with the Spring Framework to determine the DataSource to use (either read-write or read-only). So, I'm writing this post.

I must admit that I have long been familiar with Spring's AbstractRoutingDataSource. But I did not have a good idea where it can be used. Thanks to Carl and team, and one of their projects. Now, I know a good use case.

@Transactional

With Spring, read-only transactions are typically marked with annotations.

public class ... {
    @Transactional(readOnly=true)
    public void ...() {...}

    @Transactional // read-write
    public void ...() {...}
}

To take advantage of this, we use Spring's TransactionSynchronizationManager to determine if the current transaction is read-only or not.

AbstractRoutingDataSource

Here, we use Spring's AbstractRoutingDataSource to route to the read-only replica if the current transaction is read-only. Otherwise, it routes to the default which is the master.

public class ... extends AbstractRoutingDataSource {
    @Override
    protected Object determineCurrentLookupKey() {
        if (TransactionSynchronizationManager
                .isCurrentTransactionReadOnly() ...) {
            // return key to a replica
        }
        return null; // use default
    }
    ...
}

Upon using the above approach, we found out that the TransactionSynchronizationManager is one step behind because Spring will have already called DataSource.getConnection() before a synchronization is established. Thus, a LazyConnectionDataSourceProxy needs to be configured as well.

As we were discussing this, we figured if there was another way to determine if the current transaction is read-only or not (without resorting to LazyConnectionDataSourceProxy). So, we came up with an experimental approach where an aspect captures the TransactionDefinition (from the @Transactional annotation, if any) as a thread-local variable, and an AbstractRoutingDataSource that routes based on the captured information.

The relevant source code can be found on GitHub. Thanks again, Carl! BTW, Carl is also an award-winning movie director. Wow, talent definitely knows no boundaries.