Sunday, December 01, 2013

GAE/J: Packaging your entity classes in separate jar files


I enjoy designing modularized applications. It allows me to visualize the architecture and understand the links between the various modules. In addition to benefitting the code base, it helps me appreciate the existing system's design better.

When using JPA (DataNucleus) as your persistence technology on top of Google's AppEngine stack, you  are most likely to end up with a single persistence.xml file and all your entity classes being bundled in the same jar.

But what if you wanted to modularize your code and package groups of entity classes in their own jar files?

In this article, we are going to look at writing a simple extension to the persistence provider of DataNucleus that will allow us to read our entity classes that have been packaged in to multiple jars.

Let us assume that after your code refactoring & modularization you end up with a code structure that can be related to the following model:

  - project_root
      |
      |-> framework-model (produces framework-model.jar)
             |
             |-> AttributeEntity.class // Used for storing attributes
             |-> UserEntity.class      // Used for storing user entries
             |-> BinaryResource.class  // Used for storing small binaries
      |
      |-> catalog-model (produces catalog-model.jar)
             |
             |-> ProductCategoryEntity.class // The product categories.
             |-> ProductEntity.class    // The products



In the above example, I have two jars that contain the entities that I wish to consume within the application. So far so good, but unfortunately, just enhancing the classes and packaging them in two separate jars will not do the trick.

At this stage, to move forward we need to take a closer look at the entries that go in to the persistence.xml file. From the documentation, the only information that looks useful is the persistence provider: org.datanucleus.api.jpa.PersistenceProviderImpl.

Here is where the beauty of open source projects shine the most. DataNucleus being open sourced, it is easy to view the source of the persistence provider implementation and understand its bootstrapping logic.

Internally, the provider implementation is constructing a PersistenceUnitMetaData object that contains information about all the entity classes that have been enhanced. This provider investigates all archives that contain a "persistence.xml" file, for identifying the enhanced entity definitions. Since we cannot create multiple persistence.xml files that point to the same PU name, we need to look at another way of solving the problem.

Custom Persistence Provider Implementation


Given that the persistence provider is the application's entry point in to the JPA stack, and the fact that we are trying to bootstrap the environment with entity classes that sit in different jars,  the logical solution to our problem is to write a custom persistence provider implementation.

With a little bit of debugging and reading the source code, I came up with my own implementation - GAEPersistenceProviderImpl.

The basic idea behind this implementation is:
  • Search for all jars that contain a META-INF/orm.xml file
    The fun part is that you can have an empty orm.xml file and the system will not crib!
  • Use Reflections to scan all the classes within the above identified jar files.
  • When the application requests for the EntityManagerFactory, get all classes that have been annotated with @Entity and use that information for constructing the metadata that is required by DataNucleus
    This is where the Reflections project comes in handy - It can scan the classes without loading the classes

Update the persistence.xml file


The next step was to specify this custom provider in the persistence.xml file.

    <persistence xmlns="http://java.sun.com/xml/ns/persistence"
                 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                 version="2.0">
        <persistence-unit name="transactions-optional">
            <provider>
rogue.app.framework.support.datanucleus.GAEPersistenceProviderImpl
            </provider>
            <properties>
               ...
            </properties>
        </persistence-unit>
    </persistence>


And voila, every jar file that contained a META-INF/orm.xml file was scanned for entities and passed along to the DataNucleus implementation.

Modularization - achieved!

Hope this was useful to you.