A system with a complex domain model often benefits from a layer, such as the one provided by Data Mapper (165), that isolates domain objects from details of the database access code. In such systems it can be worthwhile to build another layer of abstraction over the mapping layer where query construction code is concentrated. This becomes more important when there are a large number of domain classes or heavy querying. In these cases particularly, adding this layer helps minimize duplicate query logic.
A Repository mediates between the domain and data mapping layers, acting like an in-memory domain object collection. Client objects construct query specifications declaratively and submit them to Repository for satisfaction. Objects can be added to and removed from the Repository, as they can from a simple collection of objects, and the mapping code encapsulated by the Repository will carry out the appropriate operations behind the scenes. Conceptually, a Repository encapsulates the set of objects persisted in a data store and the operations performed over them, providing a more object-oriented view of the persistence layer. Repository also supports the objective of achieving a clean separation and one-way dependency between the domain and data mapping layers.
How It Works
Repository is a sophisticated pattern that makes use of a fair number of the other patterns described in this book. In fact, it looks like a small piece of an object-oriented database and in that way it‘s similar to Query Object (316), which development teams may be more likely to encounter in an object-relational mapping tool than to build themselves. However, if a team has taken the leap and built Query Object (316), it isn‘t a huge step to add a Repository capability. When used in conjunction with Query Object (316), Repository adds a large measure of usability to the object-relational mapping layer without a lot of effort.
In spite of all the machinery behind the scenes, Repository presents a simple interface. Clients create a criteria object specifying the characteristics of the objects they want returned from a query. For example, to find person objects by name we first create a criteria object, setting each individual criterion like so: criteria.equals(Person.LAST_NAME, "Fowler"), and criteria.like(Person.FIRST_NAME, "M"). Then we invoke repository.matching(criteria) to return a list of domain objects representing people with the last name Fowler and a first name starting with M. Various convenience methods similar to matching (criteria) can be defined on an abstract repository; for example, when only one match is expected soleMatch(criteria) might return the found object rather than a collection. Other common methods include byObjectId(id), which can be trivially implemented using soleMatch.
To code that uses a Repository, it appears as a simple in-memory collection of domain objects. The fact that the domain objects themselves typically aren‘t stored directly in the Repository is not exposed to the client code. Of course, code that uses Repository should be aware that this apparent collection of objects might very well map to a product table with hundreds of thousands of records. Invoking all() on a catalog system‘s ProductRepository might not be such a good idea.
Repository replaces specialized finder methods on Data Mapper (165) classes with a specification-based approach to object selection [Evans and Fowler]. Compare this with the direct use of Query Object (316), in which client code may construct a criteria object (a simple example of the specification pattern), add() that directly to the Query Object (316), and execute the query. With a Repository, client code constructs the criteria and then passes them to the Repository, asking it to select those of its objects that match. From the client code‘s perspective, there‘s no notion of query "execution"; rather there‘s the selection of appropriate objects through the "satisfaction" of the query‘s specification. This may seem an academic distinction, but it illustrates the declarative flavor of object interaction with Repository, which is a large part of its conceptual power.
Under the covers, Repository combines Metadata Mapping (329) with a Query Object (316) to automatically generate SQL code from the criteria. Whether the criteria know how to add themselves to a query, the Query Object (316) knows how to incorporate criteria objects, or the Metadata Mapping (306) itself controls the interaction is an implementation detail.
The object source for the Repository may not be a relational database at all, which is fine as Repository lends itself quite readily to the replacement of the data-mapping component via specialized strategy objects. For this reason it can be especially useful in systems with multiple database schemas or sources for domain objects, as well as during testing when use of exclusively in-memory objects is desirable for speed.
Repository can be a good mechanism for improving readability and clarity in code that uses querying extensively. For example, a browser-based system featuring a lot of query pages needs a clean mechanism to process HttpRequest objects into query results. The handler code for the request can usually convert the HttpRequest into a criteria object without much fuss, if not automatically; submitting the criteria to the appropriate Repository should require only an additional line or two of code.
When to Use It
In a large system with many domain object types and many possible queries, Repository reduces the amount of code needed to deal with all the querying that goes on. Repository promotes the Specification pattern (in the form of the criteria object in the examples here), which encapsulates the query to be performed in a pure object-oriented way. Therefore, all the code for setting up a query object in specific cases can be removed. Clients need never think in SQL and can write code purely in terms of objects.
However, situations with multiple data sources are where we really see Repository coming into its own. Suppose, for example, that we‘re sometimes interested in using a simple in-memory data store, commonly when we wants to run a suite of unit tests entirely in memory for better performance. With no database access, many lengthy test suites run significantly faster. Creating fixture for unit tests can also be more straightforward if all we have to do is construct some domain objects and throw them in a collection rather than having to save them to the database in setup and delete them at teardown.
It‘s also conceivable, when the application is running normally, that certain types of domain objects should always be stored in memory. One such example is immutable domain objects (those that can‘t be changed by the user), which once in memory, should remain there and never be queried for again. As we‘ll see later in this chapter, a simple extension to the Repository pattern allows different querying strategies to be employed depending on the situation.
Another example where Repository might be useful is when a data feed is used as a source of domain objects梥ay, an XML stream over the Internet, perhaps using SOAP, might be available as a source. An XMLFeedRepositoryStrategy might be implemented that reads from the feed and creates domain objects from the XML.
Further Reading
The specification pattern hasn‘t made it into a really good reference source yet. The best published description so far is [Evans and Fowler]. A better description is currently in the works in [Evans].
Example: Finding a Person‘s Dependents (Java)
From the client object‘s perspective, using a Repository is simple. To retrieve its dependents from the database a person object creates a criteria object representing the search criteria to be matched and sends it to the appropriate Repository.
public class Person { public List dependents() { Repository repository = Registry.personRepository(); Criteria criteria = new Criteria(); criteria.equal(Person.BENEFACTOR, this); return repository.matching(criteria); } }
Common queries can be accommodated with specialized subclasses of Repository. In the previous example we might make a PersonRepository subclass of Repository and move the creation of the search criteria into the Repository itself.
public class PersonRepository extends Repository { public List dependentsOf(aPerson) { Criteria criteria = new Criteria(); criteria.equal(Person.BENEFACTOR, aPerson); return matching(criteria); } }
The person object then calls the dependents() method directly on its Repository.
public class Person { public List dependents() { return Registry.personRepository().dependentsOf(this); } }
Example: Swapping Repository Strategies (Java)
Because Repository‘s interface shields the domain layer from awareness of the data source, we can refactor the implementation of the querying code inside the Repository without changing any calls from clients. Indeed, the domain code needn‘t care about the source or destination of domain objects. In the case of the in-memory store, we want to change the matching() method to select from a collection of domain objects the ones satisfy the criteria. However, we‘re not interested in permanently changing the data store used but rather in being able to switch between data stores at will. From this comes the need to change the implementation of the matching() method to delegate to a strategy object that does the querying. The power of this, of course, is that we can have multiple strategies and we can set the strategy as desired. In our case, it‘s appropriate to have two: RelationalStrategy, which queries the database, and InMemoryStrategy, which queries the in-memory collection of domain objects. Each strategy implements the RepositoryStrategy interface, which exposes the matching() method, so we get the following implementation of the Repository class:
abstract class Repository { private RepositoryStrategy strategy; protected List matching(aCriteria) { return strategy.matching(aCriteria); } }
A RelationalStrategy implements matching() by creating a Query Object from the criteria and then querying the database using it. We can set it up with the appropriate fields and values as defined by the criteria, assuming here that the Query Object knows how to populate itself from criteria:
public class RelationalStrategy implements RepositoryStrategy { protected List matching(Criteria criteria) { Query query = new Query(myDomainObjectClass()) query.addCriteria(criteria); return query.execute(unitOfWork()); } }
An InMemoryStrategy implements matching() by iterating over a collection of domain objects and asking the criteria at each domain object if it‘s satisfied by it. The criteria can implement the satisfaction code using reflection to interrogate the domain objects for the values of specific fields. The code to do the selection looks like this:
public class InMemoryStrategy implements RepositoryStrategy { private Set domainObjects; protected List matching(Criteria criteria) { List results = new ArrayList(); Iterator it = domainObjects.iterator(); while (it.hasNext()) { DomainObject each = (DomainObject) it.next(); if (criteria.isSatisfiedBy(each)) results.add(each); } return results; } }