Tuesday, September 9, 2008

Using the repository pattern to achieve persistence ignorance in practice

I recently experimented with migrating a project from Linq2Sql to Linq2NHibernate. It’s a small windows time tracker application that features offline capability.

The original app built a year ago used Linq2Sql’s class designer to create domain classes from existing database tables. Along with the domain classes it created a DataContext class:

public partial class DomainDataContext: System.Data.Linq.DataContext
{
 public System.Data.Linq.Table<customer> Customers
 {
  get { return this.GetTable<customer>(); }
 }
 public System.Data.Linq.Table<project> Projects
 {
  get   { return this.GetTable<project>(); }
 }
}

Tables of T are in fact Microsofts implementation of the repository pattern. I have two issues with Table<T> as a repository implementation. One, I like my repositories to take the shape of a collection more in line with what repositories were originally: A facade that let’s you access data through a collection metaphor. The method names should be Add, Remove and Clear as you would expect from a normal collection. In Linq2Sql MS renamed those to InsertOnSubmit, DeleteOnSubmit and so on.

Issue number two with Table<T> is that the methods Insert/DeleteOnSubmit are not defined in an interface but on Table<T> directly. That means I have to rely on a concrete class. Bad OOD karma! The thing is, these methods are really part of another pattern, Unit of Work. There is a muddy mismatch between the two and a need for a unified way to access data through repositories.

In order to be accessed in a manner closer to real collections, I could let each repository implement ICollection<T>:

public interface Repositories : IDisposable
{
 ICollection<customer> Customers { get; }
 ICollection<project> Projects { get; }
}

That’s all well and dandy as long as my repositories are simple in-memory collections or in-memory collections persisted using Xml. If I want to switch to repositories backed by an Linq2Sql or Linq2NHibernate, troubles arise. The result is that each time a repository is queried, the whole table is loaded into RAM and filtered there. Ooops. The trouble has to do with the way that Linq compiles queries.

Linq is able to choose between running queries in-memory or capturing the query expression in an expression tree then translating it into Sql for execution on the database server. The (not so secret) secret consists of two interfaces, IEnumerable<T> and IQueryable<T>.

If the collection you query against implements IQueryable<T>, then the expression is translated to Sql using Linq2Sql. If the collection implements IEnumerable<T>, the query is run in memory when the GetEnumerator() method is called.

When switching from in-memory collections to ORM backed repositories, I can no longer let my repositories implement ICollection<T> only, since Table<T> and Linq<T> implement IQueryable<T> instead. In other words I’m forced to change my interface to:

public interface Repositories : IDisposable
{
 IQueryable<customer> Customers { get; }
 IQueryable<project> Projects { get; }
}

Only now, I’m back to having repositories that are queryable but do not include any way to add or delete objects.

What I really like is a way to leave my Repositories interface alone while still being able to switch between database persistence, file based persistence, no persistence, pen-and-paper based persistence, coffee based persistence … anyway, you get the point.

What I need is a new interface:

public interface QueryableCollection<t> : IQueryable<t>, ICollection<t> { }

Allowing me to declare my repositories as:

public interface Repositories : IDisposable
{
 QueryableCollection<customer> Customers { get; }
 QueryableCollection<project> Projects { get; }
}

That way I can easily swap persistence mechanism, even have two different schemes running at the same time.

Here are is my repository implementation for NHibernate:

public class NHRepositories : Repositories, ConnectionProvider
{
 private readonly ISession _session;



 public QueryableCollection<customer> Customers
 {
  get { return new NHRepositoryAdapter<customer>(_session); }
 }

 public QueryableCollection<project> Projects
 {
  get { return new NHRepositoryAdapter<project>(_session); }
 }

}

NHRepositoryAdapter exposes NHibernate’s Query<T> as a QueryableCollection<T>:

internal class NHRepositoryAdapter<t> : QueryableCollection<t>
{
 private readonly ISession _session;

 public NHRepositoryAdapter(ISession session)
 {
  _session = session;
 }
 
 public IEnumerator<t> GetEnumerator()
 {
  return _session.Linq<t>().GetEnumerator();
 }
}

To satisfy the in memory collections I made an adapter to expose an IList<T> as a QueryableCollection<T> using Linq’s built-in AsQueryable() method:

public class QueryableList<t> : IList<t>, QueryableCollection<t>
{
 private readonly List<t> _list;
 private readonly IQueryable<t> _queryable;

 public QueryableList()
 {
  _list = new List<t>();
  _queryable = _list.AsQueryable();
 }

 public IEnumerator<t> GetEnumerator()
 {
  return _list.GetEnumerator();
 }

 public Expression Expression
 {
  get { return _queryable.Expression; }
 }

}

Couldn’t I just implement my repositories by inheriting List<T>, implementing IQueryable<T> and then delegating calls to IQueryable<T>’s members to Enumerable.AsQueryable()?. That would save the tedious wrapper code. Unfortunately that results in stack overflow errors when Linq calls the getters for the three properties Expression, Provider and ElementType defined in IQueryable<T>. I suppose the reason is that AsQueryable is in fact an extension method and thus doesn’t obey normal inheritance rules. Calling base.AsQueryable() gives the same result as this.AsQueryable() even though the getters have been overriden in the subclass.

Another concern to air is: Does the persistence mechanism really change often enough to justify this abstraction and added complexity? Not always. In this particular app, yes. One one the requirements is smooth operation online as well as offline. I can achieve that easily using my QueryableCollection interface. When running offline my repositories use xml as storage. When online and when synchronizing it uses NHibernate with a Sql Server database behind.

Another way of achieving offline functionality would be to only let the app talk to a SqlCe 3.5 database via Linq2Sql or Linq2NHibernate and then let ADO.NET Synchronization Services to sync it with the master Sql Server database. Then you wouldn’t need the abstraction I made, but complexity would only be relocated to configuring Synchronization services.

Anyway this solution allows me to maximum flexibility in persistence ignorance. The payback is a new interface and an adapter two adapter classes for each storage mechanism. It’s not feasible in all solutions but can be if you need the ability to manage offline/online synchronization manually or store data in several places using the same repository abstraction.

2 Responses to 'Using the repository pattern to achieve persistence ignorance in practice'
  1. Morten Lyhr said,

    ON SEPTEMBER 9TH, 2008 AT 11:40 PM

    Great post Søren!

    But its not persistence ignorence you have achieved, its ORM ignorence.

    Actually I was wondering how to make a “POCO LINQ” repository that was not tied to any specific ORM. I guess you beat me to it :-)

  2. Rasmus Kromann-Larsen said,

    ON OCTOBER 10TH, 2008 AT 11:50 PM

    Nice post.

    I’m about to play around with LINQ2NHibernate myself, in a LINQ-less solution that was recently kicked up to 3.5. I think your post might be the inspiration for my repositories.

    - Rasmus.

No comments:

Post a Comment