.NET Junkie - Faking your LINQ Provider part 2: Optimizing performance with fetching strategies

19 June 11

Faking your LINQ Provider part 2: Optimizing performance with fetching strategies

This sequel explains how to write O/RM specific performance optimizations in such a way that the core business logic won’t be affected. This article builds on top of the foundation described in part 1 and uses the Fetching Strategy design pattern to achieve this goal.

In part 1 of this series I described how to hide our LINQ enabled O/RM tool behind an abstraction with the main goal to be able to unit test the application, while keeping the ability to write LINQ (over IQueryable) queries in the business layer.

One of the short comes noted at the end of the article is that optimizing the code is harder. In fact this is only partly true, because when you have mastered the art of writing good LINQ queries, you’ll notice that many performance problems can be fixed with it (including little neat tricks such as Damien’s Include extension method). On the other hand, because LINQ statements tend to be written on a higher level of abstraction, our O/RM tool sometimes transforms them poorly to SQL. I had to deal with this a lot the last couple of months. Sometimes a single let statement was the difference between a tunable SQL query and an hideous performance hog.

However, because we are hiding our O/RM tool behind an abstraction, it obviously gets harder to do O/RM specific optimizations. Or at least, there is no way to do this at the same place as we were used to. Think about how for instance LINQ to SQL contains the DataLoadOptions class that allows us to inform LINQ to SQL to do fetch related entities in advance. Most modern O/RM tools have similar options to configure loading behavior.

The chosen design forces us to write these optimizations in a different way, which gives us the opportunity to come with a flexible design that follows the SOLID design principles. For instance by applying Udi Dahan’s Fetching Strategies.

The idea behind Udi’s Fetching Strategies is to define specific classes that describe O/RM specific optimizations for a single service. The class is therefore very focused (Single Responsibility Principle), and new optimizations can be plugged in easily (Open-Closed Principle), without the need to change any other part of the system. Because these optimizations are O/RM specific, you want to separate them from the services that they optimize. Those services contain the core logic of the application, working against an abstraction, while the strategies themselves are an implementation detail. These strategies could easily be located in a different assembly or even in the composition root of the application.

While the goal of the IUnitOfWorkFactory was to hide the O/RM tool, the interface for applying optimizations does not try to hide the O/RM at all. On the contrary: the interface will be defined for a specific O/RM. For LINQ to SQL the interface might look like this:

public interface IFetchingStrategy<TService>
{
    void Prepare(DataLoadOptions loadOptions);
}

Say for instance that our system contains a MoveCustomerCommand that gets processed by a MoveCustomerCommandHandler (see an example of a command and a handler in my previous article). In that case we could define a MoveCustomerHandlerFetchingStrategy that changes the default loading behavior of the DataContext class when the MoveCustomerCommandHandler requests a new context:

public sealed class MoveCustomerHandlerFetchingStrategy
    : IFetchingStrategy<MoveCustomerCommandHandler>
{
    public void Prepare(DataLoadOptions loadOptions)
    {
        loadOptions.LoadWith<Customer>(c => c.Orders);
    }
}

This fetching strategy just tells the LINQ to SQL DataLoadOptions instance that for each customer that it loads from the database, it should load all its orders in the same SQL query.

It might seem that this approach gives a lot of overhead, but don’t forget that we would probably just end up with a dozen of fetching strategy classes, because we would only add a new strategy when we’ve got a performance problem.

Defining the interface and a concrete strategy is the easy part. Now we must ensure that fetching strategies get applied on the service they relate to. There are several ways to do this. A naïve approach would be to inject a fetching strategy directly into the particular service. This will unfortunately break the abstraction. Here is an example:

public class MoveCustomerCommandHandler
    : ICommandHandler<MoveCustomerCommand>
{
    private IFetchingStrategy<MoveCustomerCommandHandler> strategy;
    private IUnitOfWorkFactory<NorthwindUnitOfWork> factory;

    public MoveCustomerCommandHandler(
        IFetchingStrategy<MoveCustomerCommandHandler> strategy,
        IUnitOfWorkFactory<NorthwindUnitOfWork> factory)
    {
        this.strategy = strategy;
        this.factory = factory;
    }

    public void Handle(MoveCustomerCommand command)
    {
        using (var context = this.factory.CreateNew())
        {
            var options = new System.Data.Linq.DataLoadOptions();

            this.strategy.Prepare(options);

            // Set the options to the L2S DataContext here.

            // Business logic
        }
    }
}

There are a few problems with this approach. First of all, this is still too much code for something that should work transparently in the background. Besides this (or perhaps even because of this) it is easy to forget about preparing the context with a fetching strategy and in that case, we would end up having to change the class to add this when we need the performance optimization. This would therefore break with the Open-Closed principle. But more importantly, this couples our business logic to a specific O/RM framework (LINQ to SQL in this case), which was the thing that we tried to prevent in the first place. In other words, this is not the way to go.

It would be better to let the unit of work factory handle preparing the context itself. It is the specific unit of work factory implementation that knows about the used O/RM technology and has a reference to the DataContext class. The question now becomes: how do we let the factory know how it should prepare the DataContext?

The easiest way to do this, would be to pass the type of our current service on to the factory. This can be done as follows:

    public MoveCustomerCommandHandler(
        IUnitOfWorkFactory<NorthwindUnitOfWork> factory)
    {
        this.factory = factory;
    }

    public void Handle(MoveCustomerCommand command)
    {
        // Here we pass the ‘this’ argument to the CreateNew.
        using (var context = this.factory.CreateNew(this))
        {
             // Business logic
        }
    }

For this to work we need to change the definition of the IUnitOfWorkFactory<TUnitOfWork> interface to the following:

public interface IUnitOfWorkFactory<TUnitOfWork>
{
    TUnitOfWork CreateNew<TService>(TService service);
}

A particular implementation could then look like this:

// This is a LINQ to SQL specific implementation.
public sealed class NorthwindUnitOfWorkFactory
    : IUnitOfWorkFactory<NorthwindUnitOfWork>
{
    private static MappingSource Source = 
        new AttributeMappingSource();

    // A reference to the application’s DI container.
    private readonly Container container;
    private readonly string conStr;

    public NorthwindUnitOfWorkFactory(Container container, 
        string conStr)
    {
        this.container = container;
        this.conStr = conStr;
    }

    public NorthwindUnitOfWork CreateNew<TService>(TService service)
    {
        var db = new DataContext(this.conStr, Source);

        this.Prepare<TService>(db);

        var mapper = new LinqToSqlDataMapper(db);

        return new NorthwindUnitOfWork(mapper);
    }

    private void Prepare<TService>(DataContext db)
    {
        // Get a strategy for the correct type from the container.
        var fetchingStrategy = 
            this.container.GetInstance<IFetchingStrategy<TService>>();

        var loadOptions = new DataLoadOptions();

        fetchingStrategy.Prepare(loadOptions);

        // Register the load options to the DataContext.
        db.LoadOptions = loadOptions;
    }
}

Nice about this approach is that it is pretty easy to implement and easy to grasp. Downside is that there is now some sort of wart in our code, that instantly seems less clean. Why should we pass on the service itself onto the factory? I rather inject a factory into the service that already knows how to do the proper preparation. For this to work however, we need context based injection.

Context based dependency injection is the ability to base the decision about what to inject on the context in which the dependency is injected. In our case it would be useful to make the decision based on the type the dependency is injected in. In other words, we base the actual factory type on the type of the service in which we inject that factory. Defining a generic factory allows us to specify the service type we need:

public sealed class FetchingNorthwindUnitOfWorkFactory<TService>
    : IUnitOfWorkFactory<NorthwindUnitOfWork>
{
    private IFetchingStrategy<TService> fetchingStrategy;
    private IUnitOfWorkFactory<NorthwindUnitOfWork> realFactory;

    public FetchingNorthwindUnitOfWorkFactory(
        IFetchingStrategy<TService> fetchingStrategy,
        IUnitOfWorkFactory<NorthwindUnitOfWork> realFactory)
    {
        this.fetchingStrategy = fetchingStrategy;
        this.realFactory = realFactory;
    }

    public NorthwindUnitOfWork CreateNew()
    {
        var unitOfWork = this.realFactory.CreateNew();

        this.Prepare(unitOfWork.Mapper);

        return unitOfWork;
    }

    private void Prepare(IDataMapper mapper)
    {
        var loadOptions = new DataLoadOptions();

        this.fetchingStrategy.Prepare(loadOptions);

        var dataMapper = (LinqToSqlDataMapper)mapper;

        dataMapper.Context.LoadOptions = loadOptions;
    }
}

This factory is a generic type and depends on an IFetchingStrategy<TService> where the generic type of that strategy is the same as that of the factory. By requesting a factory from the DI container for a specific service, it would be loaded with the fetching strategy that is specific for the given service.

Also note how this factory wraps another factory; the factory that actually creates the context (see part 1 for the code for that factory). Besides adhering to the Open-Closed Pinciple (OCP), this allows us to inject the ‘normal’ factory for groups of types that never have any strategy defined for them, while injecting this factory for groups of types that do. Also note how the factory itself is clean of the container itself, which makes it easier to test it.

Warning: I’m trying to keep things simple. Although the previous class adherers to the OCP, it breaks the Liskov Substitution Principle.

Now we need context based injection to inject the correct instance into a service that depends on an IUnitOfWorkFactory<NorthwindUnitOfWork>. Context based injection however, is an advanced feature that not all DI containers natively support. Containers such as Autofac and my own Simple Injector, lack this feature. This doesn’t mean however that it is impossible to do such things with these containers. On the contrary; The way to achieve this is by breaking the dependency chain and moving from constructor injection to property injection and moving that property to a non-generic base class. Downside of this is that it forces us to add a base class, but on the plus side it keeps our registration fairly clean and works with most DI containers (and is therefore a good example for this blog post).

We can for instance define a non-generic base class that all command handlers should inherit from.

public abstract class CommandHandlerBase
{
    public IUnitOfWorkFactory<NorthwindUnitOfWork> Factory { get; set; }
}

The MoveCustomerCommandHandler that we’ve seen earlier will now have to inherit from CommandHandlerBase:

public class MoveCustomerCommandHandler : CommandHandlerBase,
    ICommandHandler<MoveCustomerCommand>
{
    public void Handle(MoveCustomerCommand command)
    {
        using (var context = this.Factory.CreateNew())
        {
            // Business logic
        }
    }
}

What’s left is the registration of the DI container. Here is an example of how to do this with the Simple Injector:

container.RegisterInitializer<CommandHandlerBase>(handler =>
{
    var type = typeof(FetchingNorthwindUnitOfWorkFactory<>)
        .MakeGenericType(handler.GetType());

    handler.Factory = (IUnitOfWorkFactory<NorthwindUnitOfWork>)
        container.GetInstance(type);
});

This code registers an event that will be called every time an instance that derives from CommandHandlerBase is created by the container. Based on the actual type of that instance, we determine the type of the FetchingNorthwindUnitOfWorkFactory<T>, request that from the container, and inject it into the property of the base type.

This type of registration works with almost all modern DI containers. As far as I see, the only container that does not support this is Microsoft Unity. The only way I can think of to solve this with Unity is by calling a static instance of the container from within the CommandHandlerBase class; yuck! If anyone knows a better way, please let me know. I will update the article.

The Simple Injector however, does contain some interesting extension points that allow to add features such as context based injection. The Simple Injector documentation contains a section about context based injection and describes a RegisterWithContext extension method. When using this method you can remove the CommandHandlerBase class and return to injecting a IUnitOfWorkFactory<NorthwindUnitOfWork> into the constructor of your services. Instead of registering an initializer for the CommandHandlerBase, we can simply register the IUnitOfWorkFactory<NorthwindUnitOfWork> using the RegisterWithContext extension method:

container.RegisterWithContext<IUnitOfWorkFactory<NorthwindUnitOfWork>>(
    context =>
    {
        var type = typeof(FetchingNorthwindUnitOfWorkFactory<>)
            .MakeGenericType(context.ImplementationType);

        return (IUnitOfWorkFactory<NorthwindUnitOfWork>)
            container.GetInstance(type);
    });

The code looks much like the previous RegisterInitializer registration, but now uses the supplied context to get the ImplementationType.

Please note that this code is specific to the Simple Injector, and it might be difficult to translate it to another container.

The FetchingNorthwindUnitOfWorkFactory<T> class depends on a IFetchingStrategy<TService>, so of course we must also register the fetching strategies:

container.RegisterManyForOpenGeneric(typeof(IFetchingStrategy<>),
    AppDomain.CurrentDomain.GetAssemblies());

This searches through all assemblies in the current AppDomain looking for concrete types that implement a closed generic version of the IFetchingStrategy<TService> interface and registers them by that interface. This type of registration is called batch registration. Because however, most services will not have any fetching strategy class defined for them, we must register a fallback implementation for those:

container.RegisterOpenGeneric(typeof(IFetchingStrategy<>), 
    typeof(NullFetchingStrategy<>));

The NullFetchingStrategy<TService> is an implementation of the Null Object Pattern, which is of course trivial. I won’t bore you with that. This registration will return a new instance of NullFetchingStrategy<TService> when a IFetchingStrategy<TService> is requested. Note that the batch registrations done with RegisterManyForOpenGeneric will always preceed these registrations.

If you are interested in applying this idea in your application, but don’t know how to do this with the DI container you currently use in your application; take a look at the Simple Injector migration guide. It shows you how to rewrite all the above registration snippets in the container of your choice.

Short comes

While part one of this series described a model that allowed to work with multiple data sources and even multiple O/RM technologies side-by-side, I left that out of this part. Adding that wouldn’t be that difficult, though.

One thing to note is that it gets much harder to use fetching strategies when you execute multiple commands within the same unit of work, for instance when a command has multiple sub commands. There are two solutions I can think of. Either we apply solely the fetching strategy for the main command to the DataContext (the easiest thing to do), or we apply all strategies to the context before the command is executes.

Conclusion

Fetching strategies are a valuable pattern that allow you to fix O/RM specific performance problems without coupling your main application logic to the chosen O/RM technology.

Happy injecting!

- .NET General, C#, Dependency injection, Entity Framework, LINQ, LINQ to SQL, O/RM, Simple Injector, SQL - No comments / No trackbacks - § ¶

The code samples on my weblog are colorized using javascript, but you disabled javascript (for my website) on your browser. If you're interested in viewing the posted code snippets in color, please enable javascript.

No comments: