
A Fast Dynamic Factory Using Reflection.Emit
In this post I'll show my implementation of the Factory Pattern build with C# using Generics, Delegates and Reflection.Emit.
This post was in the pipeline for a long time, but a post on Eber Irigoyen's blog made me speed things up a bit. A few weeks ago I commented on his post about Arrays of methods in C# by noting he could use a dictionary of delegate objects. My comment inspired him and he wrote a follow-up about Dictionaries of methods in C#. He finished this post by saying:
on my next article I'll show you the implementation of this technique applied to the simple factory pattern just to close the loop.
For Cutting Edge I created an implementation of the Factory Pattern over a year ago and some rough sketches for this post were on disk for quite a while. But now I of course I have to beat Eber on this one ;-).
A Simple Factory Implementation
A normal factory implementation could use an enum and a switch case statement like this:
// We define an abstract base class
public abstract class Car { }
// We define our implementations of Car
public class Spyker : Car { }
public class Lada : Car { }
public class Ferrari : Car { }
// We define an Enumeration for all concrete
// car implementations
public enum CarType
{
SpykerType, LadaType, FerrariType
}
// And we have our Factory class for the cars
public static class CarFactory
{
public static Car CreateInstance(CarType car)
{
switch (car)
{
case CarType.SpykerType:
return new Spyker();
case CarType.LadaType:
return new Lada();
case CarType.FerrariType:
return new Ferrari();
default:
throw new InvalidEnumArgumentException("car");
}
}
}
Basically this is one of the fastest implementations (maybe even THE fastest) of the Factory pattern. The big problem however is that it isn't flexible. For each new Car descendant, you will need to extend the Cars enumeration AND the CreateInstance method. Besides that, all types must be known at compile time (that is, at the time you compile the CarFactory class). This could be a problem when you want to add new Car implementations using a plug-in model (i.e. by adding a new assembly with your type).
So a more flexible implementation uses Reflection. This idea is not new, in 2003 Romi Kovacs wrote in MSDN magazine about Creating Dynamic Factories in .NET Using Reflection. The problem with Romi's implementation however, is the performance. His implementation is terribly slow. So let us take another perspective on this subject.
A Factory Using Delegates And Reflection.Emit
My implementation uses the Reflection.Emit namespace to compile a new method on the fly and add it to Generic Dictionary of Delegate objects. This way we only have to do a dictionary look-up (which is pretty fast) and then call the delegate which will return the new desired instance.
The factory looks like this:
public static class Factory<TKey, TBaseType>
where TBaseType : class
{
public static void Add(TKey key, Type type) { }
public static TBaseType CreateInstance(TKey key) { }
}
The factory is a generic type with two type parameters. TKey is the type that will be used as key in the internal dictionary object. This way new objects can be retrieved by their key. You can use any type as key, but I had database storage in mind while writing this class, so then an integer makes the most sense, because this will normally be your primary key.
The second type parameter is the TBaseType. Like the name says, it defines a base type. An implementation of this generic factory can only create objects that can be derived from TBaseType (and create TBaseType objects). If you don't want this limitation you can simply use a Factory<TKey, object> (everything derives from object).
The factory has two public methods: CreateInstance(TKey), which returns a new object for the type linked to the given key, and Add(TKey, Type), which adds a new type to the internal dictionary. Note the Add method! This factory class can't generate a type for you before you've added it (once). (So you may argue whether this class in fact is a factory or not.)
The factory class is static. This way a user doesn't have to recreate or store the factory, but still each factory type (i.e. Factory<int, Control> and Factory<string, Control> are two different types) has it's own storage that will stay alive till the AppDomain dies.
The Internals
So let's look inside the Factory<TKey, TBaseType>. Let's start with the fields:
public static class Factory<TKey, TBaseType>
where TBaseType : class
{
// Declare the delegate
private delegate TBaseType BaseTypeInvoker();
// The HashTable that caches the delegates
private static Dictionary<TKey, BaseTypeInvoker> _delegates =
new Dictionary<TKey, BaseTypeInvoker>();
// The object that will be used for ensuring thread-safety
private static object _locker = new object();
[...]
}
For each type that will be created by the factory, a BaseTypeInvoker delegate will be stored in the _delegates Dictionary. The delegates are created using the System.Reflection.Emit.ILGenerator by the private CreateInvoker(Type t) method:
// Create a new delegate that returns a new object
// of Type t.
private static BaseTypeInvoker CreateInvoker(Type t)
{
// Get the Default constructor.
ConstructorInfo ctor =
t.GetConstructor(new Type[0]);
// Check if the constructor exists.
if (ctor == null) throw new ArgumentException(
String.Format(CultureInfo.InvariantCulture,
"{0} doesn't have a public default constructor.",
t.FullName));
// Create a new method.
DynamicMethod dm =
new DynamicMethod(t.Name + "Ctor",
t, new Type[0], typeof(TBaseType).Module);
// Generate the intermediate language.
ILGenerator lgen = dm.GetILGenerator();
lgen.Emit(OpCodes.Newobj, ctor);
lgen.Emit(OpCodes.Ret);
// Finish the method and create new delegate
// pointing at it.
return (BaseTypeInvoker)dm.CreateDelegate(
typeof(BaseTypeInvoker));
}
The CreateInvoker method tries to create the default constructor and throws an exception when it fails. When it succeeds it generates a new DynamicMethod (which uses the new .NET 2.0 lightweight code generation (LCG) possibilities) and emits the simplest intermediate language (IL) you can imagine. The first IL instruction calls our constructor and the second returns the newly created object. After this we ask the DynamicMethod to create us a BaseTypeInvoker of that IL code.
Next method to explore is the Add(TKey, Type) method:
public static void Add(TKey key, Type type)
{
// Some checks on the type argument
if (type == null)
throw new ArgumentNullException("type");
// Check if object is not a class
if (type.IsClass == false)
throw new ArgumentException(
String.Format(CultureInfo.InvariantCulture,
"{0} is not a reference type.",
type.FullName), "type");
// Check if object is abstract
if (type.IsAbstract == true)
throw new ArgumentException(
String.Format(CultureInfo.InvariantCulture,
"{0} is an abstract class, which can not be created.",
type.FullName), "type");
// Check whether the given type is assignable from
if (typeof(TBaseType).IsAssignableFrom(type) == false)
throw new ArgumentException(String.Format(
"The given type {0} should be derivable from {1}.",
type.FullName, typeof(TBaseType).FullName), "type");
// Lock for thread safety
lock (_locker)
{
// Extra check if delegate not already added.
if (_delegates.ContainsKey(key) == false)
{
try
{
// Create the delegate for the type
BaseTypeInvoker invoke = CreateInvoker(type);
// Try to invoke function (extra error check,
// so the delegate is not added on error)
invoke();
// The invoker executed correctly (no exceptions)
// so let's add it to the dictionary
_delegates.Add(key, invoke);
}
catch (InvalidCastException)
{
throw new InvalidCastException(
String.Format(CultureInfo.InvariantCulture,
"{0} couldn't be casted to {1}.",
type.FullName, typeof(TBaseType).FullName)
);
}
}
}
}
After some basic checks whether the Type is actually valid, we enter a Monitor (using the C# lock statement) to ensure thread safety. We create a new Delegate and add it to the _delegates dictionary using the given key.
Now let's look at the public CreateInstance(TKey) method:
public static TBaseType CreateInstance(TKey key)
{
BaseTypeInvoker invoke = null;
lock (_locker)
{
_delegates.TryGetValue(key, out invoke);
}
return invoke != null ? invoke() : null;
}
This is about as simple as can get. We try to get a Delegate from the dictionary by it's key and then invoke it so it will return a new object or return null when the key isn't present in the dictionary.
The Domain specific Factory
While the Factory<TKey, TBaseType> class works pretty well, you still need some code to get it working. This is because you always have to check whether the factory returns null and if so, you'll have to add a new type to the factory. The way you want to create new types for the dictionary is totally domain specific. Maybe you have a plug-in based system with assemblies referenced from a configuration file, or you could define the type names in the database and add new assemblies to the bin folder of your web app.
Below I'll show a possible implementation of such code. The WebControlFactory class. (Again you may argue with me whether the Factory<TKey, TBaseType> is in fact a factory or not. You could see the domain specific implementation as the real factory and the Factory<TKey, TBaseType> as it's internal helper.)
// Domain specific Factory
public static class WebControlFactory
{
private static string GetTypeNameFromDatabase(int primaryKey)
{
// In real life we could have a database call here.
string sql = "SELECT * FROM DynamicObjects WHERE id = @ID;";
// returns the FQN of TextBox for demonstration
return "System.Web.UI.WebControls.TextBox";
}
public static WebControl CreateInstance(int id)
{
WebControl instance =
Factory<int, WebControl>.CreateInstance(id);
if (instance == null)
{
string typeName = GetTypeNameFromDatabase(id);
// Create a type
Type type = Type.GetType(typeName);
// You'd use System.Web.Compilation.BuildManager.GetType
// within Web Applications instead of Type.GetType.
// Add the type to the generic Factory, so it can be
// cached and returned next time.
Factory<int, WebControl>.Add(id, type);
instance = Factory<int, WebControl>.CreateInstance(id);
}
return instance;
}
}
This WebControlFactory uses the specific Factory<int, WebControl> type so we can expect the keys to be integers. The WebControlFactory has one method, the CreateInstance(int). This method directly calls the CreateInstance method of the Factory<int, WebControl> and when that returns null the WebControlFactory will create the type and add it to the dictionary. Again this code is pretty domain specific, so you probably don't want to insert this code into the generic factory class itself (otherwise it won't be that generic anymore ;-)).
Nice thing to note is that for web applications you can use the BuildManager.GetType method, which will search the type through all the top level assemblies and assemblies defined in the configuration. When you generate your assemblies on the fly, you should probably find another way to create your types, because both Type.GetType and BuildManager.GetType won't work on those.
The WebControlFactory calls the private method GetTypeNameFromDatabase, but only when the type was not found (and what will only happen once per type). The method returns the fully qualified type name by it's id. What "by it's id" means is up to you, but I can image you creating a method that connects to the database to fetch a type name by it's primary key, or opening the Web.Config or an XML file.
The code that uses the WebControlFactory could look like this:
public partial class Default : System.Web.UI.Page
{
protected void Page_Load(object sender, EventArgs e)
{
int controlId = 14;
WebControl control =
WebControlFactory.CreateInstance(controlId);
this.Form.Controls.Add(control);
}
}
Performance
The performance of this generic factory is quite good, but still it's about 3 times as slow as the simple factory pattern (which may become worse when many threads use the factory at the same time). So what is slowing this factory down? Basically three things:
- The locking mechanism. Entering a Monitor simply costs (Don't use a ReaderWriterLock, that only makes things worse).
- Delegate calls. Delegate calls are more expensive than normal method calls. (The CLR team has done some serious optimizations on delegate invokation and in the .NET 2.0 framework it's now only about two or three times as slow as a virtual method call) See more details about what things cost here.
- The dictionary look-up. (This takes about 475 ns)
Three times seems much, but I tested this by creating only System.Object objects which occupy 12 bytes of memory on the heap. (A nice article on CLR internals can be found here). Normally the objects you create will be bigger and after working with those objects, the time the factory occupies becomes insignificant.
Other implementations
Of course you can think of other implementations for such a factory class. You can for instance use a System.Activator together with a Dictionary of types instead of the ILGenerator and a Dictionary of delegates. Here is an example of such:
public static class Activator<TKey, TBaseType>
where TBaseType : class
{
private static Dictionary<TKey, Type> _types =
new Dictionary<TKey, Type>();
private static object _locker = new object();
public static void Add(TKey key, Type type)
{
// Some checks on the type argument
[...]
lock (_locker)
{
if (_types.ContainsKey(key) == false)
{
TBaseType t = (TBaseType)
Activator.CreateInstance(type);
_types.Add(key, type);
}
}
}
public static TBaseType CreateInstance(TKey key)
{
Type type = null;
lock (_locker)
{
_types.TryGetValue(key, out type);
}
return type != null ?
(TBaseType)Activator.CreateInstance(type) : null;
}
}
This Activator<TKey, TBaseType> implementation is about two times as slow as the Factory<TKey, TBaseType> implementation. But to create the fastest possible implementation of the factory pattern you should possibly dynamically generate the CreateInstance method with it's switch case statement. It then should generate the code in (perhaps) C# and let the compiler do the work of optimizing the switch case statement.
That's about it for now. I hope I inspired some of you.
UPDATE 2008-01-12: Recently I found a design flaw in the current implementation. When you use the Factory<TKey, TBaseType> class as a reusable type in a bigger framework it might be better to make the type not static and instantiate it with a private member variable in your domain specific implementation. The problem lies in the fact that everybody can add Types with certain keys to the dictionary and several implementations could use the same key for different types. So be ware of this caveat when using this code.
UPDATE 2009-07-22: I recently did some performance tests with a factory that generated types based on a interface. This research was based on the idea that interface calls are faster than delegate calls. Therefore a factory that cached interfaces instead of delegates should be faster.
Instead of using an internal Dictionary<TKey, BaseTypeInvoker> I used a Dictionary<TKey, IFactory<TBaseType>> where the IFactory<TBaseType> defined a TBaseType CreateInstance() method. On the fly I generated for each key a new type that implemented the IFactory<TBaseType> interface.
The results however, where disappointing. The new approach needed much more code, while both implementations had the same performance characteristics (the interface approach was even slightly slower).
I also want to say that the last few years I started to get used to the concept of immutability. By creating a type that can not change after creation, such an object is thread-safe and doesn’t need any locking internally. This will give an performance improvement of almost 100% for the implementation given in this article. This is something you should keep in mind when trying to optimize performance.
- .NET General, C# - eleven comments / one trackback - § ¶
pretty slick, I hope you keep posting this kind of material; very interesting indeed.
Using your implementation I came up with another way of doing the same thing without using dynamic code generation (I'll blog about it), nevertheless your code is very didactic, good learning experience
Eber Irigoyen (URL) - 24 09 06 - 00:45
If the app guarantees that the dictionary of types in the factory is only setup at the start of the app, before any calls are made, I assume that one could skip the lock, correct?
mawi (URL) - 08 02 07 - 14:55
Mawi,
This is correct. Adding types to the factory from within a static constructor or during app start, will allow you to remove the lock and this can increase performance on multi-threaded applications.
[Steven] (URL) - 11 02 07 - 14:59
The new Orcas release of .NET (v3.5) will have a ReaderWriterLockSlim that can speed up the generic factory. It is a ReaderWriterLock with the performance of the Monitor. Joe Duffy wrote about this here: http://www.bluebytesoftware.com/blog/Per..
[Steven] (URL) - 10 03 07 - 00:44
This is a lot more than good. This is didactic and serves as inspiration for developers in the 2.0 to 3.x learning curve.
Just one issue: What about the implementation of BaseTypeInvoker and DynamicMethod. I couldn“t get it to work without those.
Thanks!
Miguel
Miguel (URL) - 05 07 07 - 16:19
Miguel,
Why are you trying to get rid of the BaseTypeInvoker delegate and the DynamicMethod? If you explain what you are trying to accomplish, I can possibly help you with it. You can send me an e-mail if you like.
Steven (URL) - 05 07 07 - 22:34
Steve,
Actually my problem was a lot simpler.
I was not able to successfully run your code.
But it is ok now... I figured out the problem.
Thanks a lot and kind regards
Miguel (URL) - 08 07 07 - 00:48
I found a white paper (http://portal.acm.org/citation.cfm?id=1753207) about the Dynamic Factory Pattern that references my article.
Steven (URL) - 28 06 10 - 17:19
Great article!
I was using something like this in my code, but I was storing Type in a dictionary and extracting it's constructor in CreateInstance method of a factory.
Your idea to use a delegate gave me a great insight! Even more, you can use a generic constraint on Add method and everything becomes very secure and simple.
public class FactoryBase<TKey, TInterface>
{
delegate TInterface Creator();
private Dictionary<TKey, Creator> ctors = new Dictionary<TKey, Creator>();
public void Add<T>(TKey key)
where T : TInterface, new() // static checking for correct type
{
if (!ctors.ContainsKey(key))
{
// storing anonymous method that calls constructor
ctors.Add(key, () => new T());
}
}
public TInterface CreateInstance(TKey key)
{
try {
Creator ctor = ctors[key];
TInterface instance = ctor();
// do stuff with instance
return instance;
} catch (KeyNotFoundException) {
// handle errors
}
}
}
Nikita B. Zuev - 15 09 10 - 13:41
Hi Nikita.
To be able to use your example, you need the type T (of Add<T>) to be known at compile time. However, you can also create an Add(TKey key, Type type) overload that calls into the generic Add method using reflection. This way you can completely skip using that ugly Reflection.Emit, but still have the same performance characteristics.
Steven (URL) - 18 09 10 - 15:29
Nice! For now I'm satisfied with compile-time checkable `add'. But if runtime `add' will be needed, overload would be very usefull.
Nikita B. Zuev - 20 09 10 - 07:12