I just finished a Guerrilla.NET in Boston with Michael Kennedy and Mark Smith. Here are the topics we covered.

  • Introduction to WPF and Silverlight
  • ASP.NET MVC 3.0: Beyond the Basics
  • LINQ to Objects and LINQ to XML
  • Entity Framework
  • Model-View-ViewModel for WPF and Silverlight
  • PFx: Task and The Parallel Class
  • Thread Safety and Concurrent Data Structures
  • Building WCF REST Services
  • C# 3.0, 4.0, and 5.0
  • Entity Framework and the Repository Pattern
  • jQuery
  • Cloud Computing for the .NET Developer: IaaS, PaaS, and Patterns
  • The NoSQL Movement, LINQ, and MongoDB
  • iOS Programming with .NET and MonoTouch
  • Design Patterns for Testability (DI, IoC, and unit testing)
  • Reactive Framework for .NET (Rx)
  • WCF Data Services
  • Power Debugging with WinDBG

I had a great time, and as an added bonus I learned some things. I was monkeying for Mark while he was doing the “Entity Framework and the Repository Pattern”. He did two things that I thought were better than I had done in the past. But first some background…

I used to structure my repositories using an interface similar to this:

IEnumerable<T> GetAll() // or sometimes returning an IList
T GetById(TKey id)  // sometimes just taking an object
void Add(T t)
void Delete(TKey id) // sometimes taking a T
void Save()

I may have other interfaces which specialize a particular IRepository to contain additional query methods.
Sometimes when I am implementing the repository pattern I combine it with a session object. This is especially nice when using NHibernate or doing Web work (where the controller method creates the session and closes it before returning). The session will hang on to the context so that different repositories can be called together in a transaction.

Mark pointed out two things that work differently in Entity Framework and LINQ than in other ORMs.
1) Save() doesn’t need to go on the individual repositories anymore it can be moved up into the Session object. This is because SaveChanges() commits all changes across anything using the same context.
2) If you change the All method to return an IQueryable you can remove all of the other Get methods, because they can be built using LINQ.

So the new IRepository looks like:

IQueryable<T> All
void Add(T t)
void Delete(T t)

That is a good simplification, thanks Mark.

[Update: this refers to the 1.0 version found in .NET 3.5]

Sure, I had read the disclaimer here: http://efvote.wufoo.com/forms/ado-net-entity-framework-vote-of-no-confidence/

But these were the requirements:

  • Needed to support multiple databases
    They had already switched databases once (from Oracle to SQL Server) and they still had some other databases in Oracle
  • Could not use strings (had to be type-safe)
    This was due to the amount of change (we were developing a fairly complicated v1.0 product)

When we initially started the project back in early 2008 (a year or so ago) we had a meeting with four people to try and choose an ORM. The first decision made was we agreed that NHibernate was out, because it uses strings in its query syntax, and the support for generation isn’t on par with the others. We discussed many aspects as to how the remaining choices are used. One of the most important points was whether they had a good online community, but all of them met that bar. Here is a summary of the differences:

Consideration Linq to SQL Entity Spaces Entity Framework
Released Product Yes Yes No
Supports LINQ Yes No Yes
Clean business objects Yes No Yes
Easy query navigability for Many-to-manys No Yes Yes
Easy to modify Many-to-manys No No Yes
Supports all databases No Yes Yes

Ultimately it was decided that as long as we are careful it should be easy to convert between the DALs. That way we can start of in one and move to another one. We will start off with Entity Spaces. One additional action item was to check out LLBLGen Pro, and msbuild support for MyGeneration.

Now that 3.5 SP1 had finally shipped we decided to completely replace our data access layer with Entity framework. One thing that we didn’t like going in was the fact that we still wouldn’t be able to replace our existing two-tiered entity objects. In our initial model we have one set tied to Entity Spaces and the other which is a DTO representing basically the same object.

What we liked:

  1. We could use LINQ
    I know that it doesn’t seem like this is such a big deal, but check this out:

    Before:

    public static ContainerInfo[] GetContainerInfosByExternalId(
    	string externalSampleId)
    {
        var containerInfos = new List<containerinfo>();
        if (String.IsNullOrEmpty(externalSampleId))
            return new ContainerInfo[0];
        var sampleSlotQuery = new SampleSlotQuery("s");
        Dal.InitConnectionName(sampleSlotQuery.es);
        var containers = new ContainerCollection();
        containers.Query.es.JoinAlias = "c";
        Dal.InitConnectionName(containers.es);
    
        containers.Query.InnerJoin(sampleSlotQuery)
           .On(containers.Query.ContainerId == sampleSlotQuery.ContainerId);
        containers.Query.Where(sampleSlotQuery.ExternalSampleId == externalSampleId);
        containers.Query.OrderBy(containers.Query.ContainerId.Ascending);
        if (containers.Query.Load())
        {
            foreach (Container container in containers)
            {
                containerInfos.Add(ConvertToContainerInfo(container));
            }
        }
        return containerInfos.ToArray();
    }
    

    After:

    public static ContainerInfo[] GetContainerInfosByExternalId(
    	WorkflowStateEntities db, string externalId)
    {
        return
            (from c in db.Containers.Include("Samples")
             from ss in c.SampleSlots
             where ss.ExternalId == externalId
             select ConvertToContainerInfo(c)).ToArray();
    }
    

What we didn’t like. These were relatively minor things like:

  • No lazy loading
  • Non type-safe eager loading
  • (See here)

It took 5 of us about 4 days to convert the entire thing over. After it was done I tried to figure out how many lines we were able to remove. However, there was a problem, because even though I knew we were able to reduce the code, the Code Metrics in Visual Studio showed that the number of lines increased. After some frustration, I finally used a version of wc (the Unix utility) which showed a 10-15% reduction across the board

Another problem we had was the varchar / nvarchar problem. We actually reported this problem to the Entity Framework team through one of our Help support tickets. From one of our developers
In EF, as it turns out, the queries are ALWAYS cast to nvarchar (Unicode) regardless of the schema column type being a varchar (or bit). Before running my last set of traces (those shown above), I made sure that ALL columns in the schema were preset to be nvarchar types before running the EF code. That actually improved the query speed, cutting the time in half from about 4000 microseconds to 2200 microseconds

But the biggest problem turned out to be performance. We had some pretty big load operations going in. By converting one of these queries went from 21 seconds in Entity Spaces to 2 minutes in Entity Framework. Basically a 6x factor across the board. This performance was not acceptable for our product. So after doing all of the conversion we had to roll everything back – uggh!