Fixing Duplicate Records in Entity Framework

Tuesday, December 02, 2014 @ 02:00

By Corey Adler

I am a very active participant in the developer community on StackOverflow, particularly when questions come up in any of the Entity Framework tags. I’ve personally found that looking at it has helped me grow a lot as a software engineer, and it has also given me the opportunity to help out people who are new to the technologies I love—like Entity Framework. Lately, though, I’ve begun noticing that some of the questions I’ve answered seem to be the same thing: People having trouble with Entity Framework duplicating some records. So for this post, I’d like to share a quick fix that tends to be the correct answer in these situations.

Oftentimes, the question I see starts with models similar to this:

Object Models

Then, they’ll show the code that they’re using to generate the objects in question, as so:

Object Generation

Calling this SaveEntity() method not only creates the MainEntity object in the database, but also somehow leads to creating a brand new OtherEntity record—one that is exactly the same as the one that the user tried to attach to the MainEntity in every single way (except for the primary key value). So why does Entity Framework create this duplicate record?

The solution to this turns out to be that Entity Framework is paying more attention to the fact that you’ve declared the MainEntity record as needing to be added. Calling that Add() method puts every part of that object as having its EntityState equal to EntityState.Added. Even though the navigational property itself will have its own ID field on it, and that that ID field is populated (which should tell EF that it exists already), Entity Framework does not look at it at all, and will create the duplicate record. The fix for this is quite simple: Just populate the NavPropertyID field instead (and leave NavProperty as being null). When you do this Entity Framework will see that NavProperty is null and skip over it, leaving just NavPropertyID to be saved into the database. Considering that the actual MainEntities table itself only has the NavPropertyID field anyway, this is exactly the result that you’d want.

So why does this behavior happen in the first place? It turns out that the designers of Entity Framework considered this possibility all the way back in 2010 (with version 4.0—see http://blogs.msdn.com/b/diego/archive/2010/10/06/self-tracking-entities-applychanges-and-duplicate-entities.aspx), but they specifically designed it not to handle duplications. Originally they had wanted to throw an exception when a duplicate record was detected, but determined that there were cases where that duplication would be a good thing to have (see the example in the above post), and so decided to leave it as is without throwing an exception.

So, in conclusion, always be sure to assign your properties by their ID field instead of the navigation property (or face the consequences)! For more information on this problem, and ways to avoid doing it, please look at a wonderful blog post by Julie Lerman (Microsoft’s Entity Framework MVP) at http://msdn.microsoft.com/en-us/magazine/dn166926.aspx.

Until next time, I wish you good coding.