Entity Framework: Joining in memory data with DbSet

转载自:https://ilmatte.wordpress.com/2013/01/06/entity-framework-joining-in-memory-data-with-dbset/

The argument of this post is relevant to people using Entity Framework and needing to filter data coming from a Database with a list of in-memory data.

In this article I will try to start summarizing what is well explained in a good article by Michael Hompus, adapting his example to Entity Framework Code First and adding a second issue for distracted programmers.
If you want his clear explanation I suggest you to go for his post:

http://blog.hompus.nl/2010/08/26/joining-an-iqueryable-with-an-ienumerable/

I will start from Michael’s article with the difference that my example will use Entity Framework Code First.
I will try to underline the 2 issues involved with this topic.

It could happen that you want to filter some data bases on a list of values and you want to filter them while querying, in order to avoid loading unuseful data in memory.

In my example I suppose that you already know Entity Framework Code First.
I explicitly invoke a DatabaseInitializer to be sure to create a freshly new database.
I previously created a DbContext with a single Dbset of Customer entities:

public class CustomerContext : DbContext
{
public DbSet Customers { get; set; }
}
and I created the Customer entity:

public class Customer
{
public int Id { get; set; }

public string Name { get; set; }

public string Surname { get; set; }
}
I created a Console application to test the Linq queries I want to analyze.
I want to filter Customers base on their Ids. I want only three of them:

private static void MainMethod()
{
try
{
var customerIds = new List {1,5,7};
using (var context = new CustomerContext())
{
var initializer = new DropCreateDatabaseAlways();
initializer.InitializeDatabase(context);

var customers = from customer in context.Customers
join customerId in customerIds
on customer.Id equals customerId
select customer;
var result = customers.ToList();
}
}
catch (Exception exception)
{
Console.WriteLine(exception.ToString());
}
}
and this is the resulting query:

SELECT
[Extent1].[Id] AS [Id],
[Extent1].[Name] AS [Name],
[Extent1].[Surname] AS [Surname]
FROM [dbo].[Customers] AS [Extent1]
INNER JOIN (SELECT
[UnionAll1].[C1] AS [C1]
FROM (SELECT
cast(1 as bigint) AS [C1]
FROM ( SELECT 1 AS X ) AS [SingleRowTable1]
UNION ALL
SELECT
cast(5 as bigint) AS [C1]
FROM ( SELECT 1 AS X ) AS [SingleRowTable2]) AS [UnionAll1]
UNION ALL
SELECT
cast(7 as bigint) AS [C1]
FROM ( SELECT 1 AS X ) AS [SingleRowTable3]) AS [UnionAll2] ON [Extent1].[Id] = [UnionAll2].[C1]
As you can see, a UNION ALL statement is issued for any item in the collection.
It’s time for the first issue:

1) Sql Server has a maximum limit of depth for nested subqueries: 32 (http://msdn.microsoft.com/en-us/library/ms189575%28v=sql.105%29.aspx)
Then, if your in-memory collection gets too big you will get the following exception:

System.Data.EntityCommandExecutionException: An error occurred while executing the command definition. See the inner exception for details. —> System.Data.SqlClient.SqlException: Some part of your SQL statement is nested too deeply. Rewrite the query or break it up into smaller queries.

As Michael, I will use Enumerable.Range to create a list of the desired length, modifying the MainMethod as in the following snippet:

private static void MainMethod()
{
try
{
var customerIds = Enumerable.Range(1, 50);
using (var context = new CustomerContext())
{
var initializer = new DropCreateDatabaseAlways();
initializer.InitializeDatabase(context);

var customers = from customer in context.Customers
join customerId in customerIds
on customer.Id equals customerId
select customer;
var result = customers.ToList();
}
}
catch (Exception exception)
{
Console.WriteLine(exception.ToString());
}
}
If you run you’re console application now you will get the exception.

If you had to write the desired SQL on your own you probably would have opted
for a simple: where …. in (…).

This would avoid us incurring in the max limit of nested statement.
If you want to obtain such a result as generated SQL you should modify your Linq
query to use the Contains method as in the following version of: MainMethod:

private static void MainMethod()
{
try
{
var customerIds = Enumerable.Range(1, 50);
using (var context = new CustomerContext())
{
var initializer = new DropCreateDatabaseAlways();
initializer.InitializeDatabase(context);

var customers = from customer in context.Customers
where customerIds.Contains(customer.Id)
select customer;
var result = customers.ToList();
}
}
catch (Exception exception)
{
Console.WriteLine(exception.ToString());
}
}
Now the resulting query, easier to read than the previous one, is the following:

{SELECT
[Extent1].[Id] AS [Id],
[Extent1].[Name] AS [Name],
[Extent1].[Surname] AS [Surname]
FROM [dbo].[Customers] AS [Extent1]
WHERE [Extent1].[Id] IN (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50)}
Obviously there’s something strange if you’re forced to filter with a very big in-memory collection but that’s it.

2)
It’s now time for the second issue. Go back for a while to the original version of our method. I will show it again here:

private static void MainMethod()
{
try
{
var customerIds = new List {1,5,7};
using (var context = new CustomerContext())
{
var initializer = new DropCreateDatabaseAlways();
initializer.InitializeDatabase(context);

var customers = from customer in context.Customers
join customerId in customerIds
on customer.Id equals customerId
select customer;
var result = customers.ToList();
}
}
catch (Exception exception)
{
Console.WriteLine(exception.ToString());
}
}
The query seems very obvious for people used to SQL but we must always know what kind of collection we are using.
Let’s rewrite the previous query with method chaining syntax like in the following snippet:

private static void MainMethod()
{
try
{
var customerIds = new List {1,5,7};
using (var context = new CustomerContext())
{
var initializer = new DropCreateDatabaseAlways();
initializer.InitializeDatabase(context);

var customers = context.Customers
.Join(customerIds,
customer => customer.Id,
customerId => customerId,
(customer, customerId) => customer)
.Select(customer => customer);
var result = customers.ToList();
}
}
catch (Exception exception)
{
Console.WriteLine(exception.ToString());
}
}
If we run our console application we will obtain the same query with both versions of our method.
The same query we saw at the beginning:

SELECT
[Extent1].[Id] AS [Id],
[Extent1].[Name] AS [Name],
[Extent1].[Surname] AS [Surname]
FROM [dbo].[Customers] AS [Extent1]
INNER JOIN (SELECT
[UnionAll1].[C1] AS [C1]
FROM (SELECT
cast(1 as bigint) AS [C1]
FROM ( SELECT 1 AS X ) AS [SingleRowTable1]
UNION ALL
SELECT
cast(5 as bigint) AS [C1]
FROM ( SELECT 1 AS X ) AS [SingleRowTable2]) AS [UnionAll1]
UNION ALL
SELECT
cast(7 as bigint) AS [C1]
FROM ( SELECT 1 AS X ) AS [SingleRowTable3]) AS [UnionAll2] ON [Extent1].[Id] = [UnionAll2].[C1]
The method syntax is more explicit about what’s happening: the IQueryable.Join method is being invoked.
This means that Linq To Entities IQueryable Provider plays its role in generating the resulting SQL: converting the Linq join into a SQL inner join.

The query syntax implies very specific roles to the joined collection depending on their position in the query: the left one is called: the outer collection and the right one: the inner collection.
If we inadvertently revert the order of our 2 lists, happening to put the in-memory list to the left side like in the following snippet:

private static void MainMethod()
{
try
{
var customerIds = new List {1,5,7};
using (var context = new CustomerContext())
{
var initializer = new DropCreateDatabaseAlways();
initializer.InitializeDatabase(context);

var customers = from customerId in customerIds
join customer in context.Customers
on customerId equals customer.Id
select customer;
var result = customers.ToList();
}
}
catch (Exception exception)
{
Console.WriteLine(exception.ToString());
}
}
the method invoked will be: IEnumerable.Join and the SQL sent to the Database will be the following (you can see it with Sql Server Profiler):

SELECT
[Extent1].[Id] AS [Id],
[Extent1].[Name] AS [Name],
[Extent1].[Surname] AS [Surname]
FROM [dbo].[Customers] AS [Extent1]
As you can see, our filter condition simply disappeared: no join nor where…in condition but still the ‘result’ variable will contain only the desired results.

If the left operand in a join statement is of type IEnumberable, the Enumerable.Join extension method will be chosen during method overload resolution.
This means that the whole Customers table will be loaded in memory and then filtered via Linq To Objects…and this is not what we want.

So we definitely need to pay attention when joining in-memory collections with IQueryable collections and remember to always put the IQueryable to the left side.

时间: 2024-07-31 18:50:04

Entity Framework: Joining in memory data with DbSet的相关文章

Entity Framework mvc Code First data migration

1. Code First 可以先在代码里写好数据模型,自动生成DB.下一次启动的时候会根据__MigrationHistory判断 数据库是否和模型一致. 详情参考:http://blogs.msdn.com/b/adonet/archive/2012/02/09/ef-4-3-code-based-migrations-walkthrough.aspx 如果想改变数据库的某个字段,而又不想重新生成一遍数据库的话.请按照以下操作做: Package Manager console: enabl

Entity Framework Core 之Querying Data

Querying Data EFCore是使用LINQ语法去数据库中查询数据,查询的声明周期基本如下: LINQ查询进程准备一个EFCore的核心提供商来准备进行查询,而这个进程会进行缓存不需要每次查询都执行一遍 结果通过数据库提供商: 什么时候执行查询结果动作: 当有通过循环遍历查询结果的时候,执行查询如for 当使用如tolist(),toArray(),single(),count等方法的时候执行查询 将查询结果绑定到UI控件的时候 Basic Query EF Core使用Linq语法从

EntityFrame Work:No Entity Framework provider found for the ADO.NET provider with invariant name 'System.Data.SqlClient'

今天试着学习了Entity Frame Work遇到的问题是 The Entity Framework provider type 'System.Data.Entity.SqlServer.SqlProviderServices, EntityFramework.SqlServer' registered in the application config file for the ADO.NET provider with invariant name 'System.Data.SqlCli

让EF飞一会儿:如何用Entity Framework 6 连接Sqlite数据库

小分享:我有几张阿里云优惠券,用券购买或者升级阿里云相应产品最多可以优惠五折!领券地址:https://promotion.aliyun.com/ntms/act/ambassador/sharetouser.html?userCode=ohmepe03 获取Sqlite 1.可以用NuGet程序包来获取,它也会自动下载EF6 2.在Sqlite官网上下载对应的版本:http://system.data.sqlite.org/index.html/doc/trunk/www/downloads.

Entity Framework Core 2.0 入门简介

不多说废话了, 直接切入正题. EF Core支持情况 EF Core的数据库Providers: 此外还即将支持CosmosDB和 Oracle. EFCore 2.0新的东西: 查询: EF.Functions.Like() Linq解释器的改进 全局过滤(按类型) 编译查询(Explicitly compiled query) GroupJoin的SQL优化. 映射: Type Configuration 配置 Owned Entities (替代EF6的复杂类型) Scalar UDF映

Entity Framework 6连接Postgresql、SQLite、LocalDB的注意事项和配置文件

Postgresql Postgresql支持Code First的方式自动生成表,不过默认的模式是dbo而不是public,而且还可以自动生成自增主键. <?xml version="1.0" encoding="utf-8"?> <configuration> <configSections> <!-- For more information on Entity Framework configuration, vis

Entity Framework使用Sqlite时的一些配置

前段时间试着用Entity Framework for Sqlite环境,发现了一些坑坑洼洼,记录一下. 同时试了一下配置多种数据库,包括Sqlite.Sql Server.Sql Server LocalDB.Sql Server Compact. 我建的demo项目结构以及通过NuGet安装的包:   EFDemo.MultipleDB.UI引用了EFDemo.MutipleDB项目. 1. 遇到的异常1 The Entity Framework provider type 'System.

Windows Service 项目中 Entity Framework 无法加载的问题

Windows Service 项目引用了别的类库项目,别的项目用到了 Entity Framework(通过Nuget引入),但是我的 Windows Service 无法开启,于是我修改了 App.config,加入 EF 的配置信息后能开启,但是业务代码无法进入,通过日志发现错误: System.InvalidOperationException: The Entity Framework provider type 'System.Data.Entity.SqlServer.SqlPro

Entity Framework 6.0 IIS 部署出错解决方案

我自己写的webAPI + EF6.0 本地运行正常,部署到Windows Server2008中的IIS7.5中报错. 报错信息如下: "Message":"出现错误.","ExceptionMessage":"无法为具有固定名称“System.Data.SqlClient”的 ADO.NET 提供程序加载在应用程序配置文件中注册的实体框架提供程序类型“System.Data.Entity.SqlServer.SqlProviderS