plsql exist和in 的区别

<![endif]--> <![endif]-->

发现公司同事很喜欢用exists 和in 做子查询关联，我觉得很有必要研究下

两者的区别，供参考和备忘

/* （这段信息来自网络begin ）对于in 的使用，就相当于对inner table 执行一个带distinct 的子查询，然后将得到的结果集再和outer table 进行外连接，连接方式和索引使用任然同于普通两表的连接（这段信息来自网络end ）*/

对于网络的这段描述，我给予了测试，测试表为 liomuser.staff ，和liomuser.department ，这两张表都是小表，数量在1 万左右。

-- 例如：

select *

from liomuser.staff

where department_id in ( select department_id from liomuser.department);

-- 可以转换为

select a.*

from liomuser.staff a,

( select distinct department_id from liomuser.department) b

where a.department_id = b.department_id;

执行计划分别如下：

（ 1 ） select *

from liomuser.staff

where department_id in ( select department_id from liomuser.department);

（ 2 ） select a.*

from liomuser.staff a,

( select distinct department_id from liomuser.department) b

where a.department_id = b.department_id;

我选择的是两个小表，从数据上看采用外连接的方式除了一致性读要稍微小一点，两者执行计划和统计信息几乎一样。

测试结果显示对于小表网络上给出的描述是正确的

但是以我的经验，in 的方式应该比外连接性能要差很多，按照上面的测试，两者似乎是一样的执行路径，是不是应为表数据量少的缘故呢？

我决定采用两张大表做测试，cust_order 和order_detail 这两张表的数据量都在一千万以上。

首先测试in ，语句如下：

select a.*

from liomuser.cust_order a

where order_id in ( select order_id from liomuser.order_detail b);

执行计划如下：

测试2 外连接，语句如下：

select a.*

from liomuser.cust_order a,

( select distinct order_id from liomuser.order_detail) b

where a.order_id = .order_id ;

执行计划如下：

对着两个大表的in 和外连接的对比可以看出，采用外连接的执行计划明显优于in 的方式，采用in方式则表连接采用nested loop 方式，而外连接采用了HASH JOIN ，

并且in 方式的CPUcost 要比外连接大1/3, 这样看来，对于小表，或者说inner table 是小表的查询，in 和外连接都差不多，但是对于大表，特别是inner table 数据量巨大的时候，采用外连接要好很多。

由此看出，in 并不完全等同于与inner table 做distinct 外连接，但是外连接要比in 有效率得多。

下面讨论下 EXIST

实际上exists 等同于先对outer table 进行扫描，从中检索到每行和inner table 做循环匹配，执行计划如下：

注释：部分网上资料认为exists 对outer table 会进行全表扫描，但是在该执行计划中没有发现全表扫描，仍然走了索引。

Exists 的写法可以转换成：

declare cnt number ( 10 );

for cur in ( select a.* from liomuser.cust_order a) loop

cnt:= 0 ;

select count ( 1 ) into cnt from liomuser.order_detail where order_id=cur.order_id;

if cnt<> 0 then

return cur;

end if ;

end loop ;

exists 与 in 的比对：

语句 1 ， in

语句 2 ， exsits

从执行计划上来看没有任何区别，再让我们看看执行的统计信息：

语句 1 ， in

select a.*

from liomuser.cust_order a

where order_id in ( select order_id from liomuser.order_detail b)

语句 2 ， exists

select a.*

from liomuser.cust_order a

where exists

( select 1 from liomuser.order_detail b where a.order_id = b.order_id)

从两种方式统计信息可以看出，采用 exists 的一致性读要比 in 要好，但是 bytessent 要比 in高，这个也从侧面验证了前面所说的 exists 相当于 loop

通过上面解释现在很容易明白当 inner table 数据量巨大且索引情况不好 ( 大量重复值等 ) 则不宜使用产生对 inner table 检索而导致系统开支巨大 IN 操作，建议对 innertable 过大的查询，采取 exsits ，或者外连接方式

另外： NOT IN 子句将执行个内部排序和合并 . 无论在哪种情况下 ,NOT IN 都是

最低效 ( 它对子查询中表执行了个全表遍历 ). 为了避免使用 NOT IN , 我们可以把它改写成外连接 (Outer Joins) 或 NOT EXISTS

时间： 2024-10-22 16:01:36

plsql exist和in 的区别

plsql exist和in 的区别的相关文章

php面试题汇总四(基础篇附答案)

关于sql的一部分知识

安装64位的oracle数据库, 使用自带的sqldeveloper

邓_php面试【003】——完整版

PostgreSQL的架构

2020最新PHP面试题（附带答案）

oracle在SQLPLUS 和PLSQL建 job 的区别

PLSQL中显示Cursor、隐示Cursor、动态Ref Cursor区别

SQL、T-SQL与PL-SQL的区别