如何获得正确的基数估计值

原文：http://www.oracle.com/technetwork/issue-archive/2013/13-jan/o13asktom-1886639.html

I have a question about joining with collections and cardinality estimation. The optimizer is always estimating that 8,168 rows are coming back from my collection, and because of that, it is using inefficient plans. The estimate of 8,168 is more than two orders
of magnitude more than the real cardinality. How can I solve this problem?

我有一个关于集合和基数估计。优化器针对我的集合基数估计总是8168，就因为这，它老师使用低效率计划。8168是实际基数的2个数量级！

我该如何解决这个问题？

This is a long-running issue with pipelined functions and collections during optimization. The optimizer in general doesn’t have any information about the cardinality (the number of rows) being returned by the collection. It has to guess—and that guess is based
on the block size (default statistics are block-size-driven). So, for a database with an 8 K block size, the guess is about 8,000. And because it is unlikely that your collection has about 8,000 elements (probably more like 8 or 80 in most cases), you can
definitely arrive at a suboptimal plan.

这的确是个长期提出的问题：关于管道函数和集合的优化。优化器通常对于集合的基数信息毫无所知。它必须去猜！怎么猜？是基于块大小。

所以，对于8k大小数据块的数据库，猜出来的基数就是8000左右！但是通常集合的基数并不是8000，其更可能是8或80。

下面，我提供4种方法来获得正确的集合基数：

1）The cardinality hint (undocumented) 基数提示（无文件证明）

2）The OPT_ESTIMATE hint (undocumented) OPT_ESTIMATE提示（无文件证明）

3）Dynamic sampling (Oracle Database 11g Release 1 and later) 动态采样（Oracle 11g第一版或更高）

4）Oracle Database’s Cardinality Feedback feature (Oracle Database 11g Release 2 and later) Oracle基数反馈特性（Oracle 11g第2版或更高）

--1）用于字符串分割的管道函数

SQL> create or replace type str2tblType as table of varchar2(30)

Type created.

SQL> create or replace

function str2tbl( p_str in varchar2, p_delim in varchar2 default ‘,‘ )

return str2tblType

PIPELINED

l_str long default p_str || p_delim;

l_n number;

begin

loop

l_n := instr( l_str, p_delim );

exit when (nvl(l_n,0) = 0);

pipe row( ltrim(rtrim(substr(l_str,1,l_n-1))) );

l_str := substr( l_str, l_n+1 );

end loop;

end;

Function created.

SQL> variable x varchar2(15)

SQL> exec :x := ‘1,2,3,a,b,c‘

PL/SQL procedure successfully completed.

SQL> select * from table(str2tbl(:x));

COLUMN_VALUE

——————————————————————————————————————

6 rows selected.

SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT

————————————————————————————————————————————————————————

SQL_ID ddk1tv9s5pzq5, child number 0

————————————————————————————————————————————————————————

select * from table(str2tbl(:x))

Plan hash value: 2407808827

———————————————————————————————————————————————————————————————————————————

| 0|SELECT STATEMENT | | | | 29 (100)| |

| 1| COLLECTION ITERATOR PICKLER...|STR2TBL|8168|16336| 29 (0)|00:00:01|

--使用CARDINALITY 提示

SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT

————————————————————————————————————————————————————————

SQL_ID bd2f8rh30z3ww, child number 0

————————————————————————————————————————————————————————

select /*+ cardinality(sq 10) */ * from table(str2tbl(:x)) sq

Plan hash value: 2407808827

| 0|SELECT STATEMENT | | | | 29 (100)| |

| 1| COLLECTION ITERATOR PICKLER...|STR2TBL| 10| 20| 29 (0)|00:00:01|

--这地方，我们给优化器提示一个10使其认为管道函数的结果基数为10

--2）使用OPT_ESTIMATE提示

--提供3个参数：对象类型；对象名；一个缩放因素，优化器将拿它认为的基数乘以这个因素

--所以，先算出这个大概的因素

SQL> select 10/8168 from dual;

10/8168

————————————————

.00122429

select /*+ opt_estimate(table, sq, scale_rows=0.00122429) */ *

from table(str2tbl(:x)) sq

Plan hash value: 2407808827

| 0|SELECT STATEMENT | | | | 29 (100)| |

| 1| OLLECTION ITERATOR PICKLER...|STR2TBL| 10| 20| 29 (0)|00:00:01|

--3）使用DYNAMIC SAMPLING 提示

select /*+ dynamic_sampling( sq, 2 ) */ * from table( str2tbl(:x,‘,‘) ) sq

Plan hash value: 2407808827

| 0|SELECT STATEMENT | | | | 11 (100)| |

| 1| COLLECTION ITERATOR PICKLER...|STR2TBL| 6| 12| 11 (0)|00:00:01|

Note

———————

dynamic sampling used for this statement (level=2)

--4）使用Cardinality Feedback

--这地方我们要将查询变一下形：

with sq

as (

select /*+ materialize */ *

from table( str2tbl( :x ) )

)

select *

from sq

Plan hash value: 630596523

—————————————————————————————————————————————————————————————————————————————

| 0|SELECT STATEMENT | | | | 32 (100)| |

| 1| TEMP TABLE TRANSFORMATION | | | | | |

| 2| LOAD AS SELECT | | | | | |

| 3| COLLECTION ITERATOR PICKLER...|STR2TBL|8168|16336| 29 (0)|00:00:01|

| 4| VIEW | |8168| 135K| 3 (0)|00:00:01|

| 5| TABLE ACCESS FULL |SYS_...|8168|16336| 3 (0)|00:00:01|

18 rows selected.

--whoop，第一次还没生效呢，没关系， one more try

with sq as (select /*+ materialize */ * from table( str2tbl( :x ) )

) select * from sq

Plan hash value: 630596523

| 0|SELECT STATEMENT | | | 32 (100)| |

| 1| TEMP TABLE TRANSFORMATION | | | | |

| 2| LOAD AS SELECT | | | | |

| 3| COLLECTION ITERATOR PICKLER...|STR2TBL|8168|16336| 29 (0)|00:00:01|

| 4| VIEW | 6| 102 | 3 (0)|00:00:01|

| 5| TABLE ACCESS FULL |SYS_...| 6| 12| 3 (0)|00:00:01|

Note

———————

- cardinality feedback used for this statement

22 rows selected.

--这一次OK了

--揭晓一下基数反馈的原理，下面直接引用TOM的话：

Cardinality Feedback works by having the optimizer change its cardinality estimates after executing a query for the first time and observing that the actual cardinalities were very far off from the estimated cardinalities. That is, the optimizer starts to learn
from its mistakes. If it executes a query and discovers that the real row counts are far off from the estimated counts, it will reoptimize the query, using the newly discovered values.

关于基数反馈请参考：http://www.oracle.com/technetwork/issue-archive/2010/10-sep/o50asktom-165477.html

--------------------------------

Dylan Presents.

时间： 2024-08-19 12:15:52

如何获得正确的基数估计值

如何获得正确的基数估计值的相关文章

KV型内存数据库Redis

大数据计算：如何仅用1.5KB内存为十亿对象计数

EF提高性能

EntityFrame Work 5 性能注意事项（转自MSDN）

动态采样---DYNAMIC_SAMPLING 基于tom文章的翻译

我如何调优SQL Server查询

实体框架 5 性能注意事项

Hyper LogLog介绍

Redis学习一(基础入门).