数据类型不准确的一个隐患,下面来构造一张表存日期字段,一个存varchar2,一个存date,做一个测试。之前也写过两篇blog:
SQL> drop table test purge;
SQL> create table test as select
to_char(to_date(‘2014-01-01‘,‘yyyy-MM-dd‘)+rownum,‘yyyymmdd‘) s_date,
to_date(‘2014-01-01‘,‘yyyy-MM-dd‘)+rownum d_date
from all_objects;
SQL> create index ind_t_sdate on test(s_date) nologging;
SQL> create index ind_t_ddate on test(d_date) nologging;
SQL> exec dbms_stats.gather_table_stats(user,‘test‘,cascade => true);
SQL> set timing on
SQL> set autotrace traceonly
SQL> select * from test where s_date between ‘20140201‘ and ‘20140222‘;
已选择22行。
已用时间: 00: 00: 00.00
执行计划
----------------------------------------------------------
Plan hash value: 953148778
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 51 | 3 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TEST | 3 | 51 | 3 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IND_T_SDATE | 3 | | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
--可以看到CBO评估出来的行数是3,明明返回的是22
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("S_DATE">=‘20140201‘ AND "S_DATE"<=‘20140222‘)
统计信息
----------------------------------------------------------
1 recursive calls
0 db block gets
7 consistent gets
0 physical reads
0 redo size
944 bytes sent via SQL*Net to client
349 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
22 rows processed
SQL> select * from test
where d_date between to_date(‘20140201‘, ‘yyyymmdd‘) and
to_date(‘20140222‘, ‘yyyymmdd‘);
已选择22行。
已用时间: 00: 00: 00.00
执行计划
----------------------------------------------------------
Plan hash value: 112387541
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 23 | 391 | 3 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TEST | 23 | 391 | 3 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IND_T_DDATE | 23 | | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
--可以看到CBO评估出来基本是准确的。
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("D_DATE">=TO_DATE(‘ 2014-02-01 00:00:00‘, ‘syyyy-mm-dd hh24:mi:ss‘)
AND "D_DATE"<=TO_DATE(‘ 2014-02-22 00:00:00‘, ‘syyyy-mm-dd hh24:mi:ss‘))
统计信息
----------------------------------------------------------
1 recursive calls
0 db block gets
7 consistent gets
0 physical reads
0 redo size
944 bytes sent via SQL*Net to client
349 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
22 rows processed
总结:虽然这两条SQL消耗的资源是一样的,但SQL1的评估结果不对,在多表关联的时候,这个绝对是个隐患,非常容易导致执行计划走错。除了以上的几个原因之外,还存在的问题是用varchar2存date会造成N多的存储格式,曾经看到过一个情况,日期格式五花八门(有年月日,年月日 小时,年月日 小时,分钟),有中英文的:,有全角、半角,有null,甚至undefine(大概是从js传过来的)。且造成索引建了用不上,不得已改数据类型,光写转换的脚本就花了一天多的时间。