建立索引
建立索引是优化查询的一种方式,我们通常会对where列上建立相关索引,可以是单列的索引,也可以是复合索引。
加索引要注意
在where、order by的相关列上可以考虑添加索引;
如果where列中已经存在索引,order by列的索引(假设存在)不会被利用(explain显示using filesort就表示order by没有用到索引,where列与order by列建立复合索引,就可以了,也可以在业务层进行排序);
where列中多个“且”条件列都添加了索引,mysql只会利用“价值最大”的一列(这种情况可以考虑复合索引);
复合索引的最前匹配原则;
字符串列的索引,“abc%”会走索引,“%abc”不会走索引;
为字符串建立索引,最好遵循短索引,比如一个CHAR(255)的列,如果该列值的前10位就可保证唯一(或者区分度很高,近似唯一),就可以以前10为建立索引;
尽量避免使用否定条件,如NOT IN、IS NOT、NOT LIKE、!=、<>等(否定句不走索引);
理解以上一系列的规则并不困难,比如说多个“且”条件只会利用一个索引,而多个“或”条件则可能会利用多个索引。因为对于“且”条件,使用一个索引找到相关数据项后只需要在这些数据项中进一步条件过滤就可以了,没有必要再次通过索引做任何事,而对于“或”条件,一个索引找到的条目可能并不满足其他条件,这就需要多次经过索引进行查找(当然也并非一定会经过索引查找,mysql会做出“明智”的决定)。
NULL值的索引
网上对where name is null这种查询走不走索引说法不一,有些观点说不走索引,推荐将列均设为NOT NULL,将null值替换为0、""等形式。但我从官方文档找到了相关说明,并且也做了相应测试,NULL列确实可以走索引。
MySQL can perform the same optimization on col_name IS NULL that it can use for col_name = constant_value. For example, MySQL can use indexes and ranges to search for NULL with IS NULL.
Examples
1 SELECT * FROM tbl_name WHERE key_col IS NULL; 2 3 SELECT * FROM tbl_name WHERE key_col <=> NULL; 4 5 SELECT * FROM tbl_name 6 WHERE key_col=const1 OR key_col=const2 OR key_col IS NULL;
附:可以通过explain查看查询是否使用了索引;
通过show profiles可以查看查询耗时,show profile for query n可以查询某条查询语句的详细执行情况。数据库默认是不开启profiling的,变量profiling是用户变量,每次都得通过set profiling=1;重新启用,可以通过set profiling=0关闭。
=、IS、LIKE
IS通常只用于NULL;=就是等值判断,注意,NULL与任何值都不相等,包括它自己;LIKE常用于字符串匹配,如“a_b%”,_代表任意一个字符,%代表任意个字符。
IN、JOIN和EXISTS
select * from t1 where exists (select null from t2 where t2.x=t1.x);的执行逻辑相当于
items = select * from t1 for x in items loop if ( exists ( select null from t2 where t2.x = x.x ) then OUTPUT THE RECORD end if end loop
select * from t1 where t1.x in (select distinct x from t2);的执行逻辑相当于
select * from t1, ( select distinct x from t2 ) t2 where t1.x = t2.x;
根据以上执行逻辑可以推断,外表为大表时适合使用in(因为mysql会建立一个内表查询结果的临时表,然后利用外表索引与临时表做联合,这种情况,临时表不宜过大,join和in类似,省去了临时表,性能比in要好一些),内表为大表时适合使用exists(因为查询会遍历外表项,外表越大遍历项就越多,而内表可以使用索引)
NOT EXISTS依旧会使用内表的索引,而NOT IN则不会再利用外表的索引,但依然会使用临时表,所以并不意味着NOT EXIST一定优于NOT IN,如果外表远远大于内表,NOT IN是有优势的,反之则应当选用NOT EXISTS
隐式转换
- If one or both arguments are NULL, the result of the comparison is NULL, except for the NULL-safe <=> equality comparison operator. For NULL <=> NULL, the result is true. No conversion is needed.
- If both arguments in a comparison operation are strings, they are compared as strings.
- If both arguments are integers, they are compared as integers.
- Hexadecimal values are treated as binary strings if not compared to a number.
- If one of the arguments is a TIMESTAMP or DATETIME column and the other argument is a constant, the constant is converted to a timestamp before the comparison is performed. This is done to be more ODBC-friendly. Note that this is not done for the arguments to IN()! To be safe, always use complete datetime, date, or time strings when doing comparisons. For example, to achieve best results when using BETWEEN with date or time values, use CAST() to explicitly convert the values to the desired data type.
- A single-row subquery from a table or tables is not considered a constant. For example, if a subquery returns an integer to be compared to a DATETIME value, the comparison is done as two integers. The integer is not converted to a temporal value. To compare the operands as DATETIME values, use CAST() to explicitly convert the subquery value to DATETIME.
- If one of the arguments is a decimal value, comparison depends on the other argument. The arguments are compared as decimal values if the other argument is a decimal or integer value, or as floating-point values if the other argument is a floating-point value.
- In all other cases, the arguments are compared as floating-point (real) numbers.
Examples
mysql> select 1+1; +-----+ | 1+1 | +-----+ | 2 | +-----+ mysql> select ‘a‘ + ‘55‘; +------------+ | ‘a‘ + ‘55‘ | +------------+ | 55 | +------------+ mysql> select 55 = 55; +--------------+ | 55 = 55 | +--------------+ | 1 | +--------------+ mysql> select ‘55aaa‘ = 55; +--------------+ | ‘55aaa‘ = 55 | +--------------+ | 1 | +--------------+ mysql> select ‘aaa55‘ = 55; +--------------+ | ‘aaa55‘ = 55 | +--------------+ | 0 | +--------------+
隐式类型转换会引发安全和性能问题
安全问题在于字符串和数字0比较时,大概率为true,比如select * from user where username=‘zyong‘ and password=0;,知道用户名就可以登录成功了(password的首字符不为0)。
性能问题在于字符串呵数字比较时,字符串被隐式转换为浮点型(相当于对列进行了运算),这样就无法利用原索引了。
LIKE会将数字类型转换为字符串,可以避免上述问题,比如select * from user where username=‘zyong‘ and password like 0;会将password=0中的0作为字符串‘0‘处理,但如果是select * from user where id like 123456;(或者select * from user where id like ‘123%‘;)则id列(id为int型)会被转换为字符串来与‘123‘进行比较,不会走索引,所以LIKE只建议用在字符串类型上。
参考:https://stackoverflow.com/questions/229179/null-in-mysql-performance-storage
http://muyue123.blog.sohu.com/146930118.html
http://www.cnblogs.com/rollenholt/p/5442825.html
https://dev.mysql.com/doc/refman/5.7/en/string-comparison-functions.html