数据库优化(十二)-tpch中Q3-Q5仅仅在sql优化方面的分析

--Q3;

1.1执行计划

EXPLAIN EXTENDED SELECT

L_ORDERKEY,

SUM(L_EXTENDEDPRICE * (1 - L_DISCOUNT)) AS REVENUE,

O_ORDERDATE,

O_SHIPPRIORITY

FROM

CUSTOMER,

ORDERS,

LINEITEM

WHERE

C_MKTSEGMENT = ‘BUILDING‘

AND C_CUSTKEY = O_CUSTKEY

AND L_ORDERKEY = O_ORDERKEY

AND O_ORDERDATE < DATE ‘1995-03-15‘

AND L_SHIPDATE > DATE ‘1995-03-15‘

GROUP BY

L_ORDERKEY,

O_ORDERDATE,

O_SHIPPRIORITY

ORDER BY

REVENUE DESC,

O_ORDERDATE;

+----+-------------+----------+--------+--------------------------------------------+---------+---------+------------------------+---------+----------+----------------------------------------------+

| id | select_type | table    | type   | possible_keys                              | key     | key_len | ref                    | rows    | filtered | Extra                                        |

+----+-------------+----------+--------+--------------------------------------------+---------+---------+------------------------+---------+----------+----------------------------------------------+

|  1 | SIMPLE      | ORDERS   | ALL    | PRIMARY,i_o_orderdate,i_o_custkey          | NULL    | NULL    | NULL                   | 1522266 |    50.00 | Using where; Using temporary; Using filesort |

|  1 | SIMPLE      | CUSTOMER | eq_ref | PRIMARY                                    | PRIMARY | 4       | tpch.ORDERS.o_custkey  |       1 |   100.00 | Using where                                  |

|  1 | SIMPLE      | LINEITEM | ref    | PRIMARY,i_l_orderkey,i_l_orderkey_quantity | PRIMARY | 4       | tpch.ORDERS.o_orderkey |       1 |   100.00 | Using where                                  |

+----+-------------+----------+--------+--------------------------------------------+---------+---------+------------------------+---------+----------+----------------------------------------------+

3 rows in set, 1 warning (0.00 sec)

1.2分析

该语句带有分组、排序、聚集操作并存的三表查询操作。对于orders表执行了全表扫描,使用了临时表和文件排序。

1.2.1查看ordrs表的索引情况

mysql> show indexes from orders;

+--------+------------+--------------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

| Table  | Non_unique | Key_name           | Seq_in_index | Column_name    | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |

+--------+------------+--------------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

| orders |          0 | PRIMARY            |            1 | o_orderkey     | A         |     1522266 |     NULL | NULL   |      | BTREE      |         |               |

| orders |          1 | i_o_orderdate      |            1 | o_orderDATE    | A         |        5722 |     NULL | NULL   | YES  | BTREE      |         |               |

| orders |          1 | i_o_custkey        |            1 | o_custkey      | A         |      217466 |     NULL | NULL   | YES  | BTREE      |         |               |

|+--------+------------+--------------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

3 rows in set (0.00 sec)

1.3groupby的优化

为字段列o_shippriority添加索引

MySQL对于GROUPBY的处理,通常采用的方式是扫描整个表、创建一个临时表用以执行分组操作。查询执行计划中出现“Using temporary”字样表示MySQL采用了常规的处理方式。现在为列o_shippriority添加索引,看添加索引后的查询执行情况。

mysql> alter table orders drop index idx_O_SHIPPRIORITY,add index idx_O_SHIPPRIORITY(O_SHIPPRIORITY);

Query OK, 0 rows affected (3.02 sec)

Records: 0  Duplicates: 0  Warnings: 0

再次查看执行计划

+----+-------------+----------+--------+--------------------------------------------+---------+---------+------------------------+---------+----------+----------------------------------------------+

| id | select_type | table    | type   | possible_keys                              | key     | key_len | ref                    | rows    | filtered | Extra                                        |

+----+-------------+----------+--------+--------------------------------------------+---------+---------+------------------------+---------+----------+----------------------------------------------+

|  1 | SIMPLE      | ORDERS   | ALL    | PRIMARY,i_o_orderdate,i_o_custkey          | NULL    | NULL    | NULL                   | 1522266 |    50.00 | Using where; Using temporary; Using filesort |

|  1 | SIMPLE      | CUSTOMER | eq_ref | PRIMARY                                    | PRIMARY | 4       | tpch.ORDERS.o_custkey  |       1 |   100.00 | Using where                                  |

|  1 | SIMPLE      | LINEITEM | ref    | PRIMARY,i_l_orderkey,i_l_orderkey_quantity | PRIMARY | 4       | tpch.ORDERS.o_orderkey |       1 |   100.00 | Using where                                  |

+----+-------------+----------+--------+--------------------------------------------+---------+---------+------------------------+---------+----------+----------------------------------------------+

3 rows in set, 1 warning (0.00 sec)

查看表orders的索引情况:

mysql> show indexes from orders;

+--------+------------+--------------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

| Table  | Non_unique | Key_name           | Seq_in_index | Column_name    | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |

+--------+------------+--------------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

| orders |          0 | PRIMARY            |            1 | o_orderkey     | A         |     1522266 |     NULL | NULL   |      | BTREE      |         |               |

| orders |          1 | i_o_orderdate      |            1 | o_orderDATE    | A         |        5722 |     NULL | NULL   | YES  | BTREE      |         |               |

| orders |          1 | i_o_custkey        |            1 | o_custkey      | A         |      217466 |     NULL | NULL   | YES  | BTREE      |         |               |

| orders |          1 | idx_O_SHIPPRIORITY |            1 | o_shippriority | A         |          18 |     NULL | NULL   | YES  | BTREE      |         |               |

+--------+------------+--------------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

字段列o_shippriority中添加索引idx_O_SHIPPRIORITY后,Cardinality列的值最小,也就是数据的唯一值的数应该是最少的,查询优化器认为没有必要使用该索引,或许这就是不能使用该索引的原因?

对于group by 的优化的可能性应该是没有了。

1.4order by的优化

查询语句中O_ORDERDATE列上已经存在索引,对于聚集函数sum的优化可以转化为对sum列的优化。查看lineitem表中的是索引。

mysql> show indexes from lineitem;

+----------+------------+-----------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

| Table    | Non_unique | Key_name              | Seq_in_index | Column_name     | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |

+----------+------------+-----------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

| lineitem |          0 | PRIMARY               |            1 | l_orderkey      | A         |     5976465 |     NULL | NULL   |      | BTREE      |         |               |

| lineitem |          0 | PRIMARY               |            2 | l_linenumber    | A         |     5976465 |     NULL | NULL   |      | BTREE      |         |               |

| lineitem |          1 | i_l_partkey           |            1 | l_partkey       | A         |      398431 |     NULL | NULL   | YES  | BTREE      |         |               |

| lineitem |          1 | i_l_orderkey          |            1 | l_orderkey      | A         |     5976465 |     NULL | NULL   |      | BTREE      |         |               |

| lineitem |          1 | i_l_commitdate        |            1 | l_commitDATE    | A         |          18 |     NULL | NULL   | YES  | BTREE      |         |               |

| lineitem |          1 | i_l_suppkey_partkey   |            1 | l_partkey       | A         |      351556 |     NULL | NULL   | YES  | BTREE      |         |               |

| lineitem |          1 | i_l_suppkey_partkey   |            2 | l_suppkey       | A         |     1494116 |     NULL | NULL   | YES  | BTREE      |         |               |

| lineitem |          1 | i_l_suppkey           |            1 | l_suppkey       | A         |       21972 |     NULL | NULL   | YES  | BTREE      |         |               |

| lineitem |          1 | i_l_receiptdate       |            1 | l_receiptDATE   | A         |        2584 |     NULL | NULL   | YES  | BTREE      |         |               |

| lineitem |          1 | i_l_orderkey_quantity |            1 | l_orderkey      | A         |     5976465 |     NULL | NULL   |      | BTREE      |         |               |

| lineitem |          1 | i_l_orderkey_quantity |            2 | l_quantity      | A         |     5976465 |     NULL | NULL   | YES  | BTREE      |         |               |

| lineitem |          1 | i_l_extendedprice     |            1 | l_extendedprice | A         |     1992155 |     NULL | NULL   | YES  | BTREE      |         |               |

| lineitem |          1 | idx_l_discount        |            1 | l_discount      | A         |          18 |     NULL | NULL   | YES  | BTREE      |         |               |

+----------+------------+-----------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

13 rows in set (0.00 sec)

可以看到列l_extendedprice和l_discount(新添加)都有索引,并把sum函数的表达式改为SUM(L_EXTENDEDPRICE * 1 - L_EXTENDEDPRICE * L_DISCOUNT) AS REVENUE

+----+-------------+----------+--------+--------------------------------------------+---------+---------+------------------------+---------+----------+----------------------------------------------+

| id | select_type | table    | type   | possible_keys                              | key     | key_len | ref                    | rows    | filtered | Extra                                        |

+----+-------------+----------+--------+--------------------------------------------+---------+---------+------------------------+---------+----------+----------------------------------------------+

|  1 | SIMPLE      | ORDERS   | ALL    | PRIMARY,i_o_orderdate,i_o_custkey          | NULL    | NULL    | NULL                   | 1522266 |    50.00 | Using where; Using temporary; Using filesort |

|  1 | SIMPLE      | CUSTOMER | eq_ref | PRIMARY                                    | PRIMARY | 4       | tpch.ORDERS.o_custkey  |       1 |   100.00 | Using where                                  |

|  1 | SIMPLE      | LINEITEM | ref    | PRIMARY,i_l_orderkey,i_l_orderkey_quantity | PRIMARY | 4       | tpch.ORDERS.o_orderkey |       1 |   100.00 | Using where                                  |

+----+-------------+----------+--------+--------------------------------------------+---------+---------+------------------------+---------+----------+----------------------------------------------+

3 rows in set, 1 warning (0.00 sec)

mysql> show warnings\G;

*************************** 1. row ***************************

Level: Note

Code: 1003

Message: /* select#1 */ select `tpch`.`lineitem`.`l_orderkey` AS `L_ORDERKEY`,sum((`tpch`.`lineitem`.`l_extendedprice` - (`tpch`.`lineitem`.`l_extendedprice` * `tpch`.`lineitem`.`l_discount`))) AS `REVENUE`,`tpch`.`orders`.`o_orderDATE` AS `O_ORDERDATE`,`tpch`.`orders`.`o_shippriority` AS `O_SHIPPRIORITY` from `tpch`.`customer` join `tpch`.`orders` join `tpch`.`lineitem` where ((`tpch`.`lineitem`.`l_orderkey` = `tpch`.`orders`.`o_orderkey`) and (`tpch`.`customer`.`c_custkey` = `tpch`.`orders`.`o_custkey`) and (`tpch`.`customer`.`c_mktsegment` = ‘BUILDING‘) and (`tpch`.`orders`.`o_orderDATE` < DATE‘1995-03-15‘) and (`tpch`.`lineitem`.`l_shipDATE` > DATE‘1995-03-15‘)) group by `tpch`.`lineitem`.`l_orderkey`,`tpch`.`orders`.`o_orderDATE`,`tpch`.`orders`.`o_shippriority` order by `REVENUE` desc,`tpch`.`orders`.`o_orderDATE`

1 row in set (0.00 sec)

1.5综述

综上,通过该语句的分析,仅仅从sql语句优化方面考虑,暂时没有优化的方法。

--Q4;

2.1执行计划

EXPLAIN EXTENDED SELECT

O_ORDERPRIORITY,

COUNT(*) AS ORDER_COUNT

FROM

ORDERS

WHERE

O_ORDERDATE >= DATE ‘1993-07-01‘

AND O_ORDERDATE < DATE ‘1993-07-01‘ + INTERVAL ‘3‘ MONTH

AND EXISTS (

SELECT

*

FROM

LINEITEM

WHERE

L_ORDERKEY = O_ORDERKEY

AND L_COMMITDATE < L_RECEIPTDATE

)

GROUP BY

O_ORDERPRIORITY

ORDER BY

O_ORDERPRIORITY;

+----+--------------------+----------+-------+--------------------------------------------+---------------+---------+------------------------+--------+----------+---------------------------------------------------------------------+

| id | select_type        | table    | type  | possible_keys                              | key           | key_len | ref                    | rows   | filtered | Extra                                                               |

+----+--------------------+----------+-------+--------------------------------------------+---------------+---------+------------------------+--------+----------+---------------------------------------------------------------------+

|  1 | PRIMARY            | ORDERS   | range | i_o_orderdate                              | i_o_orderdate | 4       | NULL                   | 117768 |   100.00 | Using index condition; Using where; Using temporary; Using filesort |

|  2 | DEPENDENT SUBQUERY | LINEITEM | ref   | PRIMARY,i_l_orderkey,i_l_orderkey_quantity | PRIMARY       | 4       | tpch.ORDERS.o_orderkey |      1 |   100.00 | Using where                                                         |

+----+--------------------+----------+-------+--------------------------------------------+---------------+---------+------------------------+--------+----------+---------------------------------------------------------------------+

2 rows in set, 2 warnings (0.00 sec)

mysql> show warnings\G;

*************************** 1. row ***************************

Level: Note

Code: 1276

Message: Field or reference ‘tpch.ORDERS.o_orderkey‘ of SELECT #2 was resolved in SELECT #1

*************************** 2. row ***************************

Level: Note

Code: 1003

Message: /* select#1 */ select `tpch`.`orders`.`o_orderpriority` AS `O_ORDERPRIORITY`,count(0) AS `ORDER_COUNT` from `tpch`.`orders` where ((`tpch`.`orders`.`o_orderDATE` >= DATE‘1993-07-01‘) and (`tpch`.`orders`.`o_orderDATE` < (DATE‘1993-07-01‘ + interval ‘3‘ month)) and exists(/* select#2 */ select 1 from `tpch`.`lineitem` where ((`tpch`.`lineitem`.`l_orderkey` = `tpch`.`orders`.`o_orderkey`) and (`tpch`.`lineitem`.`l_commitDATE` < `tpch`.`lineitem`.`l_receiptDATE`)))) group by `tpch`.`orders`.`o_orderpriority` order by `tpch`.`orders`.`o_orderpriority`

2 rows in set (0.00 sec)

2.2分析

该语句带有分组、排序、聚集操作、子查询并存的单表查询操作。子查询是相关子查询。

2.2.1对于分组和排序的优化

对于分组和排序的优化,首先想到的是给涉及的列添加索引。

mysql> alter table orders add index idx_o_orderpriority(o_orderpriority);

Query OK, 0 rows affected (1 min 9.11 sec)

Records: 0  Duplicates: 0  Warnings: 0

+----+--------------------+----------+-------+--------------------------------------------+---------------+---------+------------------------+--------+----------+---------------------------------------------------------------------+

| id | select_type        | table    | type  | possible_keys                              | key           | key_len | ref                    | rows   | filtered | Extra                                                               |

+----+--------------------+----------+-------+--------------------------------------------+---------------+---------+------------------------+--------+----------+---------------------------------------------------------------------+

|  1 | PRIMARY            | ORDERS   | range | i_o_orderdate,idx_o_orderpriority          | i_o_orderdate | 4       | NULL                   | 117768 |   100.00 | Using index condition; Using where; Using temporary; Using filesort |

|  2 | DEPENDENT SUBQUERY | LINEITEM | ref   | PRIMARY,i_l_orderkey,i_l_orderkey_quantity | PRIMARY       | 4       | tpch.ORDERS.o_orderkey |      1 |   100.00 | Using where                                                         |

+----+--------------------+----------+-------+--------------------------------------------+---------------+---------+------------------------+--------+----------+---------------------------------------------------------------------+

2 rows in set, 2 warnings (0.00 sec)

mysql> show warnings\G;

*************************** 1. row ***************************

Level: Note

Code: 1276

Message: Field or reference ‘tpch.ORDERS.o_orderkey‘ of SELECT #2 was resolved in SELECT #1

*************************** 2. row ***************************

Level: Note

Code: 1003

Message: /* select#1 */ select `tpch`.`orders`.`o_orderpriority` AS `O_ORDERPRIORITY`,count(0) AS `ORDER_COUNT` from `tpch`.`orders` where ((`tpch`.`orders`.`o_orderDATE` >= DATE‘1993-07-01‘) and (`tpch`.`orders`.`o_orderDATE` < (DATE‘1993-07-01‘ + interval ‘3‘ month)) and exists(/* select#2 */ select 1 from `tpch`.`lineitem` where ((`tpch`.`lineitem`.`l_orderkey` = `tpch`.`orders`.`o_orderkey`) and (`tpch`.`lineitem`.`l_commitDATE` < `tpch`.`lineitem`.`l_receiptDATE`)))) group by `tpch`.`orders`.`o_orderpriority` order by `tpch`.`orders`.`o_orderpriority`

2 rows in set (0.00 sec)

从执行计划可以看出对于查询没有任何改观。再查看一下该表的索引情况:

mysql> show indexes from orders;

+--------+------------+---------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

| Table  | Non_unique | Key_name            | Seq_in_index | Column_name     | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |

+--------+------------+---------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

| orders |          0 | PRIMARY             |            1 | o_orderkey      | A         |     1445997 |     NULL | NULL   |      | BTREE      |         |               |

| orders |          1 | i_o_orderdate       |            1 | o_orderDATE     | A         |        5738 |     NULL | NULL   | YES  | BTREE      |         |               |

| orders |          1 | i_o_custkey         |            1 | o_custkey       | A         |      206571 |     NULL | NULL   | YES  | BTREE      |         |               |

| orders |          1 | idx_O_SHIPPRIORITY  |            1 | o_shippriority  | A         |          18 |     NULL | NULL   | YES  | BTREE      |         |               |

| orders |          1 | idx_o_orderpriority |            1 | o_orderpriority | A         |          18 |     NULL | NULL   | YES  | BTREE      |         |               |

+--------+------------+---------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

5 rows in set (0.00 sec)

字段列o_orderpriority中添加索引idx_o_orderpriority后,Cardinality列的值依然是最小,也就是数据的唯一值的数应该是最少的,查询优化器依旧认为没有必要使用该索引。

如果强制性使用idx_o_orderpriority,看看执行计划的情况:

EXPLAIN EXTENDED SELECT

O_ORDERPRIORITY,

COUNT(*) AS ORDER_COUNT

FROM

#ORDERS

ORDERS force index(idx_o_orderpriority)

WHERE

O_ORDERDATE >= DATE ‘1993-07-01‘

AND O_ORDERDATE < DATE ‘1993-07-01‘ + INTERVAL ‘3‘ MONTH

AND EXISTS (

SELECT

*

FROM

LINEITEM

WHERE

L_ORDERKEY = O_ORDERKEY

AND L_COMMITDATE < L_RECEIPTDATE

)

GROUP BY

O_ORDERPRIORITY

ORDER BY

O_ORDERPRIORITY;

+----+--------------------+----------+-------+--------------------------------------------+---------------------+---------+------------------------+---------+----------+-------------+

| id | select_type        | table    | type  | possible_keys                              | key                 | key_len | ref                    | rows    | filtered | Extra       |

+----+--------------------+----------+-------+--------------------------------------------+---------------------+---------+------------------------+---------+----------+-------------+

|  1 | PRIMARY            | ORDERS   | index | idx_o_orderpriority                        | idx_o_orderpriority | 16      | NULL                   | 1445997 |   100.00 | Using where |

|  2 | DEPENDENT SUBQUERY | LINEITEM | ref   | PRIMARY,i_l_orderkey,i_l_orderkey_quantity | PRIMARY             | 4       | tpch.ORDERS.o_orderkey |       1 |   100.00 | Using where |

+----+--------------------+----------+-------+--------------------------------------------+---------------------+---------+------------------------+---------+----------+-------------+

2 rows in set, 2 warnings (0.01 sec)

可以看到,扫描的行数(1445997)却比不适用该索引多扫描(117768 )了100多万行。

因此对于分组和排序的优化余地应该不存在。

2.2.2尝试使用mrr算法的优化

mysql> show variables like ‘optimizer_switch‘\G;

*************************** 1. row ***************************

Variable_name: optimizer_switch

Value: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,engine_condition_pushdown=on,index_condition_pushdown=on,mrr=off,mrr_cost_based=on,block_nested_loop=on,batched_key_access=on,materialization=on,semijoin=on,loosescan=on,firstmatch=on,subquery_materialization_cost_based=on,use_index_extensions=on

mysql> set optimizer_switch=‘mrr=on,mrr_cost_based=off‘;

Query OK, 0 rows affected (0.00 sec)

再次查看执行计划:

+----+--------------------+----------+-------+--------------------------------------------+---------------+---------+------------------------+--------+--------------------------------------------------------------------------------+

| id | select_type        | table    | type  | possible_keys                              | key           | key_len | ref                    | rows   | Extra                                                                          |

+----+--------------------+----------+-------+--------------------------------------------+---------------+---------+------------------------+--------+--------------------------------------------------------------------------------+

|  1 | PRIMARY            | ORDERS   | range | i_o_orderdate,idx_o_orderpriority          | i_o_orderdate | 4       | NULL                   | 117768 | Using index condition; Using where; Using MRR; Using temporary; Using filesort |

|  2 | DEPENDENT SUBQUERY | LINEITEM | ref   | PRIMARY,i_l_orderkey,i_l_orderkey_quantity | PRIMARY       | 4       | tpch.ORDERS.o_orderkey |      1 | Using where                                                                    |

+----+--------------------+----------+-------+--------------------------------------------+---------------+---------+------------------------+--------+--------------------------------------------------------------------------------+

2 rows in set (0.00 sec)

优化目的是减少磁盘的随机访问,其前提是使用索引(i_o_orderdate),找到所需的行,并按照主键进行排序,并将随机访问转换为较为顺序的数据访问,其优化使用于range、ref和eq_ref类型的查询,也就是说非这些类型的查询,优化也就失去了作用:

2.3综述

通过该语句的分析,仅仅从sql语句优化方面考虑,增加不必要的索引,除了维护上不必要的开销外,对优化没有任何意义;

可以通过mrr算法进行优化,以减少对IO的访问,提高查询效率。

--Q5;

3.1执行计划

EXPLAIN EXTENDED SELECT

N_NAME,

SUM(L_EXTENDEDPRICE * (1 - L_DISCOUNT)) AS REVENUE

FROM

CUSTOMER,

ORDERS,

LINEITEM,

SUPPLIER,

NATION,

REGION

WHERE

C_CUSTKEY = O_CUSTKEY

AND L_ORDERKEY = O_ORDERKEY

AND L_SUPPKEY = S_SUPPKEY

AND C_NATIONKEY = S_NATIONKEY

AND S_NATIONKEY = N_NATIONKEY

AND N_REGIONKEY = R_REGIONKEY

AND R_NAME = ‘ASIA‘

AND O_ORDERDATE >= DATE ‘1994-01-01‘

AND O_ORDERDATE < DATE ‘1994-01-01‘ + INTERVAL ‘1‘ YEAR

GROUP BY

N_NAME

ORDER BY

REVENUE DESC;

+----+-------------+----------+--------+--------------------------------------------------------+---------------+---------+--------------------------+------+----------+-----------------------------------------------------+

| id | select_type | table    | type   | possible_keys                                          | key           | key_len | ref                      | rows | filtered | Extra                                               |

+----+-------------+----------+--------+--------------------------------------------------------+---------------+---------+--------------------------+------+----------+-----------------------------------------------------+

|  1 | SIMPLE      | REGION   | ALL    | PRIMARY                                                | NULL          | NULL    | NULL                     |    5 |   100.00 | Using where; Using temporary; Using filesort        |

|  1 | SIMPLE      | NATION   | ref    | PRIMARY,i_n_regionkey,idx_n_name                       | i_n_regionkey | 5       | tpch.REGION.r_regionkey  |    2 |   100.00 | Using join buffer (Batched Key Access)              |

|  1 | SIMPLE      | SUPPLIER | ref    | i_s_nationkey                                          | i_s_nationkey | 5       | tpch.NATION.n_nationkey  |  221 |   100.00 | Using where; Using join buffer (Batched Key Access) |

|  1 | SIMPLE      | LINEITEM | ref    | PRIMARY,i_l_orderkey,i_l_suppkey,i_l_orderkey_quantity | i_l_suppkey   | 5       | tpch.SUPPLIER.s_suppkey  |  272 |   100.00 | Using join buffer (Batched Key Access)              |

|  1 | SIMPLE      | ORDERS   | eq_ref | PRIMARY,i_o_orderdate,i_o_custkey                      | PRIMARY       | 4       | tpch.LINEITEM.l_orderkey |    1 |   100.00 | Using where                                         |

|  1 | SIMPLE      | CUSTOMER | eq_ref | PRIMARY,i_c_nationkey                                  | PRIMARY       | 4       | tpch.ORDERS.o_custkey    |    1 |   100.00 | Using where                                         |

+----+-------------+----------+--------+--------------------------------------------------------+---------------+---------+--------------------------+------+----------+-----------------------------------------------------+

6 rows in set, 1 warning (0.01 sec)

mysql> show warnings\G;

*************************** 1. row ***************************

Level: Note

Code: 1003

Message: /* select#1 */ select `tpch`.`nation`.`n_name` AS `N_NAME`,sum((`tpch`.`lineitem`.`l_extendedprice` * (1 - `tpch`.`lineitem`.`l_discount`))) AS `REVENUE` from `tpch`.`customer` join `tpch`.`orders` join `tpch`.`lineitem` join `tpch`.`supplier` join `tpch`.`nation` join `tpch`.`region` where ((`tpch`.`region`.`r_name` = ‘ASIA‘) and (`tpch`.`nation`.`n_regionkey` = `tpch`.`region`.`r_regionkey`) and (`tpch`.`supplier`.`s_nationkey` = `tpch`.`nation`.`n_nationkey`) and (`tpch`.`customer`.`c_nationkey` = `tpch`.`nation`.`n_nationkey`) and (`tpch`.`lineitem`.`l_suppkey` = `tpch`.`supplier`.`s_suppkey`) and (`tpch`.`orders`.`o_orderkey` = `tpch`.`lineitem`.`l_orderkey`) and (`tpch`.`customer`.`c_custkey` = `tpch`.`orders`.`o_custkey`) and (`tpch`.`orders`.`o_orderDATE` >= DATE‘1994-01-01‘) and (`tpch`.`orders`.`o_orderDATE` < <cache>((DATE‘1994-01-01‘ + interval ‘1‘ year)))) group by `tpch`.`nation`.`n_name` order by `REVENUE` desc

1 row in set (0.00 sec)

3.2分析

优化使用了Batched Key Access算法和<cache>方法,另外,通过对表region的全表扫描,使用了临时表和文件排序。通过对前面Q3、Q4的分析,order by无优化的可能,而对于n_name列,由于其数据量很少,添加索引完全没有必要。

尝试去掉表连接:

where条件:

AND C_NATIONKEY = S_NATIONKEY

AND S_NATIONKEY = N_NATIONKEY

更改为AND C_NATIONKEY = N_NATIONKEY

查看执行计划:

+----+-------------+----------+--------+--------------------------------------------------------+-------------+---------+---------------------------+-------+----------+----------------------------------------------------+

| id | select_type | table    | type   | possible_keys                                          | key         | key_len | ref                       | rows  | filtered | Extra                                              |

+----+-------------+----------+--------+--------------------------------------------------------+-------------+---------+---------------------------+-------+----------+----------------------------------------------------+

|  1 | SIMPLE      | SUPPLIER | ALL    | NULL                                                   | NULL        | NULL    | NULL                      | 10176 |   100.00 | Using where; Using temporary; Using filesort       |

|  1 | SIMPLE      | LINEITEM | ref    | PRIMARY,i_l_orderkey,i_l_suppkey,i_l_orderkey_quantity | i_l_suppkey | 5       | tpch.SUPPLIER.s_suppkey   |   272 |   100.00 | Using join buffer (Batched Key Access)             |

|  1 | SIMPLE      | ORDERS   | eq_ref | PRIMARY,i_o_orderdate,i_o_custkey                      | PRIMARY     | 4       | tpch.LINEITEM.l_orderkey  |     1 |   100.00 | Using where                                        |

|  1 | SIMPLE      | CUSTOMER | eq_ref | PRIMARY,i_c_nationkey                                  | PRIMARY     | 4       | tpch.ORDERS.o_custkey     |     1 |   100.00 | Using where                                        |

|  1 | SIMPLE      | NATION   | eq_ref | PRIMARY,i_n_regionkey,idx_n_name                       | PRIMARY     | 4       | tpch.CUSTOMER.c_nationkey |     1 |   100.00 | NULL                                               |

|  1 | SIMPLE      | REGION   | ALL    | PRIMARY                                                | NULL        | NULL    | NULL                      |     5 |    80.00 | Using where; Using join buffer (Block Nested Loop) |

+----+-------------+----------+--------+--------------------------------------------------------+-------------+---------+---------------------------+-------+----------+----------------------------------------------------+

6 rows in set, 1 warning (0.11 sec)

mysql> show warnings\G;

*************************** 1. row ***************************

Level: Note

Code: 1003

Message: /* select#1 */ select `tpch`.`nation`.`n_name` AS `N_NAME`,sum((`tpch`.`lineitem`.`l_extendedprice` * (1 - `tpch`.`lineitem`.`l_discount`))) AS `REVENUE` from `tpch`.`customer` join `tpch`.`orders` join `tpch`.`lineitem` join `tpch`.`supplier` join `tpch`.`nation` join `tpch`.`region` where ((`tpch`.`region`.`r_name` = ‘ASIA‘) and (`tpch`.`region`.`r_regionkey` = `tpch`.`nation`.`n_regionkey`) and (`tpch`.`nation`.`n_nationkey` = `tpch`.`customer`.`c_nationkey`) and (`tpch`.`lineitem`.`l_suppkey` = `tpch`.`supplier`.`s_suppkey`) and (`tpch`.`orders`.`o_orderkey` = `tpch`.`lineitem`.`l_orderkey`) and (`tpch`.`customer`.`c_custkey` = `tpch`.`orders`.`o_custkey`) and (`tpch`.`orders`.`o_orderDATE` >= DATE‘1994-01-01‘) and (`tpch`.`orders`.`o_orderDATE` < <cache>((DATE‘1994-01-01‘ + interval ‘1‘ year)))) group by `tpch`.`nation`.`n_name` order by `REVENUE` desc

1 row in set (0.00 sec)

源语句对于表REGION全表扫描,影响的行数为5,而改造后的语句对于表supplier的全表扫描,影响的行数为10176。

3.3综述

对于SQL语句来说,仅仅从语句方面来说,优化的空间很小,怎么与物理设备和数据库的参数结合起来共同考虑,确实值得探讨。

时间: 2024-10-08 14:57:41

数据库优化(十二)-tpch中Q3-Q5仅仅在sql优化方面的分析的相关文章

SpringMVC学习(十二)——SpringMVC中的拦截器

SpringMVC学习(十二)--SpringMVC中的拦截器 SpringMVC的处理器拦截器类似于Servlet开发中的过滤器Filter,用于对处理器进行预处理和后处理.本文主要总结一下SpringMVC中拦截器是如何定义的,以及测试拦截器的执行情况和使用方法. SpringMVC中拦截器的定义和配置 SpringMVC中拦截器的定义 在SpringMVC中,定义拦截器要实现HandlerInterceptor接口,并实现该接口中提供的三个方法,如下: public class Inter

Docker学习(十二)中遇到的一些问题汇总

Docker学习(十二)中遇到的一些问题汇总 标签(空格分隔): docker docker: Error response from daemon: Conflict. The container name "/myubuntu" is already in use docker container ls docker container rm 容器名 docker rm $(docker ps -a -q) 删除所有容器 centos无法上网 ifup ens33 // 启用网卡

Oracle 学习之 性能优化(十二) 分区表

一.分区概念 Oracle允许将表.索引.索引组织表细分成更小的片,每个片我们称之为分区.分区有其自己的名字和存储参数. 如下图: 每行数据只能属于一个分区,分区键决定数据行属于哪个分区.分区键由一个或多个列组成.Oracle自动的将数据的DML操作映射到相应的分区中. 何时使用分区表: Tables greater than 2 GB should always be considered as candidates for partitioning. Tables containing hi

Node.js(十二)——NodeJs中的Promise

爬虫基于回调和事件的方式去实现,回调也是被诟病已久的问题尤其是callback这种,无论是阅读还是调试都很费劲,甚至我们连代码的堆栈都看不到,这是一种反人类的写法,Promise来拜托这种痛苦的方式 传统方式实现动画效果: <!DOCTYPE html> <html> <head> <meta charset="UTF-8"> <title>Promise animation</title> <style&g

Java面试题-数据库篇十二

111.把"sc"表中"王五"所教课的成绩都更改为此课程的平均成绩 update sc set score = (select avg(sc_2.score) from sc sc_2 wheresc_2.cno=sc.cno) from course,teacher where course.cno=sc.cno and course.tno=teacher.tno andteacher.tname='王五' 112.查询和编号为2的同学学习的课程完全相同的其他同

(十二) WebGIS中矢量图层的设计

1.前言 在前几章中我们已经了解了什么是矢量查询.屏幕坐标与地理坐标之前的转换以及要素的设计,有了这些知识,这一章里我们终于可以进行矢量图层的设计了. 在Catalog里创建一个FeatureClass时,我们能够选择是创建点图层或者线图层或者面图层等.所谓点图层,即该图层中的要素只能是点,其他几个图层顾名思义,也是如此.那么在WebGIS中,我们同样也可以按照ArcGIS里的这个设计思路来设计我们的矢量图层,虽然这个方法是肯定可行的,但是在实际的业务需求中,如果按照这种点图层等来严格控制的话,

MYSQL数据库学习十二 使用MySQL运算符

12.1 算术运算符 + - * /(DIV) %(MOD) 12.2 比较运算符 > < = <=> != <> >= <= BETWEEN AND 存在于指定范围 IS NULL 为空 IN 存在于指定集合 LIKE 通配符匹配 REGEXP 正则表达式匹配 所谓正则表达式,就是通过模式去匹配一类字符串. 模式字符 含义 ^ 匹配字符串的开始部分 $ 匹配字符串的结束部分 . 匹配字符串中的任意一个字符 [字符集合] 匹配字符集合中的任意一个字符 [^字

Java Servlet(十二):Servlet、Listener、Filter之间的执行流程分析

时隔几年后,看到本系列文章讲解的内容缺少了不少内容:周末无事分析了Spring Security是如何被集成到Web Servlet(SpringMVC)时,需要重新理清Filter.Listener.Servlet(SpringMVC#DispatcherServlet)之间的执行顺序,于是就有了本篇文章.这个话题是Web Servlet学习中的一个重点,弄清它们之间的执行流程,有助于理解SpringMVC.Spring Security这些框架是否如何与Web Servlet集成到一起. 原

Netty in Action (二十三) 第十二章节 WebSocket

第三部分:网络协议 WebSocket是一个先进的网络协议,被开发用来用来提高网络的性能和web应用的响应率,我们将介绍Netty对WebSocket这两个特性的支持,同时我们也会举一个简单的实例来说明讲解这两个WebSocket的特性 在第十二章节中,你将学会如何使用WebSocket实现数据双向传输的功能,我们会写一个聊天室的方式讲解这个数据双向传输的问题,我们这个聊天室的实例是这样的:多个浏览器客户端可是实时的相互通信,你也会学会如何将普通的HTTP协议切换升级成WebSocket协议,当