月度利息统计sql优化
原因:写的sql语句复杂,理解起来有难度,另一方面,查询性能比较低
原来的语句如下:
SELECT
tp.year,
tp.month,
tp.bid_id,
b.`title`,
DATE(b.`time`) `time`,
tp.receive_date,
u.`name`,
u.`reality_name`,
b.`amount`,
b.`apr`,
b.`period_unit`,
b.`period`,
tp.receive_interest,
tp.bid_invest_count,
IFNULL((SELECT SUM(s.`amount`) FROM t_invests s WHERE s.`bid_id`=tp.bid_id AND s.user_id IN (SELECT r.user_id FROM `t_user_repair` r)), 0) AS bid_amount1,
IFNULL((SELECT SUM(s.`receive_corpus`) FROM t_bill_invests s WHERE s.`bid_id`=tp.bid_id AND DATE(IFNULL(s.`real_receive_time`, s.`receive_time`))=tp.`receive_date` AND s.user_id IN (SELECT r.user_id FROM `t_user_repair` r)), 0) AS bid_corpus1,
IFNULL((SELECT SUM(s.`receive_interest`) FROM t_bill_invests s WHERE s.`bid_id`=tp.bid_id AND DATE(IFNULL(s.`real_receive_time`, s.`receive_time`))=tp.`receive_date` AND s.user_id IN (SELECT r.user_id FROM `t_user_repair` r)), 0) AS bid_interest1,
IFNULL((SELECT COUNT(1) FROM t_bill_invests s WHERE s.`bid_id`=tp.bid_id AND DATE(IFNULL(s.`real_receive_time`, s.`receive_time`))=tp.`receive_date` AND s.user_id IN (SELECT r.user_id FROM `t_user_repair` r)), 0) AS bid_invest_count1,
IFNULL((SELECT SUM(s.`receive_corpus`) FROM t_bill_invests s WHERE s.`bid_id`=tp.bid_id AND DATE(IFNULL(s.`real_receive_time`, s.`receive_time`))=tp.`receive_date` AND s.user_id NOT IN (SELECT r.user_id FROM `t_user_repair` r)), 0) AS bid_corpus2
FROM
(
SELECT
t.`bid_id`,
DATE(IFNULL(t.`real_receive_time`, t.`receive_time`)) AS `receive_date`,
YEAR(IFNULL(t.`real_receive_time`, t.`receive_time`)) AS `year`,
MONTH(IFNULL(t.`real_receive_time`, t.`receive_time`)) AS `month`,
IFNULL(SUM(t.`receive_interest`),0) receive_interest,
COUNT(1) AS bid_invest_count
FROM t_bill_invests t
WHERE 1=1 AND DATE(IFNULL(t.`real_receive_time`, t.`receive_time`)) >= ‘2015-09-01‘ AND DATE(IFNULL(t.`real_receive_time`, t.`receive_time`)) <= ‘2015-10-31‘
GROUP BY `year`, `month`, t.`bid_id`, DATE(IFNULL(t.`real_receive_time`, t.`receive_time`))
)tp
LEFT JOIN t_bids b ON tp.bid_id=b.`id`
LEFT JOIN t_users u ON b.`user_id`=u.`id`;
执行结果所需时间信息如下:
/* Affected rows: 0 已找到记录: 129 警告: 0 持续时间 1 query: 28.704 sec. (+ 10.031 sec. network) */
最终优化语句:
SELECT
YEAR(IFNULL(tp.`real_receive_time`, tp.`receive_time`)) AS `year`,
MONTH(IFNULL(tp.`real_receive_time`, tp.`receive_time`)) AS `month`,
tp.bid_id,
b.`title`,
DATE(b.`time`) `time`,
DATE(IFNULL(tp.`real_receive_time`, tp.`receive_time`)) AS `receive_date`,
u.`name`,
u.`reality_name`,
b.`amount`,
b.`apr`,
b.`period_unit`,
b.`period`,
sum(if(tp.real_receive_time,tp.`receive_interest`,0)) receive_interest,
COUNT(1) AS bid_invest_count ,
sum(if(tp.user_id in (SELECT r.user_id FROM `t_user_repair` r) and tp.real_receive_time is not null,ti.amount,0 )) bid_amount1,
sum(if(tp.user_id in (SELECT r.user_id FROM `t_user_repair` r) and tp.real_receive_time is not null,tp.receive_corpus,0 )) bid_corpus1,
sum(if(tp.user_id in (SELECT r.user_id FROM `t_user_repair` r) and tp.real_receive_time is not null,tp.receive_interest,0 )) bid_interest1,
sum(if(tp.user_id in (SELECT r.user_id FROM `t_user_repair` r),1,0 )) bid_invest_count1,
sum(if(tp.user_id not in (SELECT r.user_id FROM `t_user_repair` r) and tp.real_receive_time is not null,tp.receive_corpus,0 )) bid_corpus2
FROM t_bill_invests tp
LEFT JOIN t_bids b ON tp.bid_id=b.`id`
LEFT JOIN t_users u ON b.`user_id`=u.`id`
left join (select id,amount from t_invests) ti on ti.id=tp.invest_id
WHERE 1=1 AND DATE(IFNULL(tp.`real_receive_time`, tp.`receive_time`)) >= ‘2015-09-01‘ AND DATE(IFNULL(tp.`real_receive_time`, tp.`receive_time`)) <= ‘2015-10-31‘
GROUP BY tp.`bid_id`, DATE(tp.`receive_time`);
执行时间:
/* Affected rows: 0 已找到记录: 129 警告: 0 持续时间 1 query: 0.671 sec. */
结果一样,但最终结果只需0.67s
优化思路,原来语句explain中:
在select_type中有大量的dependent_subquery,此种类型的查询极耗性能,在sql的编写中应该极力避免。
- 结合业务需求,查看group by写法,发现没有必要写那么多的group by,只需要抓住核心的tp.`bid_id`, DATE(tp.`receive_time`)便可确定一组帐单信息。
2. 尽量改写查询列,让其为subquery,如下:
SELECT
YEAR(IFNULL(tp.`real_receive_time`, tp.`receive_time`)) AS `year`,
MONTH(IFNULL(tp.`real_receive_time`, tp.`receive_time`)) AS `month`,
tp.bid_id,
b.`title`,
DATE(b.`time`) `time`,
DATE(IFNULL(tp.`real_receive_time`, tp.`receive_time`)) AS `receive_date`,
u.`name`,
u.`reality_name`,
b.`amount`,
b.`apr`,
b.`period_unit`,
b.`period`,
sum(if(tp.real_receive_time,tp.`receive_interest`,0)) receive_interest,
COUNT(1) AS bid_invest_count ,
IFNULL((SELECT SUM(s.`amount`) FROM t_invests s WHERE s.`bid_id`=tp.bid_id and s.user_id IN (SELECT r.user_id FROM `t_user_repair` r)), 0) as bid_amount1,
sum(if(tp.user_id in (SELECT r.user_id FROM `t_user_repair` r) and tp.real_receive_time is not null,tp.receive_corpus,0 )) bid_corpus1,
sum(if(tp.user_id in (SELECT r.user_id FROM `t_user_repair` r) and tp.real_receive_time is not null,tp.receive_interest,0 )) bid_interest1,
sum(if(tp.user_id in (SELECT r.user_id FROM `t_user_repair` r),1,0 )) bid_invest_count1,
sum(if(tp.user_id not in (SELECT r.user_id FROM `t_user_repair` r) and tp.real_receive_time is not null,tp.receive_corpus,0 )) bid_corpus2
FROM t_bill_invests tp
LEFT JOIN t_bids b ON tp.bid_id=b.`id`
LEFT JOIN t_users u ON b.`user_id`=u.`id`
WHERE 1=1 AND DATE(IFNULL(tp.`real_receive_time`, tp.`receive_time`)) >= ‘2015-04-01‘ AND DATE(IFNULL(tp.`real_receive_time`, tp.`receive_time`)) <= ‘2015-10-31‘
GROUP BY tp.`bid_id`, DATE(tp.`receive_time`);
在其explain中还有残留的依赖外查询:
主要是由框中的语句引起,于是再改写,让其改成临时表连接,就是最后的sql语句:
只剩临时表与了查询,性能有了大幅度的提升,优化结束。
总结:统计类查询经常不小心就会写成depended_subquery的形式,可以结合sum count case if条件判断来改写,让其可以达到subquery的形式,另一方面也可以改成临时表的方式并通过连接来提高性能。