leetcode-184-Department Highest Salary 优化记录

题目

The Employee table holds all employees. Every employee has an Id, a salary, and there is also a column for the department Id.

+----+-------+--------+--------------+
| Id | Name  | Salary | DepartmentId |
+----+-------+--------+--------------+
| 1  | Joe   | 70000  | 1            |
| 2  | Henry | 80000  | 2            |
| 3  | Sam   | 60000  | 2            |
| 4  | Max   | 90000  | 1            |
+----+-------+--------+--------------+

The Department table holds all departments of the company.

+----+----------+
| Id | Name     |
+----+----------+
| 1  | IT       |
| 2  | Sales    |
+----+----------+

Write a SQL query to find employees who have the highest salary in each of the departments. For the above tables, Max has the highest salary in the IT department and Henry has the highest salary in the Sales department.

+------------+----------+--------+
| Department | Employee | Salary |
+------------+----------+--------+
| IT         | Max      | 90000  |
| Sales      | Henry    | 80000  |
+------------+----------+--------+
先后写了5,6个版本,效率各不相同,挑出典型的5个,来分析一下sql语句的优化

1.Runtime: 1539 ms

select Department.Name as Department,     Employee.Name as Employee,     Employee.Salary as Salary from Department join Employee    on Department.Id = Employee.DepartmentId where (Department.Id, Employee.Salary) in    (select DepartmentId, max(Salary) from Employee group by DepartmentId);

2.Runtime: 1204 ms

select Department.Name as Department,     Employee.Name as Employee,     Employee.Salary as Salary from Department join Employee    on Department.Id = Employee.DepartmentId where (Department.Id, Employee.Salary) in    (select DepartmentId, Salary     from (select * from Employee order by Salary desc) q     group by DepartmentId);

3.Runtime: 1399 ms

select a.Name as Department,     b.Name as Employee,     b.Salary as Salary from Department a join Employee b    on a.Id = b.DepartmentId where exists(select 1 from (select * from Employee order by Salary desc) c         group by DepartmentId         having a.Id = c.DepartmentId and b.Salary = max(c.Salary));

4.Runtime: 980 ms

select a.Name as Department,     b.Name as Employee,     b.Salary as Salary from (Department a join Employee b on a.Id = b.DepartmentId) join    (select c.DepartmentId,max(c.Salary) as Salary from (select * from Employee order by Salary desc) c group by DepartmentId) d    on a.Id = d.DepartmentId and b.Salary = d.Salary;

5.Runtime: 957 ms

select a.Name as Department,     b.Name as Employee,     b.Salary as Salary from (Department a straight_join Employee b on a.Id = b.DepartmentId) straight_join     (select c.DepartmentId,max(c.Salary) as Salary from (select * from Employee order by Salary desc) c group by c.DepartmentId) d     on a.Id = d.DepartmentId and b.Salary = d.Salary;

 总结

  • 1与2比较,聚合函数 max() 的效率不如嵌套子查询
  • 2与3比较, in 与 exists 效率差不多,当时在网上查的是:

 1、in 和 not in 也要慎用,否则会导致全表扫描

 2、很多时候用 exists 代替 in 是一个好的选择

  不过通过后面的优化,可以看出 in 确实挺慢的

  • 3与4比较,4用 join on 代替了 where 判断,效率提升很多,后来有个看过mysql源码的大神说:

在 MySQL 的 SELECT 查询当中,其核心算法就是 JOIN 查询算法。其他的查询语句都相应向 JOIN 靠拢:单表查询被当作 JOIN 的特例;子查询被尽量转换为 JOIN 查询

  • 4与5比较,5将 join 替换为了 straight_join ,还是源码大神说的:

对于多表查询,如果可以确定表按照某一固定次序处理可以获得较好的效率,则建议加上 STRAIGHT_JOIN 子句,以减少优化器对表进行重排序优化的过程。

该子句一方面可以用于优化器无法给出最优排列的 SQL 语句;另一方面同样适用于优化器可以给出最优排列的 SQL 语句,因为 MySQL 算出最优排列也需要耗费较长的流程。

对于后一状况,可以根据 EXPLAIN 的提示选定表的顺序,并加上 STRAIGHT_JOIN 子句固定该顺序。该状况下的使用前提是几个表之间的数据量比例会一直保持在某一顺序,否则在各表数据此消彼长之后会适得其反。

  对于经常调用的 SQL 语句,这一方法效果较好;同时操作的表越多,效果越好。

后记

  至此,优化还没完全结束,leetcode上该题最快是813ms,但是没有分享代码,最后贴两个别人家的代码:

  Join twice,890ms accepted

SELECT Name, Employee, Salary
FROM Department JOIN (SELECT Employee.Name AS Employee, Employee.Salary, Employee.DepartmentId
    FROM Employee JOIN (SELECT `DepartmentId`, MAX(`Salary`) AS Salary
        FROM `Employee`
        GROUP BY `DepartmentId`
        ) t1 ON t1.DepartmentId = Employee.DepartmentId
    AND t1.Salary = Employee.Salary
    ) t2 ON Department.Id = t2.DepartmentId

  Easy Solution. No joins. GROUP BY is enough. 916ms

select
d.Name, e.Name, e.Salary
from
Department d,
Employee e,
(select MAX(Salary) as Salary,  DepartmentId as DepartmentId from Employee GROUP BY DepartmentId) h
where
e.Salary = h.Salary and
e.DepartmentId = h.DepartmentId and
e.DepartmentId = d.Id;
时间: 2024-11-08 07:12:04

leetcode-184-Department Highest Salary 优化记录的相关文章

LeetCode:Department Highest Salary - 部门内最高工资

1.题目名称 Department Highest Salary(部门内最高工资) 2.题目地址 https://leetcode.com/problems/rising-temperature 3.题目内容 表Employee包括四列:Id.Name.Salary.DepartmentId +----+-------+--------+--------------+ | Id | Name  | Salary | DepartmentId | +----+-------+--------+--

184. Department Highest Salary (medium)

Source: https://leetcode.com/problems/department-highest-salary/#/descriptionDescription: The Employee table holds all employees. Every employee has an Id, a salary, and there is also a column for the department Id. +----+-------+--------+-----------

184. Department Highest Salary

SELECT d.Name AS Department,e.Name AS Employee,e.Salary AS SalaryFROM Department d,Employee e, (SELECT MAX(Salary) AS Salary,DepartmentId FROM Employee GROUP BY DepartmentId) mWHERE d.Id = e.DepartmentId AND e.Salary = m.Salary AND e.DepartmentId = m

[LeetCode] Department Highest Salary -- 数据库知识(mysql)

184. Department Highest Salary The Employee table holds all employees. Every employee has an Id, a salary, and there is also a column for the department Id. +----+-------+--------+--------------+ | Id | Name | Salary | DepartmentId | +----+-------+--

[LeetCode] Department Highest Salary 系里最高薪水

The Employee table holds all employees. Every employee has an Id, a salary, and there is also a column for the department Id. +----+-------+--------+--------------+ | Id | Name | Salary | DepartmentId | +----+-------+--------+--------------+ | 1 | Jo

[LeetCode]Department Highest Salary,解题报告

题目 The Employee table holds all employees. Every employee has an Id, a salary, and there is also a column for the department Id. Id Name Salary DepartmentId 1 Joe 70000 1 2 Henry 80000 2 3 Sam 60000 2 4 Max 90000 1 The Department table holds all depa

LeetCode:Second Highest Salary - 第二高的工资

1.题目名称 Second Highest Salary(第二高的工资) 2.题目地址 https://leetcode.com/problems/second-highest-salary/ 3.题目内容 现在有一张记录了Id(主键)和Salary(工资)的表,求出其中第二高的工资.如果不存在第二高的工资,返回null. +----+--------+ | Id | Salary | +----+--------+ | 1  | 100    | | 2  | 200    | | 3  | 

[LeetCode][SQL]Nth Highest Salary

Nth Highest Salary Write a SQL query to get the nth highest salary from the Employee table. +----+--------+ | Id | Salary | +----+--------+ | 1 | 100 | | 2 | 200 | | 3 | 300 | +----+--------+ For example, given the above Employee table, the nth highe

[LeetCode][SQL]Second Highest Salary

https://leetcode.com/problems/second-highest-salary/ Second Highest Salary Write a SQL query to get the second highest salary from the Employee table. +----+--------+ | Id | Salary | +----+--------+ | 1 | 100 | | 2 | 200 | | 3 | 300 | +----+--------+