over不能单独使用,要和分析函数:rank(),dense_rank(),row_number()等一起使用。
其参数:over(partition by columnname1 order by columnname2)
含义:按columname1指定的字段进行分组排序,或者说按字段columnname1的值进行分组排序。
例如:employees表中,有两个部门的记录:department_id =10和20
select department_id,rank() over(partition by department_id order by salary) from employees就是指在部门10中进行薪水的排名,在部门20中进行薪水排名。如果是partition by org_id,则是在整个公司内进行排名。
以下是个人见解:
sql中的over函数和row_numbert()函数配合使用,可生成行号。可对某一列的值进行排序,对于相同值的数据行进行分组排序。
执行语句:select row_number() over(order by AID DESC) as rowid,* from bb
SELECT
House.HouseId,
House.HouseName,
House.iconFlag,
House.orderId,
House.HouseJingYingFW,
House.HouseTel,
House.HouseCelPhone,
dbo.fnGetDistance(
118.328213, 35.081728, House.longitude,
House.latitude
) as jl,
Goods.originalPrice as levelCount,
Goods.presentPrice as levelAmount
FROM
House
LEFT JOIN (
select shopId,originalPrice,presentPrice
from (select no =row_number() over (partition by shopId order by createTime desc), * from Goods_info WHERE IsClear = 1)t
where no=1
) Goods on shopId = House.HouseId
where
(
House.isdel is null
or House.isdel = 0
)
and House.status = 1
and House.houseStatus <> 0
and House.housetype like ‘%1%‘
order by
House.orderId desc,
jl ASC
IF OBJECT_ID (‘dbo.fnGetDistance‘) IS NOT NULL
DROP FUNCTION dbo.fnGetDistance
GO
--计算地球上两个坐标点(经度,纬度)之间距离sql函数
CREATE FUNCTION [dbo].[fnGetDistance](@LatBegin REAL, @LngBegin REAL, @LatEnd REAL, @LngEnd REAL) RETURNS FLOAT
AS
BEGIN
--距离(千米)
DECLARE @Distance REAL
DECLARE @EARTH_RADIUS REAL
SET @EARTH_RADIUS = 6378.137
DECLARE @RadLatBegin REAL,@RadLatEnd REAL,@RadLatDiff REAL,@RadLngDiff REAL
SET @RadLatBegin = @LatBegin *PI()/180.0
SET @RadLatEnd = @LatEnd *PI()/180.0
SET @RadLatDiff = @RadLatBegin - @RadLatEnd
SET @RadLngDiff = @LngBegin *PI()/180.0 - @LngEnd *PI()/180.0
SET @Distance = 2 *ASIN(SQRT(POWER(SIN(@RadLatDiff/2), 2)+COS(@RadLatBegin)*COS(@RadLatEnd)*POWER(SIN(@RadLngDiff/2), 2)))
SET @Distance = @Distance * @EARTH_RADIUS
--SET @Distance = Round(@Distance * 10000) / 10000
RETURN @Distance
END
GO
SQL中Group分组获取Top N方法实现
有产品表,包含id,name,city,addtime四个字段,因报表需要按城市分组,统计每个城市的最新10个产品,便向该表中插入了100万数据,做了如下系列测试:
www.2cto.com
CREATE TABLE [dbo].[products](
[id] [int] IDENTITY(1,1) NOT NULL,
[name] [nvarchar](50) NULL,
[addtime] [datetime] NULL,
[city] [nvarchar](10) NULL,
CONSTRAINT [PK_products] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
1、采用row_number方法,执行5次,平均下来8秒左右,速度最快。
www.2cto.com
select no, id,name,city
from (select no =row_number() over (partition by city order by addtime desc), * from products)t
where no< 11 order by city asc,addtime desc
2、采用cross apply方法,执行了3次,基本都在3分5秒以上,已经很慢了。
select distinct b.id,b.name,b.city from products a
cross apply (select top 10 * from products where city = a.city order by addtime desc) b
3、采用Count查询,只执行了两次,第一次执行到5分钟时,取消任务执行了;第二次执行到13分钟时,没有hold住又直接停止了,实在无法忍受。
select id,name,city from products a
where ( select count(city) from products where a.city = city and addtime>a.addtime) < 10
order by city asc,addtime desc
4、采用游标方法,这个最后测试的,执行了5次,每次都是10秒完成,感觉还不错。
declare @city nvarchar(10)
create table #Top(id int,name nvarchar(50),city nvarchar(10),addtime datetime)
declare mycursor cursor for
select distinct city from products order by city asc
open mycursor
fetch next from mycursor into @city
while @@fetch_status =0
begin
insert into #Top
select top 10 id,name,city,addtime from products where city = @city
fetch next from mycursor into @city
end
close mycursor
deallocate mycursor
Select * from #Top order by city asc,addtime desc
drop table #Top
通过上述对比不难发现,在面临Group获取Top N场景时,可以首选row_number,游标cursor其次,另外两个就基本不考虑了,数据量大的时候根本没法使用。