数据库产生阻塞(Blocking)的本质原因
:SQL语句连续持有锁的时间过长 ,数目过多, 粒度过大。阻塞是事务隔离带来的副作用,它是不可避免的,而且是一个数据库系统常见的现象。
但是阻塞的时间和出现频率要控制在一定的范围内,阻塞持续的时间过长或阻塞出现过多(过于频繁),就会对数据库性能产生严重的影响。
很多时候,DBA需要知道数据库在出现性能问题时,有没有发生阻塞? 什么时候开始的?发生在那个数据库上? 阻塞发生在那些SQL语句之间? 阻塞的时间有多长?
阻塞发生的频率? 阻塞有关的连接是从那些客户端应用发送来的?.......
如果我们能够知道这些具体信息,我们就能迅速定位问题,分析阻塞产生的原因,
从而找出出现性能问题的根本原因,并根据具体原因给出相应的解决方案(索引调整、优化SQL语句等)。
查看阻塞的方法比较多,
我在这篇博客MS SQL 日常维护管理常用脚本(二)里面提到查看阻塞的一些方法:
方法1:查看那个引起阻塞,查看blk不为0的记录,如果存在阻塞进程,则是该阻塞进程的会话 ID。否则该列为零。
EXEC sp_who
active
方法2:查看那个引起阻塞,查看字段BlkBy,这个能够得到比sp_who更多的信息。
EXEC sp_who2
active
方法3:sp_lock
系统存储过程,报告有关锁的信息,但是不方便定位问题
方法4:sp_who_lock存储过程
方法5:右键服务器-选择“活动和监视器”,查看进程选项。注意“任务状态”字段。
方法6:右键服务名称-选择报表-标准报表-活动-所有正在阻塞的事务。
但是上面方法,例如像sp_who、
sp_who2,sp_who_lock等,都有或多或少的缺点:例如不能查看阻塞和被阻塞的SQL语句。不能从查看一段时间内阻塞发生的情况等;没有显示阻塞的时间.......
我们要实现下面功能:
1:
查看那个会话阻塞了那个会话
2:阻塞会话和被阻塞会话正在执行的SQL语句
3:被阻塞了多长时间
4:像客户端IP、Proagram_Name之类信息
5:阻塞发生的时间点
6:阻塞发生的频率
7:如果需要,应该通知相关开发人员,DBA不能啥事情都包揽是吧,那不还得累死,总得让开发人员员参与进来优化(有些问题就该他们解决),多了解一些系统运行的具体情况,有利于他们认识问题、解决问题。
8:需要的时候开启这项功能,不需要关闭这项功能
于是为了满足上述功能,有了下面SQL 语句
SELECT wt.blocking_session_id AS BlockingSessesionId
,sp.program_name AS ProgramName
,COALESCE(sp.LOGINAME, sp.nt_username) AS HostName
,ec1.client_net_address AS ClientIpAddress
,db.name AS DatabaseName
,wt.wait_type AS WaitType
,ec1.connect_time AS BlockingStartTime
,wt.WAIT_DURATION_MS/1000 AS WaitDuration
,ec1.session_id AS BlockedSessionId
,h1.TEXT AS BlockedSQLText
,h2.TEXT AS BlockingSQLText
FROM sys.dm_tran_locks AS tl
INNER JOIN sys.databases db
ON db.database_id = tl.resource_database_id
INNER JOIN sys.dm_os_waiting_tasks AS wt
ON tl.lock_owner_address = wt.resource_address
INNER JOIN sys.dm_exec_connections ec1
ON ec1.session_id = tl.request_session_id
INNER JOIN sys.dm_exec_connections ec2
ON ec2.session_id = wt.blocking_session_id
LEFT OUTER JOIN master.dbo.sysprocesses sp
ON SP.spid = wt.blocking_session_id
CROSS APPLY sys.dm_exec_sql_text(ec1.most_recent_sql_handle) AS h1
CROSS APPLY sys.dm_exec_sql_text(ec2.most_recent_sql_handle) AS h2
我们做一个测试例子来验证一下
1:打开第一会话窗口1,执行下面语句
USE DBMonitor;
GO
BEGIN TRANSACTION
SELECT * FROM dbo.TEST(TABLOCKX);
--COMMIT TRANSACTION;
2:打开第二个会话窗口2,执行下面语句
USE DBMonitor;
GO
SELECT * FROM dbo.TEST
3:打开第三个会话窗口3,执行下面语句
SELECT wt.blocking_session_id AS BlockingSessesionId
,sp.program_name AS ProgramName
,COALESCE(sp.LOGINAME, sp.nt_username) AS HostName
,ec1.client_net_address AS ClientIpAddress
,db.name AS DatabaseName
,wt.wait_type AS WaitType
,ec1.connect_time AS BlockingStartTime
,wt.WAIT_DURATION_MS/1000 AS WaitDuration
,ec1.session_id AS BlockedSessionId
,h1.TEXT AS BlockedSQLText
,h2.TEXT AS BlockingSQLText
FROM sys.dm_tran_locks AS tl
INNER JOIN sys.databases db
ON db.database_id = tl.resource_database_id
INNER JOIN sys.dm_os_waiting_tasks AS wt
ON tl.lock_owner_address = wt.resource_address
INNER JOIN sys.dm_exec_connections ec1
ON ec1.session_id = tl.request_session_id
INNER JOIN sys.dm_exec_connections ec2
ON ec2.session_id = wt.blocking_session_id
LEFT OUTER JOIN master.dbo.sysprocesses sp
ON SP.spid = wt.blocking_session_id
CROSS APPLY sys.dm_exec_sql_text(ec1.most_recent_sql_handle) AS h1
CROSS APPLY sys.dm_exec_sql_text(ec2.most_recent_sql_handle) AS h2
如下图所,我们可以看到阻塞其它会话以及被阻塞会话的信息,如下所示
现在上面SQL已经基本实现了查看阻塞具体信息的功能,但是现在又有几个问题:
1:上面SQL脚本只适合已经出现阻塞情况下查看阻塞信息,如果没有出现阻塞情况,我总不能傻傻的一直在哪里点击执行吧,因为阻塞这种情况有可能在那段时间都不会出现,只会在特定的时间段出现。
2:我想了解一段时间内数据库出现的阻塞情况,那么需要将阻塞信息保留下来。
3:有时候忙不过来,我想将这些具体阻塞信息发送给相关开发人员,让他们了解具体情况。
于是我想通过一个存储过程来实现这方面功能,通过设置参数@OutType,默认为输出阻塞会话信息,当参数为"Table"
时,将阻塞信息写入数据库表,如果参数为 "Email"表示将阻塞信息通过邮件发送开发人员。
正好这段时间,我在YourSQLDba上扩展一些功能,于是我将这个存储过程放置在YouSQLDba数据库中。
USE [YourSQLDba]
GO
IF NOT EXISTS(SELECT * FROM sys.objects WHERE object_id=OBJECT_ID(N‘[Maint].[BlockingSQLHistory]‘) AND type=‘U‘)
BEGIN
CREATE TABLE Maint.BlockingSQLHistory
(
RecordTime DATETIME ,
DatabaseName SYSNAME ,
BlockingSessesionId SMALLINT ,
ProgramName NCHAR(128) ,
UserName NCHAR(256) ,
ClientIpAddress VARCHAR(48) ,
WaitType NCHAR(60) ,
BlockingStartTime DATETIME ,
WaitDuration BIGINT ,
BlockedSessionId INT ,
BlockedSQLText NVARCHAR(MAX) ,
BlockingSQLText NVARCHAR(MAX) ,
CONSTRAINT PK_BlockingSQLHistory PRIMARY KEY(RecordTime)
)
END
GO
存储过程如下所示:
USE [YourSQLDba]
GO
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N‘[Maint].[sp_who_blocking]‘) AND type in (N‘P‘, N‘PC‘))
DROP PROCEDURE [Maint].[sp_who_blocking]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
--==================================================================================================================
-- ProcedureName : [Maint].[sp_who_blocking]
-- Author : Kerry http://www.cnblogs.com/kerrycode/
-- CreateDate : 2014-04-23
-- Description : 监控数据库阻塞情况,显示阻塞会话信息或收集阻塞会话信息或发送告警邮件
/******************************************************************************************************************
Parameters : 参数说明
********************************************************************************************************************
@OutType : 默认为输出阻塞会话信息,"Table", "Email"分别表示将阻塞信息写入表或邮件发送
@EmailSubject : 邮件主题.默认为Sql Blocking Alert,一般指定,例如“ServerName Sql Blocking Alert"
@ProfileName : @profile_name 默认值为YourSQLDba_EmailProfile
@RecipientsLst : 收件人列表
********************************************************************************************************************
Modified Date Modified User Version Modified Reason
********************************************************************************************************************
2014-04-23 Kerry V01.00.00 新建存储过程[Maint].[sp_who_blocking]
*******************************************************************************************************************/
--==================================================================================================================
CREATE PROCEDURE [Maint].[sp_who_blocking]
(
@OutType
VARCHAR(8) =‘Default‘ ,
@EmailSubject
VARCHAR(120)=‘Sql Blocking Alert‘ ,
@ProfileName
sysname=‘YourSQLDba_EmailProfile‘ ,
@RecipientsLst
VARCHAR(MAX) = NULL
)
AS
BEGIN
SET NOCOUNT ON;
DECLARE @HtmlContent NVARCHAR(MAX) ;
IF @OutType NOT IN (‘Default‘, ‘Table‘,‘Email‘)
BEGIN
PRINT ‘The parameter @OutType is not correct,please check it‘;
return;
END
IF @OutType =‘Default‘
BEGIN
SELECT db.name AS DatabaseName
,wt.blocking_session_id AS BlockingSessesionId
,sp.program_name AS ProgramName
,COALESCE(sp.LOGINAME, sp.nt_username) AS UserName
,ec1.client_net_address AS ClientIpAddress
,wt.wait_type AS WaitType
,ec1.connect_time AS BlockingStartTime
,wt.WAIT_DURATION_MS/1000 AS WaitDuration
,ec1.session_id AS BlockedSessionId
,h1.TEXT AS BlockedSQLText
,h2.TEXT AS BlockingSQLText
FROM sys.dm_tran_locks AS tl
INNER JOIN sys.databases db
ON db.database_id = tl.resource_database_id
INNER JOIN sys.dm_os_waiting_tasks AS wt
ON tl.lock_owner_address = wt.resource_address
INNER JOIN sys.dm_exec_connections ec1
ON ec1.session_id = tl.request_session_id
INNER JOIN sys.dm_exec_connections ec2
ON ec2.session_id = wt.blocking_session_id
LEFT OUTER JOIN master.dbo.sysprocesses sp
ON SP.spid = wt.blocking_session_id
CROSS APPLY sys.dm_exec_sql_text(ec1.most_recent_sql_handle) AS h1
CROSS APPLY sys.dm_exec_sql_text(ec2.most_recent_sql_handle) AS h2;
END
ELSE IF @OutType=‘Table‘
BEGIN
INSERT INTO [Maint].[BlockingSQLHistory]
SELECT GETDATE() AS RecordTime
,db.name AS DatabaseName
,wt.blocking_session_id AS BlockingSessesionId
,sp.program_name AS ProgramName
,COALESCE(sp.LOGINAME, sp.nt_username) AS UserName
,ec1.client_net_address AS ClientIpAddress
,wt.wait_type AS WaitType
,ec1.connect_time AS BlockingStartTime
,wt.WAIT_DURATION_MS/1000 AS WaitDuration
,ec1.session_id AS BlockedSessionId
,h1.TEXT AS BlockedSQLText
,h2.TEXT AS BlockingSQLText
FROM sys.dm_tran_locks AS tl
INNER JOIN sys.databases db
ON db.database_id = tl.resource_database_id
INNER JOIN sys.dm_os_waiting_tasks AS wt
ON tl.lock_owner_address = wt.resource_address
INNER JOIN sys.dm_exec_connections ec1
ON ec1.session_id = tl.request_session_id
INNER JOIN sys.dm_exec_connections ec2
ON ec2.session_id = wt.blocking_session_id
LEFT OUTER JOIN master.dbo.sysprocesses sp
ON SP.spid = wt.blocking_session_id
CROSS APPLY sys.dm_exec_sql_text(ec1.most_recent_sql_handle) AS h1
CROSS APPLY sys.dm_exec_sql_text(ec2.most_recent_sql_handle) AS h2;
END
ELSE IF @OutType=‘Email‘
BEGIN
SET @HtmlContent =
N‘<head>‘
+ N‘<style type="text/css">h2, body {font-family: Arial, verdana;} table{font-size:11px; border-collapse:collapse;} td{background-color:#F1F1F1; border:1px solid black; padding:3px;} th{background-color:#99CCFF;}</style>‘
+ N‘<table border="1">‘
+ N‘<tr>
<th>DatabaseName</th>
<th>BlockingSessesionId</th>
<th>ProgramName</th>
<th>UserName</th>
<th>ClientIpAddress</th>
<th>WaitType</th>
<th>BlockingStartTime</th>
<th>WaitDuration</th>
<th>BlockedSessionId</th>
<th>BlockedSQLText</th>
<th>BlockingSQLText</th>
</tr>‘ +
CAST (
(SELECT db.name AS TD, ‘‘
,wt.blocking_session_id AS TD, ‘‘
,sp.program_name AS TD, ‘‘
,COALESCE(sp.LOGINAME, sp.nt_username) AS TD, ‘‘
,ec1.client_net_address AS TD, ‘‘
,wt.wait_type AS TD, ‘‘
,ec1.connect_time AS TD, ‘‘
,wt.WAIT_DURATION_MS/1000 AS TD, ‘‘
,ec1.session_id AS TD, ‘‘
,h1.TEXT AS TD, ‘‘
,h2.TEXT AS TD, ‘‘
FROM sys.dm_tran_locks AS tl
INNER JOIN sys.databases db
ON db.database_id = tl.resource_database_id
INNER JOIN sys.dm_os_waiting_tasks AS wt
ON tl.lock_owner_address = wt.resource_address
INNER JOIN sys.dm_exec_connections ec1
ON ec1.session_id = tl.request_session_id
INNER JOIN sys.dm_exec_connections ec2
ON ec2.session_id = wt.blocking_session_id
LEFT OUTER JOIN master.dbo.sysprocesses sp
ON SP.spid = wt.blocking_session_id
CROSS APPLY sys.dm_exec_sql_text(ec1.most_recent_sql_handle) AS h1
CROSS APPLY sys.dm_exec_sql_text(ec2.most_recent_sql_handle) AS h2
FOR XML PATH(‘tr‘), TYPE
) AS NVARCHAR(MAX) ) +
N‘</table>‘
IF @HtmlContent IS NOT NULL
BEGIN
EXEC msdb.dbo.sp_send_dbmail
@profile_name = @ProfileName ,
@recipients = @RecipientsLst ,
@subject = @EmailSubject ,
@body = @HtmlContent ,
@body_format = ‘HTML‘ ;
END
END
END
GO
最后在数据库新建一个作业,调用该存储过程,然后在某段时间启用作业监控数据库的阻塞情况,作业的执行频率是个比较难以定夺的头痛问题,具体要根据系统情况来决定,我习惯2分钟执行一次。
最后,这个脚本还有一个问题,如果阻塞或被阻塞的SQL语句是某个存储过程里面的一段脚本,显示的SQL是整个存储过程,而不是正在执行的SQL语句,目前还没有想到好的方法解决这个问题。我目前手工去查看阻塞情况,如果非要查看存储过程里面被阻塞的正在执行的SQL,一般结合下面SQL语句查看(输入阻塞或被阻塞会话ID替代@sessionid)
SELECT [Spid] = er.session_id
,[ecid]
,[Database] = DB_NAME(sp.dbid)
,[Start_Time]
,[SessionRunTime] = datediff(SECOND, start_time,getdate())
,[SqlRunTime]= RIGHT(convert(varchar,
dateadd(ms, datediff(ms, sp.last_batch, getdate()), ‘1900-01-01‘),
121), 12)
,[HostName]
,[Users]=COALESCE(sp.LOGINAME, sp.nt_username)
,[Status] = er.status
,[WaitType] = er.wait_type
,[Waitime] = er.wait_time/1000
,[Individual Query] = SUBSTRING(qt.text, er.statement_start_offset / 2,
( CASE WHEN er.statement_end_offset = -1
THEN LEN(CONVERT(NVARCHAR(MAX), qt.text))
* 2
ELSE er.statement_end_offset
END - er.statement_start_offset ) / 2)
,[Parent Query] = qt.text
,[PROGRAM_NAME] = program_name
FROM sys.dm_exec_requests er
INNER JOIN sys.sysprocesses sp ON er.session_id = sp.spid
CROSS APPLY sys.dm_exec_sql_text(er.sql_handle) AS qt
WHERE session_Id = @sessionid;