万圣节问题(Halloween Protection)

万圣节效应指的是结果集中数据移动位置并因此被改变多次。这个效应不同于双读,因为它是有数据修改驱动的,而不是读取查询。要执行一个更行,数据必须先被读取。执行这个要使用两个游标,一个用于读取,另一个用于写入。如果数据在所有的数据读入之前被写入游标更行,那么就有可能出现某行移动位置,并再次被读取,从而再次被更新。理论上,这会永远持续下去。使用索引读取数据,该索引的键值(key)会被查询更新,这是万圣节效应的一个例子。万圣节效应显然非常让人不悦,幸好SQL Server的存储引擎使其免受该效应。要确保可用于写入的数据全部被读取,SQL Server需要向计划注入一个阻塞操作符,如spool。

下面是找到的一篇关于如何防止万圣节效应的文章:http://blogs.msdn.com/b/craigfr/archive/2008/02/27/halloween-protection.aspx

还有这篇: https://www.simple-talk.com/sql/learn-sql-server/operator-of-the-week---spools,-eager-spool/

In a prior post, I introduced the notion that update plans consist of two parts: a read cursor that identifies the rows to be updated and a write cursor that actually performs the updates.  Logically speaking, SQL Server must execute the read cursor and write cursor of an update plan in two separate steps or phases.  To put it another way, the actual update of rows must not affect the selection of which rows to update.  This problem of ensuring that the write cursor of an update plan does not affect the read cursor is known as "Halloween protection" as it wasdiscovered by IBM researchers more than 30 years ago on Halloween.

One simple solution to the Halloween problem is to physically separate the read and write cursors of an update plan using a blocking operator such as an eager spool or sort.  Inserting a blocking operator between the two halves of an update plan ensures that the read cursor runs in its entirety and generates all rows that it will generate before the write cursor begins executing or modifying any rows.   Unfortunately, inserting a blocking operator such as an eager spool into the update plan requires copying all rows output by the read cursor.  This copying can be quite expensive.  Fortunately, in many cases, SQL Server can determine that the write cursor will not affect the read cursor and does not need to add a blocking operator at all.

Let‘s look at an example:

CREATE TABLE T (PK INT, A INT)
CREATE UNIQUE CLUSTERED INDEX TPK ON T(PK)
CREATE INDEX TA ON T(A)

INSERT T VALUES (1, 1)
INSERT T VALUES (2, 2)
INSERT T VALUES (3, 3)

UPDATE T SET A = A + 10

Here is the plan for the update statement:

|--Clustered Index Update(OBJECT:([T].[TPK]), OBJECT:([T].[TA]), SET:([T].[A] = [Expr1003]))
       |--Compute Scalar(DEFINE:([Expr1016]=[Expr1016]))
            |--Compute Scalar(DEFINE:([Expr1016]=CASE WHEN [Expr1004] THEN (1) ELSE (0) END))
                 |--Compute Scalar(DEFINE:([Expr1003]=[T].[A]+(10), [Expr1004]=CASE WHEN [T].[A] = ([T].[A]+(10)) THEN (1) ELSE (0) END))
                      |--Top(ROWCOUNT est 0)
                           |--Clustered Index Scan(OBJECT:([T].[TPK]))

In this plan, the clustered index scan is the read cursor while the clustered index update is the write cursor.  Notice that this plan has no blocking operators.  Because we are modifying column A which is not part of the clustered index key, SQL Server knows that rows in the clustered index will not move due to this update and, thus, knows that there is no need to separate the scan and update operators.  Now let‘s use a hint to force SQL Server to scan the non-clustered index on column A:

UPDATE T SET A = A + 10 FROM T WITH (INDEX(TA))

Here is the new update plan:

|--Clustered Index Update(OBJECT:([T].[TPK]), OBJECT:([T].[TA]), SET:([T].[A] = [Expr1003]))
       |--Compute Scalar(DEFINE:([Expr1016]=[Expr1016]))
            |--Compute Scalar(DEFINE:([Expr1016]=CASE WHEN [Expr1004] THEN (1) ELSE (0) END))
                 |--Top(ROWCOUNT est 0)
                      |--Compute Scalar(DEFINE:([Expr1003]=[T].[A]+(10), [Expr1004]=CASE WHEN [T].[A] = ([T].[A]+(10)) THEN (1) ELSE (0) END))
                           |--Table Spool
                                |--Index Scan(OBJECT:([T].[TA]), ORDERED FORWARD)

Notice that this plan does include a blocking operator - specifically an eager table spool.  (The text plan as generated by SHOWPLAN_TEXT does not show that the spool is eager, but SHOWPLAN_ALL as well as the graphical and XML plans do indicate that the spool is eager.)  This time SQL Server recognizes that updating column A could cause rows to move within the index on column A which could cause the scan to return these rows more than once which would in turn lead to the plan updating the same rows more than once.  The spool ensures that we get the correct result by saving a copy of the scan output before updating any rows.

There is no way to get SQL Server to omit the spool from the above plan as this would lead to incorrect results.  However, we can simulate what would happen by using dynamic cursors.  The following batch creates a dynamic cursor to scan index TA and then updates each row before fetching the next row.  Because updates are immediately visible to dynamic cursors, this batch yields the same result as the above plan would yield if we could remove the spool.  Note that I am not suggesting that anyone should implement an update this way and, if anything, the following example nicely illustrates one of the pitfalls of dynamic cursors.

DECLARE @PK INT
DECLARE C CURSOR DYNAMIC SCROLL_LOCKS FOR SELECT PK FROM T WITH (INDEX(TA))
OPEN C
WHILE 0=0
    BEGIN
        FETCH NEXT FROM C INTO @PK
        IF @@FETCH_STATUS <> 0
            BREAK
        UPDATE T SET A = A + 10 WHERE PK = @PK
    END
CLOSE C
DEALLOCATE C

If you execute this batch, it will enter an infinite loop as it repeatedly scans, updates, and then again scans the same three rows.  On the other hand, the batch terminates properly if we change the index hint to use the clustered index (where no spool is required) or if we use a static cursor (which makes a copy of the table just like the spool):

DECLARE C CURSOR DYNAMIC SCROLL_LOCKS FOR SELECT PK FROM T WITH (INDEX(TPK))
DECLARE C CURSOR STATIC FOR SELECT PK FROM T WITH (INDEX(TA))

SQL Server can use any blocking operator, not just a spool, to provide Halloween protection.  Normally, if an update plan requires Halloween protection, SQL Server adds a spool because the spool is the cheapest blocking operator.  However, if an update plan already includes another blocking operator, SQL Server will not also add a spool.  For example, if we update column PK which has a unique clustered index, we get a sort as a by-product ofmaintaining the unique index.  Since this plan already has a sort, we do not also get a spool operator.

UPDATE T SET PK = PK + 10 FROM T WITH (INDEX(TPK))

|--Index Update(OBJECT:([T].[TA]), SET:([PK1015] = [T].[PK],[A1016] = [T].[A]))
       |--Split
            |--Clustered Index Update(OBJECT:([T].[TPK]), SET:([T].[PK] = [T].[PK],[T].[A] = [T].[A]))
                 |--Collapse(GROUP BY:([T].[PK]))
                      |--Sort(ORDER BY:([T].[PK] ASC, [Act1014] ASC))
                           |--Split
                                |--Top(ROWCOUNT est 0)
                                     |--Compute Scalar(DEFINE:([Expr1003]=[T].[PK]+(10)))
                                          |--Clustered Index Scan(OBJECT:([T].[TPK]))

As I mentioned above, adding a spool is not free and does increase the cost of an update plan.  If we insert enough rows into the table, we can measure this effect.  For example, I loaded the table with 100,000 rows as follows.

TRUNCATE TABLE T
SET NOCOUNT ON
DECLARE @I INT
SET @I = 0
WHILE @I < 100000
    BEGIN
        INSERT T VALUES (@I, @I)
        SET @I = @I + 1
    END
SET NOCOUNT OFF

I then used SET STATISTICS TIME ON to measure the time to run the first two update statements above.  I ran each statement twice and reported the second time to ensure that the buffer pool was warmed up.

UPDATE T SET A = A + 10

SQL Server Execution Times:
   CPU time = 3046 ms,  elapsed time = 3517 ms.

UPDATE T SET A = A + 10 FROM T WITH (INDEX(TA))

SQL Server Execution Times:
   CPU time = 4391 ms,  elapsed time = 4666 ms.

As you can see, the same update took over 30% longer in elapsed time and over 40% longer in CPU time.  I ran this experiment on a Pentium Xeon 2.2 GHz workstation with 2 GB of RAM, Windows Server 2003 SP2, and SQL Server 2005 SP2.  Other system configurations may yield different results.

Finally, although I‘ve used update statements for all of the examples in this post, some insert and delete statements also require Halloween protection, but I‘ll save that topic for a future post.

时间: 2024-08-04 14:30:54

万圣节问题(Halloween Protection)的相关文章

启橙英语万圣节主题活动

万圣节(Halloween),每年的11月1日,是一年一度的"鬼节"和狂欢节.trick or treat!万圣将至,小鬼来袭,南瓜怪驾着马车女巫×××送来请柬启橙[少儿英语]万圣节主题活动课惊喜来袭! 万圣节(Halloween),每年的11月1日,是一年一度的"鬼节"和狂欢节.trick or treat!万圣将至,小鬼来袭,南瓜怪驾着马车女巫×××送来请柬启橙万圣节主题活动课惊喜来袭! 启橙英语万圣节主题活动适合:3-6岁非启橙学员体验,家长报名可在线咨询客服

技术与经济之六:现代化的陷阱

技术与经济之六:现代化的陷阱 碳足印与世界工厂 "碳足印"(carbon footprint)被定义为家庭或企业的温室气体排放量. 学术界似乎很在意这个词,倡导"低碳"生活.实际上,由于环境影响的成本已经计入商品价格,因此民众没有必要刻意为此操心并降低生活的舒适度.例如,汽车引擎的碳排放的负面影响已被(或可被)计入油价:又如,不必为"低碳"强行电子化,放弃使用纸质档案书籍带来的舒适与正式感. 对环境的保护并非一定为了防止危害,而是出自更为一般的享

SQL optimizer -Query Optimizer Deep Dive

refer: http://sqlblog.com/blogs/paul_white/archive/2012/04/28/query-optimizer-deep-dive-part-1.aspx    SQL是一种结构化查询语言规范,它从逻辑是哪个描述了用户需要的结果,而SQL服务器将这个逻辑需求描述转成能执行的物理执行计划,从而把结果返回给用户.将逻辑需求转换成一个更有效的物理执行计划的过程,就是优化的过程. 执行SQL的过程: Input Tree We start by looking

万圣节的糖果(Halloween Sweets)

今天遇到codewars的一道题,这是链接,讲的是关于万圣节的一个题目,简单点说,就是9个包裹,一个天平,两次称的机会,怎么找出9个包裹中唯一一个较重的包裹. 像我这种年轻时候喜欢研究难题获得存在感的蠢材,觉得很开心,因为这是我为数不多还记得答案的小学题.包裹分成三堆,取两个堆一称,可以得到哪个是比较中的一堆,然后再在这个异常的堆里选择两个称,找到嫌疑犯X. 于是我开始码代码 function pick(bags, scale) { switch(scale.weigh([bags[0],bag

10. Halloween 万圣节

(1) On October the 31st,across Britain and the USA,thousands of children are dressing up as monsters,ghosts and witches and going to their neighbours' houses to ask for sweets or to play tricks on them if they refuse. (2) Many houses have lanterns ma

一键领取免费万圣节图表素材

万圣节也快到了,虽然是一个国外的节日,我们也可以作为一个有趣的娱乐节目,我的朋友圈几乎每年这个时候都有一些比较有创意的南瓜设计图,还真有一些第一眼看上去把我吓着了,同学们也设计一些图案来吓吓身边的小伙伴们吧! 下面是一些素材图标,还蛮不错的. Brilliant Icons 这个图标素材包有六个橙色图标,有南瓜.蜘蛛.幽灵.骷髅头.科学怪人等. Colorful Ribbons 用这些丝带来装饰你的万圣节设计吧,一共8个,有黑色.紫色.橙色3种. Scary Font 万圣节字体选这个了!字体像

POJ 3370 Halloween treats(抽屉原理)

题意  有c个小孩 n个大人万圣节搞活动  当小孩进入第i个大人家里时   这个大人就会给小孩a[i]个糖果  求小孩去哪几个大人家可以保证得到的糖果总数是小孩数c的整数倍  多种方案满足输出任意一种 用s[i]表示前i个打人给糖果数的总和  令s[0]=0  那么s[i]共有n+1种不同值  而s[i]%c最多有c种不同值  题目说了c<=n   所以s[i]%c肯定会有重复值了 这就是抽屉原理了   n个抽屉放大于n个苹果   至少有一个抽屉有大于等于2个苹果 就把s[i]%c的取值个数(c

[USACO08DEC]在农场万圣节Trick or Treat on the Farm

题目描述 Every year in Wisconsin the cows celebrate the USA autumn holiday of Halloween by dressing up in costumes and collecting candy that Farmer John leaves in the N (1 <= N <= 100,000) stalls conveniently numbered 1..N. Because the barn is not so la

洛谷 P2921 [USACO08DEC]在农场万圣节Trick or Treat on the Farm

P2921 [USACO08DEC]在农场万圣节Trick or Treat on the Farm 题目描述 Every year in Wisconsin the cows celebrate the USA autumn holiday of Halloween by dressing up in costumes and collecting candy that Farmer John leaves in the N (1 <= N <= 100,000) stalls conven