用过sql server的Merge语句的开发人员都应该很清楚Merge用来做表数据的插入/更新是非常方便的,但是其中有一个问题值得关注,那就是Merge语句中的源表中不能出现重复的数据,我们举例来说明这个问题。
现在我们有一张表叫T_Class_A,其建表语句如下:
CREATE TABLE [dbo].[T_Class_A]( [ID] [int] IDENTITY(1,1) NOT NULL, [ClassName] [nvarchar](50) NULL, [StudentTotalCount] [int] NULL, [Owner] [nvarchar](50) NULL, CONSTRAINT [PK_T_Class_A] PRIMARY KEY CLUSTERED ( [ID] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] GO
插入数据的脚本如下:
SET IDENTITY_INSERT [dbo].[T_Class_A] ON GO INSERT [dbo].[T_Class_A] ([ID], [ClassName], [StudentTotalCount], [Owner]) VALUES (1, N‘Class 1‘, 35, N‘Jim‘) GO INSERT [dbo].[T_Class_A] ([ID], [ClassName], [StudentTotalCount], [Owner]) VALUES (2, N‘Class 2‘, 36, N‘Bob‘) GO INSERT [dbo].[T_Class_A] ([ID], [ClassName], [StudentTotalCount], [Owner]) VALUES (3, N‘Class 3‘, 51, N‘James‘) GO INSERT [dbo].[T_Class_A] ([ID], [ClassName], [StudentTotalCount], [Owner]) VALUES (4, N‘Class 4‘, 45, N‘Rose‘) GO INSERT [dbo].[T_Class_A] ([ID], [ClassName], [StudentTotalCount], [Owner]) VALUES (5, N‘Class 5‘, 43, N‘Tom‘) GO INSERT [dbo].[T_Class_A] ([ID], [ClassName], [StudentTotalCount], [Owner]) VALUES (6, N‘Class 6‘, 30, N‘Clark‘)GO SET IDENTITY_INSERT [dbo].[T_Class_A] OFF GO
执行上面两段SQL脚本之后,表T_Class_A的数据如下所示:
现在我们有另外一张表T_Class_B,其结构和T_Class_A完全一样,我们要使用Merge语句用T_Class_A的数据来构造表T_Class_B的数据(相同的ClassName就Update,否者就Insert)。T_Class_B的建表语句如下:
CREATE TABLE [dbo].[T_Class_B]( [ID] [int] IDENTITY(1,1) NOT NULL, [ClassName] [nvarchar](50) NULL, [StudentTotalCount] [int] NULL, [Owner] [nvarchar](50) NULL, CONSTRAINT [PK_T_Class_B] PRIMARY KEY CLUSTERED ( [ID] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY]
接下来我们执行如下Merge语句把T_Class_A表的数据插入到T_Class_B表中去:
merge into [dbo].[T_Class_B] using [dbo].[T_Class_A] -- 这里的[dbo].[T_Class_A]也可以是子查询 on [T_Class_A].[ClassName]=[T_Class_B].[ClassName] when matched then update set [T_Class_B].[StudentTotalCount]=[T_Class_A].[StudentTotalCount],[T_Class_B].[Owner]=[T_Class_A].[Owner] when not matched then insert([ClassName],[StudentTotalCount],[Owner]) values([T_Class_A].[ClassName],[T_Class_A].[StudentTotalCount],[T_Class_A].[Owner]);
之后我们可以看到T_Class_B表中的数据和T_Class_A表完全一样了:
现在我们更改T_Class_A表的数据,将Owner全部改为Unknown,如下语句所示:
update T_Class_A set [Owner]=N‘Unknown‘
然后再执行上面的Merge语句:
merge into [dbo].[T_Class_B] using [dbo].[T_Class_A] -- 这里的[dbo].[T_Class_A]也可以是子查询 on [T_Class_A].[ClassName]=[T_Class_B].[ClassName] when matched then update set [T_Class_B].[StudentTotalCount]=[T_Class_A].[StudentTotalCount],[T_Class_B].[Owner]=[T_Class_A].[Owner] when not matched then insert([ClassName],[StudentTotalCount],[Owner]) values([T_Class_A].[ClassName],[T_Class_A].[StudentTotalCount],[T_Class_A].[Owner]);
然后查看T_Class_B表中的数据如下,可以看到T_Class_B表的Owner字段都被Merge语句Update为了"Unknown"了:
很好到现在为止我们的Merge语句都工作得很不错,没有出现问题。接下来我们在T_Class_A表中再插入一条数据,如下语句所示:
INSERT [dbo].[T_Class_A] ([ClassName], [StudentTotalCount], [Owner]) VALUES (N‘Class 6‘, 38, N‘Terry‘)
此时我们查看T_Class_A表中的数据如下:
我们发现此时,T_Class_A表中有两行ClassName为"Class 6"的数据行,那么现在我们再执行上面的Merge语句,如下所示:
merge into [dbo].[T_Class_B] using [dbo].[T_Class_A] -- 这里的[dbo].[T_Class_A]也可以是子查询 on [T_Class_A].[ClassName]=[T_Class_B].[ClassName] when matched then update set [T_Class_B].[StudentTotalCount]=[T_Class_A].[StudentTotalCount],[T_Class_B].[Owner]=[T_Class_A].[Owner] when not matched then insert([ClassName],[StudentTotalCount],[Owner]) values([T_Class_A].[ClassName],[T_Class_A].[StudentTotalCount],[T_Class_A].[Owner]);
结果现在我们发现Sql server在执行Merge语句的时候报错了,错误如下所示:
消息 8672,级别 16,状态 1,第 1 行 The MERGE statement attempted to UPDATE or DELETE the same row more than once. This happens when a target row matches more than one source row. A MERGE statement cannot UPDATE/DELETE the same row of the target table multiple times. Refine the ON clause to ensure a target row matches at most one source row, or use the GROUP BY clause to group the source rows.
原因上面的错误消息也写的很清楚了,就是因为现在Merge语句的源表T_Class_A中有两行ClassName为"Class 6"的数据,那么这会导致Merge语句中目标表T_Class_B中ClassName为"Class 6"的这一行数据Match两次T_Class_A表中的数据,而这在Merge语句中是不允许的,Merge语句只允许目标表T_Class_B中的每行数据最多被源表T_Class_A中的数据Match一次。这就是为什么这里Merge语句会报错的原因。