Delete Duplicate records

2015.8.31

查询目标

DeleteDuplicate Emails

Writea SQL query to delete all duplicate email entries in a table named Person,keeping only unique emails based on its smallest Id.

题目来自 Leetcode

Person表如下:

SQL> select * fromperson;

ID    EMAIL

------------------------------------------------------------

1     [email protected]

2     [email protected]

3     [email protected]

建表及插入测试数据

droptable person purge;

createtable person(id int,email char(50));

insertinto person(id,email) values(1,‘[email protected]‘);

insertinto person(id,email) values(2,‘[email protected]‘);

insertinto person(id,email) values(3,‘[email protected]‘);

得到的Person表中共3行数据。根据题目要求,第三行数据应该被删除。

SQL语句如下(oracle中执行):

方法一:使用MIN

SQL>delete from person a where a.id > (

select min(b.id) from person b where a.email=b.email

);

方法二:或者用 ANY

SQL>delete from person a where a.id > any (

select b.id from person bwhere a.email=b.email

);

方法三:使用GROUP BY & ROWID

SQL>deletefrom person where rowid not in(select min(rowid) from person group by email);

方法四:self-join

SQL>delete from person p1 where rowid not in (select min(rowid)from person p2 where p1.email=p2.email);

方法五:使用 ROW_NUMBER()

SQL > delete from person where rowid in (select rid from(select rowid rid , row_number() over(partition by email order by email)row_num from person) where row_num>1);

方法六:使用DENSE_RANK()

delete from person where rowid in (select rid from (select rowidrid , dense_rank() over(partition by email order by rowid) rank  fromperson) where rank>1);

参考文章:

http://www.dba-oracle.com/t_delete_duplicate_table_rows.htm

http://sqlandplsql.com/2013/01/29/5-ways-to-delete-duplicate-records-oracle/

主要内容如下:

In Oracle there are many ways to delete duplicate records. Notethat below example are described to just explain the different possibilities.

Considerthe EMP table with below rows

createtable emp(

EMPNNO  integer,

EMPNAME varchar2(20),

SALARY  number);

10   Bill    2000

11    Bill    2000

12    Mark    3000

12    Mark    3000

12    Mark    3000

13    Tom    4000

14    Tom    5000

15    Susan    5000

1. Usingrowid

SQL >delete from emp

where rowid not in

(select max(rowid) from emp group by empno);

Thistechnique can be applied to almost scenarios. Group by operation should be onthe columns which identify the duplicates.

2. Usingself-join

SQL> delete from emp e1

where rowid not in

(select max(rowid) from emp e2

where e1.empno = e2.empno );

3. Using row_number()

SQL> delete from emp where rowid in

(

select rid from

(

select rowid rid,

row_number() over(partition by empno order by empno) rn

from emp

)

where rn > 1

);

This isanother efficient way to delete duplicates

4. Usingdense_rank()

SQL> delete from emp where rowid in

(

select rid from

(

select rowid rid,

dense_rank() over(partition by empno order by rowid) rn

from emp

)

where rn > 1

);

Here youcan use both rank() and dens_rank() since both will give unique records whenorder by rowid.

5. Using groupby

Considerthe EMP table with below rows

10   Bill    2000

11    Bill    2000

12    Mark    3000

13    Mark    3000

SQL> delete from emp where

(empno,empname,salary) in

(

select max(empno),empname,salary from emp

group by empname,salary

);

版权声明:本文为博主原创文章,未经博主允许不得转载。

时间: 2024-08-02 21:02:56

Delete Duplicate records的相关文章

SQL Server Delete Duplicate Rows

There can be two types of duplication of rows in a table 1. Entire row getting duplicated because there is no primary key or unique key. 2. Only primary key or unique key value is different, but remaining all values are same. Scenario 1: Delete dupli

Find and delete duplicate files

作用:查找指定目录(一个或多个)及子目录下的所有重复文件,分组列出,并可手动选择或自动随机删除多余重复文件,每组重复文件仅保留一份.(支持文件名有空格,例如:"file  name" 等) 实现:find遍历指定目录查找所有文件,并对找到的所有文件进行MD5校验,通过比对MD5值分类处理重复文件. 不足:  find 遍历文件耗时: MD5校验大文件耗时: 对所有文件校验比对耗时(可考虑通过比对文件大小进行第一轮的重复性筛选,此方式针对存放大量大文件的目录效果明显,本脚本未采用): 演

重复记录(duplicate records)数据的相关操作

MySQL 中查找重复数据,删除重复数据 创建表和测试数据 /* 表结构 */ DROPTABLEIFEXISTS `t1`; CREATETABLEIFNOTEXISTS `t1`( `id` INT(1)NOTNULL AUTO_INCREMENT, `name` VARCHAR(20)NOTNULL, `add`VARCHAR(20)NOTNULL, PRIMARYKEY(`id`) )Engine=InnoDB; /* 插入测试数据 */ INSERTINTO `t1`(`name`,`

leetcode 196. Delete Duplicate Emails

196. Delete Duplicate Emails Write a SQL query to delete all duplicate email entries in a table named Person, keeping only unique emails based on its smallest Id. +----+------------------+ | Id | Email | +----+------------------+ | 1 | [email protect

[LeetCode][SQL]Delete Duplicate Emails

https://leetcode.com/problems/delete-duplicate-emails/ Delete Duplicate Emails Write a SQL query to delete all duplicate email entries in a table named Person, keeping only unique emails based on its smallest Id. +----+------------------+ | Id | Emai

LeetCode Database: Delete Duplicate Emails

Write a SQL query to delete all duplicate email entries in a table named Person, keeping only unique emails based on its smallest Id. +----+------------------+ | Id | Email | +----+------------------+ | 1 | [email protected] | | 2 | [email protected]

LeetCode - Delete Duplicate Emails

Discription:Write a SQL query to delete all duplicate email entries in a table named Person, keeping only unique emails based on its smallest Id. 删除重复的Email地址,保留Id最小的那个. 使用自身连接循环即可. # Write your MySQL query statement below delete p1 from Person p1, P

Leetcode 196. Delete duplicate Emails. (Database)

Write a SQL query to delete all duplicate email entries in a table named Person, keeping only unique emails based on its smallest Id. +----+------------------+ | Id | Email | +----+------------------+ | 1 | [email protected] | | 2 | [email protected]

leetcode数据库sql之Delete Duplicate Emails

leetcode原文引用: Write a SQL query to delete all duplicate email entries in a table named Person, keeping only unique emails based on its smallest Id. +----+------------------+ | Id | Email | +----+------------------+ | 1 | [email protected] | | 2 | [em