重复的数据可能有这样两种情况,第一种: 表中只有某些字段一样,第二种是两行记录完全一样。
一、对于部分字段重复数据的删除
1.查询重复的数据
select 字段1,字段2, count(*) from 表名 group by 字段1,字段2 having count(*) > 1
例:Select owner from dba_tables group by owner having count(*)>1;
Select owner from dba_tables group by owner having count(*)=1; //查询出没有重复的数据
2.删除重复的数据
delete from 表名 a where 字段1,字段2 in (select 字段1,字段2,count(*) from 表名 group by 字段1,字段2 having count(*) > 1)
这种删除执行的效率非常低,对于大数据量来说,可能会将数据库吊死。
另一种高效率的方法是先将查询到的重复的数据插入到一个临时表中,然后再进行删除。
CREATE TABLE 临时表 AS
(
select 字段1,字段2, count(*) as row_num
from 表名
group by 字段1,字段2
having count(*) > 1
);
上面这句话就是建立了临时表,并将查询到的数据插入其中。
下面就可以进行这样的删除操作了:
delete from 表名 a
where 字段1,字段2 in (select 字段1,字段2 from 临时表);
3.保留重复数据中最新的一条记录
在Oracle中,rowid是隐藏字段,用来唯一标识每条记录。所以,只要保留重复数据中rowid最大的一条记录就可以了。
1、查找表中多余的重复记录,重复记录是根据单个字段(Id)来判断
select * from 表 where Id in (select Id from 表 group byId having count(Id) > 1)
2、删除表中多余的重复记录,重复记录是根据单个字段(Id)来判断,只留有rowid最小的记录
DELETE from 表 WHERE (id) IN ( SELECT id FROM 表 GROUP BY id HAVING COUNT(id) > 1) AND ROWID NOT IN (SELECT MIN(ROWID) FROM 表 GROUP BY id HAVING COUNT(*) > 1);
3、查找表中多余的重复记录(多个字段)
select * from 表 a where (a.Id,a.seq) in(select Id,seq from 表 group by Id,seq having count(*) > 1)
4、删除表中多余的重复记录(多个字段),只留有rowid最小的记录
delete from 表 a where (a.Id,a.seq) in (select Id,seq from 表 group by Id,seq having count(*) > 1) and rowid not in (select min(rowid) from 表 group by Id,seq having count(*)>1)
5、查找表中多余的重复记录(多个字段),不包含rowid最小的记录
select * from 表 a where (a.Id,a.seq) in (select Id,seq from 表 group by Id,seq having count(*) > 1) and rowid not in (select min(rowid) from 表 group by Id,seq having count(*)>1)
查询重复数据:
select a.rowid,a.* from 表名 a
where a.rowid != (
select max(b.rowid) from 表名 b
where a.字段1 = b.字段1 and a.字段2 = b.字段2 );
例:selete from dba_tables a
where a.rowid!=(
select max(rowid) from test b
where a.owner=b.owner);
delete from 表名 a
where a.rowid != (
select max(b.rowid) from 表名 b
where a.字段1 = b.字段1 and a.字段2 = b.字段2 )
使用临时表实现高效查询
create table 临时表 as
(select a.字段1, a.字段2, MAX(a.ROWID) as dataid from 正式表 a
GROUP BY a.字段1,a.字段2);
delete from 表名 a
where a.rowid !=
( select b.dataid from 临时表 b
where a.字段1 = b.字段1 and
a.字段2 = b.字段2 );
commit;