下面先来看看例子:
代码如下 | 复制代码 |
table id name 1 a 2 b 3 c 4 c 5 b |
库结构大概这样,这只是一个简单的例子,实际情况会复杂得多。
比如我想用一条语句查询得到name不重复的所有数据,那就必须使用distinct去掉多余的重复记录。
代码如下 | 复制代码 |
select distinct name from table 得到的结果是: name a b c |
好像达到效果了,可是,我想要得到的是id值呢?改一下查询语句吧:
代码如下 | 复制代码 |
select distinct name, id from table 结果会是: id name 1 a 2 b 3 c 4 c 5 b |
distinct怎么没起作用?作用是起了的,不过他同时作用了两个字段,也就是必须得id与name都相同的才会被排除。。。。。。。
我们再改改查询语句:
select id, distinct name from table
现在将完整语句放出:
代码如下 | 复制代码 |
select *, count(distinct name) from table group by name 结果: id name count(distinct name) 1 a 1 2 b 1 3 c 1 |
上面简单但有些地方是不能完成我们的需要的,下面记录了些常用的重复记录操作语句
查询及删除重复记录的方法
(一)
1、查找表中多余的重复记录,重复记录是根据单个字段(peopleId)来判断
代码如下 | 复制代码 |
select * from people where peopleId in (select peopleId from people group by peopleId having count(peopleId) > 1) |
2、删除表中多余的重复记录,重复记录是根据单个字段(peopleId)来判断,只留有rowid最小的记录
代码如下 | 复制代码 |
delete from people where peopleId in (select peopleId from people group by peopleId having count(peopleId) > 1) and rowid not in (select min(rowid) from people group by peopleId having count(peopleId )>1) |
3、查找表中多余的重复记录(多个字段)
代码如下 | 复制代码 |
select * from vitae a where (a.peopleId,a.seq) in (select peopleId,seq from vitae group by peopleId,seq having count(*) > 1) |
4、删除表中多余的重复记录(多个字段),只留有rowid最小的记录
代码如下 | 复制代码 |
delete from vitae a where (a.peopleId,a.seq) in (select peopleId,seq from vitae group by peopleId,seq having count(*) > 1) and rowid not in (select min(rowid) from vitae group by peopleId,seq having count(*)>1) |
5、查找表中多余的重复记录(多个字段),不包含rowid最小的记录
代码如下 | 复制代码 |
select * from vitae a where (a.peopleId,a.seq) in (select peopleId,seq from vitae group by peopleId,seq having count(*) > 1) and rowid not in (select min(rowid) from vitae group by peopleId,seq having count(*)>1) |