PostgreSQL删除重复数据sql语句

作者：袖梨 2022-06-29

PostgreSQL 库如何去除单表重复数据呢？可以通过 ctid 进行，下面是实验过程。

一、创建测试表

代码如下	复制代码
david=# create table emp ( david(# id int, david(# name varchar); CREATE TABLE david=#

二、插入测试数据

代码如下

复制代码

david=# insert into emp values (1, 'david');
INSERT 0 1
david=# insert into emp values (1, 'david');
INSERT 0 1
david=# insert into emp values (1, 'david');
INSERT 0 1
david=# insert into emp values (2, 'sandy');
INSERT 0 1
david=# insert into emp values (2, 'sandy');
INSERT 0 1
david=# insert into emp values (3, 'renee');
INSERT 0 1
david=# insert into emp values (4, 'jack');
INSERT 0 1
david=# insert into emp values (5, 'rose');
INSERT 0 1
david=#

三、查询初始化数据

代码如下

复制代码

david=# select ctid, * from emp;
ctid | id | name
-------+----+-------
(0,1) | 1 | david
(0,2) | 1 | david
(0,3) | 1 | david
(0,4) | 2 | sandy
(0,5) | 2 | sandy
(0,6) | 3 | renee
(0,7) | 4 | jack
(0,8) | 5 | rose
(8 rows)

david=#

查询重复数据数

代码如下

复制代码

david=# select distinct id, count(*) from emp group by id having count(*) > 1;
id | count
----+-------
1 | 3
2 | 2
(2 rows)

david=#

查询出 id 为1的记录有3条，id 为2的记录有2条。

四、查询要保留的数据

以 min(ctid) 或 max(ctid) 为准。

代码如下

复制代码

david=# select ctid, * from emp where ctid in (select min(ctid) from emp group by id);
ctid | id | name
-------+----+-------
(0,1) | 1 | david
(0,4) | 2 | sandy
(0,6) | 3 | renee
(0,7) | 4 | jack
(0,8) | 5 | rose
(5 rows)

david=#

五、删除重复数据

代码如下	复制代码
david=# delete from emp where ctid not in (select min(ctid) from emp group by id); DELETE 3 david=#

六、查看最后结果

代码如下	复制代码
david=# select ctid, * from emp; ctid \| id \| name -------+----+------- (0,1) \| 1 \| david (0,4) \| 2 \| sandy (0,6) \| 3 \| renee (0,7) \| 4 \| jack (0,8) \| 5 \| rose (5 rows)

david=# 说明：如果表中已经有标明唯一的序列主键值，可以把该值替换上述的ctid直接删除

PostgreSQL删除重复数据sql语句

相关文章

精彩推荐