是,MySQL 5.7及更早版本禁止DELETE语句中子查询直接引用目标表,否则报错1093;解决方法是加一层派生表(如SELECT id FROM (SELECT ... FROM t) AS tmp),使内层结果物化,解除对原表的直接引用。
MySQL 5.7 及更早版本会报错 ERROR 1093 (HY000): You can't specify target table 't' for update in FROM clause。这不是语法写错了,而是 MySQL 的限制:DELETE 时子查询若直接 SELECT 同一表,会被视为“修改过程中读取正在修改的表”,引擎主动拒绝执行。
绕过方法是加一层派生表(即子查询套子查询),让内层先固化结果集:
DELETE FROM users WHERE id NOT IN ( SELECT id FROM ( SELECT MIN(id) AS id FROM users GROUP BY email ) AS tmp);
SELECT id FROM (...) AS tmp 是关键——AS tmp 强制 MySQL 把内层结果物化为临时表,解除对原表的直接引用WHERE id NOT IN (SELECT MIN(id) ...)
email 字段含 NULL,NOT IN 会整体失效(因为 NULL = anything 永远为 UNKNOWN),应改用 NOT EXISTS 或先过滤 IS NOT NULL
窗口函数能明确按字段分组排序后标序号,比聚合 + NOT IN 更直观、更少陷阱。
例如保留每组 email 中 id 最小的记录:
DELETE t1 FROM users t1INNER JOIN users t2 ON t1.email = t2.email AND t1.id > t2.id;
但若想用 ROW_NUMBER(),需借助 CTE(MySQL 8.0+ 支持):
WITH ranked AS ( SELECT id, ROW_NUMBER() OVER (PARTITION BY email ORDER BY id) AS rn FROM users)DELETE FROM users WHERE id IN (SELECT id FROM ranked WHERE rn > 1);
PARTITION BY email 定义重复组,ORDER BY id 决定哪条“活下来”DELETE ... WHERE id IN (SELECT ...) 形式,不能直接 DELETE FROM ranked——CTE 不是真实表IN (SELECT ...) 可能触发全表扫描,建议确保 email 和 id 上有联合索引永远不要在没确认的情况下直接跑 DELETE。先用等价的 SELECT 模拟:
对应第一种方案(子查询嵌套):
SELECT * FROM users WHERE id NOT IN ( SELECT id FROM ( SELECT MIN(id) AS id FROM users WHERE email IS NOT NULL GROUP BY email ) AS tmp);
对应窗口函数方案:
WITH ranked AS ( SELECT id, email, ROW_NUMBER() OVER (PARTITION BY email ORDER BY id) AS rn FROM users)SELECT * FROM ranked WHERE rn > 1;
SELECT 结果是否符合预期:是否多删了(比如误把不同用户的同名邮箱当重复)、是否漏删(比如有空字符串或前后空格干扰分组)SELECT COUNT(*) FROM orders WHERE user_id IN (上述待删id列表),避免级联破坏数据一致性mysqldump --where="id IN (...)" 备份待删记录HAVING 只能筛出“存在重复的分组”,但无法定位到具体哪几行该删。例如:
SELECT email FROM users GROUP BY email HAVING COUNT(*) > 1;
它只返回重复的 email 值,不告诉你这些 email 对应的 id 是多少。想删,还得再 JOIN 回原表,反而更绕,且容易误删整组。
DELETE FROM users WHERE email IN (SELECT email FROM users GROUP BY email HAVING COUNT(*) > 1) —— 这会把所有重复邮箱的记录全干掉,不留一条ALTER TABLE users ADD UNIQUE KEY uk_email (email),从源头阻止重复插入实际删之前,最易被忽略的是字符比较行为:MySQL 默认不区分大小写和尾部空格,'[email protected] ' 和 '[email protected]' 可能被当成同一组。真要严格去重,得用 BINARY email 或先 TRIM() 清洗。