oracle中not exists对外层查询的影响

作者：袖梨 2022-06-29

又一个类似『12c比10g索引回表消耗增多的问题』的案例，同事在12c中跑的buffer get很高，但是在10g中跑的buffer很低。怀疑是不是12c的优化器有问题。

这个10g的环境和12c的环境，数据量大致一样，只是有很少部分的不同，但是就是这个很少部分不同，造成了not exists中的子查询返回不同的值，进而对外层查询产生不同的影响。

我们来用如下的代码模拟一下。

初始化数据：
--10g
drop table t1;
drop table t2;

create table t1 (id number,name varchar2(20),dep_id varchar2(10));
create table t2 (id number,name varchar2(20),dep_id varchar2(10));

insert into t1 select rownum,'a','kk' from dual connect by level <=3000000;
insert into t2 select rownum,'a','kk' from dual connect by level <=1000000;
insert into t2 select rownum,'a','mm' from dual;

commit;

--12c
drop table t1;
drop table t2;

create table t1 (id number,name varchar2(20),dep_id varchar2(10));
create table t2 (id number,name varchar2(20),dep_id varchar2(10));

insert into t1 select rownum,'a','kk' from dual connect by level <=3000000;
insert into t2 select rownum,'a','kk' from dual connect by level <=1000000;

commit;
我们看到，12c的数据和10g只是有很少的差别，t1表12c和10g都一样，t2表在12c只是少了一行数据。

--10g
SQL> select dep_id,count(*) from t1 group by dep_id;

DEP_ID                 COUNT(*)
-------------------- ----------
kk                      3000000

SQL> select dep_id,count(*) from t2 group by dep_id;

DEP_ID                 COUNT(*)
-------------------- ----------
mm                            1
kk                      1000000

SQL>

--12c
SQL> select dep_id,count(*) from t1 group by dep_id;

DEP_ID                 COUNT(*)
-------------------- ----------
kk                      3000000

SQL> select dep_id,count(*) from t2 group by dep_id;

DEP_ID                 COUNT(*)
-------------------- ----------
kk                      1000000

SQL>

我们将要执行的sql语句是：

select count(*)
from t1, t2
where t1.id = t2.id
   and t1.dep_id = 'kk'
   and not exists (select 1
          from t1, t2
         where t1.id = t2.id
           and t2.dep_id = 'mm');
我们先来看执行情况的差距，10g的bufferget小，12c多：

--10g
SQL> select /*+ gather_plan_statistics */ count(*) from t1,t2 where t1.id=t2.id and t1.dep_id='kk' and not exists (select 1 from t1,t2 where t1.id=t2.id and t2.dep_id='mm');

COUNT(*)
----------
0

SQL> select* from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 22t5mb43w55pr, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(*) from t1,t2 where t1.id=t2.id and t1.dep_id='kk' and not
exists (select 1 from t1,t2 where t1.id=t2.id and t2.dep_id='mm')

Plan hash value: 3404612428

------------------------------------------------------------------------------------------------------------------
| Id | Operation            | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | OMem | 1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |      1 |        |      1 |00:00:00.02 |    2086 |       |       |          |
|   1 | SORT AGGREGATE      |      |      1 |      1 |      1 |00:00:00.02 |    2086 |       |       |          |
|* 2 |   FILTER             |      |      1 |        |      0 |00:00:00.02 |    2086 |       |       |          |
|* 3 |    HASH JOIN         |      |      0 |    901K|      0 |00:00:00.01 |       0 |    39M| 5518K|          |
|   4 |     TABLE ACCESS FULL| T2   |      0 |    901K|      0 |00:00:00.01 |       0 |       |       |          |
|* 5 |     TABLE ACCESS FULL| T1   |      0 |   2555K|      0 |00:00:00.01 |       0 |       |       |          |
|* 6 |    HASH JOIN         |      |      1 |     23 |      1 |00:00:00.02 |    2086 | 1517K| 1517K| 612K (0)|
|* 7 |     TABLE ACCESS FULL| T2   |      1 |     23 |      1 |00:00:00.02 |    2082 |       |       |          |
|   8 |     TABLE ACCESS FULL| T1   |      1 |   2555K|      1 |00:00:00.01 |       4 |       |       |          |
------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter( IS NULL)
   3 - access("T1"."ID"="T2"."ID")
   5 - filter("T1"."DEP_ID"='kk')
   6 - access("T1"."ID"="T2"."ID")
   7 - filter("T2"."DEP_ID"='mm')

Note
-----
- dynamic sampling used for this statement

34 rows selected.

SQL>

--12c
SQL> select /*+ gather_plan_statistics */ count(*) from t1,t2 where t1.id=t2.id and t1.dep_id='kk' and not exists (select 1 from t1,t2 where t1.id=t2.id and t2.dep_id='mm');

COUNT(*)
----------
1000000

SQL> select* from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 22t5mb43w55pr, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(*) from t1,t2 where
t1.id=t2.id and t1.dep_id='kk' and not exists (select 1 from t1,t2
where t1.id=t2.id and t2.dep_id='mm')

Plan hash value: 1692274438

--------------------------------------------------------------------------------------------------------------------
| Id | Operation              | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | OMem | 1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |      |      1 |        |      1 |00:00:00.79 |   10662 |       |    |     |
|   1 | SORT AGGREGATE        |      |      1 |      1 |      1 |00:00:00.79 |   10662 |       |    |     |
|* 2 |   FILTER               |      |      1 |        |   1000K|00:00:00.74 |   10662 |       |    |     |
|* 3 |    HASH JOIN           |      |      1 |   1215K|   1000K|00:00:00.52 |    8579 |    43M| 6111K|   42M (0)|
|   4 |     TABLE ACCESS FULL | T2   |      1 |   1215K|   1000K|00:00:00.01 |    2083 |       |    |     |
|* 5 |     TABLE ACCESS FULL | T1   |      1 |   2738K|   3000K|00:00:00.07 |    6496 |       |    |     |
|* 6 |    HASH JOIN RIGHT SEMI|      |      1 |     35 |      0 |00:00:00.02 |    2083 | 1245K| 1245K| 461K (0)|
|* 7 |     TABLE ACCESS FULL | T2   |      1 |     23 |      0 |00:00:00.02 |    2083 |       |    |     |
|   8 |     TABLE ACCESS FULL | T1   |      0 |   2738K|      0 |00:00:00.01 |       0 |       |    |     |
--------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter( IS NULL)
   3 - access("T1"."ID"="T2"."ID")
   5 - filter("T1"."DEP_ID"='kk')
   6 - access("T1"."ID"="T2"."ID")
   7 - filter("T2"."DEP_ID"='mm')

Note
-----
- dynamic statistics used: dynamic sampling (level=2)

35 rows selected.

SQL>
SQL>

可以看到第23，24行，在10g中运行时，buffers是0，而在12c中，即78，79行，buffer是2083+6496。

也就是说在10g中，外层查询不进行t1和t2的扫描，直接返回结果了，而在12c中，外层查询还要进行t1表和t2表层扫描才返回结果。

这其实不是10g和12c的差别，而是not exists的返回数据对外层的影响。子查询要返回0行记录，才满足not exist的条件，从而返回外层查询结果。

在10g中，子查询返回了一行记录

--10g
SQL> select 1 from t1,t2 where t1.id=t2.id and t2.dep_id='mm';

1
----------
1

SQL>

不满足not exists（即0行才满足），所以，也就不用在外层继续查询了。直接返回记录0行。

在12c中，子查询返回0行记录，满足not exist的条件，所以还需要在外层查询中继续查询。

--12c
SQL> select 1 from t1,t2 where t1.id=t2.id and t2.dep_id='mm';

no rows selected

SQL>

正是这一行记录的差异，导致了not exists对外层查询的影响。进而导致整个sql的buffer get的差异。

反证这个结果，我只要在12c中，运行子查询结果返回大于0行的，不满足not exists，也应该不会去外层查询了。见下：

--12c
SQL> select count(*) from t1,t2 where t1.id=t2.id and t2.dep_id='kk';

COUNT(*)
----------
   1000000

SQL> set line 1000
SQL> set pages 1000
SQL> col PLAN_TABLE_OUTPUT for a250
SQL>
SQL>
SQL> select /*+ gather_plan_statistics */ count(*) from t1,t2 where t1.id=t2.id and t1.dep_id='kk' and not exists (select 1 from t1,t2 where t1.id=t2.id and t2.dep_id='kk');

COUNT(*)
----------
         0

SQL> select* from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID c5hj2p2jt1fxf, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(*) from t1,t2 where
t1.id=t2.id and t1.dep_id='kk' and not exists (select 1 from t1,t2
where t1.id=t2.id and t2.dep_id='kk')

Plan hash value: 1692274438

--------------------------------------------------------------------------------------------------------------------
| Id | Operation              | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | OMem | 1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |      |      1 |        |      1 |00:00:00.28 |    2087 |       |    |     |
|   1 | SORT AGGREGATE        |      |      1 |      1 |      1 |00:00:00.28 |    2087 |       |    |     |
|* 2 |   FILTER               |      |      1 |        |      0 |00:00:00.28 |    2087 |       |    |     |
|* 3 |    HASH JOIN           |      |      0 |   1215K|      0 |00:00:00.01 |       0 |    69M| 7428K|          |
|   4 |     TABLE ACCESS FULL | T2   |      0 |   1215K|      0 |00:00:00.01 |       0 |       |    |     |
|* 5 |     TABLE ACCESS FULL | T1   |      0 |   2738K|      0 |00:00:00.01 |       0 |       |    |     |
|* 6 |    HASH JOIN RIGHT SEMI|      |      1 |   2738K|      1 |00:00:00.28 |    2087 |    43M| 6111K|   42M (0)|
|* 7 |     TABLE ACCESS FULL | T2   |      1 |   1215K|   1000K|00:00:00.12 |    2083 |       |    |     |
|   8 |     TABLE ACCESS FULL | T1   |      1 |   2738K|      1 |00:00:00.01 |       4 |       |    |     |
--------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter( IS NULL)
   3 - access("T1"."ID"="T2"."ID")
   5 - filter("T1"."DEP_ID"='kk')
   6 - access("T1"."ID"="T2"."ID")
   7 - filter("T2"."DEP_ID"='kk')

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)

35 rows selected.

SQL>

可以看到第38，39行的buffer为0.

相关文章

精彩推荐