Tuesday, September 06, 2005

Explain Plans and Scalar Subqueries

Explain plans can give a misleading impression for queries which include a scalar subquery.

Here is an example of some different approaches for a query, and their explain plans and actual runs.

drop table parent;
drop table child;
create table parent (id number primary key, description varchar2(80));
create table child (id number primary key,
parent_id number constraint par_fk references parent,
description varchar2(10));

insert into parent
select rownum, table_name||'.'||column_name
from all_tab_columns
where rownum <= 10000;
insert into child select rownum, p.id, m.column_value
from parent p, table(sys.DBMS_DEBUG_VC2COLL('Number 1','Number 2')) m;
commit;
exec dbms_stats.gather_table_stats(user,'PARENT',null,100);
exec dbms_stats.gather_table_stats(user,'CHILD',null,100);
commit;

Now, have a look at various forms of a query to find the best option
set autotrace off
set pages 0
set lines 120

Firstly, analytics plus a DISTINCT

EXPLAIN PLAN SET STATEMENT_ID = 'T1' INTO PLAN_TABLE FOR
select distinct p.id, count(c.id) over (partition by c.parent_id)
from parent p, child c
where p.id=c.parent_id;

select * from TABLE( dbms_xplan.display);

----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 20000 | 253K| 140 (1)|
| 1 | SORT UNIQUE | | 20000 | 253K| 140 (1)|
| 2 | WINDOW SORT | | 20000 | 253K| 140 (1)|
| 3 | NESTED LOOPS | | 20000 | 253K| 8 (13)|
| 4 | TABLE ACCESS FULL| CHILD | 20000 | 175K| 6 (0)|
|* 5 | INDEX UNIQUE SCAN| SYS_C00109804 | 1 | 4 | |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

5 - access("P"."ID"="C"."PARENT_ID")

When run it gave

Statistics
-----------------------------------------------------
0 recursive calls
0 db block gets
20070 consistent gets
0 physical reads
0 redo size
124037 bytes sent via SQL*Net to client
4936 bytes received via SQL*Net from client
668 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
10000 rows processed


Now with an inline view.

EXPLAIN PLAN SET STATEMENT_ID = 'T1' INTO PLAN_TABLE FOR
select distinct p.id, c.cnt
from parent p,
(select parent_id, count(id) cnt
from child group by parent_id) c
where p.id=c.parent_id;

select * from TABLE( dbms_xplan.display);

-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10000 | 205K| 82 (2)|
| 1 | SORT UNIQUE | | 10000 | 205K| 82 (2)|
| 2 | NESTED LOOPS | | 10000 | 205K| 38 (3)|
| 3 | VIEW | | 10000 | 166K| |
| 4 | SORT GROUP BY | | 10000 | 40000 | 36 (0)|
| 5 | TABLE ACCESS FULL| CHILD | 20000 | 80000 | 6 (0)|
|* 6 | INDEX UNIQUE SCAN | SYS_C00109804 | 1 | 4 | |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

6 - access("P"."ID"="C"."PARENT_ID")

Statistics
----------------------------------------------------
5 recursive calls
0 db block gets
10072 consistent gets
0 physical reads
0 redo size
124001 bytes sent via SQL*Net to client
4936 bytes received via SQL*Net from client
668 SQL*Net roundtrips to/from client
4 sorts (memory)
0 sorts (disk)
10000 rows processed


Now with a Scalar subquery.

EXPLAIN PLAN SET STATEMENT_ID = 'T1' INTO PLAN_TABLE FOR
select p.id,
(select count(c.id) from child c where c.parent_id = p.id)
from parent p;

select * from TABLE( dbms_xplan.display);

-----------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
-----------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10000 | 40000 | 3 |
| 1 | SORT AGGREGATE | | 1 | 4 | |
|* 2 | TABLE ACCESS FULL | CHILD | 2 | 8 | 6 |
| 3 | INDEX FAST FULL SCAN| SYS_C00109804 | 10000 | 40000 | 3 |
-----------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter("C"."PARENT_ID"=:B1)

Statistics
----------------------------------------------------
0 recursive calls
0 db block gets
680689 consistent gets
0 physical reads
0 redo size
124048 bytes sent via SQL*Net to client
4936 bytes received via SQL*Net from client
668 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
10000 rows processed
The 'inline view' ran best of all, mostly because it does exactly what we want. It summarises up the CHILD table by the parent id, and joins it up with the PARENT (or in this case the index on the parent as it doesn't need the base table).

The 'scalar subquery' version was the worst performing. Although the lookup table will be accessed ten thousand times, the explain plan only shows 2 rows and a total cost of 6. And furthermore, the cost isn't added to the '3' cost for the parent table, giving a total cost of 3. Even the 'Bytes' is based only on the parent table data. From the optimizers point of view, it can calculate the cost of running the scalar subquery once. But, given that the number of times it has to be run (and therefore its total cost) will not alter whatever plan the optimizer chooses, then there is no point in trying to factor that cost in anywhere.

The explain plan is relevant only to the query passed in, and explains HOW that query will be evaluated. The role of the developer is to interpret the plan and work out if that is an appropriate way to go about answering the question for which the query was developed.

PS. This was run in 9iR2. If there is any improvement in later Oracle versions, I'd love to know

1 comment:

Anonymous said...

The problem with scalar subqueries not showing up in explain plans is a known bug.