Wednesday, May 05, 2010

How many rows in a view ?

It depends on the order of the tables in the FROM clause....sometimes.

Okay, it's a trick question. the real one is
"How many rows does the CBO estimate when querying a view over a DB link ?"

Lets see it in action.

create table small as 
select level id, 'small' val from dual connect by level <= 10;
create table large as 

select level id, 'large' val from dual connect by level <= 10000;
 
create or replace view small_large as
select s.id small_id, l.id large_id, s.val small_val, l.val large_val 

from small s, large l where s.id = l.id;
create or replace view large_small as
select s.id small_id, l.id large_id, s.val small_val, l.val large_val 

from large l, small s where s.id = l.id;

begin
 dbms_stats.gather_table_stats (ownname => user, tabname => 'SMALL',

   estimate_percent=> 100, method_opt => 'FOR ALL COLUMNS SIZE SKEWONLY',
   cascade => true, no_invalidate => false);
 dbms_stats.gather_table_stats (ownname => user, tabname => 'LARGE',

   estimate_percent=> 100, method_opt => 'FOR ALL COLUMNS SIZE SKEWONLY',
   cascade => true, no_invalidate => false);
end;
/



Simple Select


explain plan for select * from small_large;
select * from table(dbms_xplan.display);


----------------------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |       |    10 |   180 |    13   (8)| 00:00:01 |
|*  1 |  HASH JOIN         |       |    10 |   180 |    13   (8)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| SMALL |    10 |    90 |     3   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| LARGE | 10000 | 90000 |     9   (0)| 00:00:01 |
----------------------------------------------------------------------------

Gives an estimate of 10 rows as we'd expect (and we can see the source tables).

Select over a database link

Lets add in the database link (in this case a loopback link to the same account and database).

create database link lb connect to ****** identified by ****** using 'db11gr2';

explain plan for select * from small_large@lb;
select * from table(dbms_xplan.display);


--------------------------------------------------------
| Id  | Operation              | Name  | Rows  | Bytes |
--------------------------------------------------------
|   0 | SELECT STATEMENT REMOTE|       |    10 |   180 |
|*  1 |  HASH JOIN             |       |    10 |   180 |
|   2 |   TABLE ACCESS FULL    | SMALL |    10 |    90 |
|   3 |   TABLE ACCESS FULL    | LARGE | 10000 | 90000 |
--------------------------------------------------------

Still ten rows, but the query is fully resolved at the remote end, so there's no real challenge (and we still see the source tables).


Distributed select with local and remote objects

Now with a query that hits both sides of the fence.

explain plan for select * from small_large@lb, dual d;
select * from table(dbms_xplan.display);

--------------------------------------------------
| Id  | Operation          | Name        | Rows  |
--------------------------------------------------
|   0 | SELECT STATEMENT   |             | 10000 |
|   1 |  NESTED LOOPS      |             | 10000 |
|   2 |   TABLE ACCESS FULL| DUAL        |     1 |
|   3 |   REMOTE           | SMALL_LARGE | 10000 |
--------------------------------------------------


Now it says 10,000 rows.

explain plan for select * from large_small@lb, dual d;
select * from table(dbms_xplan.display);

----------------------------------------------------------
| Id  | Operation          | Name        | Rows  | Bytes |
----------------------------------------------------------
|   0 | SELECT STATEMENT   |             |    10 |   110 |
|   1 |  NESTED LOOPS      |             |    10 |   110 |
|   2 |   TABLE ACCESS FULL| DUAL        |     1 |     2 |
|   3 |   REMOTE           | LARGE_SMALL |    10 |    90 |
----------------------------------------------------------


Just 10 rows estimated there.

Distributed select with local and remote objects  
 
The only difference is the join order in the view definitions and that difference only comes into play with a distributed query over the DB link.

Anyone with a eidetic memory may realise I'm just reprising a post from my blog a few years back. [That was one of my first - I haven't been going quite as long as Tom or Doug.] Really I just wanted to see if anything had changed in the past couple of database releases. It hadn't, and 11gr2 does the same thing.

To take it one step further, I tried a trace. It isn't easy picking trhough the recursive SQL, but there was one query that looked for object dependencies for the view. It queried sys.dependency$ with an order by on a column order# (which isn't exposed in the dba_dependecies view). As long as the dependency is in the FROM clause (and I'm still working on how it determines that) it seems the entry with the highest order# is the one used to determine the cardinality used by the optimizer.




So, sometimes, the order of tables in the FROM clause is important. But not very often.

No comments: