Wednesday, October 06, 2010

Fun with ROWNUM

Conventional wisdom has it that the only predicate that is effective with ROWNUM is in the form of ROWNUM < (positive integer). Okay, a <= works too. Or a "BETWEEN 1 AND ...", but you get the idea.

In practical terms, that is true. But in totally impractical terms, you can get some interesting effects with a predicate like "rownum in (3,4)" if you use it in conjunction with an OR. Here is a simple demo. Firstly, load up a table with some data.


create table test
  (id number not null, val varchar2(30) not null);
  
create index text_id_ix on test (id, val);
create index text_val_ix on test (val, id);


insert into test (id, val) values (1,'Zebra');
insert into test (id, val) values (2,'Toucan');
insert into test (id, val) values (3,'Rhino');
insert into test (id, val) values (4,'Monkey');
insert into test (id, val) values (5,'Llama');
insert into test (id, val) values (6,'Frog');
insert into test (id, val) values (7,'Emu');
insert into test (id, val) values (8,'Dog');
insert into test (id, val) values (9,'Cat');


Next query it through a simple table scan.


select /*+ NO_INDEX (t) */               * from test t 
where rownum in (3,4) or val between 'Cat' and 'Emu';



        ID VAL
---------- -----
         7 Emu
         8 Dog
         9 Cat




Exactly what you'd expect without the ROWNUM predicate. But now tell it to use the index on the value column...


select /*+ INDEX_FFS (t text_val_ix)*/   * from test t 
where rownum in (3,4) or val between 'Cat' and 'Emu';


        ID VAL
---------- --------
         9 Cat
         8 Dog
         7 Emu
         6 Frog

So, CAT, DOG and EMU match the BETWEEN predicate. However, the scan continues onto the remaining rows in the index and, because CAT, DOG and EMU have been allocated ROWNUMs one, two and three, then the next candidate (FROG) can be assigned ROWNUM four which is passed by the ROWNUM predicate. Since it is an OR, then the row is passed into the eventual result set even though it doesn't satisfy the other criteria.

So why didn't we get four rows on the FULL SCAN ? Well, that is because the EMU, DOG and CAT rows were the last rows inserted into the table. Once the full scan has picked up those three, there aren't any more rows left in the table to be assigned ROWNUM four.

With ROWNUM, the physical ordering of rows in a table and the choice of query path can affect which rows are returned. However by using a really DAFT predicate, you can actually affect the NUMBER of rows returned too.

It's not the only way to have the same query on the same data set return different numbers of rows though.A predicate using DBMS_RANDOM can do it or the use of the SAMPLE clause. Both are intentionally random though. Function based indexes on non-deterministic functions can have similar effects, as I blogged previously.

No comments: