Wednesday, June 22, 2011

The exceptional case of the backwards null

Yes, Virginia, there is a unique NULL. The trick is to go backwards.

At my son's school, children in the younger years are buddied up with a child from an older year. Using this as example, I'm going to use the database to ensure that no child is buddied up to more than one other child. If we have an odd number of children, an unlucky one will have no buddy, but if our uniqueness constraint can ensure that there can only be one null, we can make sure we never have more than one 'un-buddied' child.

CREATE TABLE child
(child_id      NUMBER NOT NULL,
child_name     VARCHAR2(20) NOT NULL,
buddy_id       NUMBER,
CONSTRAINT child_pk PRIMARY KEY (child_id)
)
/
CREATE UNIQUE INDEX child_uk ON child (buddy_id);

insert into child  (child_id, child_name)
values  (1,'Adam');

insert into child  (child_id, child_name)
values  (2,'Bill');

insert into child  (child_id, child_name)
values  (3,'Chris');

update child set buddy_id = 3 where child_id != 3;
ERROR at line 1:
ORA-00001: unique constraint (PERFORM.CHILD_UK) violated


The UPDATE fails because of the uniqueness constraint. But it didn't object to multiple NULL values on the INSERTs.

Reading the documentation, we find out that DESC indexes will treat multiple NULLs as duplicates and so will not allow them.

Let's give it a try

TRUNCATE TABLE child;
DROP INDEX child_uk;
CREATE UNIQUE INDEX child_uk ON child (buddy_id DESC);
insert into child  (child_id, child_name)
values  (1,'Adam');

insert into child  (child_id, child_name)
values  (2,'Bill');
ERROR at line 1:
ORA-00001: unique constraint (PERFORM.CHILD_UK) violated

Yup, documented behavious matches actual behaviour. 

Why ? Well, the DESC index is actually a function-based index using SYS_OP_DESCEND. That takes every byte of the input and subtracts its ascii value from 255. Then it sticks an extra byte of value 255 on the end (so the descending order has 'ABC' coming after 'ABCD'). The NULL gets converted to a single byte with the value 0.

At least in 10g, the optimizer isn't smart enough to use the index for a 'WHERE buddy_id IS NULL' predicate, but will for a one with a predicate of SYS_OP_DESCEND(buddy_id) =SYS_OP_DESCEND(null)

PS. I'm not enforcing referential integrity on the buddy_id, so children may end up with imaginary friends. I'm also not enforcing that if Fred's buddy is Barney, then Barney's buddy is Fred. Those are left as an exercise for the reader.

No comments: