Thursday, January 10, 2013

Greatest, Least and NULLs

The functions greatest and least are amongst the special ones which can take lots of arguments. They are a bit similar to min and max for columns rather than rows. But they have one difference to the aggregate functions that meant I avoided them for a long time.

Unlike aggregate functions, which ignore null values,  greatest and least will return a null if any of the supplied columns (or expressions) are null. If that isn't what you want (and it often isn't) it means putting NVLs around everything and with substitution values that don't break the logic.

For example, 


create table state_sum
  (style varchar2(10), 
  nsw number, vic number, wa number, qld number, 
  sa number, tas number, nt number, act number);
  
begin
  insert into state_sum values
    ('Red',5,3,9,null,1,7,8,null);
  insert into state_sum values
    ('Blue',null,null,null,null,null,null,null,null);
  insert into state_sum values
    ('Green',10,9,8,7,6,5,4,3);
  commit;
end;
/

If I try a 
SELECT least(nsw,vic,wa,qld) 
FROM state_sum 
WHERE style  = 'Red';
It will return a null (because there's a null value for the QLD column);

I could try
SELECT least(nvl(nsw,0),nvl(vic,0),nvl(wa,0),nvl(qld,0)
FROM state_sum 
WHERE style  = 'Red';

But that returns a 0 because that has been substituted for the QLD null. To get the lowest 'real' value, I'd need to NVL with a value so high that it could never be the lowest. A similar problem occurs for greatest except you can probably get away with a zero....up until you get a set of negative numbers. And don't even get me started on character and date values.

In short, I never found these functions to be as useful as intended.

But once in a while this apparent drawback in their management of null values does make them useful.

Coalesce, greatest/least and null tests

I have become a fan of coalesce as NVL on steroids. 
I never liked NVL2. It might be useful, but if you can't actually give it a decent name, then you shouldn't have bothered. Or maybe just wait till you invent the concept of overloading and simply have the existing two-parameter NVL and an additional three-parameter NVL instead of NVL2.

If you haven't used it, it returns the first no-null value in the list. And, like greatest/least, it can take lots of arguments.

SQL> select style from state_sum
  2  where coalesce(nsw,vic,wa,qld,sa,tas,nt,act) is null;

STYLE
----------
Blue

SQL> select style from state_sum
  2  where coalesce(nsw,vic,wa,qld,sa,tas,nt,act) is not null;

STYLE
----------
Red
Green

But what if you want to know if any of the values are null. That is, you want Red and Blue, but not Green (or vice versa).
Yep, you guessed it. That's where you can use  greatest/least.


SQL> select style from state_sum
  2  where least(nsw,vic,wa,qld,sa,tas,nt,act) is null;

STYLE
----------
Red
Blue

SQL> select style from state_sum
  2  where least(nsw,vic,wa,qld,sa,tas,nt,act) is not null;

STYLE
----------
Green

And it is easy enough to combine them 

SQL> select style,
  2         case when least(nsw,vic,wa,qld,sa,tas,nt,act) is null 
                 then 'Y' else 'N' end nulls_present,
  3         case when coalesce(nsw,vic,wa,qld,sa,tas,nt,act) is null 
                 then 'Y' else 'N' end all_null
  4  from state_sum;

STYLE      N A
---------- - -
Red        Y N
Blue       Y Y
Green      N N


1 comment:

Anonymous said...

Gary,

in order to avoid the NULL-dilemma with least
( take your example

SELECT least(nsw,vic,wa,qld)
FROM state_sum
WHERE style = 'Red';

)
I always use the following technique

SELECT
least(
coalesce(nsw,vic,wa,qld),
coalesce(vic,wa,qld,nsw),
coalesce(wa,qld,nsw,vic),
coalesce(qld,nsw,vic,wa)
)
FROM state_sum
WHERE style = 'Red';


Matthias