Tuesday, January 16, 2007

Column level dependencies in 11g

One of the reported 11g enhancements is "No recompilation of dependent objects when columns added to tables"

This is going to be an interesting challenge for Oracle as it is not as easy as it first seems. Or at least if they take the simple route it's going to disguise a gotcha for some shoddy code.

Take this little procedure (a useless stub just to demonstrate the issue)

PROCEDURE prc_test (pi_location in VARCHAR2) IS
v_num number;
SELECT count(*) INTO v_num
FROM emp e, dept d
WHERE location = pi_location
and dept_mgr = emp_no;

A decent PL/SQL developer would immediately spot that we can't see which of emp or dept has the columns location, dept_mgr or emp_no. However, if location and dept_mgr are only in dept, and emp_no is only in emp, then the SQL will parse successfully and the procedure compiles fine.

So what will happen in 11g if 'location' or 'dept_mgr' is added to emp or 'emp_no' to dept ?
  • Option 1 is that Oracle will not only maintain the dependencies on columns that exist, but also on ones that don't exist. That way it can perform the necessary invalidation. This is no different to the rows with 'NON-EXISTENT' type that appear in the user_dependencies view when you refer to objects that rely on public synonyms.
  • Option 2 is that the procedure isn't invalidated but the SQL falls over if the SQL is executed.
  • Option 3 is that the procedure isn't invalidated and the SQL continues to run as always....until something else forces recompilation of the procedure somewhere down the line and THEN it fails.
Options 2 and 3 aren't pretty, but I guess Oracle could get away with those without much complaint.

But what happens if a column 'pi_location' was added to emp or dept ?

A decent Oracle shop has variable naming standards in place and a big reason for that is to ensure that the names of PL/SQL variables can't conflict with column names. If every Oracle shop had that in force and guaranteed, Oracle wouldn't have a problem. If....

To accommodate this, Option 1 has to extend to recording dependencies on the names of PL/SQL variables used in SQLs. With this, as in the current versions of Oracle, the procedure is marked as invalid when the column is added. It can recompile successfully, but testing should show up the new error (and the invalidation of the procedure should prompt for it to be tested).

If Oracle doesn't maintain that dependency, then the procedure continues to run as normal (using a pi_location bind variable) until at some point it is recompiled and switches to seeing pi_location as a column and produces unexpected results. If a procedure suddenly produces unexpected results, then you tend to look at what has changed recently. Who is going to look at causes that may have been happened months ago ?

I'm not an 11g Beta tester, so I don't know what approach Oracle has taken. It may be Option 1 and we may benefit from an insight into our PL/SQL variables. It may have a totally different solution. One option would be a tentative recompile if the table structure changes, and only invalidating the original if the tentative recompile comes up different. Another approach may be to change the precedence of PL/SQL variables and column names. Currently, if you create a database function with the same name as a PL/SQL variable used in an SQL, the PL/SQL variable is still used. It is only column names which take precedence.

Whatever is offered in 11g, it's just another reason to enforce some decent variable naming standards.

No comments: