Sydney Oracle Lab: March 2011

Thursday, March 31, 2011

How to un-confuse SQL Developer with ORA-00932

Sometimes SQL Developer will throw back a message like "ORA-00932: inconsistent datatypes: expected NUMBER got -". One way to get this is to try to execute

select collect(column_name) from user_tab_columns

The problem the tool is having is that it is getting back a data type which it doesn't recognize.

If you don't already have a defined collection type, then go create one as they are really useful. This is mine

CREATE OR REPLACE TYPE TAB_CHAR IS table of VARCHAR2(4000);

Then you can do an explicit CAST in the select statement and the error will go away.

select cast(collect(column_name) as tab_char) from user_tab_columns

Wednesday, March 30, 2011

Implicit rounding on INSERT

Last week I had another quiz on the PL/SQL Challenge. There was a twist in its tail which meant that only about a quarter of players got it right.

Consider the following table and procedure.

CREATE TABLE plch_products
(
   product_id NUMBER (2, 0)
, product_name VARCHAR2 (10)
, stage NUMBER (2, 0)
, CONSTRAINT plch_products_pk PRIMARY KEY (product_id)
)
/

BEGIN
   INSERT INTO plch_products
   VALUES (1, 'Mouse', 10);
   COMMIT;
END;
/

CREATE OR REPLACE PROCEDURE ins_upd_product
   (i_product_id IN NUMBER
   , i_product_name IN VARCHAR2)
IS
BEGIN
   INSERT INTO plch_products (product_id, product_name, stage)
   VALUES (i_product_id, i_product_name, 20);
EXCEPTION
   WHEN DUP_VAL_ON_INDEX
   THEN
   UPDATE plch_products
   SET product_name = i_product_name, stage = 30
   WHERE product_id = i_product_id;
   WHEN OTHERS
   THEN
   UPDATE plch_products
   SET product_name = i_product_name, stage = 40;
END;
/

If I pass in a product_id of 1.1, the value will get implicitly rounded during the INSERT to match the scale/precision of the table column. Because of this the product_id gets rounded to 1 and the insert fails due to a duplicate key. However in the exception handler, the UPDATE does NOT round the value for comparison so no rows match the criteria and none are updated..

So what can we do to prevent this problem.

Firstly, at least in 10g, the PL/SQL warnings don't cover scale/precision mismatches. They only cover data type mismatches (date to number, or even CHAR to VARCHAR2).

%TYPE anchoring can help, but it has to be in the declare section. Using %TYPE in the procedure signature doesn't help because, again, those just relate to data type.

CREATE OR REPLACE PROCEDURE ins_upd_product
   (i_product_id IN NUMBER
   , i_product_name IN VARCHAR2)
IS
  v_product_id plch_products.product_id%type;
  v_product_name plch_products.product_name%type;
BEGIN
   v_product_id := i_product_id;
   v_product_name := i_product_name;
   INSERT INTO plch_products (product_id, product_name, stage)
   VALUES (v_product_id, v_product_name, 20);
EXCEPTION
   WHEN DUP_VAL_ON_INDEX
   THEN
   UPDATE plch_products
   SET product_name = v_product_name, stage = 30
   WHERE product_id = v_product_id;
   WHEN OTHERS
   THEN
   UPDATE plch_products
   SET product_name = v_product_name, stage = 40;
END;
/

With this variation, the rounding happens within the PL/SQL assignment. The rounded value is used in both the insert and the update, so the latter will succeed. Moreover, if someone passes an excessively large product name to the procedure, this will get picked up on the relevant assignment line, rather than in the execution of the update statement.

Personally, I prefer the assignment statements to happen after the BEGIN, rather than assigning the value in the declaration section. This allows for any exceptions to be caught in the local exception handler. The disadvantage is that you can't use a CONSTANT.

Tuesday, March 29, 2011

Everything happened at once

Okay, it has taken around 14 billion years for everything to happen so far, and there's still a lot more to go. However I'm interested in the concept of 'time' in Oracle terms.

Redo log sequence

In a single Oracle instance, the redo log views everything as a sequence with one change after the other. Only in a RAC setup can you have multiple instances of the same database doing things simultaneously. Outside of recovery we are not too fussed with the redo log.

Concurrent Activity

Individual Oracle sessions can make changes to separate data blocks concurrently, though you are limited by the number of CPU cores (because only the CPU can actually process instructions to read or change a block). You can even configure multiple DBWR processes so that you can write multiple blocks to disk at the same time. Again, you might be throttled by the hardware of spindles and disk heads.

System Change Number

Bridging the gap between the redo log and the 'parallel' activity of the database is the SCN. Every commit increments the SCN, plus it gets incremented every three seconds even without a commit. The LGWR also writes at least every three seconds. Coincidence ?

Flashback Query

If your transaction performs three inserts then all three inserts may be under the same SCN or they may be under different SCNs depending on the activity in other sessions, and the time taken between those inserts.

Since you can't match an SCN to points in a transaction, Oracle sensibly decided to keep things simple and return the COMMITTED state of the data as it was at the SCN. That was the key for my Challenge question earlier this month.

In some ways, you can view a flashback query as a very lightweight autonomous transaction that only permits reads.

Undo sequencing

Within a transaction there is a sequence of changes, and you can 'navigate' that sequence using SAVEPOINT and ROLLBACK. This navigation walks back through UNDO to reverse the sequence of operations within a transaction. SAVEPOINT doesn't affect the SCN, so flashback can't be used to look at a table as it was when a SAVEPOINT was issued. All you can do is rollback to a savepoint, and once you've done that you can't rollforward again.

In theory you could walk back through the undo blocks to see everything the transaction did. If you could do that, you'd also be able to tell what records are locked by the transaction. But I haven't heard of any tools that support this in practice. There's no SQL to access undo directly, and the undo blocks are not necessarily written to disk as you'd find in a redo log, but nor are they guaranteed to still be in the SGA (though they would be in one or the other).

Saturday, March 26, 2011

An exceptional question for the optimizing compiler

I thought about submitting this as a PL/SQL Challenge quiz, but decided it was unfair, undocumented and unhelpful.

Consider the following procedure

create or replace procedure plch_test is
  v_num number(1);
  v_date date;
  e_no_day_for_month exception;
  pragma exception_init(e_no_day_for_month, -1839);
begin
  for i in reverse 1 .. 30 loop
   v_date := to_date(i||'-02-2011','dd-mm-yyyy');
   v_num := 10;
   dbms_output.put_line(v_num||':'||v_date);
  end loop;
exception
  when e_no_day_for_month then
   dbms_output.put_line('Date error');
  when value_error then
   dbms_output.put_line('Num error');
end;

Then I execute it twice with different optimization settings. What error is returned for the executions ?

alter session set plsql_optimize_level =0;

alter procedure plch_test compile;

exec plch_test;

alter session set plsql_optimize_level =2;

alter procedure plch_test compile;

exec plch_test;

In the first case, with no optimization, the "date error" path is taken as there isn't a 30th of February.

In the second case, you may wonder what effect the optimization might have. The assignment of the number 10 to v_num doesn't need to be in the loop. This isn't a cursor loop or a WHILE loop, so the code inside the loop is guaranteed to be executed. If it was executed once prior to entering the loop, it would result in a value_error.

In fact, a date error is still returned. Possibly there is no such optimization. Or possibly it is done in a way that means it is only executed once, in the correct sequence, but is not repeated. Tests shown in AskTom indicate that there is some optimization happening.

Ultimately, the PL/SQL optimizing compiler is a black box. It would be nice to assume that it wouldn't change the behaviour of the code. But I can't help looking at the procedure and wondering, would it be "wrong" (or more specifically a bug) if an optimization switched it to return a value error.

Thursday, March 17, 2011

Monitoring with Images and Slugs

After my Blogging Jeopardy! post, John Piwowar recommend I use FeedBurner to better keep count of my audience. My blind spot is aggregators. I'm very happy to have my content aggregated, and it goes out through OraDBPedia, OraNA, OraFAQ and probably some others too. Those aggregators pick up my feed, extract the records (content and metadata, such as author and date posted) and merge it into their feed. And people pick up that merged feed without ever touching the original feed, and that's where I lose track of them.

But images are magic. In an RSS feed, images are just a URL. The actual image data has to picked up by the end viewer. My images go through Picasa, as all my blog/website stuff is tied up in Google. Picasa tells me that an image in a recent post of mine was downloaded about 400 times, indicating that post had about that many viewers.

A similar tactic is exploited by mass mailers. When your email client talks about blocking remote content, it is probably talking about images, though Javascript libraries and CSS files are another, more dangerous, culprit. But outside of the security aspect, the email client warns you about 'privacy'. That is because there is a good chance that the image is being used to track whether you opened the email.

If I do a mass email to Tom, Dick and Harry, I can include a unique image URL in each one, in the same way as I address them individually with Dear Slug. I'll obfuscate it a bit to hide the fact, but essentially I'll have src="http://images.sydoracle.com/mailImage?id=Tom" If Tom wants to see the image, he has to get the content by calling the URL which tells me that Tom's message has been opened.

In my Oracle focussed world, I'd have one table of 'email templates' (with the HTML text and image data as CLOB and BLOB respectively), one 'subscriber' table with names and email addresses, and an 'email' table representing the many-to-many between the two. The email table would have the personalized URL. I'd use the Apex Listener to supply images in the same way as I did for PDFs, but this time I'd use a procedure. The listener calls the procedure, which picks the email via the personalized URL, marks it as read and then returns the actual image data from the parent email template.

If anyone wants an itty-bitty Apex app for that, let me know through a comment (or twitter, email or LinkedIn...) 11.2.0.2 (maybe we should call it 11gR2d2) added SSL for UTL_SMTP, making it a better option for sending out emails. Whether the XE version will have that remains to be seen. Anything I have would use the old, unencrypted technique.

PS.
If anyone doubts the approach of having the image data in the database, I can recommend Marcelle's ODTUG webinar on "Audio & Video Management Using PL/SQL". The ODTUG webinars are recorded and you get the URL if you are registered for the webinar. Not sure if I'm allowed to post the URL, but you could ask Marcelle.

Sunday, March 13, 2011

Blogging Jeopardy!

Please phrase your article in the form of a question.

I'm not quite sure how many people read my blog (see footnote). I can be sure that about a thousand people take the PL/SQL Challenge every working day. They take the time to visit the site, read the question, think about it and come up with an answer. I assume most will actually check the correct answer the next day (and maybe think about that too). So if I can phrase my article or concept in the form of a question for the Challenge, I can reach a much wider audience that I do through my blog. And they will think about what I wrote.

So far I've had one question published (25th of Jan 2010), taken by 1090 people with a typical time of 2 minutes and about two thirds got it all correct. And about 200 submitted a survey afterwards. That is a lot of eyeballs. I've got another one coming soon and I recently submitted a third.

So if you've got something you want people to know, whether you have an existing blog or not, if you can phrase it in the form of a PL/SQL question, go submit it. Alternatively, just go and play. The challenge awaits.

Footnote:

Google Analytics talks of about a thousand page views last month (though misses a week when I did something and lost my old stats). Blogger stats says about 2,700 which is more comforting. I expect some will pick me up through an aggregator and won't get counted and I'm not sure if a bunch of people who take my feed online, eg though Google Reader, are counted individually or not.

Saturday, March 12, 2011

Why I don't trust deferrable constraints

I don't like deferred constraints. I don't like deferrable constraints either, because there is the risk of someone making them deferred.

There's a Pythian blog on one potential corruption pitfall where you can end up with duplicate primary keys.

A little over a year ago, I commented on a blog by Chrisian Antognini to demonstrate where the optimizer gives incorrect results by invalid table elimination with a deferred constraint.

I've recently retested this in the 11.2.0.2 environment on apex.oracle.com and seen a change in behavior, though the bug is still there.

Firstly, create the parent and child tables, linked by a deferrable constraint that is INITIALLY IMMEDIATE (ie not deferred by default).

CREATE TABLE t1 (
   id NUMBER NOT NULL,
   val varchar2(5),
   CONSTRAINT t1_pk PRIMARY KEY(id)
);

CREATE TABLE t2 (
   id NUMBER NOT NULL,
   t1_id NUMBER NOT NULL,
   val varchar2(5),
   CONSTRAINT t2_pk PRIMARY KEY(id),
   CONSTRAINT t2_t1_fk FOREIGN KEY (t1_id) REFERENCES t1
   deferrable initially immediate
);

Then add a view to encapsulate the join

CREATE VIEW v12 AS
SELECT t1.id AS t1_id, t1.val AS t1_val,
   t2.id AS t2_id, t2.val AS t2_val, t2.t1_id AS t2_par_t1
FROM t1, t2
WHERE t1.id = t2.t1_id;

Because I was testing on the apex.oracle.com instance, I needed to do it as a single request (because the web-based SQL Worksheet treats a request as a transaction). So I used a stored procedure.

CREATE OR REPLACE PROCEDURE test_v AS
BEGIN

  EXECUTE IMMEDIATE 'alter session set constraints=deferred';
  INSERT INTO t2 (id, t1_id, val) VALUES (10,2,'Child');
  --
  FOR i IN (SELECT t2_id, t2_val, t2_par_t1 FROM v12) LOOP
   DBMS_OUTPUT.PUT_LINE('1:'||i.t2_id||'/'||i.t2_val||
   '/'||i.t2_par_t1);
  END LOOP;

  FOR j IN (SELECT t1_id FROM v12) LOOP
   DBMS_OUTPUT.PUT_LINE('2:'||j.t1_id);
  END LOOP;

  FOR k IN (SELECT t1_id, t1_val FROM v12) LOOP
   DBMS_OUTPUT.PUT_LINE('3:'||k.t1_id||'/'||k.t1_val);
  END LOOP;

ROLLBACK;

END;
/

The procedure sets the transaction to use deferred constraints, then inserts a row into the child table (without the corresponding parent row). Then there are three selects taking different columns from the view.

The first loop picks details that exist only in the child table, the second picks the common identifier and the third picks up details from the parent table.

In XE, only first loop returns records, but the second and third do not. The first loop "works" because the join to the parent table is eliminated on the assumption that the constraint requires that a parent row exists.

In 11.2.0.2, the first and second loops return records. The second loop can work if you assume that, because the ids are common, then it makes no logical difference if you take it from the child table or the parent table. If you assume the parent table will have at most one row for the key then the number of records returned is fully dependent on the child table. It is therefore more efficient to just hit the child table, and it is am improvement to eliminate the join to the parent.

In neither situation does the third query return a row. Because it requires the parent table to be hit, the optimizer cannot eliminate it and when it is hit, it is found to be empty.

Ultimately, the 'solution' would require every cursor to have additional dependencies on the state of every deferrable constraint for the query...or to simply assume that any constraint that is deferrable might be deferred and to optimize it as if it doesn't exist.

Friday, March 11, 2011

DBAs and Developers - complementary skill sets

This is a comment on one slide from Cary Millsap's NoCoug presentation

The slide showed the skill sets needed by Developers and DBAs. He suggested that there was a partial intersection in skills.

DBAs	Developers
Operations	Algorithms
Resource management	Languages
Troubleshooting	Frameworks
Physical data design	Design
System interfaces	User interfaces
Data modeling	Data modeling
SQL optimizer	Debugging, profiling, tracing
Oracle product features	Business-domain processes

Personally, I'd match up the developer's Debugging skill against the DBAs Troubleshooting. Both are forms of problem resolution, but the developer's focus will be on algorithm/process with the DBA responsible for the Operations.

I like the thought that both need to understand data modelling. I'm not convinced where the Developer's responsibility for the data model passes to the DBA, or whether there is an analyst/architect/designer column missing. It probably varies between organizations. DBAs are definitely responsible for tablespaces and file concepts. Constraints and indexes fall into a DMZ between the two. I recall getting into a heated discussion with HJR on the old Dizwell forums about whether the DBA should be able to make constraints deferrable by default. I still think constraints are a 'logical' concept arising from the business model.

System interfaces are about the only other area I'd be fuzzy on. I'd still put these as heavily 'business/process' constructs that sit with the developers.

Any major skill sets missed ? Anything in the wrong box ?

Thursday, March 10, 2011

Listening for my PDFs

After the last post I had some PDFs in my database. I therefore took advantage of the opportunity to user the Apex Listener to get them out. It can just as easily serve up images, video, audio and all manner of other files.

I used the standalone mode. That means I downloaded the listener, and just ran it, through java, from the command line. No tomcat, glassfish or any other application server involved. I used port 8083 as I was running on the same machine as the database and already had the embedded gateway listening on 8080.

java -Dapex.port=8083 -Dapex.images=D:\oraclexe\apex\images -Dapex.erase=true -jar d:\oraclexe\apexlistener\apex.war

'Install' is pretty much down to setting a couple of passwords to start it up.

Then go to the admin screen and point it at an Oracle install just as you would any other client. In the database I had to unlock APEX_PUBLIC_USER, and make sure it had grants to the table I wanted to use. I could have used another username/password though. I could have several listeners running on different ports attached to different schemas.

Then switch to the Resource Template tab, enter the URL and query and set the Type to Media Resource.

Downloading the file is just a matter of putting the filename in the URL. Of course in a real application, there would be a proper sequence generated primary key rather than a filename.

http://127.0.0.1:8083/apex/pdf?filename=NoCOUG_Journal_200602.pdf

I ran the browser on the same machine too, but that isn't necessary or normal. The database can be sitting listening on 1521 on machine "Blue", the apex listener on machine "Green" connected to the database on 1521 and listening on 8083, and the browser can run on machine "Red" and just talk to "Green" on 8083.

In an XE environment you might run the Apex Listener on the same machine as the database (as XE is limited to 1 CPU and 1 GB of memory, so even a basic machine still has resources spare). This is actually quite secure as you can use the firewall to shut off port 1521 from external machines, forcing all database activity through a more controlled and less privileged account, just RESTful APIs through the URL.

Hey, that could take off. Just need a cool name for accessing a database as a service with no SQL.

Tuesday, March 08, 2011

Using Oracle to convert PDF to HTML

I've used Oracle Text on a couple of assignments. One of the handy things it can do is convert a PDF into an HTML or plain text document. Actually, you can do this sort of thing through Google Docs too, by uploading a PDF and saving it back to your desktop as a different file type. But I'm going to demo the Oracle mechanism.

Firstly, you need an Oracle directory. You may already have one. If not, your DBA will need to create it as the CREATE ANY DIRECTORY privilege is needed and this is one of those very dangerous privileges that a cautious DBA will keep under lock and key. If, like me, you are using a personal XE environment, you can do what you want.

create or replace directory pdf_files
as 'WHATEVER YOUR SOURCE DIR IS';

grant all on directory pdf_files to public;

The next step is a table to store the PDF in (as a BLOB) and the appropriate Oracle Text index.

create table ctx_demo

  (filename varchar2(30),

   text blob)

/

create index search_idx

  on ctx_demo (text)

  indextype is ctxsys.context

  parameters ('sync (on commit)')

/

The third step is a few simple procedures. One to store the file in the blob, the second to get the converted HTML as a CLOB, and the final procedure to write the CLOB as a file. In practice, they'd live happily together in a package.

create or replace procedure load_file
(i_filename in varchar,

   i_dirname in varchar2 default 'PDF_FILES')
is
l_blob blob;
l_bfile bfile;
begin
insert into ctx_demo (filename, text)
    values ( i_filename, empty_blob() )
    returning text into l_blob;
l_bfile := bfilename( i_dirname, i_filename );
dbms_lob.fileopen( l_bfile );
dbms_lob.loadfromfile
    (l_blob, l_bfile, dbms_lob.getlength( l_bfile ) );
dbms_lob.fileclose( l_bfile );
end;
/

create or replace function get_html
(i_filename in varchar2) return clob
is
v_rowid rowid;
v_res clob;
begin
select rowid into v_rowid
from ctx_demo
where filename = i_filename;
--
CTX_DOC.FILTER(index_name => 'SEARCH_IDX',
    textkey => v_rowid,
    restab => v_res,
    plaintext => false);
    return v_res;
end;
/
create or replace procedure write_html

(i_filename in varchar2)
is
v_clob     clob;
v_out_file UTL_FILE.file_type;
v_buffer   VARCHAR2(16000);
v_amount   BINARY_INTEGER := 8000;
v_pos      INTEGER := 1;
v_clob_len INTEGER;
begin
v_clob := get_html(i_filename||'.pdf');
v_clob_len := DBMS_LOB.getlength(v_clob);
v_out_file := UTL_FILE.fopen('PDF_FILES',
                      i_filename||'.html', 'w', 32767);
WHILE v_pos <> v_clob_len LOOP
DBMS_LOB.read (v_clob, v_amount, v_pos, v_buffer);
    UTL_FILE.put(v_out_file, v_buffer);
    UTL_FILE.fflush(v_out_file);
    v_pos := v_pos + v_amount;
END LOOP;
UTL_FILE.fclose(v_out_file);
END;
/

Once all the preparation is done, it is a simple matter of loading the PDF and saving the html.

For this demo, I'm using an old NoCoug journal, as they recently loaded issues for the past ten years. If you are interested in the complete collection, I've got the links listed on my NoCoug Journal page. Using XE, this conversion only works up to the August 2006 issue. The later ones (except for February 2009) don't get any useful output. Possibly a later Oracle edition would cope better with these PDFs. I'm crossing my fingers that the Easter Bunny will bring me a chocolate covered XE 11g.

begin

  load_file('NoCOUG_Journal_200605.pdf');

  write_html('NoCOUG_Journal_200605');

  commit;

end;

/

Remember PDFs aren't designed to be reverse engineered to HTML documents, and Oracle Text is just interested in the text, so there aren't any pictures.

Saturday, March 05, 2011

How do I filter data when doing an import or export

When you need a subset of data for testing, it is very useful to be able to specify the filtering in the export or import. That's a lot easier than copying a full dataset and waiting while it deletes most if it.

This is very easy using the Datapump tools in 10g. Both expdb and impdb have the facility to accept a filter criteria. As a quick example:

Step 1 - Create the table. It has 16,000 rows (roughly)

>sqlplus gary/gary

SQL*Plus: Release 10.2.0.1.0 - Production
SQL> create table bunch_of_data
  2 as select owner, table_name, column_name, data_type
  3 from all_tab_columns;

SQL> select count(*) from bunch_of_data;

  COUNT(*)
----------
   16346

Step 2. Export a subset of the data

>expdp userid=gary/gary dumpfile=bunch.dmp directory=ext_tables
   tables=BUNCH_OF_DATA query='WHERE DATA_TYPE=''DATE'''

Export: Release 10.2.0.1.0 - Production on Friday, 04 March, 2011 17:45:17

Copyright (c) 2003, 2005, Oracle. All rights reserved.

Connected to: Oracle Database 10g Express Edition Release 10.2.0.1.0 - Production
Starting "GARY"."SYS_EXPORT_TABLE_01":
userid=gary/******** dumpfile=bunch.dmp
directory=ext_tables tables=BUNCH_OF_DATA
query='WHERE DATA_TYPE=''DATE'''

Estimate in progress using BLOCKS method...
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 1024 KB
Processing object type TABLE_EXPORT/TABLE/TABLE
. . exported "GARY"."BUNCH_OF_DATA" 30.10 KB 513 rows
Master table "GARY"."SYS_EXPORT_TABLE_01" successfully loaded/unloaded
******************************************************************************
Dump file set for GARY.SYS_EXPORT_TABLE_01 is:
  D:\APPS\AR122007\BUNCH.DMP
Job "GARY"."SYS_EXPORT_TABLE_01" successfully completed at 17:46:00

Step 3. Import a subset of that export (into another schema in this test)

>impdp userid=gary_dba/gary dumpfile=bunch.dmp
   directory=ext_tables
   tables=BUNCH_OF_DATA
   query='WHERE OWNER = ''SYSTEM'''
   remap_schema=gary:gary_dba

Import: Release 10.2.0.1.0 - Production on Friday, 04 March, 2011 17:46:02

Copyright (c) 2003, 2005, Oracle. All rights reserved.

Connected to: Oracle Database 10g Express Edition Release 10.2.0.1.0 - Production
Master table "GARY_DBA"."SYS_IMPORT_TABLE_01" successfully loaded/unloaded
Starting "GARY_DBA"."SYS_IMPORT_TABLE_01":
  userid=gary_dba/******** dumpfile=bunch.dmp directory=ext_tables
  tables=BUNCH_OF_DATA
  query='WHERE OWNER = ''SYSTEM''' remap_schema=gary:gary_dba
Processing object type TABLE_EXPORT/TABLE/TABLE
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
. . imported "GARY_DBA"."BUNCH_OF_DATA" 30.10 KB 2 out of 513 rows
Job "GARY_DBA"."SYS_IMPORT_TABLE_01" successfully completed at 17:46:11

Partial export and partial import all complete.

Wednesday, March 02, 2011

FOR EACH ROW trigger fires at least once per row

I have a table with a 'BEFORE UPDATE FOR EACH ROW' trigger.
The table has just one row and I update that row a thousand times. How many times does the trigger fire ?

It depends...though I'm not sure on what. This was the basis for one of Steven Feuerstein's PL/SQL Challenge questions. Here's a demonstration.

DROP TABLE plch_employees purge;
CREATE TABLE plch_employees
( employee_id INTEGER
, salary NUMBER)
/

INSERT INTO plch_employees VALUES (100, 0);

CREATE OR REPLACE PACKAGE plch_emp_count
IS
   g_count NUMBER;
END plch_emp_count;
/

CREATE OR REPLACE TRIGGER plch_employee_changes
   BEFORE UPDATE
   ON plch_employees
   FOR EACH ROW
BEGIN
   plch_emp_count.g_count := plch_emp_count.g_count + 1;
END;
/

CREATE OR REPLACE FUNCTION f_trg_test
   (i_iterations IN NUMBER default 1000000)
RETURN VARCHAR2 IS
  v_ret varchar2(100);
  v_cnt number := 1;
BEGIN
  plch_emp_count.g_count := 0;
  WHILE v_cnt < i_iterations LOOP
   UPDATE plch_employees
   SET salary = salary + 1
   WHERE employee_id = 100;
   IF plch_emp_count.g_count > v_cnt THEN
   v_ret := 'MisMatch at :'||v_cnt;
   v_cnt := i_iterations;
   END IF;
   v_cnt := v_cnt + 1;
  END LOOP;
  v_ret := NVL(v_ret, 'Reached '||plch_emp_count.g_count);
  commit;
  return v_ret;
END;
/
set serveroutput on
exec dbms_output.put_line(f_trg_test);

When I run it on my XE instance, I get a mismatch, generally after 718 rows. I get another, smaller, peak at 11518. These indicate a restart which is discussed in a couple of Tom Kyte posts listed here.

In summary a restart is where the session has found a row that needs update, fires the row trigger, then tries to update the row...and doesn't. It then retries the update, finding the row again and succeeds.

Apparently there is some discussion about 'single-session' restarts being related to UNDO.

I'll leave the reasons open. But it is worth remembering that with a BEFORE UPDATE trigger, FOR EACH ROW means it will fire at least once for each row, not exactly once.

Tuesday, March 01, 2011

Keep the faith

A frequent query request is that, when you've used MAX or MIN to find the highest, latest or otherwise most favoured row(s), you want the values of the other columns in that row.

Often a suggested answer is the use of a sub-query. But this overlooks one of those less well known features in Oracle SQL, the KEEP clause.

Consider the test case:

create table test_max
  (id number,
   colour varchar2(20));

insert into test_max values (1,'Red');
insert into test_max values (2,'Orange');
insert into test_max values (3,'Yellow');
insert into test_max values (4,'Green');
insert into test_max values (5,'Blue');
insert into test_max values (6,'Purple');
insert into test_max values (7,'Black');
insert into test_max values (8,'White');
insert into test_max values (9,'Yellow');
insert into test_max values (10,'Yellow');

The maximum value of the 'colour' column is 'Yellow'. To make it a bit trickier, I've included three rows for the colour Yellow, with the IDs 3, 9 and 10. This allows us to see what happens if more than one row matches that maximum value.

select max(colour),

min(id) keep (dense_rank last order by colour asc) min_id,

max(id) keep (dense_rank last order by colour asc) max_id,

count(id) keep (dense_rank last

order by colour asc) cnt_col,

count(id) keep (dense_rank last

order by colour asc, id desc) cnt_id

from test_max;

MAX(COLOUR) MIN_ID MAX_ID CNT_COL CNT_ID

------------- ------ ------- ------- -------

Yellow 3 10 3 1

The query tells us that the highest colour is yellow, and of the 'yellow' rows, the lowest id is 3, the highest is 10 and that there are three in total. Whether you use DENSE_RANK LAST and ORDER BY ... ASC or DENSE_RANK FIRST and ORDER BY ... DESC is a readability issue, and should be judged on a case-by-case basis.

The final count demonstrates that, by adding further ordering criteria, it is possible to detail down to an individual result.

If you have Tom Kyte's STRAGG installed, you can use that too. [My own implementation is odd in that I expect the delimiter to be concatenated into the parameter value, rather than coding it into the aggregate function.]

I haven't found a way to pull in the new 11g LISTAGG function which has its own syntactic peculiarites.

select stragg(id||',') keep (dense_rank last