Sydney Oracle Lab: 2005

Thursday, December 29, 2005

Raptor has been set free

As a slightly late Christmas present, Raptor is now available for download (at about 60 Mb for the Windows install).

On very first impressions, I'm not convinced on the PL/SQL editor (though maybe it doesn't match my formatting style), but as a straight query tool it is looking good.

Wednesday, December 28, 2005

12 Selects of christmas (part 1)

Since we are still in that festive spirit, here's a four parter on 12 different things to do with a SELECT (which is a nice alternative to 12 things to do with leftover turkey)

1. Plain select with a side order of flashback

You can't really start anywhere other than a plain select. However to add some flavour, I've include the FLASHBACK clause.


create table test_tab (id number, val varchar2(100));

begin
dbms_lock.sleep(60);
  -- In 9i you may have to wait three minutes, not just one.
end;
.
/

insert into test_tab
select rownum, to_char(add_months(sysdate,rownum),'fmMonth')
from all_tables
where rownum <= 12; 
commit;
SELECT * from test_tab AS OF TIMESTAMP SYSDATE-(1/(24*60));

Now, try this one.

delete from test_tab where rownum < 5;
select * from test_tab;
select * from test_tab as of
 SCN dbms_flashback.get_system_change_number;

Using FLASHBACK to go to the current time/SCN allows you to select the current COMMITTED state of the table, that is without all the changes your session has made. An interesting feature, but I am unable to think of any practical use for it.

2. Inline View

Here I'll show the first and last days of the months in the current year.
Rather than repeat the TO_DATE(...rownum...) operation for each column, I'll select that in an inline view.

select to_char(val_date,'DD-Mon-YYYY') first_date,
      to_char(last_day(val_date),'DD-Mon-YYYY') last_date
from (select to_date('01-'||rownum||'-2005','DD-MM-YYYY') val_date
     from all_objects where rownum <=12) order by val_date;

3. Scalar subquery

Now how many objects did I create in each month.

select to_char(val_date,'DD-Mon-YYYY') first_date,
      to_char(last_day(val_date),'DD-Mon-YYYY') last_date,
      (select count(*) from user_objects
      where created >= val_date
      and created < add_months(val_date,1)) num_obj
from (select to_date('01-'||rownum||'-2005','DD-MM-YYYY') val_date
    from all_objects where rownum <=12)
order by val_date;

Friday, December 23, 2005

Heading towards Christmas down under.

I've been in Australia for about seven years now. I'm finally adjusting to Christmas actually being HOT not cold. Sydney is expecting 38 tomorrow (100 for those in the US) and that takes a lot of adjusting. Apart from thirty years of associating Christmas with heavy coats, gloves and hoping my toes don't fall off, all the Christmassy songs (except for this one) have got that Northern Hemisphere bent. Snow, sleighs, Jack Frost nipping at your heels and winter wonderland all over the place. After watching a Winnie the Pooh Christmas cartoon, my three-year old son asked whether it would snow soon. Luckily he didn't like his only contact with snow, eighteen months ago on a day trip to Lithgow in the Blue Mountains, so he wasn't too disappointed when I said that it wasn't likely.

But I'm still having trouble with the daylight. There's just so much of it. Every night, my son wants to see the "Christmas house", which is basically any of several houses that has LOTS of lights. The problem is, it doesn't get dark until after his bedtime. Still we've allowed a few late nights, and one with his baby sister. He was pretty much asleep at the 'Carols by candlelight' before we even lit our 'candle' (one of those glowing plastic sticks which are less likely to start bushfires).

While I'm talking about Australia and children, the ABC (Australian Broadcasting Corporation, not the American one) is our national broadcaster. It is funded out of general taxation, unlike back in England where you had to be licenced to own anything as dangerous as a television set. It's children's television segment is presented by a rabbit called Mixy.

Rabbits were introduced into Australia over a hundred years ago, partly as food and partly as something to hunt. The thing about rabbits is that breed, well, like rabbits. Start with a couple and you soon have a whole continent full. So in the 1950's myxomatosis was introduced into Australia to kill off as many rabbits as possible. So the national broadcaster thinks it perfectly reasonable that its Children's TV segment is presented by a feral, diseased pest. I'm unsure whether that is because it is how Australians view children, or whether its what they want them to grow up to be.

And finally, on children's TV, there's also a disturbing similarity between the Teletubbies and the Australian Federal government. For those without small children, the Teletubbies are group of four rather over-fed individuals who live in a grass-covered dome in the far from anywhere. Often, with overtones of Big Brother, trumpets will emerge from the ground from which anonymous voices will tell the Teletubbies what to do (though they may be naughty and ignore the advice).
The Australian Federal Government is based in a grass-covered dome in Canberra, built roughly equidistant between Sydney and Melbourne with the sole purpose of being a place to put the government. They often get advice from anonymous public servants, some of which is followed. And both Teletubbies and politicians speak in a babble from which you can understand the occasional word but which, on the whole, generally makes pretty little sense.
The Teletubbies at least have the Noo Noo who comes to clean up after them.....

Friday, December 16, 2005

Campaign for the abolition of commented out code

AskTom recently had a post on the ten commandments of Oracle development. The commandment that generated the most comment was on commenting code.

I do agree with useful comments in code though I'm sure I don't do it as much as I should (New Year's resolution time). However my personal pet hate is commented out code. These days I general use a pretty editor with syntax highlighting, so it's a lot easier to ignore all the 'green stuff'. In the days of black and white editors (and some people still LIKE vi), it is very easy to lose track of what is 'real' code and what was code three years ago but isn't anymore. [I've read that some 'spawn of vi' editors do have syntax highlighting, but all I get is plain vanilla vi.]

IMHO (and that's all this blog is) dead code belongs in source control history and no place else. If you absolutely MUST comment out code, use the -- form of comments so that people don't have to page up and down to work out whether the code they are looking at is still being used or not.

Thursday, December 15, 2005

Shock - Wikipedia found to be accurate

Okay, not quite, but at least no less inaccurate than Britannica in a few places.

But how important is accuracy ? A while ago I considered a posting about the various Oracle forums around, and then got side-tracked into how responses can be measured. When a question is posted in a forum, what the questioner is looking for in an answer is USEFULNESS rather than ACCURACY. There are other factors, such as PROMPTNESS too, but lets not go there now.

There's a lot of overlap between 'usefulness' and 'accuracy'. You can be inaccurate by being incomplete (missing exceptions that may apply in different environments, Oracle versions and so forth) but if the answer applies to the questioners system, they will still find it useful. You can be inaccurate by being wrong about the 'why' but if the 'what' is right, you may still get away with it. It's only when the suggested action does not bring about the desired result that the response is bad.

And accurate answers aren't necessarily useful. "You'll need to upgrade to version ... to do that" isn't normally helpful and some questions simply don't have a useful answer. "I'm in noarchivelog mode, haven't backed up since the weekend, and just poured beer all over my disk drive at the Christmas party. What do I do now ?" Update the resume perhaps ?

Encyclopedias aren't forums of course. For the most part, the information in an Encyclopedia isn't at all useful. Are you likely to consult an Encylopedia before buying a car or camera, moving house, rewiring the electrics or trying a spot of open-heart surgery ? Generally, you'd look for a specialist resource before attempting any of those tasks, or just look for a specialist.
Encyclopedias are great for when you have a casual interest in a topic, such as working out where the country you team drew in the World Cup actually is. Curiosity satisfied, you can put the book back on the shelf (or close the browser), job done. You don't actually act on the information.

The other main use for Encyclopedias is the School Project, which just about says it all as far as usefulness is concerned. And there the advantage is with Wikipedia. If the teacher sees a word-for-word copy of the article, you can simply claim that you wrote both. That doesn't quite work for the Britannica.

So when you are thinking about a response, remember to aim for usefulness, and hopefully you'll be accurate too. And remember, AskTom rankings are based on 'Not Useful' ... 'Most Useful' too.

Wednesday, December 14, 2005

A little blog rework

A while back Andrew Clarke referred to an article about the Top Ten Blog Mistakes. In light of its content, I've made a couple of small changes. Also, Blogger now has a trackback option, so I've enabled that.

1. No Author Biographies
I've got a author biography, and I'm happy with it at the moment. If you think you want to know more about me (no, not my credit card number) add a comment.

2. No Author Photo
I've added one, though its about four years old. I'll confess to a habit, when I'm burning CDs (and DVDs), of not wanting to waste space. So I tend to throw in any old digital camera images laying around on the grounds that any sort of backup copy is better than none (which is my normal backup strategy). This photo was on a CD I've carried round containing a bunch of handy Oracle documentation, SQL scripts, emergency applications....

3. Nondescript Posting Titles, 4. Links Don't Say Where They Go and
9. Forgetting That You Write for Your Future Boss
Those are ones to deal with for individual posts. However in regards to the latter, a blogger should never forget they are writing for their current and past bosses too. Confidentiality should be respected, and remember your workmates may be reading what you say about them....

5. Classic Hits are Buried and 6. The Calendar is the Only Navigation
Not sure I've got any 'Classic Hits' yet, but I've set up an link in my entry page to index my 'Oracle Tips and Tricks'. Now I've just got to remember to keep that page up to date. In future I may add one for 'Opinion'.

"Most weblog software provides a way to categorize postings so users can easily get a list of all postings on a certain topic."
Don't know of this in blogger/blogspot. If you do, I'd be interested in it.

7. Irregular Publishing Frequency
I disagree with this. Given the prevelance of RSS feeds I think people can keep up to date with a blog irrespective of how long between posts. At least I hope so. I had a month long gap when time was in short supply.

With blogger/blogspot, the main limit of RSS is that it doesn't cover comments. I do get email for all new comments and I'm going to make a new resolution to make more of an effort to acknowledge them.

8. Mixing Topics
"If you have the urge to speak out on, say, both American foreign policy and the business strategy of Internet telephony, establish two blogs. "
Not convinced on this. I subscribe to orablogs, and I can see that non-Oracle posts don't really belong there. I'll see if there's any real demand for a split. If so, I'll carve out a second blog (Igor's Rest Room ?). Also, it's nice to hear what is going on in other Oracle blogger's lives. Reminds us that, underneath it all, we are people first.

10. Having a Domain Name Owned by a Weblog Service
Not convinced on this one either. I can see a benefit in having your blog on your own hosted service, but there's a lot more admin hassle and cost there. The Domain name is really a separate issue, and while domain names are cheap I don't see the benefit in having a domain name that points to service I don't have real control over.

An intro to my Oracle Tips and Tricks

I expect this entry to undergo regular updating as I add in more Oracle Tips and Tricks.

Variants on the STRAGG (String Aggreagte) function

A new variant on stragg - the ordered aggregate
Analytics with ORDER BY and DISTINCT STRAGG

Performance Tuning

My query is slow.... What we need before we can help
Another Stats Extraction Routine
10053 - The Optimizer dump
Running DBMS_XPLAN against V$SQL_PLAN
Explain Plans and Scalar Subqueries
Views, DB Links and the CBO and Views, DB Links, the CBO and indexes

An Oracle Forms Ad-Hoc Query tool
Functions as datasources and NO_DATA_FOUND
Locating an error in a query run using Dynamic SQL
PL/SQL Garbage Collection - How memory used by variables is reclaimed
Recording the commit time on a record
When empty is not empty - Why a table with no rows can still be big

Monday, December 12, 2005

PL/SQL Garbage Collection

In another blog a question was raised about PL/SQL garbage collection, and how memory used, for example, by PL/SQL table is reused. I'm sure it's been covered elsewhere, but it never hurts to restate things so....


set serveroutput on size 1000000

declare
  procedure show_mem (p_text in varchar2) is
    cursor c_mem is 
     select a.name, b.value
     from v$statname a, v$mystat b
     where a.statistic# = b.statistic#
     and a.name like '%pga %';
  begin
    dbms_output.put_line('---'||p_text||'---');
    for i in c_mem loop
      dbms_output.put_line(rpad(i.name,35)||':'||lpad(i.value,20));
    end loop;
    dbms_output.put_line('------');
  end show_mem;
begin
  show_mem('before');
  declare
    type tab_char is table of varchar2(2000) index by pls_integer;
    t_char tab_char;
  begin
    for i in 1..2000 loop
      t_char(i) := rpad('a',2000);
    end loop;
    show_mem('in scope 1');
  end;
  show_mem('out scope 1');
  declare
    type tab_char is table of varchar2(2000) index by pls_integer;
    t_char tab_char;
  begin
    for i in 1..2000 loop
      t_char(i) := rpad('a',2000);
    end loop;
    show_mem('in scope 2');
  end;
  show_mem('out scope 2');
end;
.
/

---before---
session pga memory                 :             1556052
session pga memory max             :             1556052
------
---in scope 1---
session pga memory                 :             7781972
session pga memory max             :             7781972
------
---out scope 1---
session pga memory                 :             7781972
session pga memory max             :             7913044
------
---in scope 2---
session pga memory                 :            14007892
session pga memory max             :            14007892
------
---out scope 2---
session pga memory                 :            14007892
session pga memory max             :            14138964
------

PL/SQL procedure successfully completed.

USER@xe> /
---before---
session pga memory                 :             1687124
session pga memory max             :            14138964
------
---in scope 1---
session pga memory                 :             7781972
session pga memory max             :            14138964
------
---out scope 1---
session pga memory                 :             7781972
session pga memory max             :            14138964
------
---in scope 2---
session pga memory                 :            14007892
session pga memory max             :            14138964
------
---out scope 2---
session pga memory                 :            14007892
session pga memory max             :            14138964
------

At the original 'before' point, I had a current value of about 1.5 million.
It jumped up to 8 million when the PL/SQL table was loaded.
It stayed there even when the PL/SQL table went out of scope.
It jumped again to over 14 million when a new PL/SQL was loaded, and again stayed there.
However at the start of a rerun, it returns to under 2 million before jumping again as in the original run.

As a slight variation, I used an explict named procedure rather than an anonymous PL/SQL block.


set serveroutput on size 1000000

declare
  procedure show_mem (p_text in varchar2) is
    cursor c_mem is 
     select a.name, b.value
     from v$statname a, v$mystat b
     where a.statistic# = b.statistic#
     and a.name like '%pga %';
  begin
    dbms_output.put_line('---'||p_text||'---');
    for i in c_mem loop
      dbms_output.put_line(rpad(i.name,35)||':'||lpad(i.value,20));
    end loop;
    dbms_output.put_line('------');
  end show_mem;
  procedure set_tab (p_text in varchar2) is
    type tab_char is table of varchar2(2000) index by pls_integer;
    t_char tab_char;
  begin
    show_mem('before:'||p_text);
    for i in 1..2000 loop
      t_char(i) := rpad('a',2000);
    end loop;
    show_mem('after:'||p_text);
  end;
begin
  set_tab('1');
  set_tab('2');
  show_mem('end');
end;
.
/

---before:1---
session pga memory                 :             1752660
session pga memory max             :            14138964
------
---after:1---
session pga memory                 :             7847508
session pga memory max             :            14138964
------
---before:2---
session pga memory                 :             7847508
session pga memory max             :            14138964
------
---after:2---
session pga memory                 :             7847508
session pga memory max             :            14138964
------
---end---
session pga memory                 :             7847508
session pga memory max             :            14138964
------

This time, the memory used in the first execution of the named procedure is reused for the second, so there isn't that extra jump. This shows that there can be a benefit in putting a memory hungry process into a named procedure (or function).

As a final test, I tried


declare
  procedure show_mem (p_text in varchar2) is
    cursor c_mem is
     select a.name, b.value
     from v$statname a, v$mystat b
     where a.statistic# = b.statistic#
     and a.name like '%pga %';
  begin
    dbms_output.put_line('---'||p_text||'---');
    for i in c_mem loop
      dbms_output.put_line(rpad(i.name,35)||':'||lpad(i.value,20));
    end loop;
    dbms_output.put_line('------');
  end show_mem;
begin
  show_mem('before');
  declare
    procedure set_tab (p_text in varchar2) is
      type tab_char is table of varchar2(2000) index by pls_integer;
      t_char tab_char;
    begin
      show_mem('before:'||p_text);
      for i in 1..2000 loop
        t_char(i) := rpad('a',2000);
      end loop;
      show_mem('after:'||p_text);
    end;
  begin
    show_mem('before scope 1');
    set_tab('point 1');
    show_mem('after scope 1');
  end;
  show_mem('out scope 1');
  declare
    type tab_char is table of varchar2(2000) index by pls_integer;
    t_char tab_char;
  begin
    for i in 1..2000 loop
      t_char(i) := rpad('a',2000);
    end loop;
    show_mem('in scope 2');
  end;
  show_mem('out scope 2');
end;
.
/

---before---
session pga memory                 :             1818196
session pga memory max             :            14138964
------
---before scope 1---
session pga memory                 :             1818196
session pga memory max             :            14138964
------
---before:point 1---
session pga memory                 :             1818196
session pga memory max             :            14138964
------
---after:point 1---
session pga memory                 :             7913044
session pga memory max             :            14138964
------
---after scope 1---
session pga memory                 :             7913044
session pga memory max             :            14138964
------
---out scope 1---
session pga memory                 :             7913044
session pga memory max             :            14138964
------
---in scope 2---
session pga memory                 :             7913044
session pga memory max             :            14138964
------
---out scope 2---
session pga memory                 :             7913044
session pga memory max             :            14138964
------

That demonstrates that, you can get the benefits of the memory cleanup and retain a single program flow (ie not skipping to an out of line procedure), by defining a named PL/SQL procedure within the anonymous PL/SQL block.
These tests were done in XE.

Friday, December 09, 2005

A new variant on stragg - the ordered aggregate

In a previous post, I included an example of a variant of Tom Kyte's STRAGG (string aggregate) function that ordered the elements alphabetically. Here I'm showing one where the elements can be ordered differently. Basically, for every string to be aggregated, you need to pass in the value itself and its order position. It doesn't test to see if that order position is already taken, so if you have two values in sort position 3, one of them will be lost. Also, the order position should be an integer.

Firstly, create the necessary types. The first two are pretty obvious, the third is the aggregate type:


create or replace type sort_element is object
   (sort_seq number, sort_value varchar2(2000))
.
/

create type tab_sort is table of sort_element
.
/

create or replace type str_ord_agg_type as object
(
   t_sort tab_sort,
   static function
        ODCIAggregateInitialize(sctx IN OUT str_ord_agg_type )
        return number,
   member function
        ODCIAggregateIterate(self IN OUT str_ord_agg_type ,
                             value IN sort_element )
        return number,
   member function
        ODCIAggregateTerminate(self IN str_ord_agg_type,
                               returnValue OUT  varchar2,
                               flags IN number)
        return number,
   member function
        ODCIAggregateMerge(self IN OUT str_ord_agg_type,
                           ctx2 IN str_ord_agg_type)
        return number,
   member function
        ODCIAggregateDelete(self IN OUT str_ord_agg_type,
                            value in sort_element)
        return number
)
.
/

Then the type body for the aggregation operation


create or replace type body str_ord_agg_type
is
--
static function ODCIAggregateInitialize
    (sctx IN OUT str_ord_agg_type)
return number
is
begin
    sctx := str_ord_agg_type( null );
    --Initialize the sortage part
    sctx.t_sort := tab_sort();
    return ODCIConst.Success;
end;
--
member function ODCIAggregateIterate
     (self IN OUT str_ord_agg_type,
      value IN sort_element )
return number
is
begin
    --Extend the array and add the new element.
    t_sort.extend;
    t_sort(t_sort.last) := value;
    return ODCIConst.Success;
end;
--
member function ODCIAggregateTerminate
          (self IN str_ord_agg_type,
           returnValue OUT varchar2,
           flags IN number)
return number
is
  type tab_char is table of varchar2(4000) index by pls_integer;
  t_char tab_char;
  v_ind number;
  v_out_line varchar2(4000);
begin
  --Store the elements in the array using the sort seq
  v_ind := t_sort.first;
  while v_ind is not null loop
    t_char(t_sort(v_ind).sort_seq) := t_sort(v_ind).sort_value;
    v_ind := t_sort.next(v_ind);
  end loop;
  --
  --Go through in sort seq order, moving elements to the output string
  v_ind := t_char.first;
  while v_ind is not null loop
    v_out_line := v_out_line||t_char(v_ind)||',';
    v_ind := t_char.next(v_ind);
  end loop;
  --Remove last comma
  v_ind := length(nvl(v_out_line,',')) -1;
  returnValue := (substr(v_out_line,1,v_ind));
  return ODCIConst.Success;
end;
--
member function ODCIAggregateDelete
      (self IN OUT str_ord_agg_type,
       value in sort_element)
return number
is
  v_point number;
begin
  v_point := t_sort.first;
  while v_point is not null loop
    if  t_sort(v_point).sort_value = value.sort_value
    and t_sort(v_point).sort_seq = value.sort_seq then
      t_sort.delete(v_point);
      return ODCIConst.Success;
    end if;
    v_point := t_sort.next(v_point);
  end loop;
  return ODCIConst.Success;
end;
--
member function ODCIAggregateMerge
         (self IN OUT str_ord_agg_type,
           ctx2 IN str_ord_agg_type)
return number
is
  v_point number;
begin
  v_point := ctx2.t_sort.first;
  while v_point is not null loop
    t_sort.extend;
    t_sort(t_sort.last) := ctx2.t_sort(v_point);
    v_point := ctx2.t_sort.next(v_point);
  end loop;
  return ODCIConst.Success;
end;
end;
.
/

And finally the function :


CREATE or replace
  FUNCTION strordagg(input sort_element )
  RETURN varchar2
  PARALLEL_ENABLE AGGREGATE USING str_ord_agg_type;
.
/

grant execute on strordagg to public;

select table_name, index_name,
       strordagg(sort_element(column_position, column_name)) sorted
from user_ind_columns
group by table_name, index_name
/

In 10G, we can use the COLLECT function using the same sort_element and tab_sort types, but with a much simpler function, though the select looks a little uglier.


create or replace function sorted_agg
    (p_tab_sort in tab_sort) return varchar2 is
  type tab_char is table of varchar2(4000) index by pls_integer;
  t_char tab_char;
  v_ind number;
  v_out_line varchar2(4000);
begin
  v_ind := p_tab_sort.first;
  while v_ind is not null loop
    t_char(p_tab_sort(v_ind).sort_seq) := p_tab_sort(v_ind).sort_value;
    v_ind := p_tab_sort.next(v_ind);
  end loop;
  --
  v_ind := t_char.first;
  while v_ind is not null loop
    v_out_line := v_out_line||t_char(v_ind)||',';
    v_ind := t_char.next(v_ind);
  end loop;
  --Remove last comma
  v_ind := length(v_out_line) -1;
  return(substr(v_out_line,1,v_ind));
end;
.
/

select index_name,
       sorted_agg(cast(collect(sort_element(column_position, column_name)) as tab_sort)) sorted
from all_ind_columns
group by index_name
/

Wednesday, December 07, 2005

PL/SQL warnings in 10GR2

It's been over a month since my last post. I'm going to have to try to improve on that.
I've got several ideas, but not much free time to flesh them out.

I have been playing around with a sandbox install of XE, my first exposure to 10GR2.
The first thing I noticed was that GROUP BY often (mostly ?) does NOT sort in the GROUPed order.
Yes, I know that without an ORDER BY it was never guaranteed, but it is a wake-up call, and I can start getting into better habits now.

Often though, while the order isn't necessary, it can be useful. For example, when processing clients are using DBMS_APPLICATION_INFO to record the 'current' one, when they are ordered you get a feel for how far through the job it is.
Now I'll have to think about whether it is worth doing the additional sort to keep that feeling.

Now with XE, I've turn the PLSQL compiler warnings on by default.
Hopefully something else to get me into good habits.

However, I've found one where the warning is incorrect. Anyone know any others ?


USER@xe> alter session set plsql_warnings='ENABLE:ALL';

Session altered.

USER@xe> create or replace FUNCTION tf_to_char (p_boolean IN BOOLEAN) RETURN VARCHAR2 IS
2  BEGIN
3    --Converts a boolean (true/false/null) to a string which can be used for output
4    IF p_boolean THEN
5      RETURN 'TRUE';
6    ELSIF NOT p_boolean THEN
7      RETURN 'FALSE';
8    END IF;
9    return 'NULL';
10  END tf_to_char;
11  .
USER@xe> /

SP2-0806: Function created with compilation warnings

USER@xe> show errors
Errors for FUNCTION TF_TO_CHAR:

LINE/COL ERROR
-------- -----------------------------------------------------------------
6/9      PLW-06002: Unreachable code
USER@xe>

USER@xe> declare
2    v_test boolean;
3  begin
4    dbms_output.put_line(tf_to_char(v_test));
5    v_test:=true;
6    dbms_output.put_line(tf_to_char(v_test));
7    v_test := false;
8    dbms_output.put_line(tf_to_char(v_test));
9  end;
10  .
USER@xe> /
NULL
TRUE
FALSE

PL/SQL procedure successfully completed.

Friday, November 04, 2005

An Oracle Forms Ad-Hoc Query tool

If you are a Forms user, have you ever thought that it would be handy to have an SQL session running within a Forms session to examine data or do some tweaks ? Ever wanted a Forms version of TOAD ? Okay, this doesn't go that far, but it gives a chunk of that functionality.

A few years ago I wrote a form which allowed the user to enter a SELECT statement, hit a button and get the results back in a multi-record block. At the time I needed to see a bunch of data posted by a Form which hadn't been committed (and was throwing an error when it tried to commit). TOAD and SQL*Plus can't help much trying to show uncommitted data from a separate Forms session.

In the years since I've added bits, such as fetching DBMS_OUTPUT messages, and the Form evolved into a pretty handy tool. I say evolved because, if you look at the code, you'll see little evidence of intelligent design ;) . I have tried to put a few comments in, and kept the 'customizable' bits in obvious places.

I've previously posted it in Quest pipelines but now I've set up a blog where I can add small files (thanks alfablog), so I've posted it there as well. There's a Forms 6i FMB file, and a 4 page Word document describing most of what it can do. I have run versions under WebForms (6i,9i and 10g), though the layout needs some tweaking for that.

The blog item with the links is at Tadpole Form
The form can be taken directly from here with the document here.

No plans to actually move my blog at this time, but I will try to monitor both blogs if you want to post comments (bugs, requests for explanations etc.)

Monday, October 24, 2005

Mangling my PC partitions

Saturday was a BAD day. I had a chunk of unused space on my hard drive which I wanted to be able to use, and a magazine with Partition Manager 2005 as a freebie on its disk. My baby daughter had dozed off in her cot and my wife had taken my son to the library. I decided to go for it. Now, given that this was a small window of opportunity, there wasn't time for a full backup. Somewhere, there is a mischievous spirit who jumps at opportunities like this, so with half an hour of partition mangling done done and five minutes remaining on the clock, the power went off...

Once the power was back, the PC wanted to be booted from the Partition Manager recovery floppy. My PC simply wasn't interested, possibly because my floppies are at least five years old and have spent a lot of time in the laundry room. Given that a couple of weeks ago I had dumped Ubuntu Linux as a dual boot option and replaced it with Fedora 4, the Linux area was all fresh and, to my mind, expendable. So I re-installed Fedora in the hope that I'd be able to at least boot into it and see the Windows partitions and copy off a few of the more desirable files. Fedora went in fine, and I could see the main XP partition, but not the one with all my data one (because, of course, that was the one I'd been trying to merge with Partition Manager).

So I did a recovery install of Windows XP, which sort of worked. When I originally installed XP I had an old hard drive connected on C:, plus a DVD-writer on D:, a CD-writer on E: and a USB Flash drive on F:. So the main XP OS partition on the new hard drive named itself G:. After install, I created a new partition for all my data, which got called F: since in the meantime I'd removed the flash drive.

When I did the recovery install, I didn't have the USB drive in. The XP OS partition renamed itself from G: to F:. I fired up Partition Manager and it managed to recover the partition I had been mangling, but it was now G: instead of F:. So half my PC thinks all the stuff on my G: drive is on F: and vice versa, while the other half is quite content.

I SHOULD bite the bullet, copy out all the data files, clear off the entire drive and start everything from scratch. But my time comes in the small chunks that toddler and baby permit, so I know I'll end up checking everything on the PC to see what breaks, why and trying to fix it bit by bit.

But as a consolation, I've bought a new 1Gb flash drive for AUS$70 (I think about US$50). 1Gb. That old C: drive in my PC is about six years old and is just 4Gb and I'd spent those six years filling it up. Now I'm just carrying around 1Gb of storage. Amazing.

Friday, October 21, 2005

J2EE and ADF for Forms Developers workshop in Sydney

Yesterday I attended the "J2EE and ADF for Forms and Designer Developers workshop" by Sue Harper http://www.groundside.com/blog/content/SueHarper/ in Sydney (or at least the talk bit, though not the hands-on component).

Firstly, I'll declare my bias. I like Forms Developer (mostly). I like PL/SQL. I hate Reports Developer, but thats neither here nor there. However there seems to be a push away from Forms towards some form of Java-derived front end. Forms doesn't get the glamour press; the 'I had all these systems and solved my integration problems by running Java/XML/whatever over the lot" type press. It probably makes it harder to sell and I don't think Oracle are really trying to sell it. I think they should, and on the basis that having your application developed by your database (PL/SQL) developers, you'll get an application that runs better against the database. And you'll tie yourself into that database....Sorry, make that 'leverage your investment'.

That's what I don't like about the Java push. IT shops will look at "we need Oracle developers for the database and Java developers for the front end". There aren't enough good Java/Oracle developers to cover both bases. There's probably not even enough competant Oracle developers, they are still 'expensive'. But Java developers are being churned out of colleges, universities etc and they produce something pretty.

At a recent project, the PM wanted to show the end users the progress being made by development. It wasn't a matter of waving around a big bunch of code printouts or telling them how many batch routines had been written. They were shown the screens, and just the screens. When push comes to shove, pretty wins. And when the Java developers are saying the database is just a datadump and that everything can be written in Java, the database developers are going to get the pointy end of the stick.

And that's my beef with JDeveloper. I don't want to be told that, when you need a tricky bit of record validation, go and find a Java developer to write it and gradually you'll pick up enough so that you don't have to ask them as often. I've got a lot of PL/SQL experience, and some in C, Perl, COBOL and a couple of other languages. What I want is a PL/SQL to Java Rosetta Stone. This is how you translate PL/SQL collections to Java, this is how you call database procedures or SQL through Java, and so forth. That will give PL/SQL developers the knowledge to write Java.

Because my experience of Forms is that, while some Forms are simple base-table selects/insert/updates, the ones that aren't simple are the ones you spend the most time working on, and they contain lots of PL/SQL. And honestly, if I can't develop that in JDeveloper when I can in Forms, then I am not able to sell myself as a practical JDeveloper resource.

I wasn't able to stay around for the actual hands-on portion. That wasn't a great disappointment to me, as it isn't the way I'm comfortable learning. I'm distrusting of a rehearsed 'press here, drag this and its done' walkthrough. It reminds me of my first Oracle training for Forms 4.5 (or maybe 5.0), developing the normal DEPT/EMP style Forms. I want to try something tricky, with validations, changing defaults depending on this or that and so on. So I'll try JDeveloper on something, time permitting.

This has probably come across very negative, and is also probably unfair. I don't see that the J2EE approach is technically better than the Forms approach for the environments I've worked in or can see myself working in. But I feel it will may well get used there and I can't afford to be shut out. I hope I'm wrong and can stay with Forms. [My last experience of a Developer Day was about four years back. It was WAP this, WAP that and them telling us that we'd be developing everything to run on a mobile phone. I heard more WAP on that day than I'd heard before or in the four years since. Though I'll admit that today, I can at least see that sometime in the next five to ten years I will probably work on something that can be used on some type of mobile device.]

We'll see in six months or a year's time whether I've changed my tune about J2EE and JDeveloper. Convincing your friends about an issue, they'll give you the advantage of the benefit of the doubt. Convincing the enemy is always more of a challenge.

PS.

Reading back on this, I haven't given justice to the emphasis in the presentation on using the appropriate technology and the potential for integrating Forms and J2EE. That's an area which I need to look into more.

Wednesday, October 05, 2005

Recording the commit time on a record

It is often useful to record the time at which a record was inserted or updated, usually achieved by an appropriate column with a SYSDATE default or using triggers. However what is harder to achieve is the time that the record was committed. But rephrasing the question to "the time the record became visible to the rest of the database/application" a solution is suggested, using a background job to identify and update records which suddenly become 'visible'.

Here I create a test table with the normal 'created_on' column set as sysdate. I have another column set as the 'create_commit_on'. This will be left null by the insert, and a batch process will run in the background to pick up the null entries and set the column to sysdate. Actually, using DBMS_JOB the job won't pick up the exact commit time, but will normally be close enough.

An index is put on the create_commit_on column, using a decode so that ONLY the null values appear in the index. This will allow those records requiring update to be identified quickly (ie without a full scan of the table). [Not that full scans are evil, but if a large proportion of the table would require this update at one time, this probably isn't a practical technique for you unless it is a very small table.]

To demonstrate, we create the table and a procedure to insert into it with a pause of a few minutes between insert and commit. We'll run this a few times.
Another procedure is created to set the commit time, and is run as a background job repeating every minute. For this you will need job processing enabled, and the demo procedures also require access to DBMS_LOCK.

drop table test;

create table test
(id number, col_a varchar2(100),
 created_on date, create_commit_on date);

create index pend_ix on test (decode(create_commit_on,null,1,null));

drop sequence test_seq;

create sequence test_seq;

drop procedure set_commit_time;

create procedure set_commit_time is
begin
 update test set create_commit_on=sysdate
 where decode(create_commit_on,null,1,null) = 1;
end;
.
/

drop procedure add_test;

create procedure add_test is
 v_pause number;
begin
 v_pause := round(dbms_random.value(120,300));
 insert into test
 (id, col_a, created_on)
 values (test_seq.nextval, 'Waiting '||round(v_pause/60,1)||' minutes', sysdate);
 dbms_lock.sleep(v_pause);
end;
.
/

set serveroutput on size 100000

declare
 v_num number;
begin
 dbms_job.submit(v_num, 'begin set_commit_time; commit; end;',
   sysdate, 'sysdate + (1/3600)');
 commit;
 for i in 1..10 loop
   dbms_job.submit(v_num, 'declare n number; begin add_test; commit; end;');
   dbms_lock.sleep(10);
   commit;
 end loop;
 commit;
 dbms_output.put_line(v_num);
end;
.
/

column col_a format a30
alter session set nls_date_format ='DD/Mon/YY HH24:MI:SS';
Select * from test order by 1;

       ID COL_A                          CREATED_ON         CREATE_COMMIT_ON
---------- ------------------------------ ------------------ ------------------
        1 Waiting 3.2 minutes            05/Oct/05 11:50:58 05/Oct/05 11:54:11
        2 Waiting 2.3 minutes            05/Oct/05 11:51:08 05/Oct/05 11:54:11
        3 Waiting 4.6 minutes            05/Oct/05 11:51:18 05/Oct/05 11:56:22
        4 Waiting 3.4 minutes            05/Oct/05 11:51:29 05/Oct/05 11:55:05
        5 Waiting 3 minutes              05/Oct/05 11:51:39 05/Oct/05 11:55:05
        6 Waiting 3.6 minutes            05/Oct/05 11:51:50 05/Oct/05 11:55:56
        7 Waiting 4 minutes              05/Oct/05 11:52:00 05/Oct/05 11:56:22
        8 Waiting 2.6 minutes            05/Oct/05 11:52:10 05/Oct/05 11:55:05
        9 Waiting 4.4 minutes            05/Oct/05 11:52:21 05/Oct/05 11:57:14
       10 Waiting 2.5 minutes            05/Oct/05 11:53:26 05/Oct/05 11:56:22

In my example (the random waits mean yours would be slightly different) we can see that item 3 was created prior to item 4 but waited a minute less to commit and so was actually committed earlier. Similar with items 8 and 9/10.

The same technique could be applied to updates, as long as the updates all set the COMMIT_TIME column to null (possibly through a trigger).

Normally, it is difficult to pick records added or updated in a table since a particular time, as the delay between change and commit means that changes do not necessarily become visible in the order they are inserted. Even allowing, for example, a five minute window may miss long running transactions and requires processing to cater for duplicates.

By using this technique, we can use the time the record became visible. This is not a universal solution. The update would be onerous if a large number of rows had been changed and applying this technique to lots of tables would also have an impact and risk of deadlock unless each update is committed separately.

Wednesday, September 28, 2005

J Lewis 'Cost Based Oracle' Apress link

Apress have put in some details on Jonathon Lewis's book here
Price at $50 (US I guess) which is the same as Tom Kyte's
400 pages. That's less than Tom's (750+ pages), so maybe I'll be able to carry it around for reading on the train without my bag strap leaving a mark :)

Apart from that, no formal release date, and no information that wasn't on Jonathan's site

Tuesday, September 27, 2005

AskTom changes on a (distant?) horizon

Looks like there'll be some changes to AskTom coming sometime.

"I'm rewriting the site :) will be locking threads after 4 weeks (will allow for emergency bug fixes to the content to be alerted to me of course - so they won't be locked totally, you'll still be able to comment on them but if the comment isn't relevant to the existing text - a question about how it worked or pointing out a mistake - it'll never see the light of day)"

I think it will be an improvement. At the moment there is the conundrum of not wanting to miss anything, but also not wanting to plough through text from the several years ago. Plus topic drift makes searching more difficult as you can get lots of results returned where the titles don't seem to match what you are looking for but you need to look at the article to be certain.

The trick for Tom will be managing late comments that are 'sort-of' relevant. Maybe some will make it as distinct questions in their own right. To avoid that becoming a route for bypassing the "I'm not accepting questions now", you could simply hold off posting those as new questions for a week or two so that people get the message that it isn't a 'fast track' for answers.

Tuesday, September 20, 2005

Locating an error in a query run using Dynamic SQL

Well not everyone has managed to get to San Francisco. This morning I was debugging some of my code which was falling over trying to run a complex query though a refcursor. Unfortunately, in 9iR2 at least, Oracle's error message was a terse "Invalid argument" and didn't tell me the location of the problem.

So I threw together a quick procedure that uses the LAST_ERROR_POSITION function in DBMS_SQL to help me out. As you can see, its easy enough to plug into any routine and you can just put in the query and it will parse it, locate the position of the error and let you know. Obviously this example is trivial, but with an SQL with a couple of dozen lines it is pretty handy.

declare
 v_query varchar2(2000);
 rc_dummy sys_refcursor;
 PROCEDURE p_dynamic_sql_err_pos (p_query IN VARCHAR2) IS
   v_cursor        NUMBER;
   v_temp          VARCHAR2(32000) := p_query;
   v_tab           DBMS_SQL.VARCHAR2S;
   v_ind           NUMBER := 1;
 begin
   -- Pass in a query and it will be parsed and the location
   -- of any error will be reported
   v_cursor := dbms_sql.open_cursor;
   IF length(v_temp) > 2000 THEN
     WHILE nvl(length(v_temp),0) > 0 LOOP
       v_tab(v_ind) := substr(v_temp,1,250);
       v_temp := substr(v_temp,251);
       v_ind := v_ind + 1;
     END LOOP;
     dbms_sql.parse(v_cursor, v_tab, 1, v_ind - 1, FALSE, 1);
   ELSE
     dbms_sql.parse(v_cursor, v_temp, 1);
   END IF;
   dbms_sql.close_cursor(v_cursor);
 exception
   WHEN OTHERS THEN
     v_ind := dbms_sql.last_error_position;
     v_temp := substr(sqlerrm,1,100)||' before '||
                 substr(p_query,v_ind,60)||' ('||v_ind||') ';
     raise_application_error(-20001,v_temp);
 end;
begin
 v_query := 'select decode(1) from dual';
 begin
   open rc_dummy for v_query;
 exception when others then
   p_dynamic_sql_err_pos(v_query);
 end;
 close rc_dummy;
end;

Friday, September 16, 2005

Tracking my blog's "customers"

Peter Scott's blog recently looked at problems with identification of customers, and the first response remarked on how much harder it is with internet 'click' customers as opposed to 'real' people walking about. I'm going to talk about what I know, or think I know, about my blog's customers.

A blog's customers are its readers and despite (or because of) being new, I've kept an eye on my site stats to try to get a feel for my customers. Since I left my SiteMeter public, you can see them too, just by clicking on the link to SiteMeter at the bottom of the blog or here [go on, have a look at what I can tell about you :) ].

My item on Jonathon Lewis's book got triple the traffic of any other page though I don't think he should get too excited since we are still talking double figures here. But for me trying to understand my readers, it is interesting. Most of my visitors, where I can tell, have come from orablogs or a similar blog aggregator. Although there were links about the book on AskTom's front page and Jonathan's own page, neither featured in an RSS feed, so it was probably news to most readers. Of course, I've got no idea how many readers OraBlogs have, so I am probably judging by too small a sample.

Since my previous internet activities were in forums, my blog extended from there, being mostly technical and resembling answers to questions which hadn't actually been asked. While I'll still do that, the 'Jonathon Lewis' spike gave me a feel for my 'customers' suggesting that blogging, and being syndicated by OraBlogs, is more like having a chat in the pub (or coffee-shop) with your workmates.

The Oracle WTF [http://oracle-wtf.blogspot.com] definately has that "You'll never believe what I saw today..." feel and other blogs have newsy items such as what has been released, who is going where and so forth. I'm therefore going to try to throw some more non-technical talking points out and see what happens.

PS.
Judging by my SiteMeter data, especially with my relatively small traffic figures, its feasible to track down details of readers and especially those who leave comments. I wanted a counter at least to pick up hit numbers and was a bit surprised by the amount of information it recorded, especially with the referrer details. I think I can turn the referrer details bit off and/or remove public access from my site traffic. However I don't want to get into some of the heated discussions that can go on, and I don't expect people to post anything that they wouldn't say 'aloud'. Since the information is recorded I figured I might as well let people see what is collected about them. My own 'Freedom of Information' policy if you like.

I'm open to persuasion though ironically you'd need to leave a comment here to voice your argument. Obviously you can use one of the anonymiser sites, and I think the referrer bit can be blocked by stopping Javascript and/or following the instructions here for Mozilla/Firefox.

Wednesday, September 14, 2005

Jonathon Lewis book Cost Based Oracle due Nov

This looks like a book worth having if you really need to understand how to deal with Oracle's CBO.
Nothing on Amazon or the Apress website yet so I don't know the price. Unfortunately, there isn't an indication of size either. The optimizer is one of those subjects which could, if demonstrations and detailed analysis is shown, easily run to a thousand pages. While that information is valuable, it also puts up the cost and hence the price, which will limit its affordability. Hopefully any such detail has been consigned to the zip-file of scripts mentioned.

The technical reviewers were Wolfgang Breitling and Christian Antognini. While I hadn't previously encountered any of Christian's work, I followed the links and found one of his presentations and it's worth a read. I've referred to Wolfgang's work in my post on 10053 and judging by previous papers, I'd be confident that anything he had technically reviewed would be sound.

I do like the teaser in the preface - "The preface exists to tell you what’s not in Volume 1.".

Views, DB Links, the CBO and indexes

This builds on my previous post and I'll use the same LOTS_OF_PEOPLE table for my demonstration

Firstly, I'll create a view that does nothing but rearrange the columns in a table. A single table view might be used where you want a groups of users to only see some columns in a table or where you want to include a pseudo-column derived from one or more of the other columns in the table.

create view lp_view as select first_name, surname, lp_id, gender from lots_of_people;
Again, I'll use a DB Link and the local DUAL table to ensure the query is split between the local and remote databases.


EXPLAIN PLAN SET STATEMENT_ID = 'TEST1' INTO PLAN_TABLE FOR
select lp_id from lots_of_people@loop, (select 1 from dual where rownum = 1) d
where lp_id = 5;

select level, lpad('_',1*(level-1),'_')||operation||'('||options||')' operation,
object_owner||'.'||object_name obj_name,
ltrim(to_char(cardinality,'999,999,999')) no_rows, object_node
link, id, parent_id, other, filter_predicates
from PLAN_TABLE
start with id=0 and statement_id = 'TEST1'
connect by prior id = parent_id  and prior statement_id = statement_id
and prior timestamp <= timestamp  order by id, position


LEVEL OPERATION              OBJ_NAME  NO_ROWS LINK   ID  PARENT_ID
----- ---------------------- --------- ------- ---- ---- ----------
OTHER                                                                  FILTER_PRE
---------------------------------------------------------------------- ----------
  1 SELECT STATEMENT()     .               1         0
  2 _NESTED LOOPS()        .               1         1          0
  3 __REMOTE()             .               1 LOOP    2          1
SELECT "LP_ID" FROM "LOTS_OF_PEOPLE" "LOTS_OF_PEOPLE" WHERE "LP_ID"=5
  3 __VIEW()               USERXX.         1         3          1
  4 ___COUNT(STOPKEY)      .                         4          3
                                                                     ROWNUM=1
  5 ____TABLE ACCESS(FULL) SYS.DUAL    8,168         5          4

I've queried the plan table directly to show the OTHER column, containing the query sent to the remote database, which isn't shown by DBMS_XPLAN in 9.2.0.4.
You can see that the optimizer expects one row to be returned by the query on LP_ID. That indicates that it knows about the unique index on that column.

In contrast, we can see a similar query using the remote view


EXPLAIN PLAN SET STATEMENT_ID = 'TEST2' INTO PLAN_TABLE FOR
select lp_id from lp_view@loop, (select 1 from dual where rownum = 1) d
where lp_id = 5;

select level, lpad('_',1*(level-1),'_')||operation||'('||options||')' operation,
object_owner||'.'||object_name obj_name,
ltrim(to_char(cardinality,'999,999,999')) no_rows, object_node
link, id, parent_id, other, filter_predicates
from PLAN_TABLE
start with id=0 and statement_id = 'TEST2'
connect by prior id = parent_id  and prior statement_id = statement_id
and prior timestamp <= timestamp  order by id, position;


LEVEL OPERATION                 OBJ_NAME  NO_ROWS LINK   ID  PARENT_ID
----- ------------------------- --------- ------- ---- ---- ----------
OTHER                                                  FILTER_PRE
------------------------------------------------------ ----------
  1 SELECT STATEMENT()        .         825             0
  2 _MERGE JOIN(CARTESIAN)    .         825             1          0
  3 __VIEW()                  USERXX.   1               2          1
  4 ___COUNT(STOPKEY)         .                         3          2
                                                     ROWNUM=1
  5 ____TABLE ACCESS(FULL)    SYS.DUAL  8,168           4          3
  3 __BUFFER(SORT)            .         825             5          1
  4 ___REMOTE()               .         825     LOOP    6          5
SELECT "LP_ID" FROM "LP_VIEW" "LP_VIEW" WHERE "LP_ID"=5

Here it expects 825 rows , or 1% of the table, to be returned by the remote query. The index on the table doesn't show up against the view and so its information is unavailable.
Both queries in the 'OTHER' column will be passed to the remote database for parsing though, and there the view will be resolved against the table and the index can be used. However the fact that the local database doesn't know about the index can mean that the best plan is missed.

Finally, I'll include an extract of the 10053 dump file which, for the table, show the LP_ID column with no nulls and the same number as distinct values as the table's cardinality, plus an index on the column. The view shows no such information.


***********************
Table stats    Table: LOTS_OF_PEOPLE   Alias: LOTS_OF_PEOPLE
 TOTAL ::  CDN: 82522  NBLKS:  496  AVG_ROW_LEN:  37
Column:      LP_ID  Col#: 1      Table: LOTS_OF_PEOPLE   Alias: LOTS_OF_PEOPLE
   NDV: 82522     NULLS: 0         DENS: 1.2118e-05 LO:  1  HI: 82522
   NO HISTOGRAM: #BKT: 1 #VAL: 2
-- Index stats
 INDEX NAME: 0  COL#: 2 3
   TOTAL ::  LVLS: 2   #LB: 485  #DK: 78007  LB/K: 1  DB/K: 1  CLUF: 14253
 INDEX NAME: 0  COL#: 1
   TOTAL ::  LVLS: 1   #LB: 154  #DK: 82522  LB/K: 1  DB/K: 1  CLUF: 489
_OPTIMIZER_PERCENT_PARALLEL = 0
***************************************
SINGLE TABLE ACCESS PATH
 TABLE: LOTS_OF_PEOPLE     ORIG CDN: 82522  ROUNDED CDN: 1  CMPTD CDN: 1
 Access path: tsc  Resc:  32  Resp:  32
 Access path: index (iff)
     Index: 0
 TABLE: LOTS_OF_PEOPLE
     RSC_CPU: 0   RSC_IO: 11
 IX_SEL:  0.0000e+00  TB_SEL:  1.0000e+00
 Access path: iff  Resc:  11  Resp:  11
 Access path: index (unique)
     Index: 0
 TABLE: LOTS_OF_PEOPLE
     RSC_CPU: 0   RSC_IO: 1
 IX_SEL:  1.0000e+00  TB_SEL:  1.0000e+00
 Access path: index (eq-unique)
     Index: 0
 TABLE: LOTS_OF_PEOPLE
     RSC_CPU: 0   RSC_IO: 1
 IX_SEL:  0.0000e+00  TB_SEL:  0.0000e+00
One row CDN: 1
 BEST_CST: 2.00  PATH: 3  Degree:  1
 BEST_CST: 8.00  PATH: 2  Degree:  1

***********************
Table stats    Table: LP_VIEW   Alias: LP_VIEW
 TOTAL ::  CDN: 82522  NBLKS:  496  AVG_ROW_LEN:  37
_OPTIMIZER_PERCENT_PARALLEL = 0
***************************************
SINGLE TABLE ACCESS PATH
Column:      LP_ID  Col#: 3      Table: LP_VIEW   Alias: LP_VIEW
   NO STATISTICS  (using defaults)
   NDV: 2579      NULLS: 0         DENS: 3.8778e-04
   NO HISTOGRAM: #BKT: 0 #VAL: 0
 TABLE: LP_VIEW     ORIG CDN: 82522  ROUNDED CDN: 825  CMPTD CDN: 825
 Access path: tsc  Resc:  32  Resp:  32
 BEST_CST: 32.00  PATH: 2  Degree:  1
 BEST_CST: 8.00  PATH: 2  Degree:  1

Tuesday, September 13, 2005

Views, DB Links and the CBO

Two exceptions to some general conceptions about Oracle's CBO, both related to DB Links.
Firstly, that the order of the tables in the FROM clause is no longer relevant.
Secondly, that selecting from a view and from the underlying query are the same thing.

When a query involves objects held in a local database and on a remote database over the database link, Query Optimization has another level of complexity. Simply put, neither database has all the information, and so there is more guesswork involved.

Here's the setup :


drop table lots_of_people;
drop table lookup;

create table lookup (lookup_code varchar2(2) constraint l_pk primary key, lookup_desc varchar2(30));

create table lots_of_people
(lp_id number constraint p_pk primary key, surname varchar2(30), first_name varchar2(30),
gender varchar2(2) constraint lk_fk references lookup);

insert into lookup values ('M','MALE');
insert into lookup values ('F','FEMALE');

insert into lots_of_people (lp_id, surname, first_name, gender)
select rownum, table_name, column_name, decode(mod(rownum,1),1,'M','F')
from dba_tab_columns;

create index lp_ix on lots_of_people (surname, first_name);

BEGIN
DBMS_STATS.GATHER_TABLE_STATS(ownname => USER,tabname => 'LOOKUP',estimate_percent => 100,
method_opt => 'FOR ALL COLUMNS SIZE SKEWONLY', CASCADE=>TRUE);
DBMS_STATS.GATHER_TABLE_STATS(ownname => USER,tabname => 'LOTS_OF_PEOPLE',estimate_percent => 100,
method_opt => 'FOR ALL COLUMNS SIZE SKEWONLY', CASCADE=>TRUE);
END;
.
/

select table_name, num_rows from user_tables
where table_name in ('LOOKUP','LOTS_OF_PEOPLE')
/

TABLE_NAME                       NUM_ROWS
------------------------------ ----------
LOOKUP                                  2
LOTS_OF_PEOPLE                      82522

create view lots_of_people_vw1
as select lp.lp_id, lp.surname, lp.first_name, l.lookup_desc gender_desc
from lots_of_people lp, lookup l
where l.lookup_code = lp.gender;

create view lots_of_people_vw2
as select lp.lp_id, lp.surname, lp.first_name, l.lookup_desc gender_desc
from lookup l, lots_of_people lp
where l.lookup_code = lp.gender;

create database link loop connect to **USER** identified by **PASSWORD** using '**ORIGDB**';

Note: The two views differ only in the order of the tables in the FROM clause.
I used a database link back to the same user and database so I didn't need to involve either two users or two databases.

Now we'll try a query that can be fully resolved at the remote database. When no local objects are involved, the local database will throw the problem over to the remote database.


select * from lots_of_people_vw1@loop
where surname = 'PLAN_TABLE';

EXPLAIN PLAN SET STATEMENT_ID = 'TEST1' INTO PLAN_TABLE FOR
select * from lots_of_people_vw1@loop
where surname = 'PLAN_TABLE';

SELECT * FROM TABLE(dbms_xplan.display('PLAN_TABLE','TEST1') );

----------------------------------------------------------------------------------------------
| Id  | Operation                    |  Name           | Rows  | Bytes | Cost (%CPU)| Inst   |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT REMOTE      |                 |    61 |  2745 |     6   (0)|        |
|*  1 |  HASH JOIN                   |                 |    61 |  2745 |     6   (0)|        |
|   2 |   TABLE ACCESS BY INDEX ROWID| LOTS_OF_PEOPLE  |    61 |  2257 |     3   (0)| ORIGDB |
|*  3 |    INDEX RANGE SCAN          | LP_IX           |    61 |       |     3   (0)| ORIGDB |
|   4 |   TABLE ACCESS FULL          | LOOKUP          |     2 |    16 |     2   (0)| ORIGDB |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("L"."LOOKUP_CODE"="LP"."GENDER")
3 - access("LP"."SURNAME"='PLAN_TABLE')

Note: fully remote operation

That gives a sensible result.
Now try it using both a local and a remote table which, without a DRIVING_SITE hint will generally be resolved locally. I've used dual as the local table. From the point of view of the demonstration, the part of the plan(s) relating to dual can be ignored.

EXPLAIN PLAN SET STATEMENT_ID = 'TEST2' INTO PLAN_TABLE FOR
select * from lots_of_people_vw1@loop, dual d
where surname = 'PLAN_TABLE';

SELECT * FROM TABLE(dbms_xplan.display('PLAN_TABLE','TEST2') );

------------------------------------------------------------------------------------
| Id  | Operation            |  Name       | Rows  | Bytes | Cost  | Inst   |IN-OUT|
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |             |   163 |  1630 |    10 |        |      |
|   1 |  MERGE JOIN CARTESIAN|             |   163 |  1630 |    10 |        |      |
|   2 |   REMOTE             |             |     1 |     8 |     2 | LOOP   | R->S |
|   3 |   BUFFER SORT        |             |  8168 | 16336 |     8 |        |      |
|   4 |    TABLE ACCESS FULL | DUAL        |  8168 | 16336 |     8 |        |      |
------------------------------------------------------------------------------------

Note: cpu costing is off

Here it thinks only 1 row will typically be returned from the remote operation.
Now using the other view (which differed only in the table order in the FROM clause).

EXPLAIN PLAN SET STATEMENT_ID = 'TEST3' INTO PLAN_TABLE FOR
select * from lots_of_people_vw2@loop, dual d
where surname = 'PLAN_TABLE';

SELECT * FROM TABLE(dbms_xplan.display('PLAN_TABLE','TEST3') );

------------------------------------------------------------------------------------
| Id  | Operation            |  Name       | Rows  | Bytes | Cost  | Inst   |IN-OUT|
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |             |  6740K|   250M|  6632 |        |      |
|   1 |  MERGE JOIN CARTESIAN|             |  6740K|   250M|  6632 |        |      |
|   2 |   REMOTE             |             |   825 | 30525 |    32 | LOOP   | R->S |
|   3 |   BUFFER SORT        |             |  8168 | 16336 |  6600 |        |      |
|   4 |    TABLE ACCESS FULL | DUAL        |  8168 | 16336 |     8 |        |      |
------------------------------------------------------------------------------------

Note: cpu costing is off

Now it has decided on 825 rows (or 1% of the total number of rows in the LOTS_OF_PEOPLE table).

Finally, with the driving site hint.

EXPLAIN PLAN SET STATEMENT_ID = 'TEST4' INTO PLAN_TABLE FOR
select /*+DRIVING_SITE (l) */ * from lots_of_people_vw2@loop l, dual d
where l.surname = 'PLAN_TABLE';

SELECT * FROM TABLE(dbms_xplan.display('PLAN_TABLE','TEST4') );

-----------------------------------------------------------------------------------------------------
| Id  | Operation                    |  Name           | Rows  | Bytes | Cost (%CPU)| Inst   |IN-OUT|
-----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT REMOTE      |                 |   500K|    22M|    22   (0)|        |
|*  1 |  HASH JOIN                   |                 |   500K|    22M|    22   (0)|        |
|   2 |   TABLE ACCESS BY INDEX ROWID| LOTS_OF_PEOPLE  |    61 |  2257 |     3   (0)| ORIGDB |
|*  3 |    INDEX RANGE SCAN          | LP_IX           |    61 |       |     3   (0)| ORIGDB |
|   4 |   MERGE JOIN CARTESIAN       |                 | 16336 |   159K|    18   (0)|        |
|   5 |    TABLE ACCESS FULL         | LOOKUP          |     2 |    16 |     2   (0)| ORIGDB |
|   6 |    BUFFER SORT               |                 |  8168 | 16336 |    16   (0)|        |
|   7 |     REMOTE                   |                 |  8168 | 16336 |            | !      |
-----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("L"."LOOKUP_CODE"="LP"."GENDER")
3 - access("LP"."SURNAME"='PLAN_TABLE')

Note: fully remote operation

Back to a sensible plan.

Finally, using the query itself rather than a view :


EXPLAIN PLAN SET STATEMENT_ID = 'TEST5' INTO PLAN_TABLE FOR
select *
from (select lp.lp_id, lp.surname, lp.first_name, l.lookup_desc gender_desc
   from lots_of_people@loop lp, lookup@loop l
   where l.lookup_code = lp.gender) l, dual d
where l.surname = 'PLAN_TABLE';

SELECT * FROM TABLE(dbms_xplan.display('PLAN_TABLE','TEST5') );
------------------------------------------------------------------------------------------
| Id  | Operation             |  Name       | Rows  | Bytes | Cost (%CPU)| Inst   |IN-OUT|
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |             |   249K|    11M|    21   (0)|        |      |
|*  1 |  HASH JOIN            |             |   249K|    11M|    21   (0)|        |      |
|   2 |   REMOTE              |             |    15 |   555 |            | LOOP   | R->S |
|   3 |   MERGE JOIN CARTESIAN|             | 16336 |   159K|    18   (0)|        |      |
|   4 |    REMOTE             |             |     2 |    16 |            | LOOP   | R->S |
|   5 |    BUFFER SORT        |             |  8168 | 16336 |    16   (0)|        |      |
|   6 |     TABLE ACCESS FULL | DUAL        |  8168 | 16336 |     8   (0)|        |      |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

 1 - access("L"."LOOKUP_CODE"="LP"."GENDER")

[Looking at the 'OTHER' column in the plan table shows that ID 2 relates to the LOTS_OF_PEOPLE component and ID 4 is the LOOKUP component].

In summary, when presented with a object in a remote database, the local database doesn't have much idea about it. With a remote table, the remote database will supply some information on the number of rows and indexes. When it comes down to views, the remote database gives a 'guess', which appears to be the number of rows in the last table mentioned in the FROM clause.

In my next post, I'll look at the 10053 plan for remote views and discuss the repurcussions for index usage.

Ashes and my home PC woes. Nothing really Oracley

It's already been mentioned here but being a Pom living in Australia, I have to mention the Ashes victory too. The urn holding the Ashes stays in England whatever happens. Ironically, it has made one trip to Australia in 1988, after England last won the trophy in 1986/1987 and now England have won it again, is expected to make another journey for 2006/2007.

Meanwhile American readers are wondering how a game can go on for five days and STILL finish without a result. And anyone who has experienced an English summer wonders how the English could come up with a game that actually takes five days of fine weather to play anyway.
---------------------------------------------------------------------------

A few months ago, my home PC got an overhaul. The main bits was a new 80Gb hard drive, more memory and a switch from Windows 98 to XP. All in all it went pretty well (except for a new firewire card which won't talk to my videa camera).

Then two weeks ago, the PC got stroppy, and said it wouldn't talk to my new hard drive without an 80 wire cable. Apparently the 40 wire cable that I had for my old 4Gb hard drive wasn't good enough any more. A web search indicated that the 80 wire cable, while not mandatory, was faster and since it was only $5 it seemed a cheap enough fix. Swapped the cable, and the PC seemed fine.

I wanted Linux on the machine for many reasons. One of which was the install of Oracle10G on XP didn't go well (possibly because I refused to let Java.exe through the firewall, possibly because I try to rush these things in an hour or so between getting the toddler asleep and falling asleep myself). But hearing of someone who had his internet savings account hacked, and that Windows really needs twenty gazillion variants of spyware checked to be safe, I moved up my plans.

I didn't think far enough ahead when installing XP, so my 80Gb hard drive had two partitions taking up the first 30Gb of the drive. Apparently Linux likes being installed, at least in part, within the first 8.5Gb. So I created a third partition, moved all the windows stuff from the second partition to the third and wiped the seond one to give linux a clean slot within the first 8.5Gb and another slot at the end of the drive. I also (finally) moved a load of old pictures and documents from the Windows98 era 4Gb drive to this new windows partition, and wiped that old drive.

About then, the PC started thowing another wobbler, this time reporting a Disk Boot Failure.
The XP disks said they couldn't find an XP installation to repair.

So I tried to see if I could trick it and get Linux installed (at least to get access to my files again). Ubuntu went in quite happily and recognised that I also had Windows XP installed. Shame Windows XP couldn't find it. And it also recognised the existing NTFS partitions and was quite happy to create ext3 partitions for linux in the spaces between.

It was still coming back with Disk Boot Failure. However I gave it a night's rest powered off. Last night I rebooted and the GRUB menu came up and both Ubuntu and Windows XP are running fine. Hopefully I'll be able to get Oracle installed and running sometime soon.

PS.
Before anyone asks why I want Windows XP, my three-year old is already practising mouse and keyboard skills with Winnie the Pooh, Bob the Builder and some Matchbox FireTruck/Police Car game.

Thursday, September 08, 2005

Atom/RSS Feeds from blogger.com

I just noticed that, although BLOGGER.COM makes it pretty easy to set up a Atom feed for Feed Readers, it didn't automatically include a link in the template for it.
So I've added the link (and another that gets processed through Feedburner). If you are looking for a feed from a blogger here, just try adding /atom.xml to their main URL and see if one exists.

I've just started using FeedReader and it's a lot easier than running through a list of bookmarks to see if anything is new.

Analytics with ORDER BY and DISTINCT STRAGG

A recent request in a help forum required an SQL that would return rows until 5 distinct values of a specific column had been returned.

Using analytics was proving difficult as the DISTINCT clause wouldn't work with an ORDER BY. While a non-analytical function was found, I offered an analytic SQL that used a user-defined aggregate. This is similar to STRAGG but eliminates duplicates prior to returning the result string. [The elimination process involves 'associative arrays' - the old INDEX BY tables with a VARCHAR2 index. This was new in 9iR2 so won't work in earlier versions where you'd need to develop your own sort routine.]


create or replace type str_dist_agg_type as object
(
  total varchar2(4000),
  static function
       ODCIAggregateInitialize(sctx IN OUT str_dist_agg_type )
       return number,
  member function
       ODCIAggregateIterate(self IN OUT str_dist_agg_type ,
                            value IN varchar2 )
       return number,
  member function
       ODCIAggregateTerminate(self IN str_dist_agg_type,
                              returnValue OUT  varchar2,
                              flags IN number)
       return number,
  member function
       ODCIAggregateMerge(self IN OUT str_dist_agg_type,
                          ctx2 IN str_dist_agg_type)
       return number,
  member function
       ODCIAggregateDelete(self IN OUT str_dist_agg_type,
                           value in varchar2)
       return number
)
/

create or replace type body str_dist_agg_type
is
--
static function ODCIAggregateInitialize(sctx IN OUT str_dist_agg_type)
return number
is
begin
   sctx := str_dist_agg_type( null );
   return ODCIConst.Success;
end;
--
member function ODCIAggregateIterate(self IN OUT str_dist_agg_type,
                                    value IN varchar2 )
return number
is
begin
   --Use chr(7) as an internal separator because, since
   --it is a pretty useless control character, it is unlikely
   --to be in the input data or cause any internal problems.
   self.total := self.total || value || chr(7);
   return ODCIConst.Success;
end;
--
member function ODCIAggregateTerminate(self IN str_dist_agg_type,
                                      returnValue OUT varchar2,
                                      flags IN number)
return number
is
 type tab_dist is table of number index by varchar2(4000);
 t_dist tab_dist;
 v_point number := 0;
 v_str varchar2(4000);
 v_sep varchar2(1) := chr(7);
 v_ret varchar2(4000);
begin
 v_str := rtrim(ltrim(self.total,v_sep),v_sep);
 v_point := instr(v_str,v_sep);
 while nvl(v_point,0) != 0 loop
   t_dist(substr(v_str,1,v_point-1)) := 1;
   v_str := ltrim(substr(v_str,v_point),v_sep);
   v_point := instr(v_str,v_sep);
 end loop;
 t_dist(v_str) := 1;
 --
 v_str := t_dist.first;
 while v_str is not null loop
   v_ret := v_ret || v_str;
   v_str := t_dist.next(v_str);
 end loop;
 returnValue := v_ret;
 return ODCIConst.Success;
end;
--
member function ODCIAggregateDelete(self IN OUT str_dist_agg_type,
                                   value in varchar2)
return number
is
 v_point number;
begin
 v_point := instr(self.total,value||chr(7));
 if v_point > 0 then
   self.total := substr(self.total,1,v_point-1)||
                   substr(self.total, v_point+length(value)+1);
 end if;
 return ODCIConst.Success;
end;
--
member function ODCIAggregateMerge(self IN OUT str_dist_agg_type,
                                  ctx2 IN str_dist_agg_type)
return number
is
begin
 self.total := self.total ||chr(7)||ctx2.total;
 return ODCIConst.Success;
end;
end;
/

CREATE or replace
 FUNCTION strdagg(input varchar2 )
 RETURN varchar2
 PARALLEL_ENABLE AGGREGATE USING str_dist_agg_type;

grant execute on strdagg to public;

Note : STRDAGG uses the CHR(7) character internally as a delimiter, so if this is in the data you are likely to see problems. No 'visible' delimiter is shown, so if you want one you'll need to specify it as part of the expression :

select table_name, column_name, strdagg(data_type||',') over
(partition by table_name order by table_name, column_id) data_types
from user_tab_columns
where table_name in ('CHILD','PARENT')

TABLE_NAME COLUMN_NAME DATA_TYPES
---------- --------------- --------------------
CHILD ID NUMBER,
CHILD PARENT_ID NUMBER,
CHILD DESCRIPTION NUMBER,VARCHAR2,
PARENT ID NUMBER,
PARENT DESCRIPTION NUMBER,VARCHAR2,

I've put all the 'tricky' processing in the TERMINATE function so it can be easily amended (eg if you wanted to return the number of duplicates for each value or have a case-insensitive sort).

(If you are interested in the original problem and the non-analytical solution, it is in the Quest Pipelines' PL/SQL forum entitled 'Tricky SQL Challenge' in the 'Writing and Tuning SQL' conference.)

Pages