Wednesday, December 20, 2006

11g PL/SQL enhancements presentation

While a number of bloggers have reported on what they heard at conferences regarding 11g, in many cases the presentations by Oracle employees have not been made available. I believe this is due to Oracle not wanting to be implied as committing to ANYTHING.
While doing an information hunt, I came across the NYOUG presentations here which includes the pdf of a Tom Kyte presentation on 11g PL/SQL enhancements.

If there is a policy of not releasing these presentations, there's a chance Oracle will ask for it to be removed from the site so best "get it while its hot".

May ratings are :
  • DML in triggers is faster - 2. We are putting that code into packages anyway, aren't we ?
  • Fine Grained Dependency Tracking - 8. Very useful for impact analysis.
  • Native compilation - ?. Depends on whether there is a significant performance benefit.
  • Intra-unit inlining -5. Anything that allows use to write more readable code with no performance hit is bound to be good.
  • Function result cache - 7. Looks good, but we'll have to watch that RELIES clause
  • SQL result cache - ?. If it is a straight query with no variables, isn't it a materialized view ? Wonder how it works with variables and/or SYS_CONTEXT
  • Compound trigger -?. Sounds good, but I can see people trying to put validation in there that doesn't cater for concurrent table updates.
  • Dynamic SQL -3. I don't see much use of it, but anything that makes it more practical to keep the SQL in the database and away from those Java developers must be good :)
  • Fine-grained access control for UTL_FILE/UTL_HTTP/... - 8. More security GOOD.
  • Regular expression enhancements-?. Not something I've used.
  • Super-?. Also not something I'm familiar with.
  • Read only table-2. Check out the Dizwell article on this. Is it really Read Only (and could we apply it to individual partitions ?)
  • Disabled trigger-1. Don't we test these things ?
  • Trigger firing order-1. Do we need to make triggers MORE complicated ?
  • "When others without raise" warning-7.
  • Direct use of sequence.nextval in PL/SQL-7.
  • "Continue" command for loops-2. Spaghetti code.
  • Named parameter referencing for functions called from SQL-7. Readability improvement.
So fine-grained dependency tracking is the big ticket item as far as I'm concerned, with the advantage that as soon as 11g is out, you can copy your database structure to a development copy of 11g and get that impact analysis benefit long before you upgrade your production environment to 11g.

Tuesday, December 19, 2006

A funny place, Australia

At the weekend, I had to cast my vote in a local election arising from the death of one of the concillors. I say "had to" because voting is compulsory here.

"VOTING AT THIS ELECTION IS COMPULSORY FOR ALL ELECTORS WHOSE PLACE OF LIVING IS WITHIN ..... Every elector who fails to vote without a sufficient reason will be liable to a penalty not exceeding $110.00."

Coming from the UK (where you get a little polling card to present at a polling booth), when I attended my first election as an Australia I was quite surprised that all I had to do was tell them my name and address. No id check, no secret check of birth dates or mother's maiden name or anything else. Trust but don't verify.

The man just crosses your name off his list. That's so you don't try voting again. Except there were 11 polling places I could attend, each with their own lists. So it is quite possible to "vote early, vote often".

There is a cross check after the election between the polling place lists so they can fine people who DON'T vote. If you appear to have voted several times, you'll get a "PLEASE EXPLAIN" letter but since they don't check id, they can't really do anything about multiple votes.

Strange, but it all seems to work.

Monday, December 11, 2006

NULL is my middle name

Eddie Awad has linked to one of the perennial database discussions, the role of NULLs in the database.

My take on this is coloured by the fact the NULL is my middle name. Or to be more precise, my parents chose not to give me a middle name. So let me make it quite clear that NULL is NOT "Unknown". I know for a fact, backed up by my birth certificate, that I do not have a middle name. Any application that deals with my name needs to get used to that fact, because it is not going to change. I do not want to be addressed as Mr G. M. Yers, or Mr G. . Myers or Mr G. ?. Myers. The vast majority of computer applications don't have a problem with my lack of middle name. The world has not fallen in because of my parents decision.

So if NULL isn't "Unknown", then what is it ? Simply speaking NULL is as NULL does. It is ignored by aggregate functions (count, min, max etc), it is ignored by most predicates (=, IN, LIKE etc) and in fact ignored anywhere that doesn't explicitly cater for them. And that is they key to its usage. You allow a column to be null when you want null values to be processed in the way Oracle processes them.

Null-haters tend to list a bunch of cases where the NULL makes no sense and extend that to say that NULL should be abolished. That assumes NULL is a data problem. It isn't. It is a business issue, and one of a number of several presented by 'extreme' data. A company decides to pay a bonus based on the percentage increase in sales value by individual salesmen between quarters. But Fred has only been there a month, so didn't have sales for the last quarter. Does he get a bonus, and if so, how much ? That is a business decision, not a data one. It doesn't matter whether we have recorded a null or a zero against Fred's sales for a quarter, or whether the row doesn't exist or if we have an exotic datatype that holds either a number or a 'weird data flag'.

There are undoubtably cases where a NULL makes no sense. We are allowed to defined columns as NOT NULL (and even put in other check constraints to ensure they only have acceptable values) and this data integrity constraints should be used wherever possible. The question is not just whether we should allow SALES_VALUE to be null, but should be allow it to hold negative values, sub-cent (or penny) fractions, values over a billion....

And my last word is, if you abolish NULLs, then I'm going to get a middle name that is at least 2001 characters long (and include some heiroglyphics not seen since the pyramids were finished), and that will really sod up your database !

Friday, November 10, 2006

FULL SCANs must be bad, SQL Developer Says so.

I've downloaded the evaluation edition of SQL Developer 1.1.
It's great that I don't have to set my NLS_DATE_FORMAT everytime I connect.
BUT, one of the changes to the Explain Plan is that some operations are being coloured red, including Cartesian Joins and Full Table Scans.










1.0 was fine. Nice, simple, black and white.











Using Explain Plan, Oracle tells us the optimal plan that the Optimizer came up with for the query we gave it. It doesn't tell us the second best plan, and keep the best hidden away. It doesn't tell us the plans it rejected. It doesn't tell us "But you'd do better with a TO_CHAR on this to make us of an index" or "The stats indicate you don't have any nulls in this column, but it does allow nulls so I'm having to avoid this index". If you asked Oracle to join a table to itself on the primary key, the Explain Plan still wouldn't say "Are you just plain stupid ?" In short, the Explain Plan does NOT tell us what's wrong with an SQL.

10G has introduced features to make suggestions about how to improve SQLs, for example suggesting new or changed indexes. Those features are not part of the explain plan. The Explain Plan describes what the optimizer decided was the best access path, with the cardinality and costs giving a (vague) suggestion as to why it might have decided that. The 10053 event can be used to obtain very detailed trace information on why a particular plan was chosen ahead of other plans.

A developer (or DBA) can look at an Explain Plan, and (if desperate) the 10053 trace, and use their knowledge of the business, the application, the data and the requirements to see if Oracle has made any incorrect assumptions based on insufficient data.

Quite simply, you cannot look at an Explain Plan and, without any additional information, say "This plan is BAD". Worse is the concept of saying "This operation in this plan is bad". The plan is the sum of the operations. You can't just excise one and pretend that there's a magic way of doing that process for free. Even worse than that is saying "This operation is pretty much always a bad idea."

Sticking a highlighter over the Explain Plan is trying to turning the very complex process of query optimization into a childs game of follow-the-leader and it won't work.

"FULL SCANs must be bad, SQL Developer Says so."
I say NO, and I'll keep a SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY) in my snippets folder.

Sunday, November 05, 2006

Melbourne AUSOUG Day 2

I finished Thursday with Connor's presentation, and started Friday with another, 'Being a successful Oracle developer'. As I mentioned, I've got Connor's Mastering PL/SQL book, as well as Tom Kyte's Expert One-on-one, Effective Oracle Design and Expert Oracle Database Architecture. I also follow AskTom as well as I can. So I knew 90% of what he presented, but it was well structured and ANYONE taking on an Oracle project would benefit by making everyone on the team attend this presentation. As soon as the conference papers become available, this one is getting printed out and attacked with a highlighter for use a reference.

Marcel Kratochvil's talk veered wildly between AJAX and the PL/SQL preprocessor (or conditional compilation, if you prefer the term) but worked well. I like the concept of AJAX (and anything else that might mean I don't have to learn Java) and referring to Application Server as a legacy technology sounds interesting.

After lunch, I tried Anjo Kolk's talk on cursors and the shared pool, but it was a bit over my head, probably because I'm a developer not a DBA. I think I grasped the issue (that cursor caching isn't always appropriate) but not what I'm supposed to do about it.

Then Tony Jambu woke up everyone who attended his 'Database Hacking - Is your database safe ?" talk. I keep an eye on Pete Finnegan's blog and forum, so wasn't as shocked but probably just as scared. Hey, these databases have OUR credit card numbers !

Finally another Connor talk on 'Odds and Ends'. I was tempted by more on 11g Availability by Mark Townsend, but with 11g the best part of a year away, I figured I'll find out more details closer to the time.

There were other sessions I wish I could have attended. However I'll have to make do with the conference papers (or winning the lottery and flying to Perth for the West Coast version at the end of this week).

Melbourne AUSOUG Day 1

I've been quiet for the past few months, mostly due to a lack of time. However to try to kick things back off again, I'll do a report on my attendance at the AUSOUG conference in Melbourne last week.
This isn't on the same scale as OpenWorld, with attendees measured in the hundreds rather than tens of thousands, but it is the closest I'll come on my budget. Being a contractor, no-one else will pick up the tab for events like these. However it is an opportunity to find out what's happening in and around Oracle, and the conference does manage a few Oracle 'names'.

Mark Townsend presented on Oracle database's future, and I was interested in heaing about 11g. Like OpenWorld, there was the disclaimer about 'what you see and here isn't guaranteed to actually ever turn up', which is fair enough. However the big ticket item does seem to be the 'editioning feature' for upgrading applications without downtime. It will be interesting to see how that pans out, but also how we might need to cater for it from a configuration/change/version management point of view.

The other nice feature is the SQL and workload 'replay' that will allow for some decent performance testing/impact assessment.

For the next session, I considered 'Understanding Oracle RAC' in the DBA thread, but the session prerequisites included an....'understanding of RAC', so rather than get caught in a recursive loop I opted for the 'Rough Guide to Data Warehousing', so I may still be a bit lost on what an interconnect is, but I now know that a slowly moving dimension isn't something requiring Captain Kirk and a pointy-eared science officer.

After lunch, I went for Penny Cookson's Apex hands-on workshop. A an ex-Forms developer trying to keep away from Java, I like the sound of Apex, and this was a good introduction. I've still got half the workshop guide to finish off, and then its just a matter of thinking of the snazzy little application I can build to make me into the next Youtube.

I finished off the day with a Connor McDonald talk. While I've got his book, I'd guessed from his 'co.uk' domain name that he was based in the UK. Apparently he works out of Perth in Western Australia. His presentation 'Once around the block' was a informative talk on the contents of the Oracle database block. From a developer perspective, I may not USE anything I learned there, but I LIKE knowing, for example, that the data block points directly to the 'undo' block that reverses the last change (which will in turn point at the one that reverses the previous change) and how the transactions are actually listed on the block.

Wednesday, June 14, 2006

Oracle XE is APC's cover girl

Oracle's finally made the big time. XE edition is featured on the cover disc of Australia's APC magazine this month. This isn't a mag for programmers, IT professionals or business decision makers, just a general "What's happening in the PC world" sort of mag.

When they make XE the featured toy of a Macca's Happy Meal, then I'll know its time to retire.

Tuesday, April 11, 2006

Nostalgia and MUMPS

There's been a bit of nostalgia recently with posts about an Oracle V4 Virtual Machine. Partially prompted by some of the items in Mark's article, I thought I'd reflect back a bit too.

My first employer after getting a degree was a Regional Health Authority in England. While I was (fortunately) assigned to a new Ingres-based project, their primary language was something called Mumps. This is one of those languages where you've either worked somewhere with it, or never heard of it. I didn't have much to do with it bar the occasional look at the interface modules that supplied us with flat files for loading into our database. Since that was a good few years ago, some of my recollections could be inaccurate.

Mumps isn't a relational database itself, though apparently you can run a relational database and even SQL on it. It stores data in Btree structures called global arrays. There would be a file system layer under that but that is invisible in practical terms, in the same way datafiles are hidden by tables in Oracle. My first problem with Mumps is that there is no internal structure enforced with a global. It would be similar to having a single column VARCHAR2 in a table, and storing all the column data in there. Without that data structure, you don't have 'columns' much less datatyping, check constraints, foreign keys and so forth. Data integrity is purely the responsibility of the application. Oh, and by the way, Mumps doesn't have an access control mechanism analagous to Grants or Roles. Add, drop, delete or update, you can do it all.

All in all, it seems to address many of the 'faults' identified in Mark's article.

My other problem with Mumps was it is the only language I've worked with where whitespace within the code is important. While it did help to ensure all the code looked the same, it was only because it was all equally yuck. It didn't help that for various reasons (some historic or for standards compatibility),there is a maximum length for variable names of eight characters. I haven't had to force myself to keep to that since I was coding BASIC on a ZX81 with 1Kb of RAM. It makes code nearly unreadable.

You can, if you wish, read more about Mumps on Wikipedia, with its critics currently sidelined off in another article.




Wednesday, March 29, 2006

New laptop

I managed to blow up my computer at the weekend. I've had problems with the hard drive since a repartitioning exercise went wrong, so I'd picked up a new hard drive to replace it. That's very easy to type, and is a relatively easy hardware task (and would be very easy if it wasn't for those ribbon cables which seem to lots of twisting to get the plug the right way up). Not sure whether it was static, dust, a loose connection or what, but when I plugged it in there was a flash from the power supply, a bit of smoke and one dead box.

So after some discussions with the holder of the budget (my wife) we went off to the David Jones to get a snazzy new laptop. We went for an Acer Aspire 1642 configured with 1Gb of RAM and 60Gb of hard drive. David Jones is a department store that sells computers as an afterthought. Asking the salesman about firewire, he looked blank and pointed at the Ethernet port. To be honest, I didn't expect much technical nounce from them (and I'll need to get a PC card with a firewire adapter at some point). Still 18 months interest free and Microsoft Office Student+Teacher edition thrown in were the reasons I went there.

I've got the dubious drive (80Gb) and the new one (120Gb) which I want to get rigged up as external hard drives. Then I can have what I'd consider a REAL backup regimen rather than relying on two drives in a single box.

The only weird thing is that the drive is set up as two FAT32 partitions rather than NTFS. Haven't decided whether/when to convert it over. I have decided to give up on the Linux dual-boot thing. I tried Ubuntu once. but couldn't figure out how to get the dialup modem working. The HELP wasn't. Yeah, it's great to talk about community support and such, but I couldn't access that support without going back to Windows and switching back and forwards trying things. Instead I switched to Fedora 4, for which I could buy a physical book and with the book I was able to get the modem working pretty quickly. But I've never really used it, and am happier with cygwin to drop down to a shell-like environment. Now I hear Fedora 5 is out, and I really can't be bothered (especially after Dizwell's review of FC5).

I did get a nice new install of the Production version of Oracle XE and was able to confirm that the DB18 fix is in the Production build. It will be interesting to see how future patches get incorporated into XE, and how we'll get to know what fixes are in which XE build.

To test whether the fix made it in, I had to learn how to do the initial exploit. It was surprisingly (frighteningly) simple once I found the right tidbits to install.

Monday, March 06, 2006

When is a schema a squema ?

Squema : (noun) a database schema that provokes nausea, disgust or fear in developers and/or DBAs. [from Latin 'schema' and Middle English 'squeamish']

Okay, it's not a real word, but it was prompted by a typo/spelling error in a post in the XE support forum.
What structures will put shivers down your spine when you first set eyes on a database schema ?

  1. There's only half a dozen tables, the largest of which has just three columns similar to 'Entity Id','Attribute Id' and 'Attribute Value'. Every query in this database will include forty-two joins.
  2. Lots of columns defined as CHAR rather than VARCHAR, all constrained to NOT NULL. A sure sign of a generic product designed to be database-independent.
  3. Lots of columns names like USERDATA..., USER_DEFINED..... or similar, so the column names give you no idea what the contents are. Normally the sign of a third-party product. You can guarantee that there won't be any practical constraints and that, at some point, the wrong data will find its way into the wrong column.
  4. Columns defined as VARCHAR2, but with names suggesting they contain dates/numbers or, in conjunction with 'generic names', just data values that are suggestive of dates and numbers. Look forward to a life of invalid date and value errors.
  5. Spelling mistakes in table, procedure or column names. I once worked with an application that included a table column "proccess_status". A giveaway that some steps are missing from the review process and bound to trip up programs written in languages that don't validate SQL syntax and semantics at compile time.
  6. select * from user_constraints where constraint_type = 'R' gives no rows selected. No foreign key enforcement, no data integrity, no data found....
  7. select * from user_constraints where constraint_type in ('P','U') gives no rows selected. Trust me, the application will enforce it. Yeah, right, with the help of the tooth fairy.
Anything else which makes you squirm ?

Tuesday, February 14, 2006

STRAGG without PL/SQL - MODEL strikes again

Still playing around with MODEL. Last time I posted on using MODEL to break down a single columne containing comma separated values into multiple rows. This time, I'll go the other way, and take multiple rows and 'stragg' them up into a single comma separated list.

I use dense_rank to count, and sequence, the individual values. The order by clause in the function allows you to define how the 'fields' in the concatenated function will be ordered.

I'm using the 'return updated rows' clause to return fewer rows (ie just the one containing the concatenated value). I iterate up to 4000 times (maximum length of VARCHAR2) but quit when there isn't a row for that iteration (eg if there are five rows to be concatenated, there isn't a value for the sixth, so the iteration quits).

I'm not convinced this is the most efficient method as I think I'm iterating for each row selected. However at the moment, my involvement with 10G is purely 'play' (and all on XE), so I'm trying to experiment to see what is possible with MODEL. Once I've got more of its techniques under my belt, I'll have a better idea of how to improve my SQLs.

select data_type, seq, all_cols
from
(select data_type, cast(column_name as varchar2(4000)) cname,
dense_rank() over (partition by data_type order by column_name) -1 seq
from user_tab_columns
where table_name = 'PLAN_TABLE')
model
return updated rows
partition by (data_type)
dimension by (seq as seq)
measures (cname all_cols)
unique single reference
RULES ITERATE (4000)
UNTIL (PRESENTV(all_cols[ITERATION_NUMBER+1],1,0) = 0)
(all_cols[0] = case when ITERATION_NUMBER = 0
then all_cols[0]
else all_cols[0]||','||all_cols[ITERATION_NUMBER] end
)
order by 1,3,2;


PS. I'm trying performancing to publish this entry.

Saturday, February 11, 2006

More on the MODEL clause

I have been playing with the MODEL clause, on the assumption that it can probably do lots of fun and exciting things if only I knew how to use it. I think the fun part is the ITERATE clause which seems a massive step forward for the SELECT statement. SELECT basically churns through row after row of a result set, the fundamental SEQUENCE component of data processing. The WHERE clause tells it which rows to process and which to ignore, and the CASE statement (and to a lesser extent, DECODE before that) offers a powerful mechanism for choosing how to process the row, representing the SELECTION component. The next step, which had been missing until the MODEL clause, was a way to repeatedly process the same selected row, the ITERATION component.

So, lets take a single row, from the famous dual table, and continually reprocess it to generate multiple entries in a new result set, a magical number generator.

SELECT x from dual
MODEL DIMENSION BY (1 AS z) MEASURES (1 x)
RULES ITERATE (7) (x[ITERATION_NUMBER]=ITERATION_NUMBER+1);
Extending that, lets do some REAL data processing, splitting a single row with a colun containing a list of values into multiple rows.

Firstly, I used the iteration operation to loop around up to 4000 times (based on a maximum size of a VARCHAR2 in the database). But I did not want 4000 rows for each row selected, so I counted the number of delimiters in the list (the 'vcnt' made-up column) and quit the iteration when all list entries have been processed.

The tricky bit is pulling the Nth entry from the list. Firstly, I pick out the start position. For iteration zero, that will be the first character, and for the subsequent (1st, 2nd, 3rd etc) interations, the character after the Nth delimiter. Then I derive the end position using INSTR to locate the Nth+1 delimiter. Where that returns zero, I use the length of the string. Finally, I take the characters from the start position to the end position.

The great thing about the model clause, is that I can use the interim vstart and vend 'made-up columns' to come up with the var2_list column I actually want, simplifying some of the more complex derivations.


drop table str_test;

create table str_test (var_list varchar2(40));

insert into str_test values ('a,b,c,d,e,f,g,h');
insert into str_test values ('AA,BB,CC,DEF,GG');
insert into str_test values ('1,2,,45,6');
insert into str_test values ('fred,is,here,today');
insert into str_test values ('a');
insert into str_test values ('');

commit;

column var_list2 format a5

SELECT var_list, var_list2
FROM str_test
MODEL
PARTITION BY (var_list)
DIMENSION BY (0 AS z)
MEASURES (var_list var_list2,
0 vcnt, 0 as vstart, 0 as vend)
RULES ITERATE (4000) UNTIL (ITERATION_NUMBER+1 = vcnt[0] )
(vcnt[ITERATION_NUMBER]=
nvl(length(cv(var_list)) - length(replace(cv(var_list),',')) + 1,1),
vstart[ITERATION_NUMBER] =
case when ITERATION_NUMBER = 0 then 1
else instr(cv(var_list),',',1,ITERATION_NUMBER)+1 end,
vend[ITERATION_NUMBER] =
case when nvl(instr(cv(var_list),',',1,ITERATION_NUMBER+1),0) = 0
then nvl(length(cv(var_list))+1,1)
else instr(cv(var_list),',',1,ITERATION_NUMBER+1) end,
var_list2[ITERATION_NUMBER] =
substr(cv(var_list),vstart[ITERATION_NUMBER],
vend[ITERATION_NUMBER] - vstart[ITERATION_NUMBER])
)
order by var_list, vstart;

Tuesday, February 07, 2006

Oracle security interview - thoughts

Interesting article through Pete Finnegan's blog from an interview with one of oracle's security people.

A few quotes:
"We don’t hide our internally discovered vulnerabilities. When we discover something internally, we still mention it in our Critical Patch Updates."

"There are others [security researchers] who for their own good reasons choose to pressure us and put our customers at risk by a partial or early or zero-day disclosure of vulnerabilities in Oracle products."

So let me get this straight. If Oracle employees find a bug, then disclose it, it is a GOOD THING. If external parties practise partial disclosure, they are putting customers at risk. The exact meaning of 'zero day' disclosure is debateable. If it means disclosure prior to informing Oracle, then there is no excuse and it is BAD. They MAY mean disclosure before Oracle have patched it, but then what is "early disclosure" ?
Where they are referring to disclosure before a patch is available, there is room for debate. If there is a potential workaround to reduce risk, then publicising the issue plus the workaround, could benefit some customers.

"If you look at all of the vulnerabilities that my security group handles, we discover about 75% of them. About 10% is reported to us by our customers. The remainder comes to us through external security researchers. "

Mostly figures should add up to 100% but I don't think this is one of them. Remember, Oracle don't publicise bugs before they patch them, so a vulnerability could be 'discovered' by customers, researchers AND internal Oracle staff, just like America being discovered by the Carthaginians, Vikings, Chinese, Columbus, and so on (not to mention people already actually living there).



HTMLDB rename poll results and blog colours

Firstly, the results of the poll are in, and half opted for APEX as a quickie name for Application Express/HTMLDB, which is fine by me.
Luckily it seems to be what Oracle are going towards, as 'apex' gets used here.

AndyC commented on the history of Oracle Bloggers
Not sure whether the comment about tweaks and garish colours was meant for me. If not, it probably should be. I was looking for some background images, and the bluish mazes one came with OpenOffice as a web-background. I've now toned it down a few shades, but left the original colours for the title box. I think it works, but what do I know ?

My main issue with the original templates are, having the sidebar on the left means some scrolling can be needed in 'narrow' windows. Having it on the right means that, when I'm lazy and use a PRE tag to highlight code, long lines of code can run over the edge of the content area into the sidebar and making it hard to read.

So I've shoved all that stuff down the bottom, with a links up the top. And I had to hunt around for a workaround to IE's inability to respect the MAXWIDTH CSS directive.

Tuesday, January 31, 2006

New name for HTMLDB - Poll

So they've renamed HTMLDB. To be honest, I never liked the HTMLDB name. It sounds too much like a mumble. Old-time readers of AskTom will recall him referring to it as Project Marvel, which sounds much better.
Application Express is also good, but takes too long to type, so I've put a poll together (thanks Eddie for the pointer to dpolls) on an abbreviation for it. Feel free to add any other suggestions.


Create polls and vote for free. dPolls.com

Monday, January 23, 2006

Wiki as reference

As a sort of follow-up to by previous blog entry on the wikipedia, yesterday I saw my first newspaper report citing Wikipedia as a source. It was in Sydney's Sunday Telegraph, in a report about Spielberg's new film, Munich. It had a side-bar article about the Black September Organisation with Wikipedia declared as the source.

Okay, the Sunday Telegraph doesn't rank amongst the elite of the world's newspapers. It's what UK readers would call a tabloid. Personally, I prefer 'zodiac press'. That is, you have to have some scepticism towards any newspaper that provides a horoscope for the benefit of its readers.

While my previous blog entry said that generally, accuracy was of less importance than usefulness, when basing an article on that information, accuracy is of prime importance. Personally, I don't think the Wikipedia entries themselves should ever be used as a sole resource. At a minimum the history and discussion pages should also be referred to, as there's always the chance that the latest edit was 'corrupt'. Better still, refer to several sources. Remember, writing an article based on one source of information is plagarism, using many is reseach.

And on an Oracle note, thanks to Dizwell for the space for an Oracle wiki.

Thursday, January 19, 2006

Google CDOS ?

Seeing "Google CDOS" referred to here makes me feel OLD.
I remember using usenet back from CompuServe in the 80's, before there was even the concept of an internet search engine.
I remember using DejaNews to search the old USENET archive (which wikipedia tells me would have been in the 90's before Google was round and when AltaVista was king of the search engines).
In wonder if, in another decade, we'll have given up referring to plain old email, and just be talking of gmail. Or even worse, we'll start calling it GoogleNet.

I would have posted this as a comment on the blog, but it was erroring out with
Invalid [] range "r-c" in regex; marked by <-- HERE in
.....lots of naughty words in a regex list....
at /var/www/www.orablogs.com/web/mt-bin/extlib/jayallen/Blacklist.pm line 3095.

Tuesday, January 17, 2006

Thoughts on Oracle forums

A new Oracle Question and Answer site has been created by Eddie Awad. Some commentary has been made about the value of this resource, and whether it will result in duplication or dilution.

There are a number of Oracle forums about, and most have their own strengths and weaknesses. It would be interesting to see how big a list we can come up with so see if you can contribute here.

It also opens up some debate about 'fringe' sites, such as AskTom and Steven Feuerstein's PL/SQL Q&A where the vast majority of questions are answered by one individual (although in Tom's case rumours persist of clones) and static Q&A sites such as the Co-operative FAQ

In that light, I suggest that there are several 'customers' for these forums (fora ?).
Firstly there is the questioner, sometimes wanting discussion about the merits of various approaches, sometimes wanting an urgent response to an emergency situation.
Secondly, there are the respondants. I class them as a customer because their participation suggests they are also looking for something too. Perhaps the feeling of helping others, or returning the assistance offered to them in the past.
Thirdly, there are searchers who, like questioners, are looking for an answer but use the forum's search facilities rather than repeat a previous question.
Finally, there are a group I think of as gatherers, who read a forum not for a specific answer but in the hope of picking up a new nugget of information here and there. Many in this group could be or may become respondants.

I don't believe there is general shortage of questioners and searchers out in the Oracle community. There will always be newbies out there needing a hand, and when faced with new functionality or versions, everyone starts off as a newbie again.

I'm less sure whether there is a shortage of respondants. Thinking of people I've worked with, there's plenty of people capable of making a valuable contribution here. Some may be active in forums I don't know about, or under pseudonyms. Others will be too busy with other things, or simply spend their spare time 'oracle-free'.

In an ideal world, the Oracle resources would all be in one place, more questioners could be satisfied with being searchers, there would be less repetition in forums and more 'nuggets' for the gatherers and AskTom would be open for questions ALL the time.

I like the concept of wikis and oradot, where people can log into a common resource and maintain a single body of knowledge (or at least pointers to that information). So I'm going to try to use them more, and will (eventually) reformat a few of my previous blog entries into oradot contributions. That way, I'm not the only one responsible if they are wrong :).

PS. Yes, I'm toying with my blog template. Normal service will be resumed when I work out what is normal.

Monday, January 09, 2006

The 12th SELECT of Xmas - Data generation through the MODEL clause

Okay, I missed twelfth night, but here's my final SELECT of Christmas : Data generation using the MODEL clause.

SQL has this annoying habit of only dealing with data that actually exists. To generate 'missing' data Oracle developers have come up with various tricks for generating lots of rows from nothing. Sometimes these involve the standard Oracle views which the developer assumes (or hopes) will return enough rows to meet their needs. ALL_TAB_COLUMNS tends to be a good one as the standard Oracle views themselves will throw back several thousand rows, never mind your own application tables.


Unfortunately, these are real views, and returning those few thousand rows will actually involve a degree of IO activity which is especially irritating when you really have no interest in the results.


Other people have come up with creative uses of DUAL and hierachial queries.

Personally, I'm not keen on them. To my mind, it isn't the intended use of the CONNECT BY clause and it is not documented that it would work that way. I just don't really TRUST it.

But in 10g, Oracle came up with the MODEL clause. It looks very complicated. It probably is very complicated. And I've avoided even looking at it for a long while.

But it is intended to generate data that doesn't actually exist.

So I've made a very simple example of a data generator using the MODEL clause.


SELECT days
FROM (select 1 days from dual) mnth
MODEL
DIMENSION BY (days)
MEASURES (days v)
RULES UPSERT
(v[FOR days FROM 1 TO 31 INCREMENT 1] = 1)
order by 1
/

Here is a slightly different technique which includes an UNTIL clause so that you can put in limits other than a simple count.

SELECT TO_CHAR(sysdate+days ,'DD-Mon-YYYY')
FROM (select 1 days from dual) mnth
MODEL
DIMENSION BY (days)
MEASURES (days v)
RULES ITERATE (32)
UNTIL (TO_CHAR(TRUNC(SYSDATE) +
(ITERATION_NUMBER+1) ,'MON') !=
to_char(sysdate,'MON'))
(v[ITERATION_NUMBER] = 1)
order by sysdate+days
/

Friday, January 06, 2006

12 Selects of christmas (the fourth instalment)

9. Unnesting a nested table

Okay, first I'll build a nested table using 10G's COLLECT operator.
Then I'll unnest it again..
A pointless exercise really, but then I haven't actually found a point to nested tables in the database anyway.
select m.table_name, c.column_value from
(select table_name,
cast ((select collect(column_name)
from user_tab_columns u
where u.table_name = t.table_name) as sys.dbms_debug_VC2coll) cols
from user_tables t) m, table(cols) c


10. ROLLUP
select data_type, nullable, count(*)
from user_tab_columns
group by rollup (data_type, nullable)

As well as the counts for the individual data_type/nullable entries, it also gives subtotals for each data type, plus a grand total

11. CUBE
select data_type, nullable, count(*)
from user_tab_columns
group by cube (data_type, nullable)

In addition to the ROLLUP results, this will also gives subtotals for nullable.

Wednesday, January 04, 2006

12 Selects of christmas (the third instalment)

A couple of things (other than tables and views) that you can select from :

7. Selecting from a collection

You can (and should) create your own type for this :

create type tab_varchar2_64 as table of varchar2(64);


If you can't then you may be able to get by with using SYS.DBMS_DEBUG_VC2COLL

This will turn a list of values into rows.


SELECT column_value
from table(tab_varchar2_64('1','Fred','Amber'));


8. Selecting from a (Pipelined) function

Again, you should use your own collection for the return type.


create or replace function f_generate_dummy_name
return sys.DBMS_DEBUG_VC2COLL pipelined is
type tab_char is varray(26) of varchar2(1);
t_vowel tab_char := tab_char('A','E','I','O','U');
t_other tab_char := tab_char('B','C','D','F','G','H','J','K','L','M','N',
'P','Q','R','S','T','V','W','X','Y','Z');
begin
for one in 1..t_other.count loop
for two in 1..t_vowel.count loop
for three in 1..t_other.count loop
pipe row (t_other(three)||t_vowel(two)||t_other(one));
end loop;
end loop;
end loop;
return;
end;
.
/
select * from table(f_generate_dummy_name);

Tuesday, January 03, 2006

The holidays are over again

Okay, I only had the actual public holidays off, but you know what I mean.

We took all our decorations down yesterday (not quite twelfth night, but close enough given that it took half a day). My three year old son was pretty good at helping. My daughter just looked confused as the tree came apart. Still we've got her first birthday to look forward to next week.

The only disappointment was a Sesame Street PC game (Elmo's Workshop) for my son which wouldn't install. The AVG virus checker won't let it past a file (DSSAGENT.EXE) which the web tells me is a nasty piece of spyware and the little checkbox the install gives about this 'option' doesn't actually do anything. You expect this sort of **** from freeware downloaded from the web, not from something you've paid for under the name of a supposedly reputable organisation promoting children's education.

Fortunately, we borrowed a 'The Cat in the Hat' CD-ROM from the library which he is enjoying immensely.

12 Selects of christmas (the second instalment)

The followup to the first set of SQLs


4. Connect by


Show all your roles and the route you have obtained them through.


select SYS_CONNECT_BY_PATH(granted_role, '/') path
from dba_role_privs
start with grantee = user
connect by prior granted_role = grantee


5. WITH clause

The data dictionary views where there is an ALL_ version and a USER_ version


column a_name format a32
column u_name format a32

with w_tab as
(select distinct object_name, substr(object_name,instr(object_name,'_')+1) suffix,
substr(object_name,1,instr(object_name,'_')-1) prefix
from dba_objects
where substr(object_name,1,instr(object_name,'_')-1) in ('ALL','USER'))
select a.object_name a_name, u.object_name u_name
from w_tab a, w_tab u
where a.prefix = 'ALL'
and u.prefix = 'USER'
and a.suffix = u.suffix;


6. ANSI style FULL Outer Join

I'm now bulding on the WITH select, but with a full outer join to find where there's a USER_ view with no corresponding ALL_ version or vice versa :


column suffix format a32
column a_name format a32
column u_name format a32

with w_tab as
(select distinct object_name name,
substr(object_name,instr(object_name,'_')+1) suffix,
substr(object_name,1,instr(object_name,'_')-1) prefix
from dba_objects
where substr(object_name,1,instr(object_name,'_')-1) in ('ALL','USER'))
select nvl(a.suffix,u.suffix) suffix,
a.name a_name, u.name u_name from
(select * from w_tab where prefix = 'ALL') a
full outer join
(select * from w_tab where prefix = 'USER') u
on (a.suffix = u.suffix)
where a.name is null or u.name is null
order by 1;