Friday, April 18, 2014

I Love Logs

It occurred to me a few days ago, as I was reading this article on DevOps, that I might actually be a DevOps.

I think of myself as a developer, but my current role is in a small team running a small system. And by running, I mean that we are 

  • 'root' and 'Administrator' on our Linux and Windows servers
  • 'oracle / sysdba' on the database side, 
  • the apex administrator account and the apex workspace administrators,
  • the developers and testers, 
  • the people who set up (and revoke) application users and 
  • the people on the receiving end of the support email
Flashbacked to Jeff Smith's article on Developers in Prod. But the truth is that there's a lot of people wearing multiple hats out there, and the job titles of old are getting a bit thin. 

The advantage of having all those hats, or at least all those passwords, is that when I'm looking at issues, I get to look pretty much EVERYWHERE. 

I look at the SSH, FTP and mailserver logs owned by root. The SSH logs generally tell me who logged on where and from where. Some of that is for file transfers (some are SFTP, some are still FTP), some of it is the other members of the team logging on to run jobs. The system sends out lots of mail notifications, and occasionally they don't arrive so I check that log to see that it was sent (and if it may have been too big, or rejected by the gateway).

Also on the server are the Apache logs. We've got these on daily rotate going back a couple of years because it is a small enough system that the logs sizes don't matter. But Apex stuffs most of those field values into the URL as a GET, so they all get logged by Apache. I can get a good idea of what IP address was inquiring about a particular location or order by grepping the logs for the period in question.

I haven't often had the need to look in the Oracle alert logs or dump directories, but they are there if I want to run a trace on some code. 

In contracts, I'm often looking at the V$ (and DBA_) views and tables. The database has some audit trail settings so we can track DDL and (some) logons. Most of the database access is via the Apex component, so there's only a connection pool there.

The SELECT ANY TABLE also gives us access to the underlying Apex tables that tell us the 'private' session state of variables, collections etc. (Scott Wesley blogged on this a while back). Oh, and it amazing how many people DON'T log out of an application, but just shut their browser (or computer) down. At least it amazed me. 

The apex workspace logs stick around for a couple of weeks too, so they can be handy to see who was looking at which pages (because sometimes email us a screenshot of an error message without telling us how or where it popped up). Luckily error messages are logged in that workspace log. 

We have internal application logs too. Emails sent, batch jobs run, people logging on, navigation menu items clicked. And some of our tables include columns with a DEFAULT from SYS_CONTEXT/USERENV (Module, Action, Client Identifier/Info) so we can automatically pick up details when a row is inserted.

All this metadata makes it a lot easier to find the cause of problems. It isn't voyeurism or spying. Honest. 

Saturday, April 12, 2014

Unique identifiers - but what do they identify

Most of the readers of this blog will be developers, or DBAs, who got the rules of Normalisation drummed into them during some phase of the education or training. But often we get to work with people who don't have that grounding. This post is for them. Feel free to point them at it.

Through normalisation, the tendency is to start with a data set, and by a methodical process extract candidate keys and their dependent attributes. In many cases there isn't a genuine or usable candidate key and artificial / surrogate keys need to be generated. While your bank can generally work out who you are based on your name and address, either of those could change and so they assign you a more permanent customer or account number.

The difficulty comes when those identifiers take on a life of their own. 

Consider the phone number. When I dial my wife's phone number, out of all the phones in Australia (or the world), it is her's alone that will ring. Why that one ? 

In the dark ages, the phone number would indicate a particular exchange and a copper wire leading out of that exchange hard wired to a receiver (or a set of receivers in the case of Party Lines).  Now all the routing is electronic, telephones can be mobile and the routing for calls to a particular number can be changed in an instant. A phone number no longer identifies a device, but a service, and a new collection of other identifiers have risen up to support the implementation of that service. An IMEI can identify a mobile handset and the IMSI indicates a SIM card from a network provider, and we can change the SIM card / IMSI that corresponds to a phone number, or swap SIM cards between handsets. Outside the cellular world, VOIP can shunt 'phone number' calls around innumerable devices using IP addresses. 

Time is another factor. While I may 'own' a given phone number at a particular time, I may give that up and someone else might take it over. That may get represented by adding dates, or date ranges to the key, or it can be looked at as a sequence. For example, Elizabeth Taylor's husband may indicate one of seven men depending on context. The "fourth husband" or "her husband on 1st Jan 1960" would be Eddie Fisher.

Those without a data modelling background that includes normalisation may flinch at the proliferation of entities and tables in a relational environment. As developers and architects look at newer technologies some of the discipline of the relational model will be passed over. Ephemeral transactions can cluster the attributes together in XML or JSON formats with no need for consistency of data definitions beyond the period of processing. Data warehousing quickly discarded relational formats in favour of 'facts' and 'dimensions'. 

The burden of managing a continuous and connected set of data extending over a long period of time, during which the identifiers and attributes morph, is an ongoing challenge in database design.

Sunday, March 09, 2014

Pre-digested authentication

A bit of a follow-up to my previous post on Digest authentication.

The fun thing about doing the hard yards to code up the algorithm is that you get a deeper level of understanding about what's going on. Take these lines:

    v_in_str := utl_raw.cast_to_raw(i_username||':'||i_realm||':'||i_password);
    v_ha1 := lower(DBMS_OBFUSCATION_TOOLKIT.md5(input => v_in_raw));

Every time we build the "who we are" component for this site, we start with exactly the same hash made up of the username, realm (site) and password. This is a batch routine, which means somewhere we would store the username and password for the site - whether that is a parameter in a scheduling tool, coded into a shell script or OS file, or somewhere in the database. If you've got the security option for Oracle, you can use the Wallet, with its own security layers.

But digest authentication gives us another option. Since we actually use the hashed value of the user/site/password, we can store that instead. The receiving site has no idea the code doesn't actually know the REAL password.

Now turn that over in your head. We can call the web service as this user WITHOUT knowing the password, just by knowing the hash. I don't know about you, but it makes me a little bit more worried when I hear of user details being leaked or hacked from sites. It's all very well reassuring us the passwords are hashed and can't be reverse engineered (assuming your own password can't be brute-forced). But depending on the security mechanism, a leak of those hashes can be dangerous. If a hacked provider advises people to change their passwords, take their advice. 

'Basic' authentication doesn't have the same weakness. In that environment the provider can store the password hash after applying their own 'secret sauce' mechanism (mostly a salt). When you authenticate, you send the password, they apply the secret sauce and compare the result. You can't get away without knowing the password, because all the work is done at their end.

There's no secret sauce for digest authentication, and there can't be. Even if the provider had the password in the clear, there's no way they can be sure the client has the password since all the client needs is the result of the hash. The provider must store, or be able to work out, the result of that hash because they need to replicate the final hash result using both the client and server nonces. They can store either that same user/realm/password hash as is, or they can encrypt it in a reversible manner, but a one-way hash wouldn't be usable.

In short, digest authentication means that our batch routine doesn't need to 'know' the actual password, just a hash. But it also makes those hashes a lot more dangerous.

I'm an amateur in this field. I checked around and it does seem this is a recognized limitation of digest authentication. EG: This Q&A and this comparison of Digest and Basic.

Saturday, March 08, 2014

PL/SQL, UTL_HTTP and Digest Authentication

For the first time in what seems like ages, I've actually put together a piece of code worth sharing. It's not that I haven't been working, but just that it has all been very 'in-house' specific.

However I had a recent requirement to use a web service that makes use of Digest Authentication. If you have look at the UTL_HTTP SET_AUTHENTICATION subprogram, it only addresses Basic authentication (and, apparently, Amazon S3 which looks intriguing).

In Basic authentication, the username and password get sent across as part of the request. Going through SSL, that doesn't seem too bad, as it is encrypted over the transfer and the certificates should ensure you are talking to the legitimate destination. However if that destination has been compromised, you've handed over your username and password. In an ideal world, the server shouldn't need to know your password, which is why database should only have hashed versions of passwords. 

Outside of SSL, you might as well just print the username and password on the back of a postcard.


In Digest authentication, you get a more complex interaction that keeps the password secret. You ask for a page, the server responds with an "Authentication Required" plus some bits of information including a nonce. You come up with a hashed value based on the server nonce, your own nonce and a hash of your username and password and send it back with the next request. The server has its own record of your username/password hash and can duplicate the calculations. If everyone is happy, the server can fulfill your request and nobody ever actually needs to know the password.

Our server used SSL, and thanks to Tim's article on SSL and UTL_HTTP, it was a simple set up. I've done it before, but that was in the days when it seemed a lot hard to get certificates OUT of a browser to put them in your Oracle Wallet.

The Interwebs were a lot less forthcoming on a PL/SQL implementation of Digest authentication though. The closest I got was this discussion, which can be summed up as "This may be complex, but I do not see these offhand as being impossible to do in PL/SQL....No Digest configured web server nearby or I would definitely have had a bash at this"

A read through the Wikipedia article, and I came up with the code below:

Firstly, after the initial request, go through the header to get the 'WWW-Authenticate' item. Take the value associated with that header, and pass it to the "auth_digest" procedure. 


    l_max := UTL_HTTP.GET_HEADER_COUNT(l_http_response);
    l_ind := 1;
    l_name := '-';
    while l_ind <= l_max AND l_name != 'WWW-Authenticate' LOOP
      UTL_HTTP.GET_HEADER(l_http_response, l_ind, l_name, l_value);
      IF  l_name = 'WWW-Authenticate'
      AND l_http_response.status_code = UTL_HTTP.HTTP_UNAUTHORIZED THEN
        --
        -- Unauthorized. Using the Authorization response header, we can come up with the
        -- required values to allow a re-request with the authentication/authorisation details
        --
        dbms_application_info.set_action('auth:'||$$PLSQL_LINE);
        UTL_HTTP.END_RESPONSE(l_http_response);
        --
        dbms_application_info.set_action('auth_req:'||$$PLSQL_LINE);
        l_http_request := UTL_HTTP.BEGIN_REQUEST(l_server||l_method);
        auth_digest (io_http_request => l_http_request, i_auth_value => l_value,
          i_username => nvl(i_username,'xxxx'), i_password => nvl(i_password,'xxxx'), 
          i_req_path => l_method, i_client_nonce => null);
        dbms_output.put_line($$PLSQL_LINE||':Get Response from authenticated request');
        dbms_application_info.set_action('auth_resp:'||$$PLSQL_LINE);
        l_http_response := UTL_HTTP.GET_RESPONSE(l_http_request);
        dump_resp (l_http_response);
        dump_hdr (l_http_response);
      END IF;
      l_ind := l_ind + 1;

    END LOOP;

The auth_digest starts with an extraction of the 'valuables' from that value string. I've used regular expressions here. I spent time working with grep, awk and perl, and regexes are habit forming.

  procedure extract_auth_items
    (i_text in varchar2,
    o_realm out varchar2, o_qop out varchar2, o_nonce out varchar2, o_opaque out varchar2) is
  begin
    o_realm   := substr(regexp_substr(i_text, 'realm="[^"]+' ),8);
    o_qop     := substr(regexp_substr(i_text, 'qop="[^"]+'   ),6);
    o_nonce   := substr(regexp_substr(i_text, 'nonce="[^"]+' ),8);
    o_opaque  := substr(regexp_substr(i_text, 'opaque="[^"]+'),9);

  end extract_auth_items;

Next is the 'meat' where the values are combined in the various hashes. Yes, there's a hard-coded default client nonce in there that, by a strange coincidence, matches on in the wikipedia article. That's how this stuff gets developed, by following through a worked example. Just like school.

  function digest_auth_md5_calcs
      (i_username     in varchar2, i_password     in varchar2, i_req_path      in varchar2,
      i_realm         in varchar2, i_server_nonce in varchar2,
      i_qop           in varchar2 default 'auth',
      i_client_nonce  in varchar2 default '0a4f113b',
      i_req_type      in varchar2 default 'GET',  i_req_cnt IN NUMBER default 1)
  return varchar2 is
    --
    v_in_str    varchar2(2000);
    v_in_raw    raw(2000);
    v_out       varchar2(60);
    --
    v_ha1       varchar2(40);
    v_ha2       varchar2(40);
    v_response  varchar2(40);
    --
  begin
    --
    v_in_str := i_username||':'||i_realm||':'||i_password;
    v_in_raw := utl_raw.cast_to_raw(v_in_str);
    v_out := DBMS_OBFUSCATION_TOOLKIT.md5(input => v_in_raw);
    v_ha1 := lower(v_out);
    --
    v_in_str := i_req_type||':'||i_req_path;
    v_in_raw := utl_raw.cast_to_raw(v_in_str);
    v_out := DBMS_OBFUSCATION_TOOLKIT.md5(input => v_in_raw);
    v_ha2 := lower(v_out);
    --
    v_in_str := v_ha1||':'||i_server_nonce||':'||lpad(i_req_cnt,8,0)||':'||
                   i_client_nonce||':'||i_qop||':'||v_ha2;
    v_in_raw := utl_raw.cast_to_raw(v_in_str);
    v_out := DBMS_OBFUSCATION_TOOLKIT.md5(input => v_in_raw);
    v_response := lower(v_out);
    --
    return v_response;
  end digest_auth_md5_calcs;

And this is the full auth_digest bit

  procedure auth_digest
    (io_http_request  in out UTL_HTTP.REQ,  i_auth_value    in varchar2,
    i_username        in varchar2,          i_password      in varchar2,
    i_req_path        in varchar2,          i_qop           in varchar2 default 'auth',
    i_req_cnt         in number default 1,  i_client_nonce  in varchar2 default null)
  is
    l_realm         varchar2(400);
    l_qop           varchar2(30);
    l_server_nonce  VARCHAR2(400);
    l_opaque        varchar2(100);
    --
    l_response      varchar2(40);
    l_value         VARCHAR2(1024);
    --
    l_client_nonce  varchar2(30);
    --
  begin
    --
    -- Apply the username / password for Digest authentication
    --
    extract_auth_items (i_auth_value,
                    l_realm, l_qop, l_server_nonce, l_opaque);
    --
    IF i_client_nonce is not null then
      l_client_nonce := i_client_nonce;
    ELSE
      l_client_nonce := lower(utl_raw.cast_to_raw(DBMS_OBFUSCATION_TOOLKIT.md5(
                            input_string=>dbms_random.value)));
    END IF;
    --
    l_response := digest_auth_md5_calcs
      (i_username => i_username, i_password    => i_password,     i_req_path => i_req_path,
      i_realm     => l_realm,    i_server_nonce => l_server_nonce,
      i_client_nonce => l_client_nonce);
    --i_qop default to auth, i_req_type default to GET and i_req_cnt default to 1
    --
    l_value := 'Digest username="' ||i_username          ||'",'||
               ' realm="'          ||l_realm             ||'",'||
               ' nonce="'          ||l_server_nonce      ||'",'||
               ' uri="'            ||i_req_path          ||'",'||
               ' response="'       ||l_response          ||'",'||
               ' qop='             ||i_qop               ||',' ||
               ' nc='              ||lpad(i_req_cnt,8,0) ||',' ||
               ' cnonce="'         ||i_client_nonce      ||'"'
               ;
    --
    IF l_opaque is not null then
      l_value := l_value||',opaque="'||l_opaque||'"';
    END IF;
    dbms_output.put_line(l_value);
    UTL_HTTP.SET_HEADER(io_http_request, 'Authorization', l_value);
    --

  end auth_digest;

A package with the code is available from my CodeSpace page, or directly here. There's a lot of debug 'stuff' in there. The code I'm using is still tailored to my single specific need, and I've stripped specific values from this published variant. You'll need to hard-code or parameterize it for any real use. I may be able to do a 'cleaned-up' version in the future, but don't hold your breath.

Sunday, February 09, 2014

Chrome's HOST-RULE flag

I'm currently working in an Apex environment, and we will soon be running some of our apex applications off a second domain name mapped to the same host.

For me this has mean a little playing around with VirtualHost in the httpd.conf which is something I don't get to do very often. But I am having to wait on others to actually set up the new DNS entry to map the domain name to the server.

If this was a brand new server with its own IP, then I could simply use the IP address. But since it is sharing a server and request using the IP goes straight to the old 'default' VirtualHost.

On the server itself I can touch up the hosts file to hard-code the entry. But the only browser I can use on the server is very dated and nearly unusable. And on my PC, I don't have admin rights to amend the PC hosts file.

Fortunately we do get chrome on our machines (tucked away as chrome-frame, but chrome nonetheless). And chrome has a handy little command line flag called "--host-rules". 

Using: --host-rules="MAP newdev.site.com 10.16.100.200" I can map the new hostname to its planned IP address and bypass the DNS lookup on the PC. Very handy when you are waiting for the DNS entry or if you want to override an existing name to point to a test or dev environment.



Wednesday, October 30, 2013

Oracle 11g XE installed on Windows 8.1

I count myself lucky if a post here gets a few hundred views. I have two posts that have hit 4000+ views, and a third edging towards 3000.

In third place is a reference to foreign keys referencing unique constraints rather than a primary key.

In second place, and heading towards its second birthday, was my desire for SQL Developer to have a "keep awake" function that stops sessions being killed. It still doesn't but will restore connections. 

In first place, was an almost throwaway article from when I migrated my netbook from XP to Windows 8. I installed Oracle 11g XE on it, and apparently a lot of people are (or were) interested in that. I guess people get Windows 8 on a machine and wonder if it is worth trying an XE install. The XE install on Windows is trivially simple:

  1. Download
  2. Unzip
  3. Run the setup program
  4. Choose a directory
  5. Enjoy a cup of coffee while it churns through. You have time for a biscuit too.

That article is nearing its first anniversary. In honour of that occasion, I upgraded to Windows 8.1. Oracle XE broke. Specifically I couldn't find the services that ran the listener or the database or anything Oracle related. A repair install of XE didn't fix things. Disclaimer: I don't use Oracle a lot on that machine, and it possible that something other than the 8.1 upgrade broke it.

I uninstalled it (that would have been the 'repaired' install), blew away all the database files as I had nothing I wanted to keep, and re-installed (instructions above - my biscuit was a Tim Tam, but people in the UK may substitute a p-p-p-penguin). It seems to work fine now.

So Oracle 11g XE works on Windows 8.1, as far as I can see. Bear in mind that XE doesn't have any real support anyway, so the difference between a 'supported' and 'unsupported' configuration is purely imaginary. I don't do anything like RMAN backups and restores, let alone between OS upgrades. I can't see why they wouldn't work, but I'm not a DBA.

Sunday, October 13, 2013

'Medalling' in Humour

Yesterday I competed, and came third, in the Humorous Speech contest of the Hawkesbury Division of Toastmasters. I'm pretty chuffed with that.

Partly, I'll blame Yuri for this. He recommended Toastmasters as a way of getting some extra public speaking skills. Luckily for me, there's a group that meets one lunchtime a week at work, which makes attendance relatively easy. Much easier than trying to fit in an evening on top of my kids' busy social and sporting calendars. I started going just before Christmas and became a paid up member in January this year.

Next, I'll blame Neil Sequeira who prompted me to enter the club contest a month ago on the basis that attendees regularly laughed at my speeches....in a nice way. I won that, and then it dawned on me that I'd be going into the Area level contest. Our club was hosting that event, so I had a slight 'home ground' advantage, but I was still in front of a bunch of people I didn't know, most of whom have been honing their speaking skills for YEARS.

I won that, which meant going up to the Division level contest last night. That was in a church hall, unfamiliar territory for me. We were speaking from an elevated stage, and with a headset microphone. Getting into the big leagues.

I was a bit ruffled because my trip there was delayed with my phone unaccountably deciingd it couldn't find any GPS signal, and refusing to tell me where I was or how to get where I was supposed to be. My destination was the other side of Middle Harbour so my regular tactic of pointing the car vaguely in the right direction and hoping for the best was foiled by its inability to fly across the water. Resorting to my trusty and dusty Gregory's Street Directory I made the 40 minute journey in a mere 80 minutes.

My speech starts with the other Gary Myers, multi-time champion of Summernats (which I've mentioned before in my blog ) and land speed record holder. Oh, and candidate for one of the NSW Senate spots for Federal parliament. He didn't win, but a candidate for the same party did get a spot for Victoria. I suspect this came as somewhat a surprise to him, as the Senate voting system is complex and highly unpredictable. An unemployed ex-sawmill worker who didn't own a suit and has kangaroo poo chucking as a hobby will be an interesting addition to the chamber.

I was more than  happy with my third place finish in the contest. The winner, who also took the Table Topics contest top prize, was excellent. And second place went to a Dave the Happy Singer who took the mick out of homoeopathy and similar rubbish, so I won't criticise that. I get a small trophy and a certificate. And an update to my LinkedIn profile.

And, for posterity, my certificates:




Sunday, September 22, 2013

Tracking email receipts through images

I pinched a technique from EMail marketeers last week. It goes by the cool name of a web beacon or web bug and is used to detect when someone reads (or at least opens) an email.

Our application is like a work flow system and it sends out a whole bunch of emails. I'm a little sceptical about who reads them, especially since I've got a bunch of rules that automatically ignore about 20-30 emails a day.

For a new email alert to a distribution list, I decided to code in a beacon. 

First for each email sent, it grabs a unique identifier. You can use a plain sequence, but I opted for a GUID. It is harder for anyone to guess other values for GUIDs and it is easy for me to tell which ones come from the test environment and which ones from production.

    declare
      v_guid_vc varchar2(32);
    begin
      v_guid_vc := sys_guid();

Then, the email HTML body included an image link

v_body := v_body || 
  'img src="http://'||v_host||'/PKG_MAIL.image?i_val='||v_guid_vc||''' ';

These mails go out to internal users, and Outlook is the primary email tool. The technique doesn't work for users reading their mail through Outlook webmail or the iOS email. The host is behind our firewall and will be unreachable from a home email account. But our company standard Outlook config will happily try to render that image when the user opens the email at their desk. And when that happens, it tries to get the image content by calling PKG_MAIL.image and telling us the GUID we associated with that email.

When we get that image request, we record the GUID, timestamp, IP address, user agent and cookie data. If the user happens to be logged into our system at that time, the cookie will tell us who the reader is. If not, the IP address will give us a clue. [When someone logs into the application, we record the userid and IP address, so we can see if someone logged in from that address recently.] Once we've recorded that data, we return a standard image content, irrespective of the GUID passed in.

No one has to click any link for the picture rendering / email read to be logged.

Rather than sending one email to ten recipients (with the same GUID), you will be better off sending ten variations with their own GUIDs. If you are sending the email to a specific individual, you may not need that excess logging (unless you want to track forwarded emails too). 

Personal email clients, such as Thunderbird and GMail, tend not to open remote images by default, offering a vague "privacy" warning. This is what they are warning you about. It means that Dominos probably know that I read their email vouchers about 5 minutes before ordering a pizza. And they will know whether I am using Thunderbird or Chome, and a bunch of other stuff about me. But, cheaper pizza !


Sunday, August 18, 2013

Session based sequences in 12c.

Catching up on some blogs, and I saw this 12c gem by Syed Jaffar Hussain.

Sequences are great. But sometimes they're just a little more unique than you actually need. Sort of like GUIDs. Start churning through them and they're quickly nine or ten digits. There's nothing wrong with long keys from a technical perspective as the extra byte or two of storage is rarely significant. But they can be greedy from a screen layout point of view (especially in tabular forms). And there's a greater chance of getting digits mixed up in email or conversations about them.

If you are doing a nightly or weekly load, it can be nice to load up your half-a-million rows with a 'batch id' and a 'sequence in batch'. Session based sequences are a nice concept that give you scope for keeping those values in a smaller range.

Another trick for reducing the size of IDs is to covert them into HEX. You'll probably only save one character there though, but the mix of alphabetic and numeric characters. TO_CHAR and TO_NUMBER are both happy to use 'XXXXXXXX' format masks to convert between decimal and hex.


Tuesday, July 30, 2013

WITH enhancements in 12c

There's been some mentions of this feature on Oracle-base and elsewhere, but here's an example of what excites me.

If you deal with nested arrays/tables in SQL, then you quickly bump into an impedance match. You can't readily get that embedded list into horizontal columns.

I'll use the example I'm most familiar with - 2-D geometry data types.

There's a concept called the MBR (minimum bounding rectangle). You can basically think of it as the most northerly, southerly, easterly and westerly points of an area.

select a.mbr from LGA_2012_AUST a  where lga_name12 = 'Darwin (C)';

MBR(SDO_GTYPE, SDO_SRID, SDO_POINT(X, Y, Z), SDO_ELEM_INFO, SDO_ORDINATES)
--------------------------------------------------------------------------------
SDO_GEOMETRY(2003, 8311, NULL, 
   SDO_ELEM_INFO_ARRAY(1, 1003, 3), 
   SDO_ORDINATE_ARRAY(130.815117, -12.469386, 130.938563, -12.33006))

The MBR is a geometry column with an ordinate array listing the four points.

We can do a Collection Unnesting operation, but then we end up with rows.

select b.*  from LGA_2012_AUST a, table(a.mbr.sdo_ordinates) b 
where lga_name12 = 'Darwin (C)';

COLUMN_VALUE
------------
  130.815117
  -12.469386
  130.938563
   -12.33006

To get back to columns, you'd need a PIVOT operation. But because we've lost any sense of the order of those values in the array, that has limitations. In Australia we can easily tell the difference between our latitudes and longitudes, but that isn't possible for many locations.

The WITH operator allows us to extract those array elements easily, and without recourse to a stored function. Though the PRAGMA UDF mention by Tim may mean I don't need to fret about the context switches of switching between SQL and PL/SQL as much.

SQL> l
  1  WITH
  2  FUNCTION ext_val (i_arr in MDSYS.SDO_ORDINATE_ARRAY, i_val in number) RETURN NUMBER IS
  3  BEGIN
  4    return i_arr(i_val);
  5  END;
  6  select ext_val(a.mbr.SDO_ORDINATES,1) v1, ext_val(a.mbr.SDO_ORDINATES,2) v2,
  7         ext_val(a.mbr.SDO_ORDINATES,3) v3, ext_val(a.mbr.SDO_ORDINATES,4) v4
  8  from LGA_2012_AUST a
  9* where lga_name12 = 'Darwin (C)'
SQL> /

        V1         V2         V3         V4
---------- ---------- ---------- ----------
130.815117 -12.469386 130.938563  -12.33006

My demo in SQL*Plus works fine in version 12c.
In the 11.2 Instant Client, I could get it to run using
set sqlterminator #

That stops it treating the ";" in line 4 as a terminator, and allows it to pull in the whole statement. The backslash will send it off to the DB for processing, and it works fine then

SQL Developer 4 also seems to choke on the syntax. I'll have to work up the strength to see if it is already logged as an issue. I assume it won't work in 3.2, and that a 12c rollout will require SQL*Plus and SQL Developer installs to be upgraded. Not sure about TOAD and any other clients.

Monday, May 27, 2013

The Adventures of the Trickster developer - Aliases

The Trickster is a mythological being who enjoys potentially dangerous counter-intuitive behaviour. You'll often find him deep within the source code of large systems.

The Alias Trap

Generally an alias in a query is there to make it easier to understand, either for the developer or the database. However the Trickster can reuse aliases within a query to make things more confusing.


desc scott.emp
       Name       Null?    Type
       ---------- -------- --------------------
1      EMPNO      NOT NULL NUMBER(4)
2      ENAME               VARCHAR2(10)
3      JOB                 VARCHAR2(9)
4      MGR                 NUMBER(4)
5      HIREDATE            DATE
6      SAL                 NUMBER(7,2)
7      COMM                NUMBER(7,2)
8      DEPTNO              NUMBER(2)

desc scott.salgrade
       Name      Null?    Type
       --------- -------- --------------------
1      GRADE              NUMBER
2      LOSAL              NUMBER
3      HISAL              NUMBER


desc scott.dept
       Name      Null?    Type
       --------- -------- --------------------
1      DEPTNO    NOT NULL NUMBER(2)
2      DNAME              VARCHAR2(14)
3      LOC                VARCHAR2(13)


The trickster can try queries such as

SELECT e.ename, e.grade
FROM scott.emp e 
       JOIN scott.salgrade e ON e.sal BETWEEN e.losal AND e.hisal;

and


SELECT x.ename, x.dname
from scott.emp x join scott.dept x using (deptno);



As long as any prefixed column names involved are unique to a table, the database can work out what to do. 

If you find this in a 'live' query, it is normally one with at least half a dozen tables where an extra join has been added without noticing that the alias is already in use. And you'll discover it when a column is added to one of those tables causing the database to throw up its hands in surrender. Sometimes you will find it in a dynamically constructed query, when it will fail seemingly at random. 

Once discovered, it isn't a mentally difficult bug to resolve. But first you have to get past the mental roadblock of "But all the columns already have an alias pointing to the table they come from".

Thursday, May 16, 2013

SCNs and Timestamps

The function ORA_ROWSCN returns an SCN from a row (or more commonly the block, unless ROWDEPENDENCIES has been used).

select distinct ora_rowscn from PLAN_TABLE;

But unless you're a database, that SCN doesn't mean much. You can put things in some sort of order, but not much more.

Much better is

select sys.scn_to_timestamp(ora_rowscn) from PLAN_TABLE;

unless it gives you

ORA-08181: specified number is not a valid system change number

which is database-speak for "I can't remember exactly".

That's when you might be able to fall back on this, plugging the SCN in place of the **** :

select * from 
  (select first_time, first_change# curr_change, 
          lag(first_change#) over (order by first_change#) prev_change,
          lead(first_change#) over (order by first_change#) next_change
  FROM v$log_history)
where **** between curr_change and next_change

It won't be exact, and it doesn't stretch back forever. But it is better than nothing.

PS. This isn't a perfect way to find when a row was really inserted/updated. It is probably at the block level, and there's 'stuff' that can happen which doesn't actually change the row but might still reset the SCN. If you're looking for perfection, you at the wrong blog :)



Friday, March 29, 2013

An Oracle April Fools trick

If anyone is looking for a trick for April Fools' Day, try

alter session set nls_date_format = 'fm';

The result will be an simple TO_CHAR on a date, or implicit conversion of a date to a string, will return NULL. You could try that with an ALTER SYSTEM too.

If no-one notices then, "Congratulations", no-one is relying on the default date format.

Tuesday, February 05, 2013

Out with the old, in with the new

The good news is that I've successfully got my picture being flagged up with my blog articles in Google Search. 

The bad news is that sometimes (mostly ?) it has the picture from my deprecated 'domain' GPlus account rather than my primary one. I suspect I need to take more drastic steps to eliminate the old profile. 


Saturday, February 02, 2013

Folks, we have an Image Problem

Firstly the Daily WTF
"The forums have strong feelings about Oracle. Not a single one of those feelings is positive." 
This was actually a post in their "I hate Oracle" forum so it may be a bit biased. But the fact that the Dailt WTF have opened a forum just for Oracle with that name, well I get the hint.


It doesn't help that the only time Oracle hits the big news is when there's another major hole in Java. It's the only installer that comes with a revolving door as standard. Oh, and the Ask.com toolbar being foisted on people


Then I saw the video linked on the highscalability blog . The whole relational / SQL database market has all the cool of, well, a Blackberry phone. It doesn't help that no-one can decide whether to pronounce it S-Q-L or see-quel.

While the video suggest a few fresh naming options, I think we need to be radical. ACIDbase recalls the core requirements of the relational model, suggests danger with a hint of rebellious substance abuse and coolly trips off the tongue. The only disadvantage is that the chemists seem to have wrapped up the best domain names. 

Now all we need is a fresh mascot. I suggest...Beaker




I'll have my people call his people....

Monday, January 28, 2013

Big BLOBs being greedy in TEMP tablespace


I work in a small team. There are just five of us at the 'coalface' for the application, responsible for both development and production. One of the five acts as the 'DBA/Sysadmin' (as well as doing programming and acting team as leader when the actual team leader is on leave and filling in a couple of other roles). And he was on leave on Friday when we got the plaintive email with the subject "The application is broken".

We could log in and out, so it wasn't 'catastrophic'. The application is mostly APEX, so my next step was to check the Apex activity log where I saw "ORA-01652: unable to extend temp segment by 128 in tablespace TEMP". The error was at least 'extreme'. 

Yes, a DBA probably would have gone to the alert.log first. I am not a DBA :)

Prompted by that message, I ran a query across V$TEMPSEG_USAGE to find several inactive sessions under APEX_PUBLIC_USER holding hundreds of MBs of temporary LOB segments. With a temp tablespace around 2GB, this level of usage isn't sustainable. The quick solution was just to shoot the sessions. 

APEX lies on top of HTTP, a stateless protocol and the session state is actually preserved in tables (check out Scott Wesley's post on viewing Apex session state). It generally isn't a problem if you kill one of the database sessions forming the connection pool and it isn't active.

On further investigation, we have identified one trouble spot, which is an application component that allows users to upload and download files. It seems that recently they have been doing large files (hundreds of MB). When the files are uploaded or download it allocates a temporary LOG segment and it holds onto the segment even after a commit or a DBMS_LOB.FREETEMPORARY or a reset package state. 

Those end users aren't getting the same database session each time oo if they do several file transfers, multiple sessions can be hanging onto these hundreds of MB of temporary segments. With around 20 sessions at peak, that was locking up our 2GB temp file.

As a stop-gap, we've introduced a scheduled job every three hours to kill off greedy sessions. It is an ugly solution though, and we'll be discussing options when we're all back at work after the long weekend. These include


  • A support note that points to event 60025 to free up the stray LOB segments
  • Tying the file upload/download component to one or two database sessions
  • File size limits on uploads and downloads
  • Is Apex the appropriate security gate-keeper to those fils
  • Can we avoid dealing with the 'entire' file, and just have a few MBs at a time

If you have any other suggestions, feel free to add a comment.

Sunday, January 27, 2013

Engaging with Google+

In the pub after the Sydney Oracle Meetup ('SQL Developer for DBAs, featuring a guest recorded appearance by the one and only +Jeff Smith ), Google+ got a mention. Okay, it was me who mentioned it, and it was closely followed by a remark by another attendee that he didn't know anyone else who used it.

It is getting some traction, apparently overtaking Twitter in active members ( I don't know how the figures are calculated. Twitter is more high profile still, possibly because practically every tweet is public).

Google+... grew in terms of active usage by 27% to 343m users to become the number 2 social platform. Interestingly for Google, YouTube (not previously tracked by us as a social platform) comes in at number 3....

But it is still a geeky place. The most popular communities are Space, Android and Photography. Photography is big in G+. Sport less so, with the Minecraft community being bigger than the 'F1 Racing' group (which is the biggest of any sport).

Google+ Tips and Suggestions

If you do want to get active in G+, finding appropriate communities is a good start.

Try Science Sunday or IT Professionals for active groups, or find something aligned to your non-work interests. There's no busy Oracle community yet. If you are an Oracle person, feel free to join the Sydney Oracle community (even if you don't live around here).

If you want a selected set of individuals to start with, I've shared a circle here. Some of them (Laurence Pegard and Artsaholic) are 'visual' posters with lots of imagery. Wired, The Economist and the Daily WTF will be known to anyone reading this. Frankly the comments on the Economist posts often fall into the drivel category.Tim Hall and Jeff Smith are included because they are the bigger Oracle 'identities'. David Brin is the sci-fi author, and the others represent good originators or curators of content.

I have one circle that is empty. I share articles to that if I want to read them later.

I have a circle for 'local' content (where it will be active during the daytime in Australia) and another for 'Following' a group of people who post stuff that I don't want to miss. I catch up on that group in the mornings (mostly) simply reading back to the last post I remember.

I have a 'Background' circle of stuff that I can read if I have time, but I can skip when bust. And I sometimes have 'potential' circles where I've imported a bunch of people and want to filter them into a better circle, or remove them altogether,

I generally remove (uncircle) people for posting lots of cat pictures or memes, or if they have a very high noise to signal ratio. I block people who irritate me. It makes life more pleasant.

If you comment on an article, you get notifications for subsequent comments (unless you mute the post). If you follow interesting people, you can often find other interesting people when they comment on interesting posts.

I don't expect people I follow to follow me back. I don't expect to find people I know in real-life.
Lots of people don't post publically. Mostly women, and mostly because of creeps. Because of this, you might need to wait until you are circled by them before you actually see any content from them.

Also, stick enough information in your profile so that people know 'what you are about'.

Thursday, January 10, 2013

Greatest, Least and NULLs

The functions greatest and least are amongst the special ones which can take lots of arguments. They are a bit similar to min and max for columns rather than rows. But they have one difference to the aggregate functions that meant I avoided them for a long time.

Unlike aggregate functions, which ignore null values,  greatest and least will return a null if any of the supplied columns (or expressions) are null. If that isn't what you want (and it often isn't) it means putting NVLs around everything and with substitution values that don't break the logic.

For example, 


create table state_sum
  (style varchar2(10), 
  nsw number, vic number, wa number, qld number, 
  sa number, tas number, nt number, act number);
  
begin
  insert into state_sum values
    ('Red',5,3,9,null,1,7,8,null);
  insert into state_sum values
    ('Blue',null,null,null,null,null,null,null,null);
  insert into state_sum values
    ('Green',10,9,8,7,6,5,4,3);
  commit;
end;
/

If I try a 
SELECT least(nsw,vic,wa,qld) 
FROM state_sum 
WHERE style  = 'Red';
It will return a null (because there's a null value for the QLD column);

I could try
SELECT least(nvl(nsw,0),nvl(vic,0),nvl(wa,0),nvl(qld,0)
FROM state_sum 
WHERE style  = 'Red';

But that returns a 0 because that has been substituted for the QLD null. To get the lowest 'real' value, I'd need to NVL with a value so high that it could never be the lowest. A similar problem occurs for greatest except you can probably get away with a zero....up until you get a set of negative numbers. And don't even get me started on character and date values.

In short, I never found these functions to be as useful as intended.

But once in a while this apparent drawback in their management of null values does make them useful.

Coalesce, greatest/least and null tests

I have become a fan of coalesce as NVL on steroids. 
I never liked NVL2. It might be useful, but if you can't actually give it a decent name, then you shouldn't have bothered. Or maybe just wait till you invent the concept of overloading and simply have the existing two-parameter NVL and an additional three-parameter NVL instead of NVL2.

If you haven't used it, it returns the first no-null value in the list. And, like greatest/least, it can take lots of arguments.

SQL> select style from state_sum
  2  where coalesce(nsw,vic,wa,qld,sa,tas,nt,act) is null;

STYLE
----------
Blue

SQL> select style from state_sum
  2  where coalesce(nsw,vic,wa,qld,sa,tas,nt,act) is not null;

STYLE
----------
Red
Green

But what if you want to know if any of the values are null. That is, you want Red and Blue, but not Green (or vice versa).
Yep, you guessed it. That's where you can use  greatest/least.


SQL> select style from state_sum
  2  where least(nsw,vic,wa,qld,sa,tas,nt,act) is null;

STYLE
----------
Red
Blue

SQL> select style from state_sum
  2  where least(nsw,vic,wa,qld,sa,tas,nt,act) is not null;

STYLE
----------
Green

And it is easy enough to combine them 

SQL> select style,
  2         case when least(nsw,vic,wa,qld,sa,tas,nt,act) is null 
                 then 'Y' else 'N' end nulls_present,
  3         case when coalesce(nsw,vic,wa,qld,sa,tas,nt,act) is null 
                 then 'Y' else 'N' end all_null
  4  from state_sum;

STYLE      N A
---------- - -
Red        Y N
Blue       Y Y
Green      N N