Sunday, October 12, 2014

Latest Oracle allows SELECT without SELECT...FOR UPDATE

Digging through a backlog of Oracle blogs, I came across an gem in a presentation from AMIS (on Slideshare). Got to bullet point 5 on slide 63 and boom !

You all know that when you grant SELECT on a table to a user, they can do a SELECT FOR UPDATE, locking records in the table and preventing other updates or deletes. [Some client tools may do that in the background. ]

Well finally Oracle have cottoned on to that too, and there's a lighter-weight "READ" privilege in 12.1.0.2 which won't allow SELECT FOR UPDATE.

This will make DBAs very happy. Actually it won't. The natural state of a DBA is grumpy, at least when in the vicinity of a developer or salesman.


PS. Why would SELECT FOR UPDATE ever be a good idea for a user with no UPDATE privilege ?
If I had to guess, I'd say it went back to a 'pre-read consistency' model when you might use a SELECT FOR UPDATE to try to select data that wasn't being updated.

Sunday, July 20, 2014

Putting my DB / Apex install through the wringer

I was mucking around trying to get APEX on one of my PCs to be visible on the internet.

This was just a proof-of-concept, not something I intend to actually leave running.

EPG on Port 8080

I do other testing on the home network too, so I already had my router configured to forward port 80 to another environment. That meant the router's web admin had been shifted to port 8080, and it wouldn't let me use that. Yes, I should find a open source firmware, but OpenWRT says it is unsupported and will "brick the router" and I can't see anything for Tomato.

So I figured I'd just use any incoming router port and forward it to the PC's 8080. I chose 6000. This was not a good choice. Looks like Chrome comes with a list of ports which it thinks shouldn't be talking http. 6000 is one of them, since it is supposed to be used for X11 traffic so Chrome told me it was unsafe and refused to co-operate.

Since it is a black-list of ports to avoid, I just happened to be unlucky (or stupid) in picking a bad one. Once I selected another, I got past that issue.

My task list was:

Server
  1. Install Oracle XE 11gR2 (Windows 64-bit)
  2. Configure the EPG for Apex. I ran apex_epg_config.sql as, I had switched straight from the pre-installed Apex 4.0 to 4.2.5 rather than upgrading a version I had actively used. 
  3. Unlocked the ANONYMOUS database account
  4. Checked DBMS_XDB.GETHTTPPORT returned 8080 
(At this point, you can test that you have connectivity to apex on the machine on which XE / Apex is installed, through 127.0.0.1 and localhost).

Local Network
  1. Enabled external access by setting DBMS_XDB.SETLISTENERLOCALACCESS(false); 
(Now you can test connectivity from another machine on the same local network through whatever hostname and/or IP address is assigned to that machine, such as 10.x.x.x or 192.168.x.x)

Remote Network
  • I got a handy Dynamic DNS via NoIP because my home IP can potentially change (though it is very rare). [Yes, there was a whole mess about Microsoft temporarily hijackinging some noip domains, but I'm not using this for anything important.] This was an option in my router setup.
  • The machine that runs XE / Apex should be assigned a specific 192.168.1.nnn IP address by the router (based on it's MAC address). This configuration is specific to the router hardware, so I won't go into my details here. But it is essential for the next step.
  • Configure the port forwarding on the router to push incoming traffic on the router's port 8088 off to port 8080 for the IP address of the machine running XE / Apex. This is also router specific. 
When everything is switched on, I can get to my Apex install from outside the local network based on the hostname set up with noip, and the port configured in the router. I used my phone's 3G internet connection to test this. 

Apex Listener

My next step was to use the Apex Listener rather than the EPG. Oracle have actually retagged the Apex Listener as RDS (Restful Data Services) so that search engines can confuse it with Amazon RDS (Relational Database Service).

This one is relatively easy to set up, especially since I stuck with "standalone" mode for this test. 

A colleague had pointed me to this OBE walkthrough on Apex PDF reports via RDS, so I took a spin through that and it all worked seamlessly.

My next step would be a regular web server/container for RDS rather than standalone. I'm tempted to give Jetty a try as the web server and container for the listener rather than Tomcat etc, but the Jetty documentation seems pretty sketchy. I'm used to the thoroughness of the documentation for Apache (as well as Oracle).


Saturday, June 21, 2014

Literally speaking

Reading Scott Wesley's blog from a days ago, and he made a remark about being unable to concatenate strings when using the ANSI date construct.

The construct date '1900-01-01' is an example of a literal, in the same way as '01-01' is string literal and 1900 is a numeric literal. We even have use some more exotic numeric literals such as 1e3 and 3d .

Oracle is pretty generous with implicit conversions from strings to numbers and vice versa, so it doesn't object when we assign a numeric literal to a CHAR or VARCHAR2 variable, or a string to a NUMBER variable (as long as the content is appropriate). We are allowed to assign the string literal '1e3' to a number since the content is numeric, albeit in scientific notation.

So there are no problems with executing the following:
declare
  v number := '1e3';
begin
  dbms_output.put_line(v);
end;
/

However while 3d and 4.5f can be used as numeric literals, Oracle will object to converting the strings '3d' or '4.5f' into a number because the 'f' and 'd' relate to the data type (Binary Float and Binary Double) and not to the content.

Similarly, we're not allowed to try to use string expressions (or varchar2/char variables) within a date literal, or the related timestamp literal. It must be the correct sequence of numbers and separators enclosed by single quotes. It doesn't complain if you use the alternative quoting mechanism, such as date q'[1902-05-01]' but I'd recommend against it as being undocumented and superfluous.

Going further, we have interval literals such as interval '15' minute .In these constructs we are not allowed to omit the quotes around the numeric component. And we're not allowed to use scientific notation for the 'number' either (but again the alternative quoting mechanism is permitted). 

I've built an affection for interval literals, which are well suited to flashback queries.

select versions_operation, a.* 
from test versions between timestamp sysdate - interval '1' minute and sysdate a;

Confusingly the TIMESTAMP keyword in the query above is part of the flashback syntax, and you have to repeat the word if you are using a timestamp literal in a flashback query. 

select versions_operation, a.*

from test versions between timestamp timestamp '2014-06-21 12:50:00' 
                   and sysdate a


Saturday, June 07, 2014

Apex theme fun


Sometimes you are working with an off-the-shelf product and find something odd, and you're not quite sure whether it is a bug, a feature or whether you've lost the plot.

I use Oracle's Application Express, and was digging into the included theme_18. The templates refer to classes "t18success" and "t18notification"



And then I go looking into the CSS and see hash / ID selectors.

#t18Success{margin:5px auto;font-size:12px;color:#333;background:#DAEED2;width:600px;background-repeat:no-repeat;padding:5px;border:1px #95C682 solid;border-right:none;border-left:none;}

#t18Notification{margin:5px auto;padding:5px;font-size:12px;color:#333;text-align:center;vertical-align:top;border:1px #ffd700 solid;border-right:none;border-left:none;background-color:#ffffcc;width:600px;}

For added confusion, HTML class names are case-sensitive, but CSS selectors are case-insensitive, so the case differences may or may not be relevant.

The application looks nicer if I change the CSS to class selectors, and then I get coloured, dismissable boxes rather than hard to read, unstyled messages. I could probably get the same effect by changing the id="MESSAGE" in the templates, but that seems riskier. At least with the CSS, I am confident that I am just changing the appearance and it shouldn't affect the logic.

Digging deeper, the CSS for more than a dozen of the built-in themes utilise the ID selector "#notification-message" in the CSS. About half a dozen have only a class selector, and another three have both (with the prefix of t followed by the theme number). Finally three just have the ID selector with the theme prefix.

My gut feel is that they switched from the ID to the class selectors stopping in various places on the way. And some of those places aren't very pretty.

I'm not sure I see the benefit in having the theme number embedded in templates and selectors. The template tells it which theme CSS file to get, and as long as the template and CSS are consistent, the use of the theme number just seems to add more place you have to edit when you want to customise a theme.

This was all checked on a fresh Apex 4.0 instance because I just installed the new Windows 64-bit version of Oracle Express Edition. I'll do an upgrade of that default to the latest 4.2 this weekend too.

Friday, April 18, 2014

I Love Logs

It occurred to me a few days ago, as I was reading this article on DevOps, that I might actually be a DevOps.

I think of myself as a developer, but my current role is in a small team running a small system. And by running, I mean that we are 

  • 'root' and 'Administrator' on our Linux and Windows servers
  • 'oracle / sysdba' on the database side, 
  • the apex administrator account and the apex workspace administrators,
  • the developers and testers, 
  • the people who set up (and revoke) application users and 
  • the people on the receiving end of the support email
Flashbacked to Jeff Smith's article on Developers in Prod. But the truth is that there's a lot of people wearing multiple hats out there, and the job titles of old are getting a bit thin. 

The advantage of having all those hats, or at least all those passwords, is that when I'm looking at issues, I get to look pretty much EVERYWHERE. 

I look at the SSH, FTP and mailserver logs owned by root. The SSH logs generally tell me who logged on where and from where. Some of that is for file transfers (some are SFTP, some are still FTP), some of it is the other members of the team logging on to run jobs. The system sends out lots of mail notifications, and occasionally they don't arrive so I check that log to see that it was sent (and if it may have been too big, or rejected by the gateway).

Also on the server are the Apache logs. We've got these on daily rotate going back a couple of years because it is a small enough system that the logs sizes don't matter. But Apex stuffs most of those field values into the URL as a GET, so they all get logged by Apache. I can get a good idea of what IP address was inquiring about a particular location or order by grepping the logs for the period in question.

I haven't often had the need to look in the Oracle alert logs or dump directories, but they are there if I want to run a trace on some code. 

In contracts, I'm often looking at the V$ (and DBA_) views and tables. The database has some audit trail settings so we can track DDL and (some) logons. Most of the database access is via the Apex component, so there's only a connection pool there.

The SELECT ANY TABLE also gives us access to the underlying Apex tables that tell us the 'private' session state of variables, collections etc. (Scott Wesley blogged on this a while back). Oh, and it amazing how many people DON'T log out of an application, but just shut their browser (or computer) down. At least it amazed me. 

The apex workspace logs stick around for a couple of weeks too, so they can be handy to see who was looking at which pages (because sometimes email us a screenshot of an error message without telling us how or where it popped up). Luckily error messages are logged in that workspace log. 

We have internal application logs too. Emails sent, batch jobs run, people logging on, navigation menu items clicked. And some of our tables include columns with a DEFAULT from SYS_CONTEXT/USERENV (Module, Action, Client Identifier/Info) so we can automatically pick up details when a row is inserted.

All this metadata makes it a lot easier to find the cause of problems. It isn't voyeurism or spying. Honest. 

Saturday, April 12, 2014

Unique identifiers - but what do they identify

Most of the readers of this blog will be developers, or DBAs, who got the rules of Normalisation drummed into them during some phase of the education or training. But often we get to work with people who don't have that grounding. This post is for them. Feel free to point them at it.

Through normalisation, the tendency is to start with a data set, and by a methodical process extract candidate keys and their dependent attributes. In many cases there isn't a genuine or usable candidate key and artificial / surrogate keys need to be generated. While your bank can generally work out who you are based on your name and address, either of those could change and so they assign you a more permanent customer or account number.

The difficulty comes when those identifiers take on a life of their own. 

Consider the phone number. When I dial my wife's phone number, out of all the phones in Australia (or the world), it is her's alone that will ring. Why that one ? 

In the dark ages, the phone number would indicate a particular exchange and a copper wire leading out of that exchange hard wired to a receiver (or a set of receivers in the case of Party Lines).  Now all the routing is electronic, telephones can be mobile and the routing for calls to a particular number can be changed in an instant. A phone number no longer identifies a device, but a service, and a new collection of other identifiers have risen up to support the implementation of that service. An IMEI can identify a mobile handset and the IMSI indicates a SIM card from a network provider, and we can change the SIM card / IMSI that corresponds to a phone number, or swap SIM cards between handsets. Outside the cellular world, VOIP can shunt 'phone number' calls around innumerable devices using IP addresses. 

Time is another factor. While I may 'own' a given phone number at a particular time, I may give that up and someone else might take it over. That may get represented by adding dates, or date ranges to the key, or it can be looked at as a sequence. For example, Elizabeth Taylor's husband may indicate one of seven men depending on context. The "fourth husband" or "her husband on 1st Jan 1960" would be Eddie Fisher.

Those without a data modelling background that includes normalisation may flinch at the proliferation of entities and tables in a relational environment. As developers and architects look at newer technologies some of the discipline of the relational model will be passed over. Ephemeral transactions can cluster the attributes together in XML or JSON formats with no need for consistency of data definitions beyond the period of processing. Data warehousing quickly discarded relational formats in favour of 'facts' and 'dimensions'. 

The burden of managing a continuous and connected set of data extending over a long period of time, during which the identifiers and attributes morph, is an ongoing challenge in database design.

Sunday, March 09, 2014

Pre-digested authentication

A bit of a follow-up to my previous post on Digest authentication.

The fun thing about doing the hard yards to code up the algorithm is that you get a deeper level of understanding about what's going on. Take these lines:

    v_in_str := utl_raw.cast_to_raw(i_username||':'||i_realm||':'||i_password);
    v_ha1 := lower(DBMS_OBFUSCATION_TOOLKIT.md5(input => v_in_raw));

Every time we build the "who we are" component for this site, we start with exactly the same hash made up of the username, realm (site) and password. This is a batch routine, which means somewhere we would store the username and password for the site - whether that is a parameter in a scheduling tool, coded into a shell script or OS file, or somewhere in the database. If you've got the security option for Oracle, you can use the Wallet, with its own security layers.

But digest authentication gives us another option. Since we actually use the hashed value of the user/site/password, we can store that instead. The receiving site has no idea the code doesn't actually know the REAL password.

Now turn that over in your head. We can call the web service as this user WITHOUT knowing the password, just by knowing the hash. I don't know about you, but it makes me a little bit more worried when I hear of user details being leaked or hacked from sites. It's all very well reassuring us the passwords are hashed and can't be reverse engineered (assuming your own password can't be brute-forced). But depending on the security mechanism, a leak of those hashes can be dangerous. If a hacked provider advises people to change their passwords, take their advice. 

'Basic' authentication doesn't have the same weakness. In that environment the provider can store the password hash after applying their own 'secret sauce' mechanism (mostly a salt). When you authenticate, you send the password, they apply the secret sauce and compare the result. You can't get away without knowing the password, because all the work is done at their end.

There's no secret sauce for digest authentication, and there can't be. Even if the provider had the password in the clear, there's no way they can be sure the client has the password since all the client needs is the result of the hash. The provider must store, or be able to work out, the result of that hash because they need to replicate the final hash result using both the client and server nonces. They can store either that same user/realm/password hash as is, or they can encrypt it in a reversible manner, but a one-way hash wouldn't be usable.

In short, digest authentication means that our batch routine doesn't need to 'know' the actual password, just a hash. But it also makes those hashes a lot more dangerous.

I'm an amateur in this field. I checked around and it does seem this is a recognized limitation of digest authentication. EG: This Q&A and this comparison of Digest and Basic.