Wednesday, December 22, 2010

Is it really 10 inches ?

Actually the measurements in question are 9.7 and 10.1 which are the respective screen sizes, in inches, of an iPad and my Asus EeePC . And it is screen size, or how screen sizes are measured, that I'm talking about.

If you remember the rules about squaws on hippopotamuses, we can work out the actual width and height of the screens, but crucially only if we know their ratio. The iPad has a 9.7 inch diagonal screen at a 1024 x 768 ratio (often stated as 4:3 but I'll go with the numerical value of 1.33). My EeePC has a 10.1 inch diagonal screen with an odd 1024 x 600 ratio. That equates to a 1.7 ratio (somewhere between a 16:10 and a 16:9).

This gives the EeePC a size of 8.72 x 5.1 and the iPad 7.75 x 5.83. The squarer size means the iPad is actually about 2% larger in area than the EeePC despite the iPad having the smaller diagonal measure.

Thanks to silisoftware for supporting those calculations.

So a couple of observations.

Firstly, as a media consumption device, the iPad has not been optimized for widescreen movies but more for 'page' sized content. Steve Jobs is a smart guy, especially on user experience. He didn't make that choice by flipping a coin. My bet is he wasn't working out what people would use the device for either. He decides what he is going to sell, and people decide whether to buy it or not, and he wanted a portable device not a encumberance.

Secondly the size of the iPad screen includes the 'keyboard'. Googling "ipad virtual keyboard" images, it looks like the keyboard takes up most of the bottom half of the screen, at least in landscape mode. Where an application requires text entry (name, email...) a 1024x768 screen on a netbook is very different to the same size screen on a tablet. And re-jigging things when it is switched between landscape and portrait is another can of worms.

The smaller the screen, the larger the impact of any virtual keyboard. You can read characters that are a lot smaller than a 'key' you press for typing. 

When Cary Millsap was presenting to the Sydney Oracle Meetup last week, one of the things he said that if the same value for a measure can result from two completely different user experiences, you are measuring the wrong thing. He was speaking about how an average may mask extreme variances, but the same applies to this situation. While the screen size of a netbook and an iPad are similar, the experiences can differ considerably.

So can people stop talking about 10 inches (and that includes those people spruiking herbal products).

Thursday, December 09, 2010

Globally and Locally Structured Databases

I've been reading another article on NoSQL.

This one focuses on the 'No Join' aspect which the author relates to schema-less data.

Normalisation and joins don't go hand-in-hand though. If you are old enough, you might have had an address book (or a filofax). In there, you would write addresses organized by name. Under "F" you'd write "FLINTSTONE, FRED and WILMA" followed by an address in Bedrock. That's pretty normalised. You haven't got one entry for Fred and another for Wilma both with the same address. You might write Fred's birthday next to his name, and Wilma's next to hers, and have another date for their wedding anniversary.

Implicitly you've recognised that some items belong the the 'family' entity and others to the 'family member' entity. But you've made them part of a single object that provides the joining mechanism.

SQL databases use tables to represent entities, requiring joins to pull back related entities (which leads to concepts like primary keys). I started out on an IDMS network database which didn't have tables, but rather explicit relationships between entities. You would start with a parent, follow a relationship down to a child and then to siblings (either all of them, or as many as you wanted to find). You might then follow a different relationship from that child to another kind of parent. Still normalised, but a different linking mechanism.

There is a difference between what I'd term 'globally structured data' and 'locally structured data'. SQL databases have a global structure in that a consistent structure is forced on the entire database. One the other hand Document databases allow each document to follow its own structure. Key-value stores don't have any explicit structure, leaving it up to the application to interpret.

Where you have orders consisting of order lines, a SQL database will have one Orders table and one Order Lines table, and all the Orders would have the same attributes (for example Customer Name), and all the Order Lines would have the same attributes. A document database may have some orders that have a "Customer Name", but others that have "Customer Surname" and "Customer First Name" and yet others that have "Customer Corporate Name". This puts more emphasis on the application to work with the potential varieties.

So what is the difference when it comes to the 'agility' of the implementations ?

In the run up to Christmas, your firm decides to change your business model. Rather than forcing people to make separate orders for each delivery address, you now allow each item to be delivered to a different location. In a normalised database you need to add the DELIVERY_LOCATION to the ORDER_LINE entity. You then update all existing orders so that the DELIVERY_LOCATION from the ORDER is copied down to the individual child ORDER_LINE records and finally drop the DELIVERY_LOCATION from the ORDER table.

In a document database you can change the application and start recording your new delivery location on your Order Lines. At the database layer, you don't have to worry that the structures are not consistent. The problem is you still have to write code that copes with delivering items to the right place, and that handles orders that just have a delivery location at the order level as well as orders that have it at the order item level. Your application logic now has additional complexity, plus you need test cases that cope with both sets of conditions. To make it easy, you may have template versions for the documents, so that 'v1.1' Orders have the single delivery location but the newer 'v2.0' Orders have it at the lower level. 

Worse yet, your application will have bugs in it. Maybe not today, maybe not tomorrow but one day, and you'll regret the consequences for the rest of your life. Because you'll be stuck with some 'v1.1' Orders that have a delivery location on the Order Lines, or perhaps a 'v2.0' with no delivery location at all, or something that claims to be a 'v1.3' style which isn't documented anywhere but is something Barney had to write at five o'clock on Christmas Eve.

It is perfectly possible for you to run a 'migration' on your document database so that all your data is at the same version. This basically enforces the same situation as a 'globally structured database' like SQL....except you need to get all the code right because you won't get enforcement support from the database layer itself.

Another solution might be to delete orders once they are delivered. After a few months, those 'v1.1' documents won't be an issue any more. At least in five years time you won't have code that is built to deal with 20 different versions.

Generally, the problems of being Agile in an RDBMS are not to do with SQL or relational concepts. They are a consequence of being unable to restructure existing data (perhaps you can't work out how, or don't have the data you need). But they are ultimately problems in migration from the old business model to the new model, not database problems.

To be even handed, schema changes are still not seamless, especially with more primitive database engines, or when dealing with high-availability / multi-server / replicated databases. But that's a 'state of the art' issue, not a fundamental conceptual issue.

Monday, December 06, 2010

Congratulations, Jeff (and Neils, Peter and Elic)

Kellyn previosuly confessed to her DBA crushes.
I'm going to confess to being a PL/SQL Stalker. It's okay, I've confessed before and he didn't object :)

This time, Jeff got one step beyond in the Q3 playoffs for the PL/SQL Challenge. Congratulations (but I'll get you next time, and your little dog too). I'll take comfort in the thought that the top ten only had one player from England, mitigating slightly Australia's performance at the cricket in Adelaide. Distracting the poms with the UKOUG may have helped. [Yes, I did migrate from England to Australia. I'm allowed to make cheap remarks either way.]

In this current quarter's rankings, he's at 20th, while I'm a bit lower down at 27th. Okay, there are four from the UK in the top 30, but I'll gloss over that fact.

One reason for my stalking is that Jeff is a blogger working with PL/SQL in Australia, albeit a couple of thousand miles away (about 4000 kilometers by road). We did meet at an AUSOUG conference a few years ago.

The other place we cross paths is StackOverflow. As shown here, we are both members of that select group of people with the PL/SQL 'bronze' badge, and the Oracle 'silver' badge. Actually Tony Andrews, Vincent Malgrat and Andrew Clarke (APC) have also got their Oracle 'gold' badge. I might start stalking them too, and maybe Justin Cave as well. Be warned, if you stalk users in StackOverflow and find yourself agreeing with them, you'll probably vote them even higher.

An interesting feature of StackOverflow is that they do regular data dumps of all the questions, answers and comments. They've also got that installed on SQL Server Azure so you can actually run your own queries against the data. I've written one to keep track of the exact scores (as of the dump date) on a specific tag.  There I can see Jeff has 74 upvotes for the tag 'oracle10g' while I've only got 47.

But I am beating him in the 'oracle' and 'plsql' tags. So there !