Saturday, May 07, 2011

From CRUD to AAAR - Pirates of Tahiti

When I wore a younger man's clothes, everything was cruddy. CREATE, READ, UPDATE and DELETE. Which equated to 'INSERT, SELECT, UPDATE and DELETE' in SQL terms. We didn't have MERGE in them days.

DELETE was the first to go. It was too hard to get back something you'd deleted, and something ALWAYS got deleted by mistake. We first moved to a 'soft delete', which meant setting a flag on the record indicating it should be (generally) ignored. After we ended up with tables that were 90% 'deleted' records, we went a step further. That's my first 'A' ; Archiving - moving the record into a separate table where it was easier to ignore.

UPDATE also gives problems. When your system was used by three men and a dog, you didn't run into too many concurrency conflicts. They generally worked on different records and only very occasionally had to wait on another user's locks. As you start to scale up, you run into more conflicts. Everyone would be trying to update the daily balance at the same time. You've all seen the effect at traffic lights. You reach the load point where more people are being added to the traffic queue than are being let through on each green light. It only sorts itself out when the load drops, and try telling sales that you need fewer customers.

So rather than UPDATE, you do INSERT or CREATE or, as I'm calling it to fit the pirate theme, APPEND. My second 'A'. Say you have 100 iPads in stock, and have them on a single stock record. Every sale you make, you need to update that record to reduce the available quantity. Only one person can update it at a time and everyone else has to wait. So rather than update the record, you append a transaction record saying how many you sold. I can insert a record and you can insert a record and neither of us has to wait for the other to finish. [Oracle can actually be smart so that when two people are inserting records at the same time, they actually go into different database blocks.]

Some of the NoSQL databases go further. Say I've got a profile record giving my account id, name and email address and I want to update the email address. Rather than do an update of the record, I actually append a new record with a later timestamp or version number and the new email address. This gives me a better audit history, and an easier replication mechanism between distributed copies of the data (because the timestamp settles arguments about the order in which things happened).

The third 'A' is for Aggregation. I take my set of sales transaction and total them up to see how many I sold. I have another set of data indicating how many I've bought. Eventually they'll balance out, even if I have to append some '-1' records to my sales and give refunds to a disappointed fanboi or two. Or with my profile, I take the details from the record with the highest version. Eventually the older ones may be archived.

At the end we will always have 'R' for reading. There's not much point in collecting data that no-one ever reads.

Anyway, that's why there are so many pirates involved in databases these days. I don't have any explanation for any cowboys in the industry.

PS. tahiti.oracle.com is Oracle's Document Search engine. I don't think it was very big on pirates, but if I had mentioned the Caribbean I probably would have ended up with Disney's lawyers.

4 comments:

hillbillyToad said...

Awesome insight as always. Delete is so yesterday, wow, that observation is very simple and enlightening.

Do you see Devs and testers generating CRUD docs anyway though? It's something I talk about, but not something I see often in the wild.

Gary Myers said...

I see very little use of DELETE in requirements, use cases or anywhere. UPDATE does still have a place, but mostly on tables that don't have concurrency or audit history issues. I do expect to see a lot less of them in future.

As for testing, I've been writing test docs the past week and will blog on that shortly.

Joel Garry said...

Must be pirate meme time, just had a pirate themed Dr. Who on BBC America. On this side of the Amy Pond, I guess you could say.

word: paria

Gary Myers said...

We get that on the weekend. I have suspicions that they've tied that episode to the coincide with the new Pirate of the Carribean film a few days later