Tuesday, May 25, 2010

Word for today is FAILSAFE.


Failsafe does NOT mean that the item cannot fail. What it does mean is that, in the event of a failure, it won't go BOOM. As far as databases go, it means that, when a statement fails, any work done by that statement is undone. When a transaction fails, any work done by that transaction is undone. When a recovery is done, it leaves the database in a consistent state (ie no transactions partly done).

Oracle rates pretty highly on the failsafe scale. When a SQL statement, or a top-level anonymous PL/SQL block, fails it will rollback all the TRANSACTIONAL data changes made. It won't reset non-transactional items, such as PL/SQL package level variables or any stuff done outside the database (OS files, web service calls, FTP transfers etc). It doesn't work miracles.

Failures are a fact of life. The "not an option" declaration can only be valid at a system level, where multiple components are present to cover for the failure of individual components. The only way to mitigate against failure is to anticipate and understand that anything and everything can fail, and to have put in place steps to remediate the situation.

The power can fail and the server crash, so we have the redo logs. The media can fail, so we can backup the data files and multiplex the redo logs. Servers fail, so we have RAC and DataGuard. 

Your software will fail. The question you've got to ask yourself is, will it fail safe ?
And if you're not sure, well punk, do you feel lucky.

No comments: