Sunday, August 21, 2011

Redo log buffer or redo log file ?

There's a very interesting post on Jonathan Lewis' blog under the unassuming title of REDO.

Oracle treats a transaction as committed when the change and the commit has been written to the redo log BUFFER and doesn't require it to be written to the redo log FILE.

This can cause issues if the instance fails before that log data is flushed to the the file. If the disk write fails BECAUSE the instance fails, it is very unlikely that anything would have had a chance to look at the data in that tiny gap (but even rare risks WILL happen somewhere, sometime). If the disk write fails but the instance continues for a time (which is simulated in Jonathan's post) then the risk gets higher.

I think RAC throws a few more variables in there. Conceivably, a transaction in one instance may have committed data (not written to redo log file) shunted to another instance where it can be amended again and written to that instance's redo log file. If that first instance fails without a persistent copy of that initial transaction, then it can never be re-applied in its entirety.

But then I'm a developer and maybe I'm missing something in the the way instance recoveries are managed in RAC.

So is Oracle 'broken' ?

One problem is the notion of 'durability' is vague. A committed action should last beyond 'system failure', but in a 'cheap' system the CPU and disks can be in the same rack (or the same server) and a failure, such as a fire, could destroy both. Does that mean a transaction shouldn't be viewed as 'committed' until the log file is archived and shipped elsewhere ? Conversely, an expensive 'Data Guard' architecture with a maximum protection mode/level might not be impacted by the failure of a redo log write on a single node.

I suspect this one will run on for a while, and it is worth keeping an eye on Jonathan's post as the experts weigh in.


Anonymous said...

If i am not mistaken, a transaction is committed only ofter it is written to redo log file, not to the buffer.

SydOracle said...

There's a great discussion on this at Jonathan Lewi's blog.

The upshot is that, even if the log buffer hasn't (can't) be written to disk, the change is visible to other transactions. As such it is 'committed', though it is not 'durable' (so arguably should not be considered as committed).