Intro to Recovery

Recall that Recovery guarantees Atomicity and Durability.
Paper outlines issues and options for recovery.

Types of Failure

Overwriting Options

ATOMIC: a transaction’s updates become visible on disk all at once. System R’s "shadow paging" scheme did this. Pros/cons?
NOT ATOMIC: parts of transactions can be on disk without other parts. Pros/cons?

Buffer Pool Eviction Options

STEAL: a transaction’s updates may be flushed from the buffer at arbitrary times, since another transaction is allowed to "steal" a buffer pool frame. Pros/cons?
NO STEAL: all of a transaction’s updates remain in the buffer pool until commit time. Pros/cons?
FORCE: at commit time, all modified pages are forced (flushed) to disk. Pros/cons?
NO FORCE: modified pages may remain in the buffer pool even after commit. Pros/cons?

Log Data

Depending on the option chosen above, need some of REDO and UNDO log records to support recovery.
Log records can be logical (e.g. "inserted a tuple t into relation R), or physical (e.g. "byte 74 of page 255 used to be ‘r’ and now is ‘s’"). Pros/cons of each?
Physical log records can be before/after images of pages (subpages), or diffs.

Checkpoints

In order to speed up recovery, it’s nice to have "checkpoint" records that limit the amout of log that needs to be processed during recovery. It can be tricky to do efficient checkpoints.

State of the Art (as exemplified by ARIES)

Focus on speed and generality, rather than simplicity.
NOT ATOMIC, STEAL, NO FORCE. This allows the buffer manager to make intelligent (i.e. efficient) decisions about when and what to flush to disk.
Log data is "physiological" – i.e. some is physical (e.g. B-tree page splits), and some is logical (heap-tuple insertions.)

Many more details in ARIES paper!