Notes for ARIES

Paper: ARIES: a transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging

  • Redo: LSN makes sure undo is performed only once. No additional log is written during redo process.
  • Undo: LSN and CLR(Compensation log record) together makes sure undo is performed only once. CLR is written for every undo log.
  • CLR: Is a redo-only log, help to make sure undo is applied only once. This is also used to achieve nested top action(To let recovery bypass some undo action).
  • Checkpoint: reduce the amount of log to be checked.

Recovery Process:

  1. Analysis: Starting from the LSN of the begin_check point in the lastest complete check point. Forward traversal to the log end. Calculate the RedoLSN, dirty_pages, active_transaction_table. The only log records that may be written by this routine are end records for transactions that had totally rolled back before system failure, but for whom end records are missing.
  2. Redo pass: start from the RedoLSN, forward traversal to the log end. use dirty_pages to determine if the LSN in a dirty page is newer than the LSN on non-volatile page, redo the change if so. Don’t write any log. This step is to repeat history till the point where crash happens. It might redo some changes which would be undone later by the following undo pass.
  3. Undo pass: start from the end. Backword traversal, active_transaction_table records the uncommitted transactions. Undo those, write CLR for each.

With respect to ACID, ARIES is taking care of Atomicity and Durability:

  1. It only makes the assumption of WAL to be enforced.
  2. Because of using undo/redo log, it can use steal policy: Page can be written to no-volatile storage before commit(Since we have undo log). It can also use no-force policy: Page don’t need to be written to no-volatile storage after commit.
  3. Additionally, to achieve Consistency and Isolation, my understanding is two phase locking needs to be enforced. For example if two transaction both writes an undo log on a page. While one transaction abort, one transaction commit. The abort one can’t undo that page if the page modification is intertwined between two transactions. Otherwise, the undo process described by ARIES would undo the change made by the committed transaction! Two phase locking can make sure the intertwined would never happen.

In summary, the innovation is the way to use LSN and CLR:

  1. LSN limits the amount of log to be checked during recovery.
  2. LSN makes redo apply change at most once. LSN and CLR together make undo apply change at most once.These two make operational log possible. Idempotence is not required for undo/redo since they would execute only once. Otherwise, we need to use before/after image to achieve idempotence since we might execute undo/redo more than once.
  3. Since both undo/redo is properly defined, buffer management is flexible, either steal(with the help of undo) or no-force(with the help of redo) or both can be used.


Subscribe to 天舟的云游格

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.