ResourcesDatabaseTransaction Management & ACID
DatabaseCollege

Transaction Management & ACID

A transaction is a logical unit of work comprising one or more database operations, treated as a single indivisible sequence of actions. Transaction management ensures the correct execution of these operations, particularly in environments with concurrent access and potential system failures.

This guide covers ACID properties, transaction states, isolation levels and concurrency anomalies, write-ahead logging, ARIES recovery, SQL walkthroughs with SAVEPOINT examples, and a 10-question practice quiz.

1Introduction

In database management systems, a transaction is defined as a logical unit of work that comprises one or more database operations. These operations are treated as a single, indivisible sequence -- either all completing successfully (committing) or none at all (aborting/rolling back).

Why do transactions matter? They enforce data integrity by preventing the database from entering an inconsistent state. They enable concurrency, allowing multiple users to access and modify data simultaneously. And they provide a mechanism for recovery after system crashes, ensuring committed changes are not lost and aborted changes are fully undone.

In Practice

Consider an online banking system. When a user transfers money from account A to account B, this involves deducting funds from A and adding them to B. These two operations must happen together; if one succeeds and the other fails, the bank's books would be inconsistent. A transaction ensures this “all-or-nothing” behavior, guaranteeing that either the full transfer completes, or it's entirely cancelled, maintaining financial integrity.

2Key Definitions

Essential terms for understanding transaction management at the university level.

Transaction

A logical unit of work: a sequence of read/write operations treated as a single, indivisible step

ACID Properties

Atomicity, Consistency, Isolation, Durability -- the four guarantees for reliable transaction processing

Commit

Signals successful completion; all changes are made permanent and visible

Rollback (Abort)

Signals unsuccessful completion; all changes are undone, restoring previous state

Transaction Log

Sequential, append-only file on stable storage recording all modifications and events

Checkpoint

A synchronization point where dirty pages and log records are flushed to stable storage

Recovery Manager

DBMS component that restores the database to a consistent state after a failure

Concurrency Control

Mechanism ensuring correct execution of concurrent transactions while preserving consistency

Serializable Schedule

A schedule equivalent to some serial schedule -- the gold standard for correctness

Isolation Level

Setting that defines the degree of isolation from other concurrent transactions

Deadlock

Two or more transactions indefinitely waiting for each other to release a resource

Savepoint

A named mark within a transaction to which a partial ROLLBACK can be performed

Write-Ahead Logging (WAL)

Protocol: log record must be written to stable storage before data page is written to disk

Dirty Page

A data page modified in the buffer pool but not yet written back to stable storage

3ACID Properties Deep Dive

The ACID properties are the cornerstone of reliable transaction processing, formally introduced by Andreas Reuter and Theo Härder in 1983. They ensure the correctness and robustness of database systems.

Atomicity (All-or-Nothing)

A transaction is treated as a single, indivisible unit. Either all operations complete successfully, or none are applied.

Example: In a fund transfer, if the debit succeeds but the credit fails, atomicity ensures the debit is reversed.

Consistency (Valid States)

A transaction transforms the database from one valid state to another, preserving all integrity constraints, domain constraints, and business rules.

Example: A withdrawal that would make a balance negative is rejected to preserve the non-negative balance constraint.

Isolation (No Interference)

Concurrent transactions execute as if each were the only one running. Intermediate effects are not visible to other transactions until commit.

Example: Two concurrent inventory updates produce the same correct result as if they ran one after the other.

Durability (Permanent Changes)

Once committed, changes persist even after system failures. Achieved via the transaction log and write-ahead logging (WAL).

Example: After a payment commits, the record persists even if the server immediately crashes.

ACID vs BASE

Modern distributed systems often embrace BASE (Basically Available, Soft state, Eventually consistent) as an alternative to strict ACID. BASE prioritizes high availability and partition tolerance over immediate strong consistency, and is common in NoSQL databases. The choice between ACID and BASE depends on your application's consistency requirements.

4Transaction States & State Diagram

A transaction progresses through five primary states from initiation to completion.

Transaction State Machine

Every transaction follows this lifecycle from BEGIN to TERMINATED.

BEGINActiveR/W operationslast op donePartiallyCommittedin buffer, not on disklog flushedCommitteddurable on diskerror / ROLLBACKcrash / constraint failAbortedchanges undoneendsystem cleanupTerminatedresources released

1. Active

The initial state. The transaction is executing its operations (READ, WRITE, UPDATE, INSERT, DELETE). Entered via BEGIN TRANSACTION.

2. Partially Committed

After the last operation executes. Changes are still in volatile memory (buffer cache), not yet on stable storage.

3. Committed

All changes are permanently recorded on stable storage. Changes are durable and visible to other transactions.

4. Aborted

The transaction failed. All changes are undone (rolled back). Can occur from Active or Partially Committed states due to errors, crashes, or explicit ROLLBACK.

5. Terminated

The final state reached after either Committed or Aborted. The transaction is no longer active in the system.

Savepoints

Savepoints provide partial rollback within a single transaction. A SAVEPOINT sets a named marker; ROLLBACK TO SAVEPOINT undoes only the changes made since that savepoint.

SAVEPOINT savepoint_name;
ROLLBACK TO SAVEPOINT savepoint_name;
RELEASE SAVEPOINT savepoint_name; -- Removes the savepoint

5Isolation Levels & Anomalies

Databases offer different isolation levels that let developers choose a trade-off between strict isolation and performance. Each level prevents a different set of concurrency anomalies.

Concurrency Anomalies

Dirty Read

Reading data written by another uncommitted transaction. If it aborts, you read “dirty” data that never existed.

Non-Repeatable Read

Reading the same row twice and getting different values because another committed transaction modified it between reads.

Phantom Read

Re-executing a query and getting a different set of rows because another transaction inserted or deleted matching rows.

Isolation Levels Table

Isolation LevelDirty ReadNon-RepeatablePhantom
Read UncommittedPossiblePossiblePossible
Read CommittedPreventedPossiblePossible
Repeatable ReadPreventedPreventedPossible
SerializablePreventedPreventedPrevented

Serial vs Serializable Schedules

T1: A = A + 50 | T2: A = A × 2 | Initial: A =

00

// Click Play to compare schedule types
Serial Schedule (T1 then T2)
T1READ(A) → 100
T1A := A + 50
T1WRITE(A) = 150
T1COMMIT
T2READ(A) → 150
T2A := A × 2
T2WRITE(A) = 300
T2COMMIT
A = $300
Interleaved Schedule
T1READ(A) → 100
T2READ(A) → 100
T1A := A + 50
T1WRITE(A) = 150
T1COMMIT
T2A := A × 2
T2WRITE(A) = 200
T2COMMIT
A = $200
LOST UPDATE!
00 ≠ $300
The final state shows a lost update anomaly. Click Play to step through how it happens.
Step 0 / 8
Conflict Serializability

Two operations conflict if they belong to different transactions, access the same data item, and at least one is a WRITE (R-W, W-R, W-W are conflicts; R-R is not). A schedule is conflict serializable if its precedence graph is acyclic. This is the practical criterion most DBMS use to determine correctness of concurrent execution.

6Write-Ahead Logging (WAL)

The WAL protocol is a fundamental principle for ensuring durability and atomicity. It dictates two crucial rules:

Rule 1: Log Record First

Before any data page is written to stable storage, the log record describing the modification must be written to stable storage first.

Rule 2: Commit Record First

Before a transaction can be declared committed, its commit log record must be written to stable storage first.

Write-Ahead Logging in Action

Bank transfer:

00 from Account A to Account B

// Click Play to see WAL protocol step by step
Log (on disk)
LSN 1: <START T1>
LSN 2: <T1, A.bal,
000, $800>
LSN 3: <T1, B.bal, $500, $700>
LSN 4: <COMMIT T1>
Buffer Pool
empty
Disk (Stable)
A.balance
$800
B.balance
$700
The final state after a successful bank transfer using WAL. Click Play to step through the protocol.
Step 0 / 8

Log Record Types

<START T_i> -- Transaction T_i has begun

<COMMIT T_i> -- Transaction T_i has committed

<ABORT T_i> -- Transaction T_i has aborted

<T_i, X, OldValue, NewValue> -- Update record for undo/redo

<CLR T_i, UndoNxtLSN> -- Compensation Log Record (during undo)

<CHECKPOINT L> -- Checkpoint with list L of active transactions

Deferred vs Immediate Update

Deferred Update (NO-UNDO/REDO)

Changes written to disk only after commit. No undo needed on abort, but requires keeping all modifications in memory until commit.

Immediate Update (UNDO/REDO)

Changes can be written to disk at any point during execution. Requires both undo and redo for recovery. More common in modern DBMS (STEAL policy).

7Recovery (ARIES Algorithm)

ARIES (Algorithm for Recovery and Isolation Exploiting Semantics) is a widely adopted recovery algorithm used in many commercial DBMS (IBM DB2, Microsoft SQL Server). It operates in three phases:

1. Analysis Phase

Scans the log from the last checkpoint forward to identify all transactions that were active at crash time and all dirty pages in the buffer pool. Constructs the RedoLSN and Transaction Table.

2. Redo Phase (Repeating History)

Scans the log forward from RedoLSN, reapplying every update to bring the database to the exact state it was in at crash time -- including the effects of uncommitted transactions.

3. Undo Phase

Scans the log backward, undoing all changes made by uncommitted transactions. Writes Compensation Log Records (CLRs) to prevent re-undoing if a crash occurs during recovery.

ARIES Recovery: Three Phases

After a crash, ARIES scans the log in three passes to restore the database to a consistent state.

LOG TIMELINECKPTT1 STARTT1 UPDATET2 STARTT2 UPDATET1 COMMITCRASH!Phase 1: AnalysisScan log from last checkpoint• Build Active Transaction Table• Build Dirty Page Table• Determine redo start pointPhase 2: RedoRepeat history from redo point• Reapply ALL logged updates• Both committed & uncommitted• Brings DB to crash-time statePhase 3: UndoReverse uncommitted changes• Scan log backward• Undo T2 (not committed)• Write CLR recordsAfter RecoveryT1: Committed ✓ (changes kept)T2: Uncommitted ✗ (changes undone)
Recovery Walkthrough

Scenario: T1 updates X (10 → 20) and commits. T2 updates Y (5 → 15) and Z (100 → 110) but crashes before commit. T1's changes were flushed to disk; T2's were not.

Analysis: T1 is committed, T2 is active/uncommitted.

Redo: Reapply all changes (T1 and T2) to restore crash-time state.

Undo: Rollback T2 -- restore Y to 5 and Z to 100. Write CLRs.

Final: X = 20 (T1 committed), Y = 5, Z = 100 (T2 undone). Consistent state achieved.

8SQL Walkthroughs

Introductory

Bank Transfer Transaction (Atomicity Demo)

-- Setup
CREATE TABLE Accounts (
    account_id INT PRIMARY KEY,
    balance DECIMAL(10, 2) NOT NULL
);
INSERT INTO Accounts VALUES (101, 1000.00), (102, 500.00);

-- Transaction: Transfer 
50 from account 101 to 102 BEGIN TRANSACTION; -- Step 1: Debit source account UPDATE Accounts SET balance = balance - 150.00 WHERE account_id = 101; -- Step 2: Credit destination account UPDATE Accounts SET balance = balance + 150.00 WHERE account_id = 102; -- If both succeed, make permanent COMMIT; -- If any error occurs: -- ROLLBACK;

Lines 10-11: BEGIN TRANSACTION marks the start of the atomic block.

Lines 13-19: Both UPDATEs must succeed. If one fails (e.g., system crash between them), ROLLBACK undoes the debit.

Line 22: COMMIT makes both changes permanent. If crash occurs before COMMIT record is written, both are undone during recovery.

Key insight: Atomicity ensures the database always remains consistent -- money is never lost or created due to partial updates.

Intermediate

Stock Trading with Isolation Level

BEGIN TRANSACTION;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;

-- Step 1: Check available shares
SELECT available_shares
FROM Stocks
WHERE symbol = 'ABC';

-- Step 2: If enough shares, update inventory
UPDATE Stocks
SET available_shares = available_shares - 10
WHERE symbol = 'ABC';

COMMIT;

Line 2: SERIALIZABLE ensures no other transaction can modify the rows this transaction reads until it commits.

Lines 4-7: The SELECT acquires a lock/snapshot preventing concurrent modification of 'ABC' shares.

Key insight: Isolation levels allow fine-tuning the balance between data consistency and concurrency. SERIALIZABLE provides the strongest guarantees.

Advanced

Savepoint and Partial Rollback

BEGIN TRANSACTION;

-- Step 1: Create order
INSERT INTO Orders (order_id, customer_id, order_status)
VALUES (1001, 201, 'Pending');

SAVEPOINT OrderCreated;

-- Step 2: Add first item and update inventory
INSERT INTO OrderItems (item_id, order_id, product_id, quantity)
VALUES (1, 1001, 501, 2);
UPDATE Inventory SET stock_level = stock_level - 2
WHERE product_id = 501;

SAVEPOINT Item1Added;

-- Step 3: Add second item (may fail)
INSERT INTO OrderItems (item_id, order_id, product_id, quantity)
VALUES (2, 1001, 502, 5);

-- If insufficient stock for product 502:
ROLLBACK TO SAVEPOINT Item1Added;
-- Only item 2 changes are undone; order and item 1 remain

COMMIT; -- Commits order with only item 1

Lines 7, 15: SAVEPOINT creates named markers for partial rollback.

Line 22: ROLLBACK TO SAVEPOINT Item1Added undoes only item 2 changes. The order and item 1 remain intact.

Key insight: Savepoints provide granular rollback control within complex transactions, allowing partial failure recovery without losing all work.

9Memory Aids

ACID Mnemonic

“A-C-I-D: All-or-nothing, Correct state always, Independent execution, Don't lose committed data.”

Transaction State Flow

“All People Care About Termination”

Active → Partially Committed → Committed → Terminated (or Active → Aborted → Terminated).

WAL Protocol Rule

“Log Before Data, Commit Log First.”

Log record written before data page to disk. Commit record written before transaction is declared committed.

Recovery Steps (ARIES)

“All Really Understand Recovery”

Analysis: figure out what happened. Redo: reapply all changes. Undo: rollback uncommitted. Restart: consistent.

Isolation Level Hierarchy

“Some Really Rich Readers”

Serializable → Repeatable Read → Read Committed → Read Uncommitted (strongest to weakest).

10Common Mistakes

Assuming COMMIT Is Instantaneous

Thinking data is immediately on disk after COMMIT

COMMIT ensures durability by guaranteeing that necessary log records are written to stable storage. However, the actual data pages may still reside in the buffer pool. The WAL protocol ensures recoverability, not immediate physical write of all data.

Forgetting ROLLBACK in Error Paths

Not handling ROLLBACK in exception handling

Every BEGIN TRANSACTION should have a corresponding COMMIT or ROLLBACK path. In application code, wrap transaction logic in try-catch blocks, ensuring ROLLBACK is called on exceptions. Unhandled errors can leave transactions open or in an inconsistent state.

Wrong Isolation Level

Using a low isolation level when a higher one is needed for correctness

Understand which concurrency anomalies your application must prevent, and choose the lowest isolation level that achieves this. Over-specifying (always using Serializable) unnecessarily cripples performance; under-specifying can cause data corruption.

Not Using Transactions for Related Operations

Running multiple related operations outside a transaction boundary

Any set of database operations that must all succeed or all fail together should be enclosed within a transaction. This is the core principle of atomicity. Running them individually risks partial updates and inconsistent state.

Designing Long-Running Transactions

Holding locks for extended periods in long transactions

Long transactions hold locks for extended periods, reducing concurrency, increasing contention, and raising deadlock likelihood. They also consume more log space and increase recovery time. Design transactions to be as short and concise as possible.

Checkpoint Frequency Tuning

Setting checkpoint frequency too high or too low

High frequency means shorter recovery time but more overhead during normal operation. Low frequency means less overhead but longer recovery time. Optimal frequency depends on system workload and recovery time objectives.

Relying Only on Application-Level Checks

Not leveraging database constraints for consistency

Database-level constraints (PRIMARY KEY, FOREIGN KEY, UNIQUE, CHECK, NOT NULL) provide a robust layer of consistency enforcement that cannot be bypassed by faulty application code. Always use them alongside application logic.

Frequently Asked Questions

What happens if a database crashes during a COMMIT operation?
If the crash occurs before the transaction's COMMIT log record is successfully written to stable storage (as per WAL rule 2), the transaction is considered uncommitted. During recovery, the recovery manager will UNDO all its changes. If the crash occurs after the COMMIT log record is written, the transaction is considered committed. During recovery, the recovery manager will REDO its changes, ensuring durability, even if the actual data pages had not been flushed to disk.
Can a transaction see its own uncommitted changes?
Yes, typically a transaction can see its own modifications immediately. This is necessary for the transaction to build upon its own work. Isolation rules primarily apply to the visibility of changes made by other concurrent transactions. For example, if a transaction updates a row and then reads it again, it expects to see its own updated value.
What is the difference between ROLLBACK and ROLLBACK TO SAVEPOINT?
ROLLBACK (without a savepoint) aborts the entire transaction, undoing all changes made since the BEGIN TRANSACTION. ROLLBACK TO SAVEPOINT performs a partial rollback, undoing only the changes made after the specified SAVEPOINT was set. Changes made before the savepoint remain intact within the current transaction.
Why is SERIALIZABLE isolation level rarely used in high-performance applications?
SERIALIZABLE provides the strongest isolation guarantee, preventing all concurrency anomalies. However, it achieves this by typically using strict two-phase locking or other mechanisms that significantly reduce concurrency. Transactions may acquire more exclusive locks, hold them longer, and lead to increased contention, deadlocks, and reduced throughput, making it unsuitable for applications requiring very high transaction rates.
How does a distributed transaction maintain ACID properties?
Distributed transactions often employ protocols like Two-Phase Commit (2PC). In the first phase ("prepare"), all participating databases vote on whether they can commit. If all vote "yes," the coordinator sends a "commit" message in the second phase. If any vote "no" or fail, the coordinator sends a "rollback" message. This ensures atomicity and durability across distributed systems, though it introduces complexity and potential for blocking.
What is the relationship between consistency and the application?
Database consistency (part of ACID) refers to the database adhering to its defined integrity constraints (e.g., primary keys, foreign keys, CHECK constraints). The DBMS enforces these. However, "application-level consistency" refers to business rules and logical correctness that the application itself must ensure. A transaction might be consistent with database rules but still violate a business rule if the application logic is flawed. The developer is responsible for designing transactions that transform the database from one logically correct state to another.

Practice Quiz

Test your understanding of transaction management and ACID properties -- select the correct answer for each question.

1.Which of the following ACID properties ensures that all operations within a transaction are completed successfully, or none of them are?

2.A transaction reads data that was written by another transaction that has not yet committed. This scenario is known as a:

3.The Write-Ahead Logging (WAL) protocol primarily ensures which two ACID properties?

4.In the transaction state diagram, which state immediately precedes the "Committed" state?

5.Which SQL command is used to define a point within a transaction to which a partial rollback can be performed?

6.A schedule is considered serializable if:

7.Which of the following is NOT a type of conflict between two operations from different transactions on the same data item X?

8.What is the primary purpose of a database checkpoint?

9.The ARIES recovery algorithm performs its operations in which order?

10.Which isolation level typically offers the highest concurrency but the lowest consistency guarantees (i.e., allows dirty reads)?

Study Tips

  • Trace through recovery scenarios: Given a log with START, UPDATE, COMMIT, and ABORT records, walk through the Analysis, Redo, and Undo phases by hand to solidify your understanding of ARIES.
  • Compare isolation levels: For each level, write out which anomalies are prevented and which are allowed. Use concrete examples with two concurrent transactions.
  • Practice SAVEPOINT usage: Write multi-step SQL transactions with savepoints and trace what happens on partial rollback vs. full rollback.
  • Draw the state diagram: Sketch the five transaction states and all transitions from memory. Label each edge with the triggering event (BEGIN, end of ops, COMMIT, error/ROLLBACK).

Related Topics