Luke a Pro

Luke Sun

Developer & Marketer

🇺🇦
EN||

Chapter 5: Concurrency Control (MVCC & Locks) — Magic of Parallel Universes

| , 4 minutes reading.

1. Definition

MVCC (Multi-Version Concurrency Control) is a concurrency control method used by databases to provide concurrent access to data.

The core philosophy is: Read/Write Decoupling. When a transaction is writing to a row, other reading transactions are not blocked. Instead, they read an older version (Snapshot) of the data. This allows each transaction to see the data as it existed in its own “parallel universe.”

2. Technical Depth: Undo Logs and Read Views

Taking MySQL InnoDB as an example:

  • Undo Log: When a transaction modifies data, InnoDB does not overwrite the old record. Instead, it copies the old data into an Undo Log and links it to the new record via a pointer, forming a Version Chain.
  • Read View: When a transaction starts, InnoDB generates a Read View—a snapshot of all “active” (uncommitted) transaction IDs at that moment.
  • Visibility Logic: When reading, the database compares the version ID of a row against the Read View. If the ID belongs to an active transaction, the version is invisible, and the engine traverses the Undo Log chain to find the most recent committed version.

3. Visualizing the Invisible: Snapshot Reads

sequenceDiagram
    participant TxA as Transaction A (Reader)
    participant TxB as Transaction B (Writer)
    participant DB as Row (ID=1)
    participant Undo as Undo Log

    TxB->>DB: 1. UPDATE ID=1 SET Age=30 (was 20)
    Note over DB: Row Locked (X-Lock)<br/>Age = 30<br/>Undo Log Created: Age=20
    
    TxA->>DB: 2. SELECT * FROM users WHERE ID=1
    Note over DB: Lock Detected (TxB is active)
    
    DB->>Undo: 3. TxA reads old version from Undo Log
    Undo-->>TxA: 4. Returns Age=20 (Snapshot)
    
    TxB->>DB: 5. COMMIT
    Note over DB: Row Age=30 becomes permanent

4. Real-World Case: GitHub’s Primary Key Contention (2018)

Background: GitHub was performing a major MySQL database migration. Phenomenon: The site experienced brief outages and write failures.

The Internal Cause: Auto-increment Contention & Next-Key Locks. While MVCC eliminates read-write conflicts, write-write conflicts still require locking.

  1. Auto-inc Lock: During a massive data ingest (INSERT INTO ... SELECT), MySQL’s auto-increment lock became a serialized bottleneck under high concurrency.
  2. Next-Key Lock: Under the Repeatable Read (RR) isolation level, uniqueness checks (e.g., REPLACE INTO or INSERT ON DUPLICATE) may trigger Next-Key Locks (locking index gaps) depending on the index type and execution plan. In highly concurrent conflict scenarios, this led to severe lock wait chains.

Lesson: MVCC is not a magic wand. In extreme high-concurrency write scenarios, understanding lock granularity (Record vs. Gap vs. Next-Key) is critical.

5. Detailed Defense & Optimization

A. Isolation Level Trade-offs

  • Read Committed (RC): Generates a new Read View for every query. Suitable for high-concurrency apps where slight non-repeatable reads are acceptable.
  • Repeatable Read (RR): Generates a Read View once at the start of the transaction. MySQL default; mitigates phantom reads in locking read scenarios (via Next-Key locks), though standard snapshot reads may still observe range anomalies depending on implementation.

B. Avoid Long Transactions

  • Long-running transactions force the database to keep old Read Views active.
  • This prevents the Purge thread from cleaning up Undo Logs, causing the History List Length to skyrocket. This not only consumes disk space but slows down all queries as they must traverse longer version chains.

C. Optimistic Locking

For non-critical sections, use application-level optimistic locking via version numbers to avoid database row locks entirely.

UPDATE products SET stock = stock - 1 
WHERE id = 1 AND version = 5;

6. References