Luke a Pro

Luke Sun

Developer & Marketer

🇺🇦
EN||

Chapter 4: Query Execution & The Optimizer — The Database Brain

| , 3 minutes reading.

1. Definition

The Query Optimizer is the most complex component of a database kernel. Its mission is to translate a user’s declarative SQL (telling the database what to get) into a physical execution plan (telling the database how to get it).

Modern databases primarily use CBO (Cost-Based Optimization), which calculates the estimated “cost” (in terms of I/O and CPU cycles) of all possible execution paths—index scans, sequential scans, nested loop joins, etc.—and selects the one with the lowest total cost.

2. Technical Depth: Selectivity and Cardinality

How does the optimizer decide which index is more efficient?

  • Selectivity: The proportion of unique values relative to the total number of rows. High selectivity (many unique values) makes an index more efficient, while low selectivity (many duplicates) often leads to sequential scans.
  • Cardinality: The estimated number of unique values in a column.

If a status column has only two values, ‘active’ and ‘inactive’ (Cardinality=2), using an index to find ‘active’ rows in a 10-million-row table might require 5 million bookmark lookups. In such cases, the CBO will likely determine that a Sequential Scan is more efficient than an indexed lookup.

3. Visualizing the Invisible: The SQL Execution Pipeline

flowchart TD
    SQL[SQL Statement] --> Parser[Parser]
    Parser --> AST[Abstract Syntax Tree]
    AST --> Rewriter[Rewriter (View expansion/Constant folding)]
    Rewriter --> Optimizer[Optimizer (CBO)]
    
    Optimizer --"Statistics"--> PlanA[Plan A: Index Scan]
    Optimizer --"Cost Calculation"--> PlanB[Plan B: Seq Scan]
    
    Optimizer --> Executor[Executor]
    Executor --> Engine[Storage Engine (e.g., InnoDB)]
    Engine --> Data[Result Set]

4. Real-World Case: GitLab’s Production Slow Query Incident

Background: GitLab.com once experienced severe database performance degradation where simple queries took over 30 seconds to complete. Root Cause: Stale Statistics in PostgreSQL.

Deep Dive:

  1. A large table had undergone a massive DELETE operation, but the Auto-analyze process (which updates statistics) had not yet caught up.
  2. Postgres’s statistical histograms still believed the table contained millions of rows with a uniform distribution.
  3. Based on this outdated data, the optimizer chose a Nested Loop Join instead of a Hash Join.
  4. This resulted in hundreds of millions of internal loop iterations, instantly saturating the server’s CPU.

Lesson: SQL performance depends not just on the SQL syntax, but on the database’s “perception” of the data. Regular ANALYZE operations are the lifeblood of database maintenance.

5. Detailed Defense & Optimization

A. Mastering the Execution Plan (EXPLAIN)

Never guess. Use EXPLAIN (ANALYZE) to inspect the actual execution path.

  • Access Type: Monitor the type field in MySQL or the node type in Postgres. ALL (Sequential Scan) is the worst; const / eq_ref is the ideal.
  • Rows Examined: Compare scanned rows vs. returned rows. If you scan 1 million rows to return 1, your index strategy has failed.

B. Index Hints

Use as a last resort (with caution).

  • In MySQL, FORCE INDEX (idx_name) tells the optimizer: “I trust my judgment over your cost calculation.”
  • Risk: As data distribution shifts over time, a forced index can become the worst possible choice.

C. The Leftmost Prefix Rule

For composite indexes, the optimizer can only utilize the index from left to right.

  • An index on (a, b, c) supports queries on a=? and a=? AND b=?.
  • It cannot optimize b=? or c=? alone. Think of it like a telephone book: you cannot find someone by their first name efficiently if you don’t know their last name.

6. References