Chapter 4: Query Execution & The Optimizer — The Database Brain
1. Definition
The Query Optimizer is the most complex component of a database kernel. Its mission is to translate a user’s declarative SQL (telling the database what to get) into a physical execution plan (telling the database how to get it).
Modern databases primarily use CBO (Cost-Based Optimization), which calculates the estimated “cost” (in terms of I/O and CPU cycles) of all possible execution paths—index scans, sequential scans, nested loop joins, etc.—and selects the one with the lowest total cost.
2. Technical Depth: Selectivity and Cardinality
How does the optimizer decide which index is more efficient?
- Selectivity: The proportion of unique values relative to the total number of rows. High selectivity (many unique values) makes an index more efficient, while low selectivity (many duplicates) often leads to sequential scans.
- Cardinality: The estimated number of unique values in a column.
If a status column has only two values, ‘active’ and ‘inactive’ (Cardinality=2), using an index to find ‘active’ rows in a 10-million-row table might require 5 million bookmark lookups. In such cases, the CBO will likely determine that a Sequential Scan is more efficient than an indexed lookup.
3. Visualizing the Invisible: The SQL Execution Pipeline
flowchart TD
SQL[SQL Statement] --> Parser[Parser]
Parser --> AST[Abstract Syntax Tree]
AST --> Rewriter[Rewriter (View expansion/Constant folding)]
Rewriter --> Optimizer[Optimizer (CBO)]
Optimizer --"Statistics"--> PlanA[Plan A: Index Scan]
Optimizer --"Cost Calculation"--> PlanB[Plan B: Seq Scan]
Optimizer --> Executor[Executor]
Executor --> Engine[Storage Engine (e.g., InnoDB)]
Engine --> Data[Result Set]4. Real-World Case: GitLab’s Production Slow Query Incident
Background: GitLab.com once experienced severe database performance degradation where simple queries took over 30 seconds to complete. Root Cause: Stale Statistics in PostgreSQL.
Deep Dive:
- A large table had undergone a massive
DELETEoperation, but the Auto-analyze process (which updates statistics) had not yet caught up. - Postgres’s statistical histograms still believed the table contained millions of rows with a uniform distribution.
- Based on this outdated data, the optimizer chose a Nested Loop Join instead of a Hash Join.
- This resulted in hundreds of millions of internal loop iterations, instantly saturating the server’s CPU.
Lesson: SQL performance depends not just on the SQL syntax, but on the database’s “perception” of the data. Regular ANALYZE operations are the lifeblood of database maintenance.
5. Detailed Defense & Optimization
A. Mastering the Execution Plan (EXPLAIN)
Never guess. Use EXPLAIN (ANALYZE) to inspect the actual execution path.
- Access Type: Monitor the
typefield in MySQL or the node type in Postgres.ALL(Sequential Scan) is the worst;const/eq_refis the ideal. - Rows Examined: Compare scanned rows vs. returned rows. If you scan 1 million rows to return 1, your index strategy has failed.
B. Index Hints
Use as a last resort (with caution).
- In MySQL,
FORCE INDEX (idx_name)tells the optimizer: “I trust my judgment over your cost calculation.” - Risk: As data distribution shifts over time, a forced index can become the worst possible choice.
C. The Leftmost Prefix Rule
For composite indexes, the optimizer can only utilize the index from left to right.
- An index on
(a, b, c)supports queries ona=?anda=? AND b=?. - It cannot optimize
b=?orc=?alone. Think of it like a telephone book: you cannot find someone by their first name efficiently if you don’t know their last name.
6. References
- PostgreSQL Documentation: Planner/Optimizer
- MySQL Explain Output Format
- GitLab Post-Mortem: Database Incident
