DatabaseCollege

Query Processing & Execution Internals

Query processing is the complete sequence of operations a DBMS performs to transform a high-level declarative SQL statement into an efficient, executable plan that retrieves or manipulates data. Understanding these internals is essential for writing fast queries, diagnosing performance problems, and designing database systems.

This guide covers the full query pipeline (lexing, parsing, semantic analysis, optimization, execution), all major join algorithms with I/O cost formulas, external sorting, the Volcano iterator model, vectorized execution, annotated SQL walkthroughs, and a 10-question practice quiz.

1Introduction

When you submit a SQL query, the database does not simply "run" it. Instead, a sophisticated pipeline of components parses, validates, rewrites, optimizes, and then executes your query. The difference between a naive execution and an optimized one can be the difference between milliseconds and hours on the same data.

Query processing matters across the full spectrum of database applications: in OLTP systems, efficient single-row lookups with index scans keep transaction latency low; in OLAP systems, vectorized hash joins and aggregate operators enable analytical queries over billions of rows in seconds. For database administrators, engineers, and architects alike, a working knowledge of these internals is indispensable for performance tuning.

In Practice

Modern analytical databases like Snowflake, Google BigQuery, and DuckDB leverage advanced query processing techniques — vectorized execution, JIT compilation, and cost-based join reordering — to achieve sub-second query latencies over petabytes of data. Understanding these internals lets you write queries that cooperate with the optimizer rather than fight it.

Three Main Phases

Parsing

Lexical analysis, syntax checking, and semantic validation. Produces a logical query plan expressed in relational algebra.

Optimization

Logical rewrites (predicate pushdown, join reordering) and physical operator selection (Hash Join vs. Sort-Merge). Produces a physical query plan.

Execution

The execution engine interprets the physical plan via the Volcano iterator model (or vectorized/JIT variants) and returns results.

2Key Definitions

Precise terminology is essential before diving into the mechanics of query processing.

Query Processor

The core DBMS component that interprets, optimizes, and executes user queries — orchestrating the entire pipeline from SQL to results.

Lexical Analysis (Scanning)

Breaks the SQL string into tokens (keywords like SELECT, identifiers, operators, literals). Performed by a lexer or scanner.

Parse Tree (Syntax Tree)

Hierarchical tree representing the grammatical structure of the query according to SQL grammar rules.

Semantic Analysis

Validates name resolution, type checking, authorization, and view expansion against the database schema.

Logical Query Plan

An operator tree in relational algebra (σ, π, ⋈, ∪) describing what data to retrieve, independent of physical implementation.

Physical Query Plan

A fully executable plan specifying how: concrete operators (HashJoin, IndexScan), access methods, and execution order.

Query Optimizer

Transforms a logical plan into an optimal physical plan by exploring join orders, algorithm choices, and access paths, guided by a cost model.

Cost Model

A function estimating resource consumption (I/O, CPU, network, memory) of a plan. Example: cost = C_CPU × |tuples| + C_IO × |blocks|.

Cardinality Estimation

Predicting the number of rows each operator will output. Uses histograms, distinct value counts, and statistical assumptions.

Iterator Model (Volcano)

Pull-based execution model where each operator implements open(), next(), close(). Parents pull tuples from children on demand.

Pipelining

Intermediate results flow directly between operators without being fully stored. Reduces I/O and latency.

Materialization

An intermediate result is fully computed and stored before the next operator reads it. Required for Sort, Hash Join build phase, etc.

Access Path

Method for reading data from a base table: Table Scan (read all blocks) or Index Scan (traverse index to locate matching rows).

Vectorized Execution

Operators process data in batches (vectors) of thousands of tuples, improving cache efficiency and enabling SIMD instructions.

External Sorting

Multi-pass merge-sort for data that exceeds available memory. Creates sorted runs, then merges them progressively.

Join Ordering

Determining the optimal sequence to join N relations. An NP-hard problem solved approximately with dynamic programming or heuristics.

3The Query Processing Pipeline

A SQL query passes through nine distinct stages before results are returned. Each stage transforms the query representation, progressively moving from high-level intent to low-level, executable operations.

Query Processing Pipeline

From raw SQL to result set in three phases

Phase 1Parsing

SQL Query

Declarative statement from user

SELECT name FROM Employees WHERE dept = 10

Lexer

Breaks query into tokens

Parser

Checks grammar, builds parse tree

SelectStmt → SelectList, FromClause, WhereClause

Semantic Analyzer

Validates names, types, permissions

Employees.dept exists? INT = INT? User has SELECT?

Phase 2Optimization

Logical Plan

Relational algebra tree

π_name( σ_dept=10( Employees ) )

Optimizer

Explores plans, estimates costs via statistics

Predicate pushdown, join reordering, index selection

Physical Plan

Concrete algorithms chosen

IndexScan(dept_idx) → Filter → Project

Phase 3Execution

Executor

Volcano iterator: open / next / close

Pull-based, one tuple at a time (or vectorized batches)

Result Set

Rows returned to client application

{ "Alice", "Bob", "Carol" }

The nine-stage query processing pipeline grouped into three phases.

3.1 Lexical Analysis & Parsing

The lexer reads the SQL string character by character and groups characters into tokens: keywords (SELECT, FROM, WHERE), identifiers (table and column names), operators (=, >), and literals (numbers, strings). If an unrecognized character sequence is encountered, a lexical error is reported immediately.

The parser takes the token stream and checks it against the formal SQL grammar. If the token sequence is syntactically valid, the parser constructs a parse tree (syntax tree), a hierarchical structure mirroring the grammar rules. For example, a SELECT statement produces child nodes for the select list, from clause, and where clause. A syntax error terminates processing here.

3.2 Semantic Analysis

Even a syntactically valid parse tree can be meaningless. The semantic analyzer performs deeper validation against the database schema:

Name resolution: All referenced tables, columns, and functions must exist and be in scope.
Type checking: Operations must be applied to compatible types (e.g., comparing a string to an integer is an error).
Authorization: The user must hold sufficient privileges (SELECT, UPDATE, etc.) on the referenced objects.
View expansion: References to views are replaced with their underlying definitions.

3.3 Intermediate Representation (Logical Plan)

After successful semantic analysis, the parse tree is translated into a logical query plan— an operator tree expressed in relational algebra. Relational algebra operators include:

σ (Selection) · π (Projection) · ⋈ (Natural Join) · ⊕ (Union) · ∩ (Intersection) · − (Difference) · G (Group By)

Example: SELECT emp_name FROM Employees WHERE salary > 50000 becomes: π _{emp_name}(σ _salary>50000(Employees))

This logical plan captures what to compute without specifying how. It is the input to the query optimizer.

4Query Optimization

Query optimization is the most challenging phase of query processing. The optimizer's goal is to find the physical execution plan with the lowest estimated resource cost among potentially billions of candidates.

4.1 Logical Rewrites

Before selecting physical operators, the optimizer applies algebraic equivalence rules to rewrite the logical plan into an equivalent but cheaper form:

Predicate Pushdown

Move selection operators (σ) as far down the tree as possible. Filtering data early reduces the size of all intermediate results flowing upward. This is the single most impactful logical rewrite.

Projection Pushdown

Move projection operators (π) down to eliminate unnecessary columns early. Narrower tuples use less memory and I/O bandwidth as they flow through the plan.

Join Commutativity & Associativity

R ⋈ S ≡ S ⋈ R (commutative) and (R ⋈ S) ⋈ T ≡ R ⋈ (S ⋈ T) (associative). These rules allow the optimizer to freely reorder joins.

Subquery Unnesting

Correlated subqueries (EXISTS, IN) are rewritten into joins where possible, replacing O(|outer| × |inner|) correlated execution with an efficient join.

4.2 Cost-Based Optimization

Modern optimizers are cost-based: they assign an estimated cost to each candidate physical plan and select the lowest-cost plan. The cost model accounts for:

I/O cost (disk reads/writes)

CPU cost (comparisons, hash computations)

Memory cost (buffer pool usage)

Network cost (distributed systems)

The System R optimizer (IBM, 1979) pioneered cost-based optimization with dynamic programming over left-deep join trees. It remains the conceptual template for most modern optimizers. The optimizer enumerates sub-expressions bottom-up, storing the cheapest plan for each sub-expression, then combines them.

4.3 Cardinality Estimation & Statistics

Cardinality estimation — predicting how many rows each operator will produce — is critical because operator costs are proportional to input/output sizes. The optimizer uses:

N_R: number of tuples in relation R
V(A, R): number of distinct values of attribute A in R
Histograms: frequency distributions of attribute values
Selectivity formulas: for σ_A=v(R), estimated rows = N_R / V(A, R); for σ_A>v(R), estimate uses histogram buckets
Join cardinality: for R ⋈ S on A=B, estimated = N_R × N_S / max(V(A, R), V(B, S))

Important Note

Inaccurate cardinality estimates are the leading cause of suboptimal query plans. Always keep statistics up-to-date with ANALYZE or UPDATE STATISTICS after large data changes.

4.4 Rule-Based vs. Cost-Based Optimization

Aspect	Rule-Based	Cost-Based
Basis	Fixed heuristics (always push selections down)	Statistical cost estimates
Data awareness	None — ignores data distribution	Yes — relies on statistics and histograms
Optimality	Can be suboptimal for unusual data	Near-optimal given accurate statistics
Use in practice	Logical rewrites still use rule-based steps	Dominant for physical plan selection

5Join Processing Algorithms

Joins are typically the most expensive operations in relational queries. Choosing the right join algorithm for each join in a query is one of the optimizer's most consequential decisions. Let B_R and B_S denote the number of disk blocks in relations R and S, |R| the number of tuples, and M available memory blocks.

Join Algorithms Comparison

How Nested Loop, Sort-Merge, and Hash Join process the same data

Hash JoinHJ complete: 4 matches found

R: Employees

name	dept_id
Alice	10
Bob	20
Carol	10
Dan	30

S: Departments

dept_name	dept_id
Engineering	10
Marketing	20
Sales	30

Matched Pairs (4)

Alice↔Engineering(10)Bob↔Marketing(20)Carol↔Engineering(10)Dan↔Sales(30)

I/O Cost:B_R + B_S (if fits memory)Fastest for equijoins

All 4 matches found. Hash Join costs B_R + B_S when the smaller table fits in memory. Click Play to watch each algorithm step by step.

Step 0 / 8

5.1 Nested Loop Join (NLJ)

The simplest join: for every tuple in R, scan all of S.

FOR each tuple r IN R DO

FOR each tuple s IN S DO

IF r and s satisfy join condition THEN emit (r, s)

I/O Cost: B_R + |R| × B_S (naive); improved to B_R + ⌈B_R / (M−1)⌉ × B_S with Block NLJ (reads M−1 blocks of R at a time). Use when: outer relation is very small, or no suitable index exists.

5.2 Index Nested Loop Join (INLJ)

A variant of NLJ where an index exists on the inner relation's join attribute. Instead of scanning all of S for each tuple in R, the optimizer uses the index to fetch only matching tuples. I/O Cost: B_R + |R| × C_index, where C_index is the index lookup cost (typically 2–4 I/Os for a non-clustered B-tree). Use when: the outer relation is small and the join is highly selective.

5.3 Sort-Merge Join (SMJ)

Sort both R and S on their join attributes, then merge the two sorted streams in a single pass. I/O Cost: Sort(R) + Sort(S) + B_R + B_S, where Sort(X) ≈ 2·B_X·(1 + ⌈log_(M−1)(B_X/M)⌉). If both relations are already sorted, cost collapses to B_R + B_S. Use when: inputs are pre-sorted, or the query requires sorted output on the join key anyway.

5.4 Hash Join (HJ)

Build an in-memory hash table on the smaller relation (R), then probe it with every tuple from S. I/O Cost: B_R + B_S if R fits in memory. If R does not fit, a partitioning phase writes both relations to disk partitioned by hash key: total cost ≈ 3·(B_R + B_S). Use when: equijoin, large unsorted relations, and the build side fits in memory. Generally the fastest join algorithm in practice.

5.5 Join Algorithm Comparison

Algorithm	Avg I/O Cost	Requires Sort?	Best For
Simple NLJ	B_R + \|R\|·B_S	No	Very small outer relation
Block NLJ	B_R + ⌈B_R/(M−1)⌉·B_S	No	No index available
INLJ	B_R + \|R\|·C_index	No	Small outer, selective index
Sort-Merge	Sort(R) + Sort(S) + B_R + B_S	Yes	Pre-sorted inputs, sorted output needed
Hash Join	B_R + B_S	No	Large unsorted equijoins (fastest avg)

Query Execution Plan Tree

Bottom-up construction of a physical query plan

SELECT E.name, D.dept_name FROM Employees E
JOIN Departments D ON E.dept_id = D.dept_id WHERE E.salary > 50000

// Complete plan tree. Click Play to see it built step by step.

Project

E.name, D.dept_name

rows: ~40,000

↓

Filter

E.salary > 50,000

rows: ~40,000

↓

Hash Join

E.dept_id = D.dept_id

rows: ~250,000

&swarr;

&searr;

Table Scan

Employees (full scan)

rows: 250,000

Index Scan

Departments (pk index)

rows: 500

Data flows bottom-up ↑

Complete query plan tree with all cardinality estimates. The optimizer chose Hash Join based on cost model analysis. Click Play to see the tree built step by step.

Step 0 / 6

6Sorting & Aggregation

ORDER BY, GROUP BY, DISTINCT, and set operations (UNION, INTERSECT, EXCEPT) all require sorting or hashing. When data exceeds available memory, the DBMS must use disk-based algorithms.

6.1 External Merge-Sort

When data is too large to sort in memory (B_R > M blocks), external merge-sort proceeds in two phases:

Phase 1: Run Formation

Read M blocks at a time into memory
Sort the chunk in-place (quicksort)
Write the sorted "run" to disk
Produces ⌈B_R / M⌉ sorted runs
I/O cost: 2 × B_R (read + write)

Phase 2: Merge Passes

Use M−1 input buffers to merge M−1 runs at once
Write merged result to disk
Repeat until 1 run remains
Passes needed: ⌈log_(M−1)(B_R/M)⌉
Total I/O: 2·B_R·(1 + ⌈log_(M−1)(B_R/M)⌉)

External Merge-Sort Animation

12 values · M = 4 memory blocks · 3 initial runs

Complete

Step 0 / 7

Complete Pipeline (Final State)

External merge-sort handles data too large for memory. This view shows the complete final sorted result. Step through to see how we got here.

Current Memory Chunk (reading from disk)

Sorted Runs (on disk)

Run 1

Run 2

Run 3

Merge Output (on disk)

Merged Run A

Run 3 (unchanged)

Final Sorted Output

Total I/O: 2 × B_R × (1 + ⌈log_(M−1)(B_R/M)⌉)

1 of 8

Figure 5: External merge-sort animation. Step 0 shows the complete final state; step through to trace the process.

6.2 Sort-Based vs. Hash-Based Aggregation

Sort-Based Aggregation

Sort on GROUP BY attributes (external sort if needed)
Scan once: adjacent equal keys form a group
Compute SUM, COUNT, AVG per group

Good when: data needs sorting anyway (ORDER BY same key)

Hash-Based Aggregation

Scan relation, build hash table keyed by GROUP BY attributes
Update aggregate accumulators (SUM += value, COUNT++)
Spill partitions to disk if hash table overflows

Good when: few distinct groups fit in hash table; generally faster

7Query Execution Models

Once the physical query plan is selected, the execution engine takes over. There are three dominant execution paradigms, each with different performance characteristics.

7.1 Iterator / Volcano Model

The Volcano model (Graefe & McKenna, 1993) is the classical execution paradigm. Every physical operator implements three methods:

open()

Initializes the operator and recursively calls open() on children. Allocates buffers, opens files.

next()

Returns the next qualifying tuple. Recursively calls next() on children. May loop internally if the child tuple fails a predicate.

close()

Releases resources and recursively calls close() on children. Closes files, frees buffers.

Volcano / Iterator Model

Pull-based execution: parent calls child.next() to get one tuple at a time

// Iterator model with skip behavior. Click Play to see the pull-based pipeline.

Client

↓next()↑tuple

Project

extract: E.name, D.dept_name

open() / next() / close()

↓next()↑tuple

Filter

predicate: salary > 50,000

open() / next() / close()

↓next()↑tuple

Table Scan

source: Employees (disk/buffer)

open() / next() / close()

↓ disk / buffer

The Volcano iterator model pulls tuples one-by-one through the operator pipeline. Filter can internally skip non-qualifying tuples. Click Play to see the full pull-based flow.

Step 0 / 7

7.2 Pipelining vs. Materialization

Pipelining

Tuples flow directly from one operator to the next without intermediate storage. The default behavior in the Volcano model.

Lower I/O, lower latency
Less temporary storage needed
All pipelined operators active concurrently in memory
Cannot work for blocking operators (Sort, Hash Join build)

Materialization

The complete output of an operator is written to a temporary buffer or disk before the next operator reads it.

Required for blocking operators (Sort, Hash Join build phase)
Allows reuse of intermediate results
Higher I/O and latency
Useful under memory pressure

7.3 Vectorized Execution

Traditional Volcano processes one tuple at a time. Vectorized execution processes data in batches (vectors) of 1,000–10,000 tuples. Each call to next() returns an entire vector instead of a single tuple:

Cache Efficiency

Processing a column of values sequentially improves spatial locality, reducing CPU cache misses significantly.

Reduced Overhead

One next() call per vector instead of per tuple eliminates the amortized cost of function dispatch per row.

SIMD Instructions

Modern CPUs apply a single instruction to 8–16 values simultaneously, enabling massive parallelism for bulk comparisons.

7.4 JIT Compilation

Some advanced systems (Apache Impala with LLVM, HyPer, CockroachDB) use Just-in-Time (JIT) compilation to translate the physical query plan into native machine code at runtime. Unlike interpreted execution, JIT-compiled code contains no virtual dispatch overhead, enables loop unrolling, and generates specialized code for specific data types and predicates. The compilation overhead is amortized over long-running or repeated queries.

8SQL Walkthroughs with EXPLAIN ANALYZE

EXPLAIN ANALYZE is the primary tool for observing how a DBMS actually executes a query. It shows the physical plan, estimated vs. actual row counts, and buffer/I/O statistics.

Example 1: Selection with Index Scan

Retrieve names and salaries of employees earning over $50,000. An index on salary exists.

EXPLAIN ANALYZE
SELECT emp_name, salary
FROM Employees
WHERE salary > 50000.00;

-- QUERY PLAN -----------------------------------------------
-- Project (emp_name, salary)
--   -> Index Scan using idx_employees_salary on Employees
--         Filter: (salary > 50000.00)
--         Rows Removed by Filter: 150000
--         Actual Rows: 100000
--         Buffers: shared hit=10000, read=5000

Index Scan: The optimizer chose an index scan because the selectivity is high (40% of rows qualify). The index locates matching row pointers efficiently.

Buffers: read=5000: 5,000 pages had to be fetched from disk (cache miss). Higher read counts indicate more I/O pressure.

I/O Cost: B_index_leaf + K × C_data_fetch ≈ 5,000 + 5,000 = 10,000 block accesses in this example.

Example 2: Hash Join with Predicate Pushdown & Hash Aggregation

Find total salary per department for "New York" departments. Demonstrates predicate pushdown, hash join, and hash-based aggregation.

EXPLAIN ANALYZE
SELECT D.dept_name, SUM(E.salary) AS total_salary
FROM Employees AS E
JOIN Departments AS D ON E.dept_id = D.dept_id
WHERE D.location = 'New York'
GROUP BY D.dept_name
ORDER BY total_salary DESC;

-- QUERY PLAN -----------------------------------------------
-- Sort (total_salary DESC)
--   -> HashAggregate (SUM(E.salary), Group Key: D.dept_name)
--         -> Hash Join (E.dept_id = D.dept_id)
--               -> Seq Scan on Employees E     [probe side]
--               -> Hash (D.dept_id)            [build side]
--                     -> Index Scan on Departments D
--                           Filter: (location = 'New York')
--                           Rows Removed: 900  Actual Rows: 100

Predicate Pushdown: The location = 'New York' filter is applied to Departments before the join, reducing the build-side hash table from 1,000 to 100 rows.

Hash Join strategy: Filtered Departments (100 rows) are the build side; the large Employees table is the probe side — exactly the right choice for minimal memory.

HashAggregate: Groups are accumulated into a hash table keyed on dept_name. Final Sort handles ORDER BY.

Example 3: EXISTS Subquery Rewritten to Join

Find departments with at least one employee earning over $70,000. The optimizer rewrites the correlated subquery into an efficient join.

EXPLAIN ANALYZE
SELECT D.dept_name
FROM Departments D
WHERE EXISTS (
    SELECT 1 FROM Employees E
    WHERE E.dept_id = D.dept_id AND E.salary > 70000.00
);

-- QUERY PLAN -----------------------------------------------
-- HashAggregate (DISTINCT D.dept_name)  -- EXISTS → join + distinct
--   -> Hash Join (D.dept_id = E.dept_id)
--         -> Seq Scan on Departments D
--         -> Hash (E.dept_id)
--               -> Index Scan using idx_salary on Employees E
--                     Filter: (salary > 70000.00)
--                     Rows Removed: 200000  Actual Rows: 50000

Subquery unnesting: The optimizer rewrites the correlated EXISTS into a Hash Join + DISTINCT, avoiding O(|D| × |E|) correlated execution.

Index on salary: Applied during the subquery scan, reducing 250,000 employees to 50,000 before the join. Overall complexity drops from O(|D|·|E|) to O(|D| + |E_filtered|).

9Common Mistakes

Confusing logical and physical query plans

A logical plan describes what the query computes in relational algebra (e.g., "a join"). A physical plan specifies how, using concrete algorithms and access paths (e.g., "Hash Join using an Index Scan on Employees.dept_id"). Optimizers generate logical plans first, then select physical operators.

Underestimating the impact of join ordering

The order of joins can change the size of intermediate results by orders of magnitude. Joining two large tables before a small table can turn a fast query into one that runs for hours. Always check that the optimizer is choosing a sensible join order via EXPLAIN.

Using outdated or missing statistics

The optimizer relies on accurate statistics (row counts, histograms, distinct values) to estimate cardinalities. After large bulk loads or deletes, run ANALYZE (PostgreSQL) or UPDATE STATISTICS (SQL Server) to refresh statistics. Stale stats are the #1 cause of suboptimal plans in production.

Assuming pipelining is always better than materialization

Pipelining generally reduces I/O, but some operators (Sort, Hash Join build phase, Aggregation with hash spills) inherently require materialization. The optimizer decides when to pipeline and when to materialize based on operator type and memory availability.

Assuming an index will always be used

An index may exist but the optimizer might not use it — for example, if the table is small enough that a sequential scan is cheaper, or if the predicate selectivity is too low (e.g., salary > 0 matches nearly every row). Always verify with EXPLAIN.

Using SELECT * when only a few columns are needed

SELECT * fetches and propagates all columns through the entire operator tree. This increases intermediate result sizes, consumes more buffer pool memory, can prevent covering index scans, and may force spilling to disk for join and aggregation operators that would otherwise fit in memory.

Frequently Asked Questions

What is the primary goal of the query optimizer, and why is it so complex?+

The primary goal of the query optimizer is to find the most efficient physical query plan for a given logical plan, minimizing resource consumption (I/O, CPU, memory, network). It is complex because the plan space grows exponentially with query complexity, accurate cost estimation depends on dynamic statistics, and heuristics are needed to prune an NP-hard search space within a reasonable time.

When would a Hash Join be preferred over a Sort-Merge Join, and vice versa?+

Hash Join is preferred when one relation is significantly smaller and fits in memory (optimal B_R + B_S I/O cost), relations are unsorted, and the output order does not matter. Sort-Merge Join is preferred when one or both relations are already sorted on the join attribute, the query requires sorted output (e.g., ORDER BY on join key), or memory constraints make hash-table spilling expensive.

Explain the difference between pipelining and materialization, and when materialization is necessary.+

Pipelining passes intermediate results directly between operators without writing to temporary storage, reducing I/O and latency. Materialization fully computes and stores the result of an intermediate operation before the next operator reads it. Materialization is necessary for operators that must see all input before producing output — a full sort must materialize all incoming tuples before it can emit any sorted output; the build phase of a hash join must materialize the entire build relation into an in-memory hash table.

How do database statistics influence query optimization?+

Database statistics (tuple counts, distinct value counts, histograms) are the foundation of cost-based optimization. They enable cardinality estimation — predicting how many rows each operator will produce — which determines join algorithm selection, access path selection (table scan vs. index scan), and join ordering. Stale or missing statistics are a leading cause of suboptimal query plans.

What is vectorized execution, and why is it beneficial?+

Vectorized execution processes data in batches (vectors) of hundreds or thousands of tuples rather than one at a time. Benefits include improved CPU cache utilization due to sequential column-at-a-time access, dramatically reduced function-call overhead (one next() per vector instead of per tuple), and the ability to leverage SIMD CPU instructions for bulk arithmetic and comparison operations. It is the dominant execution model in modern OLAP databases.

What is the Join Ordering Problem and why does it matter in practice?+

The Join Ordering Problem is determining the most efficient sequence in which to join N relations. For N relations there are exponentially many possible join orders, and the optimal solution is NP-hard. In practice, a poor join order can produce enormous intermediate results, consuming memory, CPU, and disk and turning a sub-second query into one that runs for hours. Optimizers use dynamic programming (e.g., the Selinger algorithm) and heuristics to find near-optimal orders within a time budget.

Practice Quiz

Test your understanding of query processing internals -- select the correct answer for each question.

1.Which phase of query processing is responsible for converting a logical query plan into an optimal physical query plan?

2.What is the primary purpose of a Cost Model in a query optimizer?

3.Which join algorithm is generally most efficient for equijoins when one relation is significantly smaller and can fit entirely in memory?

4.An operator that must process all its input before producing any output (e.g., a full external sort) typically requires which execution strategy?

5.What is the main advantage of Predicate Pushdown in query optimization?

6.The Iterator Model (Volcano-style) is characterized by which type of data flow between operators?

7.Which of the following is NOT a typical factor considered by a cost-based optimizer?

8.For external merge-sort of a relation with B_R blocks and M available memory blocks, approximately how many merge passes are required?

9.What is the primary benefit of vectorized execution compared to tuple-at-a-time processing?

10.In the context of join ordering, what does a "left-deep join tree" mean?

Study Tips & Memory Aids

Pipeline mnemonic — "PSLO-E"

Parsing (lex + syntax)
Semantic Analysis
Logical Plan
Optimization (physical plan)
Execution

Join algorithm I/O costs (R is smaller)

NLJ: B_R + |R|·B_S (worst)
SMJ: Sort(R) + Sort(S) + B_R + B_S
HJ: B_R + B_S (best, if R fits in mem)
INLJ: B_R + |R|·C_index

External sort formula

Total I/O = 2·B_R·(1 + ⌈log_(M−1)(B_R/M)⌉)

Volcano model interface

open() ↓ · next() ↑ · close() ↓

open/close go DOWN the tree; tuples flow UP via next()

1Introduction

Three Main Phases

2Key Definitions

3The Query Processing Pipeline

Query Processing Pipeline

3.1 Lexical Analysis & Parsing

3.2 Semantic Analysis

3.3 Intermediate Representation (Logical Plan)

4Query Optimization

4.1 Logical Rewrites

4.2 Cost-Based Optimization

4.3 Cardinality Estimation & Statistics

4.4 Rule-Based vs. Cost-Based Optimization

5Join Processing Algorithms

Join Algorithms Comparison

5.1 Nested Loop Join (NLJ)

5.2 Index Nested Loop Join (INLJ)

5.3 Sort-Merge Join (SMJ)

5.4 Hash Join (HJ)

5.5 Join Algorithm Comparison

Query Execution Plan Tree

6Sorting & Aggregation

6.1 External Merge-Sort

External Merge-Sort Animation

Complete Pipeline (Final State)

6.2 Sort-Based vs. Hash-Based Aggregation

7Query Execution Models

7.1 Iterator / Volcano Model

Volcano / Iterator Model

7.2 Pipelining vs. Materialization

7.3 Vectorized Execution

7.4 JIT Compilation

8SQL Walkthroughs with EXPLAIN ANALYZE

Example 1: Selection with Index Scan

Example 2: Hash Join with Predicate Pushdown & Hash Aggregation

Example 3: EXISTS Subquery Rewritten to Join

9Common Mistakes

Frequently Asked Questions

Practice Quiz

Study Tips & Memory Aids

Related Topics