Processing Model

Defines how system executes a query plan and moves data from one data to the next
Consists of 2 types of execution paths:
1. Control flow -> How DBMS invokes an operator
2. Data flow -> How operator sends its result
Output of an operator can either be whole tuples (NSM) or subsets of columns (DSM)

Iterator Model

AKA volcano or pipeline model
Each query plan operator implements a Next() function
- On each invocation, operator returns a single tuple or EOF marker
- Operator implements a loop that calls next on its children and then processes them
Each operator also implements Open() and Close(), which are analogous to constructors and destructors
Used in almost every DBMS
- Easy to implement/debug
- Output control works easily with this approach (like limit)
Allows pipelineing
- Pipeline breaking is an operator that cannot finish until all its children emit all their tuples
- Tuple will often remain in cache
Downside
- Lot of function calls
- Mixes control flow and data flow

Each operator processes its input all at once, and then emits all at once
DBMS can push down hints (Eg. LIMIT) to avoid scanning too many tuples
Great for OLTP, because workload queries a small number of tuples at time
Bad for OLAP with large intermediate results

Like the iterator model, but Next() emits a batch of tuples
Best of both worlds
Operator's loop processes a vector of tuples
Size of batch can be set based on hardware or query
Each batch will contain a null bitmap as well.
Ideal for OLAP. Works well with out-of-order CPUs
- No data or control dependencies
- Operators perform work in tight for-loops