Parallel Execution
202406212050
Status: #idea
Tags: CMU Advanced Database Systems
Parallel Execution
- Run multiple tasks simultaneously
- 2 main types:
Inter-Query Parallelism
- Multiple queries execute simultaneously
- 2 approaches
- Intra-operator (Horizontal)
- Inter-operator (Vertical)
- Not mutually exclusive
Intra-Operator (Horizontal)
- Operators are decomposed into independent instances
- Perform the same function on different subsets of data
- Inserts an exchange operator to coalesce results from child operators
Exchange Operator
- Gather -> Combine results from multiple works into a single output stream
- Distribute -> Split a single input stream into multiple output streams
- Repartition
- Shuffle multiple input streams across multiple streams
- Some DBMS's always perform this step after each pipeline (e.g., Dremel/BigQuery)
Inter-Operator (Vertical)
- Operators are overlapped, to pipeline data from one stage to the next
- Multiple pipelines running at the same time, where data flows from one pipeline to another
- AKA pipelined parallelism
- Often used in stream processing (beyond the scope of this course)