18 Query Execution & Scheduling

Process model

Query execution

A query plan is compirsed of operators.
An operator instance is an invocation of an operator on some segment of data.
- Parallelism
A task is the execution of a sequnce of one or more operator instances.

Process model

A DBMS’s process model defines how the system is architected to support concurrent reequset from a multi-user applicatoin
A worker is the DBMS component that is responsible for execution task on the behalf of client and returning the result
Approach #1 Process per worker (prior to existence of pthread)
- replies on OS scheduler
- use shared-memory for global data structures
- A process crash doesn’t take down the entire system.
- example : IBM DB2, Postgres, Oracle
Approache #2 Process Pool
- A worker uses any process that is free in a pool
- bad for cache locality
- Used by IBM DB2, Postgres
Approach #4 Thread per worker
- single process with mltiple worker threads
- DBMS has to manage its own scheduling
- May or may not use a dispatcher thread
- thread crash may kill the entire system
- IBM DB2, MSSQL, MySQL, Oracle
- no shared memory / less contex swithc overhead
- most recent DBMS use this unless ones built on postgres
  query parallelism
scheduling
- DBMS decides where, when, how to execute it. DBMS can do this better than OS.
  - how many task should it use?
  - how many CPU cores should it use?
  - What CPU core should the tasks execute on?
  - Where should a task store its output?
- inter query parallelism
  - run multiple queries concurrently
  - won’t save you from 2PL/MVCC/OCC
- intra query parallelism
  - improve performance of single query
  - Approach #1 intra-operator(horizonal)
    - different instance on different data
  - Approach #2 inter operator(vertical)
    - executation overlap
woker allocation
- Approach #1: one worker per core
  - the thread is pinned to that core by OS
  - sched_setaffinity
- Approach #2: Multiple worker per core (SAP HANA)
  - a pool of workers per core
  - improve cpu utilization
task assignment
- Approach #1: push
  - centralized dispather assigns tasks to wokers and monitors their process
  - woker notifies dispatcher upon it finish and get a new task
- Approach #2: pull
  - no dispatcher, work pulls task, runs it and the next task
scheduler has to be awared of underlying hardware

data placement

schedule operators to work on closest CPU cores to data
memory allocation
- question is where does the OS place the physical memory
- approach #1 interleaving : distribute uniformly across CPUs
- approach #2 first-touch: that the CPU execute the access memory thread
  convert logical plan to physical plan
one OLTP query to one task cause OLTP doesn’t have much parallelism
static scheduling
- number of threads don’t change
- morsel driven scheduling
  - morsel is horizontal data partitin(like a chunk of memory)
  - scheduls dynamically
  - pull worker
  - on worker per core
- hyper
  - no dispatcher thread
  - threads cooperate
  - one lock free task queue
  - worker tries to pull task local to its data
  - multipe tasks per query
numa-aware scheduler (SAP HANA)
- pull base scheduling
- each CPU has a working group
- one pool has multiple workers
- group has priority task queue
- seperate watchdog thread reassign task if saturated thread exists
- workers are allowed to steal tasks from other group’s soft task queue
- detect query CPU bound / memory bound and adjust thread pinning
- when socket number grows large (really large), work stealing may not be benifical
- thread groups allow cores to do not just query but other things like networking etc
  DBMS is ‘strong-willed independent’
database machines once beaten by quickly increasing commdity hardware performance
trend now is speeding up DBMS component by specialized hardware like GPU
specialized OS may not be worth the software engineer effort
OS helps but sometimes may not be that helpful

18 Query Execution & Scheduling

Process model

Query execution

Process model

query parallelism

data placement

convert logical plan to physical plan

DBMS is ‘strong-willed independent’