db execution

Sam Li

2022-03-17

computer systems › 15-445 database

sorting
aggregations
joins

Execution process: the operators are arranged in a tree.

sorting¶

supported feature: ORDER BY, DISTINCT, GROUP BY

Problem: can’t guarantee sorting data to fit in memory

Approach: external merge sort: splits the data set into separate runs, sorts them individually, and then combine into larger sorted runs.

run: a list of key/value pairs.
- Key: attributes to compare
- Value: record id/tuple

Procedure
- Sorting: sort in memory and then write back to a file on disk
- Merging: combine sorted sub-files into single file
2-way external merge sort
- data set = N pages, buffer pool = B pages
- Pass:read and write each page in file.
- Number of passes = 1 + log₂^N
  - 1 represents pass 0 which reads every B pages of the table into buffer pool(memory).
  - In this case, B >= 3(2I, 1O).
- totoal I/O cost: 2N*(# of passes)
- Problem: the worker blocks on disk I/O -> more buffer pages won’t help in this case
  - Approach:double buffering optimazation: prefetching the next run in the background and storing it in a second buffer while the system is processing the current run.
  - pro: reduce the wait time for I/O requests
  - Multi-thread env
General(k-way) merge sort

let k = B -1, #of runs = N/B * B

#of sorted runs = N/B, merge k runs

#of passes = 1 + log_B-1^(N/B) , total I/O cost = 2N*(# of passes)
optimize sorting by using b+ tree: Retrieve tuples in desired sort order by simply traversing the leaf pages of the tree.

The point is # of I/O access.

Case 1: clustered b+ tree: if the index is a clustered index, traverse the tree

case 2: unclustered b+ tree: else, don’t do it.

aggregations¶

Aggregations: Collapse values for a single attribute from multiple tuples into a single scalar value.
- two approaches for implementations: sorting, hashing
- -> GROUP BY, DISTINCT
Sorting -> ordering query

optimization: perform filter first: reduce the amount of data requiring to be sorted.
Hashing -> no ordering

Like sorting algorithms, DBMS spills data to disk when data doesn’t fit in memory. Two phases for external hashing aggregate: partition, rehash.

The point is the location of operations given context.
- problem 1: what to split?
  1. partition: one or more pages that contain the set of keys with the same hash value.
- problem 2 : how to split?
  1. hash function: used to split tuples into partitions on disk
  2. rehash: Build in-memory hash table for each partition and compute the aggregation
    
    Assume: each partition fits in memory.
- summarization: result form: (GroupKey -> RunningVal)
  
  scenario: Insert new tuples(G’, R’) into the hash table
  - like dictionary insertion: G’ existed, update R to R’; G’ not existed, insert (G’, R’)

joins¶

inner equijoin algorithm: joins tables where keys are equal.
goal: minimize repetitions
Join operators: choose what to join and smart joins
operator output: For a tuple r ∈ R and a tuple s ∈ S that match on join attributes, the join operator concatenates r and s together into a new output tuple.
- early materialization(data): Copy the values for the attributes in outer and inner tuples into a new output tuple. -> no need to go back in the query plan
- late materialization( record ids): Only copy the joins keys along with the record ids of the matching tuples -> suits for column store
cost analysis criteria: # of disk I/Os used to compute join

Only count input costs. Because outputs depend on the date computed afterwards.

The point is to find appropriate algorithms in certain scenario.
Variables used in this lecture:
- M pages in table R (Outer Table), m tuples total
- N pages in table S (Inner Table), n tuples total

Nested loop join¶

Nested loop join: two nested FOR loops that iterate over the tuples in both tables and compares each unique of them.
- locality:DBMS uses smaller table as the outer table and buffers it in memory.
- Index: find matches

Stupid/Simple NLJ:

1
2
3

foreach tuple r ∈ R:
	foreach tuple s ∈ S:
		emit, if r and s match

Block NLJ:

foreach block B_R ∈ R:
	foreach block B_S ∈ S:
		foreach tuple r ∈ B_R: 
			foreach tuple s ∈ B_s:
				emit, if r and s match

Index NLJ

Sort-merge join¶

Sort-merge join: sorts the two tables on their join key(s)

sort R,S on join keys
	cursorR ← Rsorted, cursorS ← Ssorted
while cursorR and cursorS:
	if cursorR > cursorS:
		increment cursorS
	if cursorR < cursorS:
		increment cursorR
	elif cursorR and cursorS match:
		emit
		increment cursorS

Cost analysis: sort + merge

Hash join¶

hash join: use a hash table to split up the tuples into smaller chunks based on their join attribute(s).
basic hash join
Grace hash join/Hybrid hash join