2. 2
Query Processing
▪ Overview
▪ Measures of Query Cost
▪ Selection Operation
▪ Sorting
▪ Join Operation
▪ Other Operations
▪ Evaluation of Expressions
3. 3
Basic Steps in Query Processing
1. Parsing and translation
2. Optimization
3. Evaluation
4. 4
Basic Steps in Query Processing (Cont.)
▪ Parsing and translation
• translate the query into its internal form. This is then
translated into relational algebra.
• Parser checks syntax, verifies relations
▪ Evaluation
• The query-execution engine takes a query-evaluation
plan, executes that plan, and returns the answers to the
query.
5. 5
Basic Steps in Query Processing: Optimization
EQUIVALENT OF EXPRESSION?
▪ Given schema
instructor (id, name, dept-name, salary)
Query: Write relational algebra expression to find id and salary of
all instructors whose id is greater than 50501.
SQL: SELECT id, salary FROM INSTRUCTOR WHERE id>50501
Answer 1: σid>50501(∏id, salary(instructor))
equivalent to
Answer 2: ∏id, salary(σid>50501(instructor))
Discussion 1: Instructor is stored physically sorted order of id.
Will the query processing cost of answer 1 and answer 2 be the
same? How?
6. 6
Basic Steps in Query Processing: Optimization
▪ Each relational algebra operation can be evaluated using one
of several different algorithms
• Correspondingly, a relational-algebra expression can be
evaluated in many ways.
▪ Annotated expression specifying detailed evaluation strategy
is called an evaluation-plan. E.g.,:
• Use an index on id to find instructors with id > 50501
(Answer 2),
• Or perform complete relation scan and discard instructors
with id <= 50501 (Answer 1)
7. 7
Basic Steps: Optimization (Cont.)
▪ Query Optimization: Amongst all equivalent evaluation plans
choose the one with lowest cost.
• Cost is estimated using statistical information from the
database catalog
▪ e.g.. number of tuples in each relation, size of tuples,
etc.
▪ In this chapter we study
• How to measure query costs
• Algorithms for evaluating relational algebra operations
• How to combine algorithms for individual operations in
order to evaluate a complete expression
8. 8
Measures of Query Cost
▪ Many factors contribute to time cost
• disk access, CPU, and network communication
▪ Cost can be measured based on
• response time, i.e. total elapsed time for answering query, or
• total resource consumption
▪ We use total resource consumption as cost metric
• Response time harder to estimate, and minimizing resource
consumption is a good idea in a shared database
▪ We ignore CPU costs for simplicity
• Real systems do take CPU cost into account
• Network costs must be considered for parallel systems
▪ We describe how to estimate the cost of each operation
• We do not include cost to writing output to disk
Which cost we will
consider
for query evaluation
plan?
Explain why? Why
not?
9. 9
Measures of Query Cost
▪ Disk cost can be estimated as:
• Number of seeks * average-seek-cost
• Number of blocks read * average-block-read-cost
• Number of blocks written * average-block-write-cost
▪ What is number of seeks?
▪ Answer: The number of times, the disk head directly moves
from one track to another track.
▪ What is number of block reads?
▪ Answer: The number of blocks need to be transferred from
disk to memory to process the query.
▪ What is number of blocks written?
▪ Answer: The number of blocks need to be written to disk from
memory to process the query.
10. 10
Magnetic Hard Disk Mechanism
Schematic diagram of magnetic disk drive Photo of magnetic disk drive
11. 11
Measures of Query Cost
▪ For simplicity we just use the number of block transfers
from disk and the number of seeks as the cost measures
• tT – time to transfer one block
▪ Assuming for simplicity that write cost is same as read
cost
• tS – time for one seek
• Cost for b block transfers plus S seeks
b * tT + S * tS
▪ tS and tT depend on where data is stored; with 4 KB blocks:
• High end magnetic disk: tS = 4 msec and tT =0.1 msec
• SSD: tS = 20-90 microsec and tT = 2-10 microsec for 4KB
12. 12
Selection Operation
● File scan
● Algorithm A1 (linear search). Scan each file block and test all records to
see whether they satisfy the selection condition.
● Cost estimate = br block transfers + 1 seek
4 br denotes number of blocks containing records from relation r
● If selection is on a key attribute, can stop on finding record
4 cost = (br /2) block transfers + 1 seek
● Linear search can be applied regardless of
4 Selection condition or
4 Ordering of records in the file, or
4 Availability of indices
● Note: binary search generally does not make sense since data is not
stored consecutively
● except when there is an index available,
● and binary search requires more seeks than index search
13. 13
Selections Using Indices
● Index scan – search algorithms that use an
index
● Selection condition must be on search-
key of index.
● A2 (primary index, equality on key).
Retrieve a single record that satisfies the
corresponding equality condition
● Let hi denote the height of the index
tree
● Cost = (hi + 1) * (tT + tS)
● A5 (primary index, comparison).
(Relation is sorted on A)
4 For A V(r) use index to find
first tuple v and scan
relation sequentially from
there
4 For AV (r) just scan relation
sequentially till first tuple > v;
do not use index
ID Name Dept_name tot_Cred
25001Fatema ECE 120
25002Sajid Math 100
25003Abid Accounting 120
25004Fatema ECE 50
25005Sajid Math 120
25006Shafique Accounting 100
25007Abid ECE 120
25008Fatema Math 50
25009Sajid ECE 120
25010Shafique Math 100
ROOT
LEAF
hi
14. 14
Selections Using Indices
● Index scan – search algorithms that use an
index
● Selection condition must be on search-
key of index.
● A3 (primary index, equality on nonkey)
Retrieve multiple records.
● Records will be on consecutive blocks
4 Let b = number of blocks
containing matching records
● Cost = hi * (tT + tS) + tS + tT * b
ID Name Dept_name tot_Cred
25003Abid Accounting 120
25007Abid ECE 120
25001Fatema ECE 120
25004Fatema ECE 50
25008Fatema Math 50
25002Sajid Math 100
25005Sajid Math 120
25009Sajid ECE 120
25006Shafique Accounting 100
25010Shafique Math 100
ROOT
LEAF
hi
15. 15
Selections Using Indices
● A4 (secondary index, equality on
nonkey).
● Retrieve a single record if the
search-key is a candidate key
4 Cost = (hi + 1) * (tT + tS)
● Retrieve multiple records if
search-key is not a candidate
key
4 each of n matching records
may be on a different block
4 Cost = (hi + n) * (tT + tS)
– Can be very expensive!
ID Name Dept_name tot_Cred
25001Fatema ECE 120
25002Sajid Math 100
25003Abid Accounting 120
25004Fatema ECE 50
25005Sajid Math 120
25006Shafique Accounting 100
25007Abid ECE 120
25008Fatema Math 50
25009Sajid ECE 120
25010Shafique Math 100
ROOT
LEAF
hi
bucket
16. 16
Selections Using Indices
● A6 (secondary index,
comparison).
4 For A V(r) use index to
find first index entry v and
scan index sequentially
from there, to find pointers
to records.
4 For AV (r) just scan leaf
pages of index finding
pointers to records, till first
entry > v
4 In either case, retrieve
records that are pointed to
– requires an I/O for each
record
– Linear file scan may be
cheaper
ID Name Dept_name tot_Cred
25001Fatema ECE 120
25002Sajid Math 100
25003Abid Accounting 120
25004Fatema ECE 50
25005Sajid Math 120
25006Shafique Accounting 100
25007Abid ECE 120
25008Fatema Math 50
25009Sajid ECE 120
25010Shafique Math 100
17. 17
Implementation of Complex Selections
● Conjunction: 1 2. . . n(r)
● A7 (conjunctive selection using one index).
● Select a combination of i and algorithms A1 through A7 that
results in the least cost for i (r).
● Test other conditions on tuple after fetching it into memory
buffer.
18. 18
Algorithms for Complex Selections
● Disjunction:1 2 . . . n (r).
● A10 (disjunctive selection by union of identifiers).
● Applicable if all conditions have available indices.
4 Otherwise use linear scan.
● Use corresponding index for each condition, and take
union of all the obtained sets of record pointers.
● Then fetch records from file
19. 19
Sorting
▪ We may build an index on the relation, and then use the
index to read the relation in sorted order. May lead to one
disk block access for each tuple.
Discussion: The case when it May lead to one disk block access
for each tuple.
▪ For relations that fit in memory, techniques like quicksort
can be used.
▪ For relations that don’t fit in memory, quicksort is not
applicable. Why? external sort-merge is a good choice in
this case.
20. 20
Example: External Sorting Using Sort-Merge
g
a
d 31
c 33
b 14
e 16
r 16
d 21
m 3
p 2
d 7
a 14
a 14
a 19
b 14
c 33
d 7
d 21
d 31
e 16
g 24
m 3
p 2
r 16
a 19
b 14
c 33
d 31
e 16
g 24
a 14
d 7
d 21
m 3
p 2
r 16
a 19
d 31
g 24
b 14
c 33
e 16
d 21
m 3
r 16
a 14
d 7
p 2
initial
relation
create
runs
merge
pass–1
merge
pass–2
runs runs
sorted
output
24
19
21. 21
External Sort-Merge
(Run Creation)
g
a
d 31
c 33
b 14
e 16
r 16
d 21
m 3
p 2
d 7
a 14
a 14
a 19
b 14
c 33
d 7
d 21
d 31
e 16
g 24
m 3
p 2
r 16
a 19
b 14
c 33
d 31
e 16
g 24
a 14
d 7
d 21
m 3
p 2
r 16
a 19
d 31
g 24
b 14
c 33
e 16
d 21
m 3
r 16
a 14
d 7
p 2
initial
relation
create
runs
merge
pass–1
merge
pass–2
runs runs
sorted
output
24
19
Run Creation Let M denote memory size (in pages).
Here M = 3
1. Create sorted runs. Let i be 0 initially.
Repeatedly do the following till the end
of the relation:
(a) Read M blocks of relation into
memory
(b) Sort the in-memory blocks
(c) Write sorted data to run Ri;
increment i.
Let the final value of i be N (Here N = ?)
Here N = 4
2. Merge the runs (next slide)…..
Memory
R1
R2
R3
R4
22. 22
External Sort-Merge
(Sort-Merge)
g
a
d 31
c 33
b 14
e 16
r 16
d 21
m 3
p 2
d 7
a 14
a 14
a 19
b 14
c 33
d 7
d 21
d 31
e 16
g 24
m 3
p 2
r 16
a 19
b 14
c 33
d 31
e 16
g 24
a 14
d 7
d 21
m 3
p 2
r 16
a 19
d 31
g 24
b 14
c 33
e 16
d 21
m 3
r 16
a 14
d 7
p 2
initial
relation
create
runs
merge
pass–1
merge
pass–2
runs runs
sorted
output
24
19
Merge pass1
Input Buffer
R1
R2
R3
R4
Merge the runs (2-way merge). Here
N > M. M=3, N=4
Use 2 blocks of memory to buffer input
runs, and 1 block to buffer output. Read
the first block of each run into its buffer
page
repeat
Select the first record (in sort order)
among all buffer pages
Write the record to the output buffer.
If the output buffer is full write it to
disk.
Delete the record from its input buffer
page.
If the buffer page becomes empty
then
read the next block (if any) of the run
into the buffer.
until all input buffer pages are empty:
a 19
b 14
Output
Buffer a 19
23. 23
External Sort-Merge
(Sort Merge)
g
a
d 31
c 33
b 14
e 16
r 16
d 21
m 3
p 2
d 7
a 14
a 14
a 19
b 14
c 33
d 7
d 21
d 31
e 16
g 24
m 3
p 2
r 16
a 19
b 14
c 33
d 31
e 16
g 24
a 14
d 7
d 21
m 3
p 2
r 16
a 19
d 31
g 24
b 14
c 33
e 16
d 21
m 3
r 16
a 14
d 7
p 2
initial
relation
create
runs
merge
pass–1
merge
pass–2
runs runs
sorted
output
24
19
Merge pass1
Memory
R1
R2
R3
R4
Merge the runs (2-way merge). Here
N > M. M=3, N=2
Use 2 blocks of memory to buffer input
runs, and 1 block to buffer output. Read
the first block of each run into its buffer
page
repeat
Select the first record (in sort order)
among all buffer pages
Write the record to the output buffer.
If the output buffer is full write it to
disk.
Delete the record from its input buffer
page.
If the buffer page becomes empty
then
read the next block (if any) of the run
into the buffer.
until all input buffer pages are empty:
b 14
d 31
b 14
24. 24
External Sort-Merge
(Algorithm Analysis)
g
a
d 31
c 33
b 14
e 16
r 16
d 21
m 3
p 2
d 7
a 14
a 14
a 19
b 14
c 33
d 7
d 21
d 31
e 16
g 24
m 3
p 2
r 16
a 19
b 14
c 33
d 31
e 16
g 24
a 14
d 7
d 21
m 3
p 2
r 16
a 19
d 31
g 24
b 14
c 33
e 16
d 21
m 3
r 16
a 14
d 7
p 2
initial
relation
create
runs
merge
pass–1
merge
pass–2
runs runs
sorted
output
24
19
Merge pass1
Memory
R1
R2
R3
R4
▪ If N M, several merge passes are
required.
▪ Here, it is 2
▪ In each pass, contiguous groups of M - 1
runs are merged.
▪ A pass reduces the number of runs by a
factor of M -1, and creates runs longer
by the same factor.
E.g. If M=11, and there are 90 runs,
one pass reduces the number of
runs to 9, each 10 times the size of
the initial runs
▪ Repeated passes are performed till all
runs have been merged into one.
▪ N = 90, M = 11
▪ After merge, number of runs = 90/(11-
1)=9
b 14
d 31
b 14
Merge pass2
25. 25
External Sort-Merge
(Algorithm Analysis)
g
a
d 31
c 33
b 14
e 16
r 16
d 21
m 3
p 2
d 7
a 14
a 14
a 19
b 14
c 33
d 7
d 21
d 31
e 16
g 24
m 3
p 2
r 16
a 19
b 14
c 33
d 31
e 16
g 24
a 14
d 7
d 21
m 3
p 2
r 16
a 19
d 31
g 24
b 14
c 33
e 16
d 21
m 3
r 16
a 14
d 7
p 2
initial
relation
create
runs
merge
pass–1
merge
pass–2
runs runs
sorted
output
24
19
Merge pass1
Memory
R1
R2
R3
R4
b 14
d 31
b 14
Merge pass2
Problem: Find the number of runs, number
of passes, number of blocks transfer and
number of seeks for a relation with br blocks.
Memory size is M, number of runs is N and
N > M.
▪ Cost analysis (Total number of passes)
▪ Total number of blocks of the relation is br
▪ Size of the memory is M
▪ Total number of runs = (br/M)
▪ No of blocks = 100
▪ Memory size = 11
▪ Number of runs = 100/11 = 10
26. 26
External Sort-Merge
(Algorithm Analysis)
g
a
d 31
c 33
b 14
e 16
r 16
d 21
m 3
p 2
d 7
a 14
a 14
a 19
b 14
c 33
d 7
d 21
d 31
e 16
g 24
m 3
p 2
r 16
a 19
b 14
c 33
d 31
e 16
g 24
a 14
d 7
d 21
m 3
p 2
r 16
a 19
d 31
g 24
b 14
c 33
e 16
d 21
m 3
r 16
a 14
d 7
p 2
initial
relation
create
runs
merge
pass–1
merge
pass–2
runs runs
sorted
output
24
19
Merge pass1
Memory
R1
R2
R3
R4
b 14
d 31
b 14
Merge pass2
▪ Cost analysis (Total number of passes)
▪ Total number of blocks of the relation is br
▪ Size of the memory is M
▪ Total number of runs = (br/M)
▪ After merge pass 1, number of runs =
(br/M) /(M-1)
▪ After merge pass 2, number of runs
= (br/M) /(M-1) (M-1)
= (br/M) /(M-1) 2
Total number of pass = log (M–1)(br/M).
log (3–1)(12/3). = 2
27. 27
External Sort-Merge
(Algorithm Analysis)
g
a
d 31
c 33
b 14
e 16
r 16
d 21
m 3
p 2
d 7
a 14
a 14
a 19
b 14
c 33
d 7
d 21
d 31
e 16
g 24
m 3
p 2
r 16
a 19
b 14
c 33
d 31
e 16
g 24
a 14
d 7
d 21
m 3
p 2
r 16
a 19
d 31
g 24
b 14
c 33
e 16
d 21
m 3
r 16
a 14
d 7
p 2
initial
relation
create
runs
merge
pass–1
merge
pass–2
runs runs
sorted
output
24
19
Merge pass1
Memory
R1
R2
R3
R4
b 14
d 31
b 14
Merge pass2
Cost analysis (Total number of block
transfer)
Block transfers for initial run creation as well
as in each pass is br + br = 2br
for final pass, we don’t count write cost
since the output of an operation may be sent
to the parent operation
Block transfer for run creation
= br for read+ br for write = 2 br
Block transfers for merging
= 2 br * log (M–1)(br/M)
BT for run and merge (B_RM) = 2 br + 2 br *
log (M–1)(br/M)
Final merge, no write.
So Net block transfer = B_RM - br
= 2 br + 2 br * log (M–1)(br/M) - br
= br * (2 * log (M–1)(br/M) +1)
28. 28
External Sort-Merge
(Algorithm Analysis)
g
a
d 31
c 33
b 14
e 16
r 16
d 21
m 3
p 2
d 7
a 14
a 14
a 19
b 14
c 33
d 7
d 21
d 31
e 16
g 24
m 3
p 2
r 16
a 19
b 14
c 33
d 31
e 16
g 24
a 14
d 7
d 21
m 3
p 2
r 16
a 19
d 31
g 24
b 14
c 33
e 16
d 21
m 3
r 16
a 14
d 7
p 2
initial
relation
create
runs
merge
pass–1
merge
pass–2
runs runs
sorted
output
24
19
Merge pass1
Memory
R1
R2
R3
R4
b 14
d 31
b 14
Merge pass2
Cost analysis (Total number of seek)
During run generation: one seek to read
each run and one seek to write each run
2 br / M
During the merge phase
Need 2 br seeks for each merge pass
except the final one which does not require
a write
Total number of seeks:
= seek for run creation + seek for merge
= 2 br / M + 2br logM–1(br / M) - br
= 2 br / M + br ( 2 logM–1(br / M) - 1)
29. 29
Problem to Solve
Question: The size of a relation is 200 blocks and the memory size is 5 blocks.
a. Find the number of initial sorted runs and explain the run creation method.
30. 30
Problem to Solve
Question: The size of a relation is 200 blocks and the memory size is 5 blocks.
b. Find the number of runs created after first merge pass and explain the first merge-
pass .
Number or runs = 200/5 = 40
Input buffer = 4
Output buffer =1
Number of runs after merge pass 1 = 40/4 = 10
31. 31
Problem to Solve
Question: The size of a relation is 200 blocks and the memory size is 5 blocks.
c. Find the number of block transfer in final merge pass and explain why the final
sorted run is not saved in disk.
Number or runs = 200/5 = 40
Input buffer = 4
Output buffer =1
Number of runs after merge pass 1 = 40/4 = 10
Merge 1 > 40 to 10
Merge 2 > 10 to 3
Merge 3 > 3 to 1
Number of merge pass =
32. 32
Problem to Solve
Question: The size of a relation is 200 blocks and the memory size is 5 blocks.
c. Find the number of block transfer in final merge pass and explain why the final
sorted run is not saved in disk.
Merge 1 > 40 to 10
Merge 2 > 10 to 3
Merge 3 > 3 to 1
Number of merge pass = 3
Number of block transfer in final merge pass = br = 200 blocks
33. 33
Join Operation
▪ Several different algorithms to implement joins
• Nested-loop join
• Block nested-loop join
▪ Choice based on cost estimate
▪ Examples use the following information
• Number of records of student: 20,000 takes: 40,000
• Number of blocks of student: 2000 takes: 400
34. 34
Nested-Loop Join
To compute the theta join
r ⨝ s
for each tuple tr in r do begin
for each tuple ts in s do begin
test pair (tr,ts) to see if they satisfy the join condition
if they do,
add tr • ts to the result.
end
end
r is the outer relation and s the inner relation of the join.
Expensive since it examines every pair of
tuples in the two relations.
Cost Analysis (Block transfer)
Case 1: (worst case),
nr = total number of tuples in r
br = number of blocks in r
bs= number of blocks in s
if there is enough memory only to hold one
block of each relation,
the estimated block transfers is
nr bs + br
Number of seek = nr + br
B1(r)
B2(r)
B3(r)
……..
Br(r)
B1(s)
B2(s)
……..
Bs(s)
M1
M2
Iteration 1
r s
35. 35
Nested-Loop Join
To compute the theta join
r ⨝ s
for each tuple tr in r do begin
for each tuple ts in s do begin
test pair (tr,ts) to see if they satisfy the join condition
if they do,
add tr • ts to the result.
end
end
r is the outer relation and s the inner relation of the join.
Expensive since it examines every pair of
tuples in the two relations.
Cost Analysis (Block transfer)
Case 1: (worst case),
nr = total number of tuples in r
br = number of blocks in r
bs= number of blocks in s
if there is enough memory only to hold one
block of each relation,
the estimated block transfers is
nr bs + br
Number of seek = nr + br
B1
B2
B3
……..
Br
B1
B2
……..
Bs
M1
M2
Iteration 1
36. 36
Nested-Loop Join
To compute the theta join
r ⨝ s
for each tuple tr in r do begin
for each tuple ts in s do begin
test pair (tr,ts) to see if they satisfy the join condition
if they do,
add tr • ts to the result.
end
end
r is the outer relation and s the inner relation of the join.
Expensive since it examines every pair of
tuples in the two relations.
Cost Analysis (Block transfer)
Case 1: (worst case),
nr = total number of tuples in r
br = number of blocks in r
bs= number of blocks in s
if there is enough memory only to hold one
block of each relation,
the estimated block transfers is
nr bs + br
Number of seek = nr + br
B1
B2
B3
……..
Br
B1
B2
……..
Bs
M1
M2
Iteration 1
37. 37
Nested-Loop Join
for each tuple tr in r do begin
for each tuple ts in s do begin
test pair (tr,ts) to see if they satisfy the join
condition
if they do,
add tr • ts to the result.
end
end
Cost Analysis (Block transfer)
Case 2 (Best Case): the smaller relation
fits entirely in memory, use that as the
inner relation
Br = number of blocks in r
Bs= number of blocks in s
the estimated block transfers is
bs + br
Number of seek = 2
B1
B2
B3
……..
Br
B1
B2
……..
Bs
M1
M2
M3
……
Mn
38. 38
Nested-Loop Join
for each tuple tr in student do begin
for each tuple ts in takes do begin
test pair (tr,ts) to see if they satisfy the join
condition
if they do,
add tr • ts to the result.
end
end
Problem : The relation schema are
Student (id, name, cgpa, tot-cred, street, city, NID)
Takes (id, course-id, semester, year)
a. Write the SQL and algebra to find id, name, course
id and semester
b. Find the worst case number of block transfer and
number of seek
Relation No. of records No. of blocks
student 20000 2000
takes 40000 400
Cost Analysis (Block transfer)
Case 1: (worst case),
Br = number of blocks in r
Bs= number of blocks in s
if there is enough memory only to hold one block
of each relation,
the estimated block transfers is
nr bs + br
Number of seek = nr + br
B1
B2
B3
……..
B2000
B1
B2
……..
B400
M1
M2
Iteration 1
39. 39
Nested-Loop Join
for each tuple tr in student do begin
for each tuple ts in takes do begin
test pair (tr,ts) to see if they satisfy the join
condition
if they do,
add tr • ts to the result.
end
end
Relation No. of records No. of blocks
student 20000 2000
takes 40000 400
Solution
br = 2000
bs= 400
memory only holds one block of each relation,
the estimated block transfers is
nr bs + br = 20000 * 400 + 2000
Number of seek = nr + br = 20000+2000
B1
B2
B3
……..
B2000
B1
B2
……..
B400
M1
M2
Iteration 1
Problem : The relation schema are
Student (id, name, cgpa, tot-cred, street, city, NID)
Takes (id, course-id, semester, year)
a. Write the SQL and algebra to find id, name, course
id and semester
b. Find the worst case number of block transfer and
number of seek
40. 40
Nested-Loop Join
for each tuple tr in student do begin
for each tuple ts in takes do begin
test pair (tr,ts) to see if they satisfy the join
condition
if they do,
add tr • ts to the result.
end
end
Problem:
Find the best case number of block transfer
and number of seek
Relation No. of records No. of blocks
student 20000 2000
takes 40000 400
Solution
br = 2000
bs= 400
memory only holds one block of each relation,
the estimated block transfers is
bs + br = 400 + 2000
Number of seek = 2
B1
B2
B3
……..
B2000
B1
B2
……
B400
M1
M2
M3
……
Mn
Discussion: Write a discussion on nested loop
join considering cost.
41. 41
Block Nested-Loop Join
Variant of nested-loop join in which every block
of inner relation is paired with every block of
outer relation.
for each block Br of r do begin
for each block Bs of s do begin
for each tuple tr in Br do begin
for each tuple ts in Bs do begin
Check if (tr,ts) satisfy the join
condition
if they do, add tr • ts to the result.
end
end
end
end
B1
B2
B3
……..
B2000
B1
B2
……..
B400
M1
M2
Iteration 1
42. 42
Block Nested-Loop Join
B1
B2
B3
……..
B2000
B1
B2
……..
B400
M1
M2
Iteration 1
Variant of nested-loop join in which every block
of inner relation is paired with every block of
outer relation.
for each block Br of r do begin
for each block Bs of s do begin
for each tuple tr in Br do begin
for each tuple ts in Bs do begin
Check if (tr,ts) satisfy the join
condition
if they do, add tr • ts to the result.
end
end
end
end
43. 43
Block Nested-Loop Join
B1
B2
B3
……..
B2000
B1
B2
……..
B400
M1
M2
Iteration 1
Variant of nested-loop join in which every block
of inner relation is paired with every block of
outer relation.
for each block Br of r do begin
for each block Bs of s do begin
for each tuple tr in Br do begin
for each tuple ts in Bs do begin
Check if (tr,ts) satisfy the join
condition
if they do, add tr • ts to the result.
end
end
end
end
44. 44
Block Nested-Loop Join
B1
B2
B3
……..
B2000
B1
B2
……..
B400
M1
M2
Iteration 2
Variant of nested-loop join in which every block
of inner relation is paired with every block of
outer relation.
for each block Br of r do begin
for each block Bs of s do begin
for each tuple tr in Br do begin
for each tuple ts in Bs do begin
Check if (tr,ts) satisfy the join
condition
if they do, add tr • ts to the result.
end
end
end
end
45. 45
Block Nested-Loop Join
B1
B2
B3
……..
B2000
B1
B2
……..
B400
M1
M2
Iteration 2
Cost Analysis (Block transfer)
Case 2 (Worst Case):
the estimated block transfers is
br * bs + br
Number of seek = bs + br
Variant of nested-loop join in which every block
of inner relation is paired with every block of
outer relation.
for each block Br of r do begin
for each block Bs of s do begin
for each tuple tr in Br do begin
for each tuple ts in Bs do begin
Check if (tr,ts) satisfy the join
condition
if they do, add tr • ts to the result.
end
end
end
end
46. 46
Block Nested-Loop Join
B1
B2
B3
……..
B2000
B1
B2
……..
B400
M1
M2
Iteration 2
Cost Analysis (Block transfer)
Case 2 (Best Case): the smaller
relation fits entirely in memory, use
that as the inner relation
the estimated block transfers is
bs + br
Number of seek = 2
Variant of nested-loop join in which every block
of inner relation is paired with every block of
outer relation.
for each block Br of r do begin
for each block Bs of s do begin
for each tuple tr in Br do begin
for each tuple ts in Bs do begin
Check if (tr,ts) satisfy the join
condition
if they do, add tr • ts to the result.
end
end
end
end
47. 47
Evaluation of Expressions
So far: we have seen algorithms for individual
operations
Alternatives for evaluating an entire expression
tree
Materialization: generate results of an
expression whose inputs are relations or
are already computed, materialize (store) it
on disk. Repeat.
Pipelining: pass on tuples to parent
operations even as an operation is being
executed
We study above alternatives in more detail
name((building = ‘Watson’(department)) ⨝ instructor)
48. 48
Materialization
Materialized evaluation: evaluate
one operation at a time, starting at
the lowest-level. Use intermediate
results materialized into temporary
relations to evaluate next-level
operations.
The department and tinstructor
schemas are given below:
instructor (id, name, dept-name,
salary)
Department (dept-name, building,
budget)
i. Write relational algebra to find the
names of all instructors of all
departments in ‘Watson’ building.
ii. Show the query expression tree
(operation tree) for the query.
name((building = ‘Watson’(department)) ⨝ instructor)
Relational algebra
Query expression tree
49. 49
Materialization
Materialized evaluation: evaluate
one operation at a time, starting at
the lowest-level. Use intermediate
results materialized into temporary
relations to evaluate next-level
operations.
Problem 12: The student and takes
schemas are given below:
student (id, name, CGPA, year-admit)
Takes (id, course-id, semester, year)
i. Write relational algebra to find the
names and CGPA of all students
of spring 2019.
ii. Show the query expression tree
(operation tree) for the query.
name((building = ‘Watson’(department)) ⨝ instructor)
Relational algebra
Query expression tree
50. 50
Materialization (Cont.)
Materialized evaluation is
always applicable
Cost of writing results to disk
and reading them back can be
quite high
Our cost formulas for
operations ignore cost of
writing results to disk, so
Overall cost = Sum of costs of
individual operations + cost of
writing intermediate results to
disk
Write
Cost
Write
Cost
51. 51
Materialization (Cont.)
Problem 13: The cost of (building =
‘Watson’(department) is 1 seek and 20
block transfer.
The cost of ((building = ‘Watson’
(department)) ⨝ instructor) is 10
seek and 100 block transfer.
The cost of name (…….) is 1 seek and
50 block transfer.
Write cost for (building =
‘Watson’(department) is 1 seek and 10
block transfer
Write cost for ⨝ is 1 seek and 50
block transfer
Find overall cost of the query.
Write
Cost
Write
Cost
52. 52
Materialization (Cont.)
Problem 13: The cost of (building =
‘Watson’(department) is 1 seek and 20
block transfer.
The cost of ((building = ‘Watson’
(department)) ⨝ instructor) is 10
seek and 100 block transfer.
The cost of name (…….) is 1 seek and
50 block transfer.
Write cost for (building =
‘Watson’(department) is 1 seek and 10
block transfer
Write cost for ⨝ is 1 seek and 50
block transfer
Find overall cost of the query.
Write
Cost
Write
Cost
Overall cost = Sum of costs of individual
operations + cost of writing intermediate
results to disk
= ((1+10+1) seek + (20+100+50) blocks) +
((1+1) seek + (10+50) blocks)
= (12Seek+170Blocks) + (2 seek + 60blocks)
53. 53
Materialization (Cont.)
▪ Materialized evaluation is always
applicable
▪ Cost of writing results to disk and reading
them back can be quite high
▪ Our cost formulas for operations ignore
cost of writing results to disk, so
▪ Overall cost = Sum of costs of individual
operations + cost of writing intermediate
results to disk
▪ Double buffering: use two output buffers
for each operation, when one is full write it
to disk while the other is getting filled
• Allows overlap of disk writes with
computation and reduces execution
time
Write (X)
Buffer 1
Buffer 2
DB
Double
Buffering
54. 54
Materialization (Cont.)
▪ Materialized evaluation is always applicable
▪ Cost of writing results to disk and reading
them back can be quite high
▪ Our cost formulas for operations ignore cost
of writing results to disk, so
▪ Overall cost = Sum of costs of individual
operations + cost of writing intermediate
results to disk
▪ Double buffering: use two output buffers for
each operation, when one is full write it to
disk while the other is getting filled
• Allows overlap of disk writes with
computation and reduces execution time
Write (Y)
Buffer 1
Buffer 2
DB
If Buffer 1 full
55. 55
Pipelining
▪ Pipelined evaluation: evaluate
several operations
simultaneously, passing the
results of one operation on to the
next.
▪ E.g., in previous expression tree,
don’t store result of
▪ instead, pass tuples directly to the
join.. Similarly, don’t store result
of join, pass tuples directly to
projection.
▪ Much cheaper than
materialization: no need to store a
temporary relation to disk.
)
(
"
Watson
" department
building=
Write
Cost
Write
Cost
Discussion: Explain how the above
query is executed using pipelining and
compare the cost with materialization
×
×
name((building = ‘Watson’(department)) ⨝ instructor
56. 56
Pipelining
▪ Pipelining may not always be
possible – e.g., sort, hash-
join.
▪ For pipelining to be effective,
use evaluation algorithms
that generate output tuples
even as tuples are received
for inputs to the operation.
▪ can be executed in two
ways: demand driven and
producer driven
Discussion: Explain why pipelining may
not be possible in case of database
sorting.
How much it is possible?
57. 57
Example: External Sorting Using Sort-Merge
g
a
d 31
c 33
b 14
e 16
r 16
d 21
m 3
p 2
d 7
a 14
a 14
a 19
b 14
c 33
d 7
d 21
d 31
e 16
g 24
m 3
p 2
r 16
a 19
b 14
c 33
d 31
e 16
g 24
a 14
d 7
d 21
m 3
p 2
r 16
a 19
d 31
g 24
b 14
c 33
e 16
d 21
m 3
r 16
a 14
d 7
p 2
initial
relation
create
runs
merge
pass–1
merge
pass–2
runs runs
sorted
output
24
19
58. 58
Pipelining (Cont.)
▪ In demand driven or lazy
evaluation
• system repeatedly requests
next tuple from top level
operation
• Each operation requests
next tuple from children
operations as required, in
order to output its next tuple
• In between calls, operation
has to maintain “state” so it
knows what to return next
• Alternative name: pull model
of pipelining
Output-1
Tuple 1
Tuple 1
Tuple 1
59. 59
Pipelining (Cont.)
▪ In producer-driven or eager
pipelining
▪ Operators produce tuples eagerly
and pass them up to their
parents
▪ Buffer maintained between
operators, child puts tuples in
buffer, parent removes tuples
from buffer
▪ if buffer is full, child waits till
there is space in the buffer, and
then generates more tuples
▪ System schedules operations
that have space in output buffer
and can process more input
tuples
▪ Alternative name: push model of
pipelining
Output-1
Tuple 1
Tuple 1
Tuple 1
Discussion: Give a comparison between
pull and push model of pipelining