SlideShare a Scribd company logo
Analysis of Algorithm
Disjoint Set Representation
Andres Mendez-Vazquez
November 8, 2015
1 / 114
Outline
1 Disjoint Set Representation
Definition of the Problem
Operations
2 Union-Find Problem
The Main Problem
Applications
3 Implementations
First Attempt: Circular List
Operations and Cost
Still we have a Problem
Weighted-Union Heuristic
Operations
Still a Problem
Heuristic Union by Rank
4 Balanced Union
Path compression
Time Complexity
Ackermann’s Function
Bounds
The Rank Observation
Proof of Complexity
Theorem for Union by Rank and Path Compression)
2 / 114
Outline
1 Disjoint Set Representation
Definition of the Problem
Operations
2 Union-Find Problem
The Main Problem
Applications
3 Implementations
First Attempt: Circular List
Operations and Cost
Still we have a Problem
Weighted-Union Heuristic
Operations
Still a Problem
Heuristic Union by Rank
4 Balanced Union
Path compression
Time Complexity
Ackermann’s Function
Bounds
The Rank Observation
Proof of Complexity
Theorem for Union by Rank and Path Compression)
3 / 114
Disjoint Set Representation
Problem
1 Items are drawn from the finite universe U = 1, 2, ..., n for some fixed
n.
2 We want to maintain a partition of U as a collection of disjoint sets.
3 In addition, we want to uniquely name each set by one of its items
called its representative item.
These disjoint sets are maintained under the following operations
1 MakeSet(x)
2 Union(A,B)
3 Find(x)
4 / 114
Disjoint Set Representation
Problem
1 Items are drawn from the finite universe U = 1, 2, ..., n for some fixed
n.
2 We want to maintain a partition of U as a collection of disjoint sets.
3 In addition, we want to uniquely name each set by one of its items
called its representative item.
These disjoint sets are maintained under the following operations
1 MakeSet(x)
2 Union(A,B)
3 Find(x)
4 / 114
Disjoint Set Representation
Problem
1 Items are drawn from the finite universe U = 1, 2, ..., n for some fixed
n.
2 We want to maintain a partition of U as a collection of disjoint sets.
3 In addition, we want to uniquely name each set by one of its items
called its representative item.
These disjoint sets are maintained under the following operations
1 MakeSet(x)
2 Union(A,B)
3 Find(x)
4 / 114
Disjoint Set Representation
Problem
1 Items are drawn from the finite universe U = 1, 2, ..., n for some fixed
n.
2 We want to maintain a partition of U as a collection of disjoint sets.
3 In addition, we want to uniquely name each set by one of its items
called its representative item.
These disjoint sets are maintained under the following operations
1 MakeSet(x)
2 Union(A,B)
3 Find(x)
4 / 114
Disjoint Set Representation
Problem
1 Items are drawn from the finite universe U = 1, 2, ..., n for some fixed
n.
2 We want to maintain a partition of U as a collection of disjoint sets.
3 In addition, we want to uniquely name each set by one of its items
called its representative item.
These disjoint sets are maintained under the following operations
1 MakeSet(x)
2 Union(A,B)
3 Find(x)
4 / 114
Disjoint Set Representation
Problem
1 Items are drawn from the finite universe U = 1, 2, ..., n for some fixed
n.
2 We want to maintain a partition of U as a collection of disjoint sets.
3 In addition, we want to uniquely name each set by one of its items
called its representative item.
These disjoint sets are maintained under the following operations
1 MakeSet(x)
2 Union(A,B)
3 Find(x)
4 / 114
Outline
1 Disjoint Set Representation
Definition of the Problem
Operations
2 Union-Find Problem
The Main Problem
Applications
3 Implementations
First Attempt: Circular List
Operations and Cost
Still we have a Problem
Weighted-Union Heuristic
Operations
Still a Problem
Heuristic Union by Rank
4 Balanced Union
Path compression
Time Complexity
Ackermann’s Function
Bounds
The Rank Observation
Proof of Complexity
Theorem for Union by Rank and Path Compression)
5 / 114
Operations
MakeSet(x)
Given x ∈ U currently not belonging to any set in the collection, create a
new singleton set {x} and name it x.
This is usually done at start, once per item, to create the initial trivial
partition.
Union(A,B)
It changes the current partition by replacing its sets A and B with A ∪ B.
Name the set A or B.
The operation may choose either one of the two representatives as the
new representatives.
Find(x)
It returns the name of the set that currently contains item x.
6 / 114
Operations
MakeSet(x)
Given x ∈ U currently not belonging to any set in the collection, create a
new singleton set {x} and name it x.
This is usually done at start, once per item, to create the initial trivial
partition.
Union(A,B)
It changes the current partition by replacing its sets A and B with A ∪ B.
Name the set A or B.
The operation may choose either one of the two representatives as the
new representatives.
Find(x)
It returns the name of the set that currently contains item x.
6 / 114
Operations
MakeSet(x)
Given x ∈ U currently not belonging to any set in the collection, create a
new singleton set {x} and name it x.
This is usually done at start, once per item, to create the initial trivial
partition.
Union(A,B)
It changes the current partition by replacing its sets A and B with A ∪ B.
Name the set A or B.
The operation may choose either one of the two representatives as the
new representatives.
Find(x)
It returns the name of the set that currently contains item x.
6 / 114
Example
for x = 1 to 9 do MakeSet(x)
98621 2 3 4 5 6 8 97
Then, you do a Union(1, 2)
Now, Union(3, 4); Union(5, 8); Union(6, 9)
7 / 114
Example
for x = 1 to 9 do MakeSet(x)
98621 2 3 4 5 6 8 97
Then, you do a Union(1, 2)
9863 4 5 6 8 9721 2
Now, Union(3, 4); Union(5, 8); Union(6, 9)
7 / 114
Example
for x = 1 to 9 do MakeSet(x)
98621 2 3 4 5 6 8 97
Then, you do a Union(1, 2)
9863 4 5 6 8 9721 2
Now, Union(3, 4); Union(5, 8); Union(6, 9)
21 2 3 4 5 678 9
7 / 114
Example
Now, Union(1, 5); Union(7, 4)
21 2 3 45 678 9
Then, if we do the following operations
Find(1) returns 5
Find(9) returns 9
Finally, Union(5, 9)
Then Find(9) returns 5
8 / 114
Example
Now, Union(1, 5); Union(7, 4)
21 2 3 45 678 9
Then, if we do the following operations
Find(1) returns 5
Find(9) returns 9
Finally, Union(5, 9)
Then Find(9) returns 5
8 / 114
Example
Now, Union(1, 5); Union(7, 4)
21 2 3 45 678 9
Then, if we do the following operations
Find(1) returns 5
Find(9) returns 9
Finally, Union(5, 9)
21 2 3 45 6 78 9
Then Find(9) returns 5
8 / 114
Outline
1 Disjoint Set Representation
Definition of the Problem
Operations
2 Union-Find Problem
The Main Problem
Applications
3 Implementations
First Attempt: Circular List
Operations and Cost
Still we have a Problem
Weighted-Union Heuristic
Operations
Still a Problem
Heuristic Union by Rank
4 Balanced Union
Path compression
Time Complexity
Ackermann’s Function
Bounds
The Rank Observation
Proof of Complexity
Theorem for Union by Rank and Path Compression)
9 / 114
Union-Find Problem
Problem
S = be a sequence of m = |S| MakeSet, Union and Find operations
(intermixed in arbitrary order):
n of which are MakeSet.
At most n − 1 are Union.
The rest are Finds.
Cost(S) = total computational time to execute sequence s.
Goal: Find an implementation that, for every m and n, minimizes the
amortized cost per operation:
Cost (S)
|S|
(1)
for any arbitrary sequence S.
10 / 114
Union-Find Problem
Problem
S = be a sequence of m = |S| MakeSet, Union and Find operations
(intermixed in arbitrary order):
n of which are MakeSet.
At most n − 1 are Union.
The rest are Finds.
Cost(S) = total computational time to execute sequence s.
Goal: Find an implementation that, for every m and n, minimizes the
amortized cost per operation:
Cost (S)
|S|
(1)
for any arbitrary sequence S.
10 / 114
Union-Find Problem
Problem
S = be a sequence of m = |S| MakeSet, Union and Find operations
(intermixed in arbitrary order):
n of which are MakeSet.
At most n − 1 are Union.
The rest are Finds.
Cost(S) = total computational time to execute sequence s.
Goal: Find an implementation that, for every m and n, minimizes the
amortized cost per operation:
Cost (S)
|S|
(1)
for any arbitrary sequence S.
10 / 114
Union-Find Problem
Problem
S = be a sequence of m = |S| MakeSet, Union and Find operations
(intermixed in arbitrary order):
n of which are MakeSet.
At most n − 1 are Union.
The rest are Finds.
Cost(S) = total computational time to execute sequence s.
Goal: Find an implementation that, for every m and n, minimizes the
amortized cost per operation:
Cost (S)
|S|
(1)
for any arbitrary sequence S.
10 / 114
Union-Find Problem
Problem
S = be a sequence of m = |S| MakeSet, Union and Find operations
(intermixed in arbitrary order):
n of which are MakeSet.
At most n − 1 are Union.
The rest are Finds.
Cost(S) = total computational time to execute sequence s.
Goal: Find an implementation that, for every m and n, minimizes the
amortized cost per operation:
Cost (S)
|S|
(1)
for any arbitrary sequence S.
10 / 114
Union-Find Problem
Problem
S = be a sequence of m = |S| MakeSet, Union and Find operations
(intermixed in arbitrary order):
n of which are MakeSet.
At most n − 1 are Union.
The rest are Finds.
Cost(S) = total computational time to execute sequence s.
Goal: Find an implementation that, for every m and n, minimizes the
amortized cost per operation:
Cost (S)
|S|
(1)
for any arbitrary sequence S.
10 / 114
Outline
1 Disjoint Set Representation
Definition of the Problem
Operations
2 Union-Find Problem
The Main Problem
Applications
3 Implementations
First Attempt: Circular List
Operations and Cost
Still we have a Problem
Weighted-Union Heuristic
Operations
Still a Problem
Heuristic Union by Rank
4 Balanced Union
Path compression
Time Complexity
Ackermann’s Function
Bounds
The Rank Observation
Proof of Complexity
Theorem for Union by Rank and Path Compression)
11 / 114
Applications
Examples
1 Maintaining partitions and equivalence classes.
2 Graph connectivity under edge insertion.
3 Minimum spanning trees (e.g. Kruskal’s algorithm).
4 Random maze construction.
12 / 114
Applications
Examples
1 Maintaining partitions and equivalence classes.
2 Graph connectivity under edge insertion.
3 Minimum spanning trees (e.g. Kruskal’s algorithm).
4 Random maze construction.
12 / 114
Applications
Examples
1 Maintaining partitions and equivalence classes.
2 Graph connectivity under edge insertion.
3 Minimum spanning trees (e.g. Kruskal’s algorithm).
4 Random maze construction.
12 / 114
Applications
Examples
1 Maintaining partitions and equivalence classes.
2 Graph connectivity under edge insertion.
3 Minimum spanning trees (e.g. Kruskal’s algorithm).
4 Random maze construction.
12 / 114
Outline
1 Disjoint Set Representation
Definition of the Problem
Operations
2 Union-Find Problem
The Main Problem
Applications
3 Implementations
First Attempt: Circular List
Operations and Cost
Still we have a Problem
Weighted-Union Heuristic
Operations
Still a Problem
Heuristic Union by Rank
4 Balanced Union
Path compression
Time Complexity
Ackermann’s Function
Bounds
The Rank Observation
Proof of Complexity
Theorem for Union by Rank and Path Compression)
13 / 114
Circular lists
We use the following structures
Data structure: Two arrays Set[1..n] and next[1..n].
Set[x] returns the name of the set that contains item x.
A is a set if and only if Set[A] = A
next[x] returns the next item on the list of the set that contains item
x.
14 / 114
Circular lists
We use the following structures
Data structure: Two arrays Set[1..n] and next[1..n].
Set[x] returns the name of the set that contains item x.
A is a set if and only if Set[A] = A
next[x] returns the next item on the list of the set that contains item
x.
14 / 114
Circular lists
We use the following structures
Data structure: Two arrays Set[1..n] and next[1..n].
Set[x] returns the name of the set that contains item x.
A is a set if and only if Set[A] = A
next[x] returns the next item on the list of the set that contains item
x.
14 / 114
Circular lists
We use the following structures
Data structure: Two arrays Set[1..n] and next[1..n].
Set[x] returns the name of the set that contains item x.
A is a set if and only if Set[A] = A
next[x] returns the next item on the list of the set that contains item
x.
14 / 114
Circular lists
Example: n = 16,
Partition: {{1, 2, 8, 9} , {4, 3, 10, 13, 14, 15, 16} , {7, 6, 5, 11, 12}
Set
next
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
1 1 4 4 7 7 7 1 1 4 7 7 4 4 4 4
2 8 10 3 12 5 6 9 1 13 7 11 14 15 16 4
Set Position 1
15 / 114
Circular lists
Example: n = 16,
Partition: {{1, 2, 8, 9} , {4, 3, 10, 13, 14, 15, 16} , {7, 6, 5, 11, 12}
Set
next
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
1 1 4 4 7 7 7 1 1 4 7 7 4 4 4 4
2 8 10 3 12 5 6 9 1 13 7 11 14 15 16 4
Set Position 1
1 2 8 91
15 / 114
Circular lists
Set Position 7
7 6 5 12 117
Set Position 4
16 / 114
Circular lists
Set Position 7
7 6 5 12 117
Set Position 4
4 3 10 13 14 15 164
16 / 114
Operations and Cost
Make(x)
1 Set[x] = x
2 next[x] = x
Complexity
O (1) Time
Find(x)
1 return Set[x]
Complexity
O (1) Time
17 / 114
Operations and Cost
Make(x)
1 Set[x] = x
2 next[x] = x
Complexity
O (1) Time
Find(x)
1 return Set[x]
Complexity
O (1) Time
17 / 114
Operations and Cost
Make(x)
1 Set[x] = x
2 next[x] = x
Complexity
O (1) Time
Find(x)
1 return Set[x]
Complexity
O (1) Time
17 / 114
Operations and Cost
Make(x)
1 Set[x] = x
2 next[x] = x
Complexity
O (1) Time
Find(x)
1 return Set[x]
Complexity
O (1) Time
17 / 114
Operations and Cost
For the union
We are assuming Set[A] = A =Set[B] = B
Union1(A, B)
1 Set[B] = A
2 x =next[B]
3 while (x = B)
4 Set[x] = A /* Rename Set B to A*/
5 x =next[x]
6 x =next[B] /* Splice list A and B */
7 next[B] =next[A]
8 next[A] = x
18 / 114
Operations and Cost
For the union
We are assuming Set[A] = A =Set[B] = B
Union1(A, B)
1 Set[B] = A
2 x =next[B]
3 while (x = B)
4 Set[x] = A /* Rename Set B to A*/
5 x =next[x]
6 x =next[B] /* Splice list A and B */
7 next[B] =next[A]
8 next[A] = x
18 / 114
Operations and Cost
For the union
We are assuming Set[A] = A =Set[B] = B
Union1(A, B)
1 Set[B] = A
2 x =next[B]
3 while (x = B)
4 Set[x] = A /* Rename Set B to A*/
5 x =next[x]
6 x =next[B] /* Splice list A and B */
7 next[B] =next[A]
8 next[A] = x
18 / 114
Operations an Cost
Thus, we have in the Splice part
A
B
x
19 / 114
We have a Problem
Complexity
O (|B|) Time
Not only that, if we have the following sequence of operations
1 for x = 1 to n
2 MakeSet(x)
3 for x = 1 to n − 1
4 Union1(x + 1, x)
20 / 114
We have a Problem
Complexity
O (|B|) Time
Not only that, if we have the following sequence of operations
1 for x = 1 to n
2 MakeSet(x)
3 for x = 1 to n − 1
4 Union1(x + 1, x)
20 / 114
Thus
Thus, we have the following number of aggregated steps
n +
n−1
i=1
i = n +
n (n − 1)
2
= n +
n2 − n
2
=
n2
2
+
n
2
= Θ n2
21 / 114
Thus
Thus, we have the following number of aggregated steps
n +
n−1
i=1
i = n +
n (n − 1)
2
= n +
n2 − n
2
=
n2
2
+
n
2
= Θ n2
21 / 114
Thus
Thus, we have the following number of aggregated steps
n +
n−1
i=1
i = n +
n (n − 1)
2
= n +
n2 − n
2
=
n2
2
+
n
2
= Θ n2
21 / 114
Thus
Thus, we have the following number of aggregated steps
n +
n−1
i=1
i = n +
n (n − 1)
2
= n +
n2 − n
2
=
n2
2
+
n
2
= Θ n2
21 / 114
Thus
Thus, we have the following number of aggregated steps
n +
n−1
i=1
i = n +
n (n − 1)
2
= n +
n2 − n
2
=
n2
2
+
n
2
= Θ n2
21 / 114
Aggregate Time
Thus, the aggregate time is as follow
Aggregate Time = Θ n2
Therefore
Amortized Time per operation = Θ (n)
22 / 114
Aggregate Time
Thus, the aggregate time is as follow
Aggregate Time = Θ n2
Therefore
Amortized Time per operation = Θ (n)
22 / 114
This is not exactly good
Thus, we need to have something better
We will try now the Weighted-Union Heuristic!!!
23 / 114
Outline
1 Disjoint Set Representation
Definition of the Problem
Operations
2 Union-Find Problem
The Main Problem
Applications
3 Implementations
First Attempt: Circular List
Operations and Cost
Still we have a Problem
Weighted-Union Heuristic
Operations
Still a Problem
Heuristic Union by Rank
4 Balanced Union
Path compression
Time Complexity
Ackermann’s Function
Bounds
The Rank Observation
Proof of Complexity
Theorem for Union by Rank and Path Compression)
24 / 114
Implementation 2: Weighted-Union Heuristic Lists
We extend the previous data structure
Data structure: Three arrays Set[1..n], next[1..n], size[1..n].
size[A] returns the number of items in set A if A == Set[A]
(Otherwise, we do not care).
25 / 114
Operations
MakeSet(x)
1 Set[x] = x
2 next[x] = x
3 size[x] = 1
Complexity
O (1) time
Find(x)
1 return Set[x]
Complexity
O (1) time
26 / 114
Operations
MakeSet(x)
1 Set[x] = x
2 next[x] = x
3 size[x] = 1
Complexity
O (1) time
Find(x)
1 return Set[x]
Complexity
O (1) time
26 / 114
Operations
MakeSet(x)
1 Set[x] = x
2 next[x] = x
3 size[x] = 1
Complexity
O (1) time
Find(x)
1 return Set[x]
Complexity
O (1) time
26 / 114
Operations
MakeSet(x)
1 Set[x] = x
2 next[x] = x
3 size[x] = 1
Complexity
O (1) time
Find(x)
1 return Set[x]
Complexity
O (1) time
26 / 114
Operations
Union2(A, B)
1 if size[set [A]] >size[set [B]]
2 size[set [A]] =size[set [A]]+size[set [B]]
3 Union1(A, B)
4 else
5 size[set [B]] =size[set [A]]+size[set [B]]
6 Union1(B, A)
Note: Weight Balanced Union: Merge smaller set into large set
Complexity
O (min {|A| , |B|}) time.
27 / 114
Operations
Union2(A, B)
1 if size[set [A]] >size[set [B]]
2 size[set [A]] =size[set [A]]+size[set [B]]
3 Union1(A, B)
4 else
5 size[set [B]] =size[set [A]]+size[set [B]]
6 Union1(B, A)
Note: Weight Balanced Union: Merge smaller set into large set
Complexity
O (min {|A| , |B|}) time.
27 / 114
What about the operations eliciting the worst behavior
Remember
1 for x = 1 to n
2 MakeSet(x)
3 for x = 1 to n − 1
4 Union2(x + 1, x)
We have then
n +
n−1
i=1
1 = n + n − 1
= 2n − 1
= Θ (n)
IMPORTANT: This is not the worst sequence!!!
28 / 114
What about the operations eliciting the worst behavior
Remember
1 for x = 1 to n
2 MakeSet(x)
3 for x = 1 to n − 1
4 Union2(x + 1, x)
We have then
n +
n−1
i=1
1 = n + n − 1
= 2n − 1
= Θ (n)
IMPORTANT: This is not the worst sequence!!!
28 / 114
For this, notice the following worst sequence
Worst Sequence s
MakeSet(x), for x = 1, .., n. Then do n − 1 Unions in round-robin manner.
Within each round, the sets have roughly equal size.
Starting round: Each round has size 1.
Next round: Each round has size 2.
Next: ... size 4.
...
We claim the following
Aggregate time = Θ(n log n)
Amortized time per operation = Θ(log n)
29 / 114
For this, notice the following worst sequence
Worst Sequence s
MakeSet(x), for x = 1, .., n. Then do n − 1 Unions in round-robin manner.
Within each round, the sets have roughly equal size.
Starting round: Each round has size 1.
Next round: Each round has size 2.
Next: ... size 4.
...
We claim the following
Aggregate time = Θ(n log n)
Amortized time per operation = Θ(log n)
29 / 114
For this, notice the following worst sequence
Worst Sequence s
MakeSet(x), for x = 1, .., n. Then do n − 1 Unions in round-robin manner.
Within each round, the sets have roughly equal size.
Starting round: Each round has size 1.
Next round: Each round has size 2.
Next: ... size 4.
...
We claim the following
Aggregate time = Θ(n log n)
Amortized time per operation = Θ(log n)
29 / 114
For this, notice the following worst sequence
Worst Sequence s
MakeSet(x), for x = 1, .., n. Then do n − 1 Unions in round-robin manner.
Within each round, the sets have roughly equal size.
Starting round: Each round has size 1.
Next round: Each round has size 2.
Next: ... size 4.
...
We claim the following
Aggregate time = Θ(n log n)
Amortized time per operation = Θ(log n)
29 / 114
For this, notice the following worst sequence
Worst Sequence s
MakeSet(x), for x = 1, .., n. Then do n − 1 Unions in round-robin manner.
Within each round, the sets have roughly equal size.
Starting round: Each round has size 1.
Next round: Each round has size 2.
Next: ... size 4.
...
We claim the following
Aggregate time = Θ(n log n)
Amortized time per operation = Θ(log n)
29 / 114
For this, notice the following worst sequence
Worst Sequence s
MakeSet(x), for x = 1, .., n. Then do n − 1 Unions in round-robin manner.
Within each round, the sets have roughly equal size.
Starting round: Each round has size 1.
Next round: Each round has size 2.
Next: ... size 4.
...
We claim the following
Aggregate time = Θ(n log n)
Amortized time per operation = Θ(log n)
29 / 114
For this, notice the following worst sequence
Worst Sequence s
MakeSet(x), for x = 1, .., n. Then do n − 1 Unions in round-robin manner.
Within each round, the sets have roughly equal size.
Starting round: Each round has size 1.
Next round: Each round has size 2.
Next: ... size 4.
...
We claim the following
Aggregate time = Θ(n log n)
Amortized time per operation = Θ(log n)
29 / 114
For this, notice the following worst sequence
Worst Sequence s
MakeSet(x), for x = 1, .., n. Then do n − 1 Unions in round-robin manner.
Within each round, the sets have roughly equal size.
Starting round: Each round has size 1.
Next round: Each round has size 2.
Next: ... size 4.
...
We claim the following
Aggregate time = Θ(n log n)
Amortized time per operation = Θ(log n)
29 / 114
For this, notice the following worst sequence
Example n = 16
Round 0: {1} {2} {3} {4} {5} {6} {7} {8} {9} {10} {11} {12}
{13} {14} {15} {16}
Round 1: {1, 2} {3, 4} {5, 6} {7, 8} {9, 10} {11, 12} {13, 14} {15,
16}
Round 2: {1, 2, 3, 4} {5, 6, 7, 8} {9, 10, 11, 12} {13, 14, 15, 16}
Round 3: {1, 2, 3, 4, 5, 6, 7, 8} {9, 10, 11, 12, 13, 14, 15, 16}
Round 4: {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16}
30 / 114
For this, notice the following worst sequence
Example n = 16
Round 0: {1} {2} {3} {4} {5} {6} {7} {8} {9} {10} {11} {12}
{13} {14} {15} {16}
Round 1: {1, 2} {3, 4} {5, 6} {7, 8} {9, 10} {11, 12} {13, 14} {15,
16}
Round 2: {1, 2, 3, 4} {5, 6, 7, 8} {9, 10, 11, 12} {13, 14, 15, 16}
Round 3: {1, 2, 3, 4, 5, 6, 7, 8} {9, 10, 11, 12, 13, 14, 15, 16}
Round 4: {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16}
30 / 114
For this, notice the following worst sequence
Example n = 16
Round 0: {1} {2} {3} {4} {5} {6} {7} {8} {9} {10} {11} {12}
{13} {14} {15} {16}
Round 1: {1, 2} {3, 4} {5, 6} {7, 8} {9, 10} {11, 12} {13, 14} {15,
16}
Round 2: {1, 2, 3, 4} {5, 6, 7, 8} {9, 10, 11, 12} {13, 14, 15, 16}
Round 3: {1, 2, 3, 4, 5, 6, 7, 8} {9, 10, 11, 12, 13, 14, 15, 16}
Round 4: {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16}
30 / 114
For this, notice the following worst sequence
Example n = 16
Round 0: {1} {2} {3} {4} {5} {6} {7} {8} {9} {10} {11} {12}
{13} {14} {15} {16}
Round 1: {1, 2} {3, 4} {5, 6} {7, 8} {9, 10} {11, 12} {13, 14} {15,
16}
Round 2: {1, 2, 3, 4} {5, 6, 7, 8} {9, 10, 11, 12} {13, 14, 15, 16}
Round 3: {1, 2, 3, 4, 5, 6, 7, 8} {9, 10, 11, 12, 13, 14, 15, 16}
Round 4: {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16}
30 / 114
For this, notice the following worst sequence
Example n = 16
Round 0: {1} {2} {3} {4} {5} {6} {7} {8} {9} {10} {11} {12}
{13} {14} {15} {16}
Round 1: {1, 2} {3, 4} {5, 6} {7, 8} {9, 10} {11, 12} {13, 14} {15,
16}
Round 2: {1, 2, 3, 4} {5, 6, 7, 8} {9, 10, 11, 12} {13, 14, 15, 16}
Round 3: {1, 2, 3, 4, 5, 6, 7, 8} {9, 10, 11, 12, 13, 14, 15, 16}
Round 4: {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16}
30 / 114
Now
Given the previous worst case
What is the complexity of this implementation?
31 / 114
Now, the Amortized Costs of this implementation
Claim 1: Amortized time per operation is O(log n)
For this, we have the following theorem!!!
Theorem 1
Using the linked-list representation of disjoint sets and the weighted-Union
heuristic, a sequence of m MakeSet, Union, and FindSet operations, n of
which are MakeSet operations, takes O (m + n log n) time.
32 / 114
Now, the Amortized Costs of this implementation
Claim 1: Amortized time per operation is O(log n)
For this, we have the following theorem!!!
Theorem 1
Using the linked-list representation of disjoint sets and the weighted-Union
heuristic, a sequence of m MakeSet, Union, and FindSet operations, n of
which are MakeSet operations, takes O (m + n log n) time.
32 / 114
Proof
Because each Union operation unites two disjoint sets
We perform at most n − 1 Union operations over all.
We now bound the total time taken by these Union operations
We start by determining, for each object, an upper bound on the
number of times the object’s pointer back to its set object is
updated.
Consider a particular object x.
We know that each time x’s pointer was updated, x must have started
in the smaller set.
The first time x’s pointer was updated, therefore, the resulting set
must have had at least 2 members.
Similarly, the next time x’s pointer was updated, the resulting set must
have had at least 4 members.
33 / 114
Proof
Because each Union operation unites two disjoint sets
We perform at most n − 1 Union operations over all.
We now bound the total time taken by these Union operations
We start by determining, for each object, an upper bound on the
number of times the object’s pointer back to its set object is
updated.
Consider a particular object x.
We know that each time x’s pointer was updated, x must have started
in the smaller set.
The first time x’s pointer was updated, therefore, the resulting set
must have had at least 2 members.
Similarly, the next time x’s pointer was updated, the resulting set must
have had at least 4 members.
33 / 114
Proof
Because each Union operation unites two disjoint sets
We perform at most n − 1 Union operations over all.
We now bound the total time taken by these Union operations
We start by determining, for each object, an upper bound on the
number of times the object’s pointer back to its set object is
updated.
Consider a particular object x.
We know that each time x’s pointer was updated, x must have started
in the smaller set.
The first time x’s pointer was updated, therefore, the resulting set
must have had at least 2 members.
Similarly, the next time x’s pointer was updated, the resulting set must
have had at least 4 members.
33 / 114
Proof
Because each Union operation unites two disjoint sets
We perform at most n − 1 Union operations over all.
We now bound the total time taken by these Union operations
We start by determining, for each object, an upper bound on the
number of times the object’s pointer back to its set object is
updated.
Consider a particular object x.
We know that each time x’s pointer was updated, x must have started
in the smaller set.
The first time x’s pointer was updated, therefore, the resulting set
must have had at least 2 members.
Similarly, the next time x’s pointer was updated, the resulting set must
have had at least 4 members.
33 / 114
Proof
Because each Union operation unites two disjoint sets
We perform at most n − 1 Union operations over all.
We now bound the total time taken by these Union operations
We start by determining, for each object, an upper bound on the
number of times the object’s pointer back to its set object is
updated.
Consider a particular object x.
We know that each time x’s pointer was updated, x must have started
in the smaller set.
The first time x’s pointer was updated, therefore, the resulting set
must have had at least 2 members.
Similarly, the next time x’s pointer was updated, the resulting set must
have had at least 4 members.
33 / 114
Proof
Continuing on
We observe that for any k ≤ n, after x’s pointer has been updated log n
times!!!
The resulting set must have at least k members.
Thus
Since the largest set has at most n members, each object’s pointer is
updated at most log n times over all the Union operations.
Then
The total time spent updating object pointers over all Union operations is
O (n log n).
34 / 114
Proof
Continuing on
We observe that for any k ≤ n, after x’s pointer has been updated log n
times!!!
The resulting set must have at least k members.
Thus
Since the largest set has at most n members, each object’s pointer is
updated at most log n times over all the Union operations.
Then
The total time spent updating object pointers over all Union operations is
O (n log n).
34 / 114
Proof
Continuing on
We observe that for any k ≤ n, after x’s pointer has been updated log n
times!!!
The resulting set must have at least k members.
Thus
Since the largest set has at most n members, each object’s pointer is
updated at most log n times over all the Union operations.
Then
The total time spent updating object pointers over all Union operations is
O (n log n).
34 / 114
Proof
Continuing on
We observe that for any k ≤ n, after x’s pointer has been updated log n
times!!!
The resulting set must have at least k members.
Thus
Since the largest set has at most n members, each object’s pointer is
updated at most log n times over all the Union operations.
Then
The total time spent updating object pointers over all Union operations is
O (n log n).
34 / 114
Proof
We must also account for updating the tail pointers and the list
lengths
It takes only O (1) time per Union operation
Therefore
The total time spent in all Union operations is thus O (n log n).
The time for the entire sequence of m operations follows easily
Each MakeSet and FindSet operation takes O (1) time, and there are
O (m) of them.
35 / 114
Proof
We must also account for updating the tail pointers and the list
lengths
It takes only O (1) time per Union operation
Therefore
The total time spent in all Union operations is thus O (n log n).
The time for the entire sequence of m operations follows easily
Each MakeSet and FindSet operation takes O (1) time, and there are
O (m) of them.
35 / 114
Proof
We must also account for updating the tail pointers and the list
lengths
It takes only O (1) time per Union operation
Therefore
The total time spent in all Union operations is thus O (n log n).
The time for the entire sequence of m operations follows easily
Each MakeSet and FindSet operation takes O (1) time, and there are
O (m) of them.
35 / 114
Proof
Therefore
The total time for the entire sequence is thus O (m + n log n).
36 / 114
Amortized Cost: Aggregate Analysis
Aggregate cost O(m + n log n). Amortized cost per operation
O(log n).
O(m + n log n)
m
= O (1 + log n) = O (log n) (2)
37 / 114
There are other ways of analyzing the amortized cost
It is possible to use
1 Accounting Method.
2 Potential Method.
38 / 114
Amortized Costs: Accounting Method
Accounting method
MakeSet(x): Charge (1 + log n). 1 to do the operation, log n stored
as credit with item x.
Find(x): Charge 1, and use it to do the operation.
Union(A, B): Charge 0 and use 1 stored credit from each item in the
smaller set to move it.
39 / 114
Amortized Costs: Accounting Method
Accounting method
MakeSet(x): Charge (1 + log n). 1 to do the operation, log n stored
as credit with item x.
Find(x): Charge 1, and use it to do the operation.
Union(A, B): Charge 0 and use 1 stored credit from each item in the
smaller set to move it.
39 / 114
Amortized Costs: Accounting Method
Accounting method
MakeSet(x): Charge (1 + log n). 1 to do the operation, log n stored
as credit with item x.
Find(x): Charge 1, and use it to do the operation.
Union(A, B): Charge 0 and use 1 stored credit from each item in the
smaller set to move it.
39 / 114
Amortized Costs: Accounting Method
Credit invariant
Total stored credit is
S
|S| log n
|S| , where the summation is taken over
the collection S of all disjoint sets of the current partition.
40 / 114
Amortized Costs: Potential Method
Potential function method
Exercise:
Define a regular potential function and use it to do the amortized
analysis.
Can you make the Union amortized cost O(log n), MakeSet and Find
costs O(1)?
41 / 114
Amortized Costs: Potential Method
Potential function method
Exercise:
Define a regular potential function and use it to do the amortized
analysis.
Can you make the Union amortized cost O(log n), MakeSet and Find
costs O(1)?
41 / 114
Amortized Costs: Potential Method
Potential function method
Exercise:
Define a regular potential function and use it to do the amortized
analysis.
Can you make the Union amortized cost O(log n), MakeSet and Find
costs O(1)?
41 / 114
Outline
1 Disjoint Set Representation
Definition of the Problem
Operations
2 Union-Find Problem
The Main Problem
Applications
3 Implementations
First Attempt: Circular List
Operations and Cost
Still we have a Problem
Weighted-Union Heuristic
Operations
Still a Problem
Heuristic Union by Rank
4 Balanced Union
Path compression
Time Complexity
Ackermann’s Function
Bounds
The Rank Observation
Proof of Complexity
Theorem for Union by Rank and Path Compression)
42 / 114
Improving over the heuristic using union by rank
Union by Rank
Instead of using the number of nodes in each tree to make a decision, we
maintain a rank, a upper bound on the height of the tree.
We have the following data structure to support this:
We maintain a parent array p[1..n].
A is a set if and only if A = p[A] (a tree root).
x ∈ A if and only if x is in the tree rooted at A.
43 / 114
Improving over the heuristic using union by rank
Union by Rank
Instead of using the number of nodes in each tree to make a decision, we
maintain a rank, a upper bound on the height of the tree.
We have the following data structure to support this:
We maintain a parent array p[1..n].
A is a set if and only if A = p[A] (a tree root).
x ∈ A if and only if x is in the tree rooted at A.
43 / 114
Improving over the heuristic using union by rank
Union by Rank
Instead of using the number of nodes in each tree to make a decision, we
maintain a rank, a upper bound on the height of the tree.
We have the following data structure to support this:
We maintain a parent array p[1..n].
A is a set if and only if A = p[A] (a tree root).
x ∈ A if and only if x is in the tree rooted at A.
43 / 114
Improving over the heuristic using union by rank
Union by Rank
Instead of using the number of nodes in each tree to make a decision, we
maintain a rank, a upper bound on the height of the tree.
We have the following data structure to support this:
We maintain a parent array p[1..n].
A is a set if and only if A = p[A] (a tree root).
x ∈ A if and only if x is in the tree rooted at A.
43 / 114
Improving over the heuristic using union by rank
Union by Rank
Instead of using the number of nodes in each tree to make a decision, we
maintain a rank, a upper bound on the height of the tree.
We have the following data structure to support this:
We maintain a parent array p[1..n].
A is a set if and only if A = p[A] (a tree root).
x ∈ A if and only if x is in the tree rooted at A.
1
13 5 8
20 14 10
19
4
2 18 6
15 9 11
17
7
3
12
16
43 / 114
Forest of Up-Trees: Operations without union by rank or
weight
MakeSet(x)
1 p[x] = x
Complexity
O (1) time
Union(A, B)
1 p[B] = A
Note: We are assuming that p[A] == A =p[B] == B. This is the
reason we need a find operation!!!
44 / 114
Forest of Up-Trees: Operations without union by rank or
weight
MakeSet(x)
1 p[x] = x
Complexity
O (1) time
Union(A, B)
1 p[B] = A
Note: We are assuming that p[A] == A =p[B] == B. This is the
reason we need a find operation!!!
44 / 114
Forest of Up-Trees: Operations without union by rank or
weight
MakeSet(x)
1 p[x] = x
Complexity
O (1) time
Union(A, B)
1 p[B] = A
Note: We are assuming that p[A] == A =p[B] == B. This is the
reason we need a find operation!!!
44 / 114
Example
Remember we are doing the joins without caring about getting the
worst case
B
A
45 / 114
Forest of Up-Trees: Operations without union by rank or
weight
Find(x)
1 if x ==p[x]
2 return x
3 return Find(p [x])
Example
46 / 114
Forest of Up-Trees: Operations without union by rank or
weight
Find(x)
1 if x ==p[x]
2 return x
3 return Find(p [x])
Example
x
46 / 114
Forest of Up-Trees: Operations without union by rank or
weight
Still I can give you a horrible case
Sequence of operations
1 for x = 1 to n
2 MakeSet(x)
3 for x = 1 to n − 1
4 Union(x)
5 for x = 1 to n − 1
6 Find(1)
47 / 114
Forest of Up-Trees: Operations without union by rank or
weight
Still I can give you a horrible case
Sequence of operations
1 for x = 1 to n
2 MakeSet(x)
3 for x = 1 to n − 1
4 Union(x)
5 for x = 1 to n − 1
6 Find(1)
47 / 114
Forest of Up-Trees: Operations without union by rank or
weight
We finish with this data structure
1
2
n-1
n
Thus the last part of the sequence give us a total time of
Aggregate Time Θ n2
Amortized Analysis per operation Θ (n)
48 / 114
Forest of Up-Trees: Operations without union by rank or
weight
We finish with this data structure
1
2
n-1
n
Thus the last part of the sequence give us a total time of
Aggregate Time Θ n2
Amortized Analysis per operation Θ (n)
48 / 114
Self-Adjusting forest of Up-Trees
How, we avoid this problem
Use together the following heuristics!!!
1 Balanced Union.
By tree weight (i.e., size)
By tree rank (i.e., height)
2 Find with path compression
Observations
Each single improvement (1 or 2) by itself will result in logarithmic
amortized cost per operation.
The two improvements combined will result in amortized cost per
operation approaching very close to O(1).
49 / 114
Self-Adjusting forest of Up-Trees
How, we avoid this problem
Use together the following heuristics!!!
1 Balanced Union.
By tree weight (i.e., size)
By tree rank (i.e., height)
2 Find with path compression
Observations
Each single improvement (1 or 2) by itself will result in logarithmic
amortized cost per operation.
The two improvements combined will result in amortized cost per
operation approaching very close to O(1).
49 / 114
Self-Adjusting forest of Up-Trees
How, we avoid this problem
Use together the following heuristics!!!
1 Balanced Union.
By tree weight (i.e., size)
By tree rank (i.e., height)
2 Find with path compression
Observations
Each single improvement (1 or 2) by itself will result in logarithmic
amortized cost per operation.
The two improvements combined will result in amortized cost per
operation approaching very close to O(1).
49 / 114
Self-Adjusting forest of Up-Trees
How, we avoid this problem
Use together the following heuristics!!!
1 Balanced Union.
By tree weight (i.e., size)
By tree rank (i.e., height)
2 Find with path compression
Observations
Each single improvement (1 or 2) by itself will result in logarithmic
amortized cost per operation.
The two improvements combined will result in amortized cost per
operation approaching very close to O(1).
49 / 114
Self-Adjusting forest of Up-Trees
How, we avoid this problem
Use together the following heuristics!!!
1 Balanced Union.
By tree weight (i.e., size)
By tree rank (i.e., height)
2 Find with path compression
Observations
Each single improvement (1 or 2) by itself will result in logarithmic
amortized cost per operation.
The two improvements combined will result in amortized cost per
operation approaching very close to O(1).
49 / 114
Self-Adjusting forest of Up-Trees
How, we avoid this problem
Use together the following heuristics!!!
1 Balanced Union.
By tree weight (i.e., size)
By tree rank (i.e., height)
2 Find with path compression
Observations
Each single improvement (1 or 2) by itself will result in logarithmic
amortized cost per operation.
The two improvements combined will result in amortized cost per
operation approaching very close to O(1).
49 / 114
Self-Adjusting forest of Up-Trees
How, we avoid this problem
Use together the following heuristics!!!
1 Balanced Union.
By tree weight (i.e., size)
By tree rank (i.e., height)
2 Find with path compression
Observations
Each single improvement (1 or 2) by itself will result in logarithmic
amortized cost per operation.
The two improvements combined will result in amortized cost per
operation approaching very close to O(1).
49 / 114
Balanced Union by Size
Using size for Balanced Union
We can use the size of each set to obtain what we want
50 / 114
We have then
MakeSet(x)
1 p[x] = x
2 size[x] = 1
Note: Complexity O (1) time
Union(A, B)
Input: assume that p[A]=A=p[B]=B
1 if size[A] >size[B]
2 size[A] =size[A]+size[B]
3 p[B] = A
4 else
5 size[B] =size[A]+size[B]
6 p[A] = B
Note: Complexity O (1) time
51 / 114
We have then
MakeSet(x)
1 p[x] = x
2 size[x] = 1
Note: Complexity O (1) time
Union(A, B)
Input: assume that p[A]=A=p[B]=B
1 if size[A] >size[B]
2 size[A] =size[A]+size[B]
3 p[B] = A
4 else
5 size[B] =size[A]+size[B]
6 p[A] = B
Note: Complexity O (1) time
51 / 114
We have then
MakeSet(x)
1 p[x] = x
2 size[x] = 1
Note: Complexity O (1) time
Union(A, B)
Input: assume that p[A]=A=p[B]=B
1 if size[A] >size[B]
2 size[A] =size[A]+size[B]
3 p[B] = A
4 else
5 size[B] =size[A]+size[B]
6 p[A] = B
Note: Complexity O (1) time
51 / 114
Example
Now, we use the size for the union
size[A]>size[B]
B
A
52 / 114
Nevertheless
Union by size can make the analysis too complex
People would rather use the rank
Rank
It is defined as the height of the tree
Because
The use of the rank simplify the amortized analysis for the data structure!!!
53 / 114
Nevertheless
Union by size can make the analysis too complex
People would rather use the rank
Rank
It is defined as the height of the tree
Because
The use of the rank simplify the amortized analysis for the data structure!!!
53 / 114
Nevertheless
Union by size can make the analysis too complex
People would rather use the rank
Rank
It is defined as the height of the tree
Because
The use of the rank simplify the amortized analysis for the data structure!!!
53 / 114
Thus, we use the balanced union by rank
MakeSet(x)
1 p[x] = x
2 rank[x] = 0
Note: Complexity O (1) time
Union(A, B)
Input: assume that p[A]=A=p[B]=B
1 if rank[A] >rank[B]
2 p[B] = A
3 else
4 p[A] = B
5 if rank[A] ==rank[B]
6 rank[B]=rank[B]+1
Note: Complexity O (1) time
54 / 114
Thus, we use the balanced union by rank
MakeSet(x)
1 p[x] = x
2 rank[x] = 0
Note: Complexity O (1) time
Union(A, B)
Input: assume that p[A]=A=p[B]=B
1 if rank[A] >rank[B]
2 p[B] = A
3 else
4 p[A] = B
5 if rank[A] ==rank[B]
6 rank[B]=rank[B]+1
Note: Complexity O (1) time
54 / 114
Thus, we use the balanced union by rank
MakeSet(x)
1 p[x] = x
2 rank[x] = 0
Note: Complexity O (1) time
Union(A, B)
Input: assume that p[A]=A=p[B]=B
1 if rank[A] >rank[B]
2 p[B] = A
3 else
4 p[A] = B
5 if rank[A] ==rank[B]
6 rank[B]=rank[B]+1
Note: Complexity O (1) time
54 / 114
Example
Now
We use the rank for the union
Case I
The rank of A is larger than B
55 / 114
Example
Now
We use the rank for the union
Case I
The rank of A is larger than B
rank[A]>rank[B]
B
A
55 / 114
Example
Case II
The rank of B is larger than A
56 / 114
Example
Case II
The rank of B is larger than A
rank[B]>rank[A]
B
A
56 / 114
Outline
1 Disjoint Set Representation
Definition of the Problem
Operations
2 Union-Find Problem
The Main Problem
Applications
3 Implementations
First Attempt: Circular List
Operations and Cost
Still we have a Problem
Weighted-Union Heuristic
Operations
Still a Problem
Heuristic Union by Rank
4 Balanced Union
Path compression
Time Complexity
Ackermann’s Function
Bounds
The Rank Observation
Proof of Complexity
Theorem for Union by Rank and Path Compression)
57 / 114
Here is the new heuristic to improve overall performance:
Path Compression
Find(x)
1 if x=p[x]
2 p[x]=Find(p [x])
3 return p[x]
Complexity
O (depth (x)) time
58 / 114
Here is the new heuristic to improve overall performance:
Path Compression
Find(x)
1 if x=p[x]
2 p[x]=Find(p [x])
3 return p[x]
Complexity
O (depth (x)) time
58 / 114
Example
We have the following structure
59 / 114
Example
The recursive Find(p [x])
60 / 114
Example
The recursive Find(p [x])
61 / 114
Path compression
Find(x) should traverse the path from x up to its root.
This might as well create shortcuts along the way to improve the efficiency
of the future operations.
Find(2)
3
13 15 10 13 15 10
3
12 14 12 1411
11
2
2
1
1
7 8
7 84
6 5
9
4
6 5
9
62 / 114
Outline
1 Disjoint Set Representation
Definition of the Problem
Operations
2 Union-Find Problem
The Main Problem
Applications
3 Implementations
First Attempt: Circular List
Operations and Cost
Still we have a Problem
Weighted-Union Heuristic
Operations
Still a Problem
Heuristic Union by Rank
4 Balanced Union
Path compression
Time Complexity
Ackermann’s Function
Bounds
The Rank Observation
Proof of Complexity
Theorem for Union by Rank and Path Compression)
63 / 114
Time complexity
Tight upper bound on time complexity
An amortized time of O(mα(m, n)) for m operations.
Where α(m, n) is the inverse of the Ackermann’s function (almost a
constant).
This bound, for a slightly different definition of α than that given
here is shown in Cormen’s book.
64 / 114
Time complexity
Tight upper bound on time complexity
An amortized time of O(mα(m, n)) for m operations.
Where α(m, n) is the inverse of the Ackermann’s function (almost a
constant).
This bound, for a slightly different definition of α than that given
here is shown in Cormen’s book.
64 / 114
Time complexity
Tight upper bound on time complexity
An amortized time of O(mα(m, n)) for m operations.
Where α(m, n) is the inverse of the Ackermann’s function (almost a
constant).
This bound, for a slightly different definition of α than that given
here is shown in Cormen’s book.
64 / 114
Ackermann’s Function
Definition
A(1, j) = 2j where j ≥ 1
A(i, 1) = A(i − 1, 2) where i ≥ 2
A(i, j) = A(i − 1, A(i, j − 1)) where i, j ≥ 2
Note:
This is one of several in-equivalent but similar definitions
of Ackermann’s function found in the literature.
Cormen’s book authors give a different definition,
although they never really call theirs Ackermann’s
function.
Property
Ackermann’s function grows very fast, thus it’s inverse grows very slow.
65 / 114
Ackermann’s Function
Definition
A(1, j) = 2j where j ≥ 1
A(i, 1) = A(i − 1, 2) where i ≥ 2
A(i, j) = A(i − 1, A(i, j − 1)) where i, j ≥ 2
Note:
This is one of several in-equivalent but similar definitions
of Ackermann’s function found in the literature.
Cormen’s book authors give a different definition,
although they never really call theirs Ackermann’s
function.
Property
Ackermann’s function grows very fast, thus it’s inverse grows very slow.
65 / 114
Ackermann’s Function
Definition
A(1, j) = 2j where j ≥ 1
A(i, 1) = A(i − 1, 2) where i ≥ 2
A(i, j) = A(i − 1, A(i, j − 1)) where i, j ≥ 2
Note:
This is one of several in-equivalent but similar definitions
of Ackermann’s function found in the literature.
Cormen’s book authors give a different definition,
although they never really call theirs Ackermann’s
function.
Property
Ackermann’s function grows very fast, thus it’s inverse grows very slow.
65 / 114
Ackermann’s Function
Definition
A(1, j) = 2j where j ≥ 1
A(i, 1) = A(i − 1, 2) where i ≥ 2
A(i, j) = A(i − 1, A(i, j − 1)) where i, j ≥ 2
Note:
This is one of several in-equivalent but similar definitions
of Ackermann’s function found in the literature.
Cormen’s book authors give a different definition,
although they never really call theirs Ackermann’s
function.
Property
Ackermann’s function grows very fast, thus it’s inverse grows very slow.
65 / 114
Ackermann’s Function
Definition
A(1, j) = 2j where j ≥ 1
A(i, 1) = A(i − 1, 2) where i ≥ 2
A(i, j) = A(i − 1, A(i, j − 1)) where i, j ≥ 2
Note:
This is one of several in-equivalent but similar definitions
of Ackermann’s function found in the literature.
Cormen’s book authors give a different definition,
although they never really call theirs Ackermann’s
function.
Property
Ackermann’s function grows very fast, thus it’s inverse grows very slow.
65 / 114
Ackermann’s Function
Definition
A(1, j) = 2j where j ≥ 1
A(i, 1) = A(i − 1, 2) where i ≥ 2
A(i, j) = A(i − 1, A(i, j − 1)) where i, j ≥ 2
Note:
This is one of several in-equivalent but similar definitions
of Ackermann’s function found in the literature.
Cormen’s book authors give a different definition,
although they never really call theirs Ackermann’s
function.
Property
Ackermann’s function grows very fast, thus it’s inverse grows very slow.
65 / 114
Ackermann’s Function
Example A(3, 4)
A (3, 4) = 2
2...
2 2
2
...
2 2
2
...
2 16
Notation: 2
2...
2 10
means 22222222222
66 / 114
Inverse of Ackermann’s function
Definition
α(m, n) = min i ≥ 1|A i,
m
n
> log n (3)
Note: This is not a true mathematical inverse.
Intuition: Grows about as slowly as Ackermann’s function does fast.
How slowly?
Let m
n = k, then m ≥ n → k ≥ 1
67 / 114
Inverse of Ackermann’s function
Definition
α(m, n) = min i ≥ 1|A i,
m
n
> log n (3)
Note: This is not a true mathematical inverse.
Intuition: Grows about as slowly as Ackermann’s function does fast.
How slowly?
Let m
n = k, then m ≥ n → k ≥ 1
67 / 114
Inverse of Ackermann’s function
Definition
α(m, n) = min i ≥ 1|A i,
m
n
> log n (3)
Note: This is not a true mathematical inverse.
Intuition: Grows about as slowly as Ackermann’s function does fast.
How slowly?
Let m
n = k, then m ≥ n → k ≥ 1
67 / 114
Inverse of Ackermann’s function
Definition
α(m, n) = min i ≥ 1|A i,
m
n
> log n (3)
Note: This is not a true mathematical inverse.
Intuition: Grows about as slowly as Ackermann’s function does fast.
How slowly?
Let m
n = k, then m ≥ n → k ≥ 1
67 / 114
Thus
First
We can show that A(i, k) ≥ A(i, 1) for all i ≥ 1.
This is left to you...
For Example
Consider i = 4, then A(i, k) ≥ A(4, 1) = 2
2...2
10
≈ 1080.
Finally
if log n < 1080. i.e., if n < 21080
=⇒ α(m, n) ≤ 4
68 / 114
Thus
First
We can show that A(i, k) ≥ A(i, 1) for all i ≥ 1.
This is left to you...
For Example
Consider i = 4, then A(i, k) ≥ A(4, 1) = 2
2...2
10
≈ 1080.
Finally
if log n < 1080. i.e., if n < 21080
=⇒ α(m, n) ≤ 4
68 / 114
Thus
First
We can show that A(i, k) ≥ A(i, 1) for all i ≥ 1.
This is left to you...
For Example
Consider i = 4, then A(i, k) ≥ A(4, 1) = 2
2...2
10
≈ 1080.
Finally
if log n < 1080. i.e., if n < 21080
=⇒ α(m, n) ≤ 4
68 / 114
Thus
First
We can show that A(i, k) ≥ A(i, 1) for all i ≥ 1.
This is left to you...
For Example
Consider i = 4, then A(i, k) ≥ A(4, 1) = 2
2...2
10
≈ 1080.
Finally
if log n < 1080. i.e., if n < 21080
=⇒ α(m, n) ≤ 4
68 / 114
Thus
First
We can show that A(i, k) ≥ A(i, 1) for all i ≥ 1.
This is left to you...
For Example
Consider i = 4, then A(i, k) ≥ A(4, 1) = 2
2...2
10
≈ 1080.
Finally
if log n < 1080. i.e., if n < 21080
=⇒ α(m, n) ≤ 4
68 / 114
Instead of Using the Ackermann Inverse
We define the following function
log∗
n = min i ≥ 0| log(i)
n ≤ 1 (4)
The i means log · · · log n i times
Then
We will establish O (m log∗
n) as upper bound.
69 / 114
Instead of Using the Ackermann Inverse
We define the following function
log∗
n = min i ≥ 0| log(i)
n ≤ 1 (4)
The i means log · · · log n i times
Then
We will establish O (m log∗
n) as upper bound.
69 / 114
In particular
Something Notable
In particular, we have that log∗
2
2...
2
k
= k + 1
For Example
log∗
265536
= 2
2222
4
= 5 (5)
Therefore
We have that log∗
n ≤ 5 for all practical purposes.
70 / 114
In particular
Something Notable
In particular, we have that log∗
2
2...
2
k
= k + 1
For Example
log∗
265536
= 2
2222
4
= 5 (5)
Therefore
We have that log∗
n ≤ 5 for all practical purposes.
70 / 114
In particular
Something Notable
In particular, we have that log∗
2
2...
2
k
= k + 1
For Example
log∗
265536
= 2
2222
4
= 5 (5)
Therefore
We have that log∗
n ≤ 5 for all practical purposes.
70 / 114
The Rank Observation
Something Notable
It is that once somebody becomes a child of another node their rank does
not change given any posterior operation.
71 / 114
For Example
The number in the right is the height
MakeSet(1),MakeSet(2),MakeSet(3), ..., MakeSet(10)
1/0 2/0 3/0 4/0 5/0
6/0 7/0 8/0 9/0 10/0
72 / 114
Example
Now, we do
Union(6, 1),Union(7, 2), ..., Union(10, 1)
1/1 2/1 3/1 4/1 5/1
6/0 7/0 8/0 9/0 10/0
73 / 114
Example
Next - Assuming that you are using a FindSet to get the name set
Union(1, 2)
3/1 4/1 5/1
8/0 9/0 10/0
2/2
7/01/1
6/0
74 / 114
Example
Next
Union(3, 4)
2/2
7/01/1
6/0
3/1
4/2
8/0
9/0
5/1
10/0
75 / 114
Example
Next
Union(2, 4)
2/2
1/1
6/0
7/0
3/1
4/3
8/0
9/0
5/1
10/0
76 / 114
Example
Now you give a FindSet(8)
2/2
1/1
6/0
7/0
4/3
9/0
5/1
10/03/1 8/0
77 / 114
Example
Now you give a Union(4, 5)
2/2
1/1
6/0
7/0
4/3
9/03/1 8/0 5/1
10/0
78 / 114
Properties of ranks
Lemma 1 (About the Rank Properties)
1 ∀x, rank[x] ≤ rank[p[x]].
2 ∀x and x = p[x], then rank[x] < rank[p[x]].
3 rank[x] is initially 0.
4 rank[x] does not decrease.
5 Once x = p[x] holds rank[x] does not change.
6 rank[p[x]] is a monotonically increasing function of time.
Proof
By induction on the number of operations...
79 / 114
Properties of ranks
Lemma 1 (About the Rank Properties)
1 ∀x, rank[x] ≤ rank[p[x]].
2 ∀x and x = p[x], then rank[x] < rank[p[x]].
3 rank[x] is initially 0.
4 rank[x] does not decrease.
5 Once x = p[x] holds rank[x] does not change.
6 rank[p[x]] is a monotonically increasing function of time.
Proof
By induction on the number of operations...
79 / 114
Properties of ranks
Lemma 1 (About the Rank Properties)
1 ∀x, rank[x] ≤ rank[p[x]].
2 ∀x and x = p[x], then rank[x] < rank[p[x]].
3 rank[x] is initially 0.
4 rank[x] does not decrease.
5 Once x = p[x] holds rank[x] does not change.
6 rank[p[x]] is a monotonically increasing function of time.
Proof
By induction on the number of operations...
79 / 114
Properties of ranks
Lemma 1 (About the Rank Properties)
1 ∀x, rank[x] ≤ rank[p[x]].
2 ∀x and x = p[x], then rank[x] < rank[p[x]].
3 rank[x] is initially 0.
4 rank[x] does not decrease.
5 Once x = p[x] holds rank[x] does not change.
6 rank[p[x]] is a monotonically increasing function of time.
Proof
By induction on the number of operations...
79 / 114
Properties of ranks
Lemma 1 (About the Rank Properties)
1 ∀x, rank[x] ≤ rank[p[x]].
2 ∀x and x = p[x], then rank[x] < rank[p[x]].
3 rank[x] is initially 0.
4 rank[x] does not decrease.
5 Once x = p[x] holds rank[x] does not change.
6 rank[p[x]] is a monotonically increasing function of time.
Proof
By induction on the number of operations...
79 / 114
Properties of ranks
Lemma 1 (About the Rank Properties)
1 ∀x, rank[x] ≤ rank[p[x]].
2 ∀x and x = p[x], then rank[x] < rank[p[x]].
3 rank[x] is initially 0.
4 rank[x] does not decrease.
5 Once x = p[x] holds rank[x] does not change.
6 rank[p[x]] is a monotonically increasing function of time.
Proof
By induction on the number of operations...
79 / 114
Properties of ranks
Lemma 1 (About the Rank Properties)
1 ∀x, rank[x] ≤ rank[p[x]].
2 ∀x and x = p[x], then rank[x] < rank[p[x]].
3 rank[x] is initially 0.
4 rank[x] does not decrease.
5 Once x = p[x] holds rank[x] does not change.
6 rank[p[x]] is a monotonically increasing function of time.
Proof
By induction on the number of operations...
79 / 114
For Example
Imagine a MakeSet(x)
Then, rank [x] ≤ rank [p [x]]
Thus, it is true after n operations.
The we get the n + 1 operations that can be:
Case I - FindSet.
Case II - Union.
The rest are for you to prove
It is a good mental exercise!!!
80 / 114
For Example
Imagine a MakeSet(x)
Then, rank [x] ≤ rank [p [x]]
Thus, it is true after n operations.
The we get the n + 1 operations that can be:
Case I - FindSet.
Case II - Union.
The rest are for you to prove
It is a good mental exercise!!!
80 / 114
For Example
Imagine a MakeSet(x)
Then, rank [x] ≤ rank [p [x]]
Thus, it is true after n operations.
The we get the n + 1 operations that can be:
Case I - FindSet.
Case II - Union.
The rest are for you to prove
It is a good mental exercise!!!
80 / 114
For Example
Imagine a MakeSet(x)
Then, rank [x] ≤ rank [p [x]]
Thus, it is true after n operations.
The we get the n + 1 operations that can be:
Case I - FindSet.
Case II - Union.
The rest are for you to prove
It is a good mental exercise!!!
80 / 114
For Example
Imagine a MakeSet(x)
Then, rank [x] ≤ rank [p [x]]
Thus, it is true after n operations.
The we get the n + 1 operations that can be:
Case I - FindSet.
Case II - Union.
The rest are for you to prove
It is a good mental exercise!!!
80 / 114
The Number of Nodes in a Tree
Lemma 2
For all tree roots x, size(x) ≥ 2rank[x]
Note size (x)= Number of nodes in tree rooted at x
Proof
By induction on the number of link operations:
Basis Step
Before first link, all ranks are 0 and each tree contains one node.
Inductive Step
Consider linking x and y (Link (x, y))
Assume lemma holds before this operation; we show that it will holds
after.
81 / 114
The Number of Nodes in a Tree
Lemma 2
For all tree roots x, size(x) ≥ 2rank[x]
Note size (x)= Number of nodes in tree rooted at x
Proof
By induction on the number of link operations:
Basis Step
Before first link, all ranks are 0 and each tree contains one node.
Inductive Step
Consider linking x and y (Link (x, y))
Assume lemma holds before this operation; we show that it will holds
after.
81 / 114
The Number of Nodes in a Tree
Lemma 2
For all tree roots x, size(x) ≥ 2rank[x]
Note size (x)= Number of nodes in tree rooted at x
Proof
By induction on the number of link operations:
Basis Step
Before first link, all ranks are 0 and each tree contains one node.
Inductive Step
Consider linking x and y (Link (x, y))
Assume lemma holds before this operation; we show that it will holds
after.
81 / 114
The Number of Nodes in a Tree
Lemma 2
For all tree roots x, size(x) ≥ 2rank[x]
Note size (x)= Number of nodes in tree rooted at x
Proof
By induction on the number of link operations:
Basis Step
Before first link, all ranks are 0 and each tree contains one node.
Inductive Step
Consider linking x and y (Link (x, y))
Assume lemma holds before this operation; we show that it will holds
after.
81 / 114
The Number of Nodes in a Tree
Lemma 2
For all tree roots x, size(x) ≥ 2rank[x]
Note size (x)= Number of nodes in tree rooted at x
Proof
By induction on the number of link operations:
Basis Step
Before first link, all ranks are 0 and each tree contains one node.
Inductive Step
Consider linking x and y (Link (x, y))
Assume lemma holds before this operation; we show that it will holds
after.
81 / 114
The Number of Nodes in a Tree
Lemma 2
For all tree roots x, size(x) ≥ 2rank[x]
Note size (x)= Number of nodes in tree rooted at x
Proof
By induction on the number of link operations:
Basis Step
Before first link, all ranks are 0 and each tree contains one node.
Inductive Step
Consider linking x and y (Link (x, y))
Assume lemma holds before this operation; we show that it will holds
after.
81 / 114
The Number of Nodes in a Tree
Lemma 2
For all tree roots x, size(x) ≥ 2rank[x]
Note size (x)= Number of nodes in tree rooted at x
Proof
By induction on the number of link operations:
Basis Step
Before first link, all ranks are 0 and each tree contains one node.
Inductive Step
Consider linking x and y (Link (x, y))
Assume lemma holds before this operation; we show that it will holds
after.
81 / 114
Case 1: rank[x] = rank[y]
Assume rank [x] < rank [y]
Note:
rank [x] == rank [x] and rank [y] == rank [y]
82 / 114
Therefore
We have that
size (y) = size (x) + size (y)
≥ 2rank[x]
+ 2rank[y]
≥ 2rank[y]
= 2rank [y]
83 / 114
Therefore
We have that
size (y) = size (x) + size (y)
≥ 2rank[x]
+ 2rank[y]
≥ 2rank[y]
= 2rank [y]
83 / 114
Therefore
We have that
size (y) = size (x) + size (y)
≥ 2rank[x]
+ 2rank[y]
≥ 2rank[y]
= 2rank [y]
83 / 114
Therefore
We have that
size (y) = size (x) + size (y)
≥ 2rank[x]
+ 2rank[y]
≥ 2rank[y]
= 2rank [y]
83 / 114
Case 2: rank[x] == rank[y]
Assume rank [x] == rank [y]
Note:
rank [x] == rank [x] and rank [y] == rank [y] + 1
84 / 114
Therefore
We have that
size (y) = size (x) + size (y)
≥ 2rank[x]
+ 2rank[y]
≥ 2rank[y]+1
= 2rank [y]
Note: In the worst case rank [x] == rank [y] == 0
85 / 114
Therefore
We have that
size (y) = size (x) + size (y)
≥ 2rank[x]
+ 2rank[y]
≥ 2rank[y]+1
= 2rank [y]
Note: In the worst case rank [x] == rank [y] == 0
85 / 114
Therefore
We have that
size (y) = size (x) + size (y)
≥ 2rank[x]
+ 2rank[y]
≥ 2rank[y]+1
= 2rank [y]
Note: In the worst case rank [x] == rank [y] == 0
85 / 114
Therefore
We have that
size (y) = size (x) + size (y)
≥ 2rank[x]
+ 2rank[y]
≥ 2rank[y]+1
= 2rank [y]
Note: In the worst case rank [x] == rank [y] == 0
85 / 114
Therefore
We have that
size (y) = size (x) + size (y)
≥ 2rank[x]
+ 2rank[y]
≥ 2rank[y]+1
= 2rank [y]
Note: In the worst case rank [x] == rank [y] == 0
85 / 114
The number of nodes at certain rank
Lemma 3
For any integer r ≥ 0, there are an most n
2r nodes of rank r.
Proof
First fix r.
When rank r is assigned to some node x, then imagine that you label
each node in the tree rooted at x by “x.”
By lemma 21.3, 2r or more nodes are labeled each time when
executing a union.
By lemma 21.2, each node is labeled at most once, when its root is
first assigned rank r.
If there were more than n
2r nodes of rank r.
Then, we will have that more than 2r · n
2r = n nodes would be
labeled by a node of rank r, a contradiction.
86 / 114
The number of nodes at certain rank
Lemma 3
For any integer r ≥ 0, there are an most n
2r nodes of rank r.
Proof
First fix r.
When rank r is assigned to some node x, then imagine that you label
each node in the tree rooted at x by “x.”
By lemma 21.3, 2r or more nodes are labeled each time when
executing a union.
By lemma 21.2, each node is labeled at most once, when its root is
first assigned rank r.
If there were more than n
2r nodes of rank r.
Then, we will have that more than 2r · n
2r = n nodes would be
labeled by a node of rank r, a contradiction.
86 / 114
The number of nodes at certain rank
Lemma 3
For any integer r ≥ 0, there are an most n
2r nodes of rank r.
Proof
First fix r.
When rank r is assigned to some node x, then imagine that you label
each node in the tree rooted at x by “x.”
By lemma 21.3, 2r or more nodes are labeled each time when
executing a union.
By lemma 21.2, each node is labeled at most once, when its root is
first assigned rank r.
If there were more than n
2r nodes of rank r.
Then, we will have that more than 2r · n
2r = n nodes would be
labeled by a node of rank r, a contradiction.
86 / 114
The number of nodes at certain rank
Lemma 3
For any integer r ≥ 0, there are an most n
2r nodes of rank r.
Proof
First fix r.
When rank r is assigned to some node x, then imagine that you label
each node in the tree rooted at x by “x.”
By lemma 21.3, 2r or more nodes are labeled each time when
executing a union.
By lemma 21.2, each node is labeled at most once, when its root is
first assigned rank r.
If there were more than n
2r nodes of rank r.
Then, we will have that more than 2r · n
2r = n nodes would be
labeled by a node of rank r, a contradiction.
86 / 114
The number of nodes at certain rank
Lemma 3
For any integer r ≥ 0, there are an most n
2r nodes of rank r.
Proof
First fix r.
When rank r is assigned to some node x, then imagine that you label
each node in the tree rooted at x by “x.”
By lemma 21.3, 2r or more nodes are labeled each time when
executing a union.
By lemma 21.2, each node is labeled at most once, when its root is
first assigned rank r.
If there were more than n
2r nodes of rank r.
Then, we will have that more than 2r · n
2r = n nodes would be
labeled by a node of rank r, a contradiction.
86 / 114
The number of nodes at certain rank
Lemma 3
For any integer r ≥ 0, there are an most n
2r nodes of rank r.
Proof
First fix r.
When rank r is assigned to some node x, then imagine that you label
each node in the tree rooted at x by “x.”
By lemma 21.3, 2r or more nodes are labeled each time when
executing a union.
By lemma 21.2, each node is labeled at most once, when its root is
first assigned rank r.
If there were more than n
2r nodes of rank r.
Then, we will have that more than 2r · n
2r = n nodes would be
labeled by a node of rank r, a contradiction.
86 / 114
The number of nodes at certain rank
Lemma 3
For any integer r ≥ 0, there are an most n
2r nodes of rank r.
Proof
First fix r.
When rank r is assigned to some node x, then imagine that you label
each node in the tree rooted at x by “x.”
By lemma 21.3, 2r or more nodes are labeled each time when
executing a union.
By lemma 21.2, each node is labeled at most once, when its root is
first assigned rank r.
If there were more than n
2r nodes of rank r.
Then, we will have that more than 2r · n
2r = n nodes would be
labeled by a node of rank r, a contradiction.
86 / 114
Corollary 1
Corollary 1
Every node has rank at most log n .
Proof
if there is a rank r such that r > log n → n
2r < 1 nodes of rank r a
contradiction.
87 / 114
Corollary 1
Corollary 1
Every node has rank at most log n .
Proof
if there is a rank r such that r > log n → n
2r < 1 nodes of rank r a
contradiction.
87 / 114
Providing the time bound
Lemma 4 (Lemma 21.7)
Suppose we convert a sequence S of m MakeSet, Union and FindSet
operations into a sequence S of m MakeSet, Link, and FindSet operations
by turning each Union into two FindSet operations followed by a Link.
Then, if sequence S runs in O(m log∗
n) time, sequence S runs in
O(m log∗
n) time.
88 / 114
Proof:
The proof is quite easy
1 Since each UNION operation in sequence S is converted into three
operations in S.
m ≤ m ≤ 3m (6)
2 We have that m = O (m )
3 Then, if the new sequence S runs in O (m log∗
n) this implies that
the old sequence S runs in O (m log∗
n)
89 / 114
Proof:
The proof is quite easy
1 Since each UNION operation in sequence S is converted into three
operations in S.
m ≤ m ≤ 3m (6)
2 We have that m = O (m )
3 Then, if the new sequence S runs in O (m log∗
n) this implies that
the old sequence S runs in O (m log∗
n)
89 / 114
Proof:
The proof is quite easy
1 Since each UNION operation in sequence S is converted into three
operations in S.
m ≤ m ≤ 3m (6)
2 We have that m = O (m )
3 Then, if the new sequence S runs in O (m log∗
n) this implies that
the old sequence S runs in O (m log∗
n)
89 / 114
Theorem for Union by Rank and Path Compression
Theorem
Any sequence of m MakeSet, Link, and FindSet operations, n of which are
MakeSet operations, is performed in worst-case time O(m log∗
n).
Proof
First, MakeSet and Link take O(1) time.
The Key of the Analysis is to Accurately Charging FindSet.
90 / 114
Theorem for Union by Rank and Path Compression
Theorem
Any sequence of m MakeSet, Link, and FindSet operations, n of which are
MakeSet operations, is performed in worst-case time O(m log∗
n).
Proof
First, MakeSet and Link take O(1) time.
The Key of the Analysis is to Accurately Charging FindSet.
90 / 114
Theorem for Union by Rank and Path Compression
Theorem
Any sequence of m MakeSet, Link, and FindSet operations, n of which are
MakeSet operations, is performed in worst-case time O(m log∗
n).
Proof
First, MakeSet and Link take O(1) time.
The Key of the Analysis is to Accurately Charging FindSet.
90 / 114
For this, we have the following
We can do the following
Partition ranks into blocks.
Put each rank j into block log∗
r for r = 0, 1, ..., log n (Corollary 1).
Highest-numbered block is log∗
(log n) = (log∗
n) − 1.
In addition, the cost of FindSet pays for the foollowing situations
1 The FindSet pays for the cost of the root and its child.
2 A bill is given to every node whose rank parent changes in the path
compression!!!
91 / 114
For this, we have the following
We can do the following
Partition ranks into blocks.
Put each rank j into block log∗
r for r = 0, 1, ..., log n (Corollary 1).
Highest-numbered block is log∗
(log n) = (log∗
n) − 1.
In addition, the cost of FindSet pays for the foollowing situations
1 The FindSet pays for the cost of the root and its child.
2 A bill is given to every node whose rank parent changes in the path
compression!!!
91 / 114
For this, we have the following
We can do the following
Partition ranks into blocks.
Put each rank j into block log∗
r for r = 0, 1, ..., log n (Corollary 1).
Highest-numbered block is log∗
(log n) = (log∗
n) − 1.
In addition, the cost of FindSet pays for the foollowing situations
1 The FindSet pays for the cost of the root and its child.
2 A bill is given to every node whose rank parent changes in the path
compression!!!
91 / 114
For this, we have the following
We can do the following
Partition ranks into blocks.
Put each rank j into block log∗
r for r = 0, 1, ..., log n (Corollary 1).
Highest-numbered block is log∗
(log n) = (log∗
n) − 1.
In addition, the cost of FindSet pays for the foollowing situations
1 The FindSet pays for the cost of the root and its child.
2 A bill is given to every node whose rank parent changes in the path
compression!!!
91 / 114
For this, we have the following
We can do the following
Partition ranks into blocks.
Put each rank j into block log∗
r for r = 0, 1, ..., log n (Corollary 1).
Highest-numbered block is log∗
(log n) = (log∗
n) − 1.
In addition, the cost of FindSet pays for the foollowing situations
1 The FindSet pays for the cost of the root and its child.
2 A bill is given to every node whose rank parent changes in the path
compression!!!
91 / 114
Now, define the Block function
Define the following Upper Bound Function
B(j) ≡



−1 if j = −1
1 if j = 0
2 if j = 1
2
2...
2
j−1
if j ≥ 2
92 / 114
First
Something Notable
These are going to be the upper bounds for blocks in the ranks
Where
For j = 0, 1, ..., log∗
n − 1, block j consist of the set of ranks:
B(j − 1) + 1, B(j − 1) + 2, ..., B(j)
Elements in Block j
(7)
93 / 114
First
Something Notable
These are going to be the upper bounds for blocks in the ranks
Where
For j = 0, 1, ..., log∗
n − 1, block j consist of the set of ranks:
B(j − 1) + 1, B(j − 1) + 2, ..., B(j)
Elements in Block j
(7)
93 / 114
For Example
We have that
B(−1) = 1
B(0) = 0
B(1) = 2
B(2) = 22
= 4
B(3) = 222
= 24
= 16
B(4) = 2222
= 216
= 65536
94 / 114
For Example
We have that
B(−1) = 1
B(0) = 0
B(1) = 2
B(2) = 22
= 4
B(3) = 222
= 24
= 16
B(4) = 2222
= 216
= 65536
94 / 114
For Example
We have that
B(−1) = 1
B(0) = 0
B(1) = 2
B(2) = 22
= 4
B(3) = 222
= 24
= 16
B(4) = 2222
= 216
= 65536
94 / 114
For Example
We have that
B(−1) = 1
B(0) = 0
B(1) = 2
B(2) = 22
= 4
B(3) = 222
= 24
= 16
B(4) = 2222
= 216
= 65536
94 / 114
For Example
We have that
B(−1) = 1
B(0) = 0
B(1) = 2
B(2) = 22
= 4
B(3) = 222
= 24
= 16
B(4) = 2222
= 216
= 65536
94 / 114
For Example
We have that
B(−1) = 1
B(0) = 0
B(1) = 2
B(2) = 22
= 4
B(3) = 222
= 24
= 16
B(4) = 2222
= 216
= 65536
94 / 114
For Example
Thus, we have
Block j Set of Ranks
0 0,1
1 2
2 3,4
3 5,...,16
4 17,...,65536
...
...
Note B(j) = 2B(j−1)
for j > 0.
95 / 114
For Example
Thus, we have
Block j Set of Ranks
0 0,1
1 2
2 3,4
3 5,...,16
4 17,...,65536
...
...
Note B(j) = 2B(j−1)
for j > 0.
95 / 114
Example
Now you give a Union(4, 5)
2/2
1/1
6/0
7/0
4/3
9/03/1 8/0 5/1
10/0
Bock 0
Bock 1
Bock 2
96 / 114
Finally
Given our Bound in the Ranks
Thus, all the blocks from B (0) to B (log∗
n − 1) will be used for storing
the ranking elements
97 / 114
Charging for FindSets
Two types of charges for FindSet(x0)
Block charges and Path charges.
Charge each
node as either:
1) Block Charge
2) Path Charge
98 / 114
Charging for FindSets
Thus, for find sets
The find operation pays for the work done for the root and its
immediate child.
It also pays for all the nodes which are not in the same block as
their parents.
99 / 114
Charging for FindSets
Thus, for find sets
The find operation pays for the work done for the root and its
immediate child.
It also pays for all the nodes which are not in the same block as
their parents.
99 / 114
Then
First
1 All these nodes are children of some other nodes, so their ranks will
not change and they are bound to stay in the same block until the
end of the computation.
2 If a node is in the same block as its parent, it will be charged for the
work done in the FindSet Operation!!!
100 / 114
Then
First
1 All these nodes are children of some other nodes, so their ranks will
not change and they are bound to stay in the same block until the
end of the computation.
2 If a node is in the same block as its parent, it will be charged for the
work done in the FindSet Operation!!!
100 / 114
Thus
We have the following charges
Block Charge :
For j = 0, 1, ..., log∗
n − 1, give one block charge to the last node with
rank in block j on the path x0, x1, ..., xl.
Also give one block charge to the child of the root, i.e., xl−1, and the
root itself, i.e., xl−1.
Path Charge :
Give nodes in x0, ..., xl a path charge until they are moved to point to a
name element with a rank different from the child’s block
101 / 114
Thus
We have the following charges
Block Charge :
For j = 0, 1, ..., log∗
n − 1, give one block charge to the last node with
rank in block j on the path x0, x1, ..., xl.
Also give one block charge to the child of the root, i.e., xl−1, and the
root itself, i.e., xl−1.
Path Charge :
Give nodes in x0, ..., xl a path charge until they are moved to point to a
name element with a rank different from the child’s block
101 / 114
Thus
We have the following charges
Block Charge :
For j = 0, 1, ..., log∗
n − 1, give one block charge to the last node with
rank in block j on the path x0, x1, ..., xl.
Also give one block charge to the child of the root, i.e., xl−1, and the
root itself, i.e., xl−1.
Path Charge :
Give nodes in x0, ..., xl a path charge until they are moved to point to a
name element with a rank different from the child’s block
101 / 114
Thus
We have the following charges
Block Charge :
For j = 0, 1, ..., log∗
n − 1, give one block charge to the last node with
rank in block j on the path x0, x1, ..., xl.
Also give one block charge to the child of the root, i.e., xl−1, and the
root itself, i.e., xl−1.
Path Charge :
Give nodes in x0, ..., xl a path charge until they are moved to point to a
name element with a rank different from the child’s block
101 / 114
Thus
We have the following charges
Block Charge :
For j = 0, 1, ..., log∗
n − 1, give one block charge to the last node with
rank in block j on the path x0, x1, ..., xl.
Also give one block charge to the child of the root, i.e., xl−1, and the
root itself, i.e., xl−1.
Path Charge :
Give nodes in x0, ..., xl a path charge until they are moved to point to a
name element with a rank different from the child’s block
101 / 114
Charging for FindSets
Two types of charges for FindSet(x0)
Block charges and Path charges.
Charge each
node as either:
1) Block Charge
2) Path Charge
102 / 114
Charging for FindSets
Two types of charges for FindSet(x0)
Block charges and Path charges.
Charge each
node as either:
1) Block Charge
2) Path Charge
102 / 114
Next
Something Notable
Number of nodes whose parents are in different blocks is limited by
(log∗
n) − 1.
Making it an upper bound for the charges when changing the last
node with rank in block j.
2 charges for the root and its child.
Thus
The cost of the Block Charges for the FindSet operation is upper bounded
by:
log∗
n − 1 + 2 = log∗
n + 1. (8)
103 / 114
Next
Something Notable
Number of nodes whose parents are in different blocks is limited by
(log∗
n) − 1.
Making it an upper bound for the charges when changing the last
node with rank in block j.
2 charges for the root and its child.
Thus
The cost of the Block Charges for the FindSet operation is upper bounded
by:
log∗
n − 1 + 2 = log∗
n + 1. (8)
103 / 114
Next
Something Notable
Number of nodes whose parents are in different blocks is limited by
(log∗
n) − 1.
Making it an upper bound for the charges when changing the last
node with rank in block j.
2 charges for the root and its child.
Thus
The cost of the Block Charges for the FindSet operation is upper bounded
by:
log∗
n − 1 + 2 = log∗
n + 1. (8)
103 / 114
Next
Something Notable
Number of nodes whose parents are in different blocks is limited by
(log∗
n) − 1.
Making it an upper bound for the charges when changing the last
node with rank in block j.
2 charges for the root and its child.
Thus
The cost of the Block Charges for the FindSet operation is upper bounded
by:
log∗
n − 1 + 2 = log∗
n + 1. (8)
103 / 114
Next
Something Notable
Number of nodes whose parents are in different blocks is limited by
(log∗
n) − 1.
Making it an upper bound for the charges when changing the last
node with rank in block j.
2 charges for the root and its child.
Thus
The cost of the Block Charges for the FindSet operation is upper bounded
by:
log∗
n − 1 + 2 = log∗
n + 1. (8)
103 / 114
Claim
Claim
Once a node other than a root or its child is given a Block Charge (B.C.),
it will never be given a Path Charge (P.C.)
104 / 114
Proof
Proof
Given a node x, we know that:
rank [p [x]] − rank [x] is monotonically increasing ⇒
log∗
rank [p [x]] − log∗
rank [x] is monotonically increasing.
Thus, Once x and p[x] are in different blocks, they will always be in
different blocks because:
The rank of the parent can only increases.
And the child’s rank stays the same
Thus, the node x will be billed in the first FindSet operation a patch
charge and block charge if necessary.
Thus, the node x will never be charged again a path charge because
is already pointing to the member set name.
105 / 114
Proof
Proof
Given a node x, we know that:
rank [p [x]] − rank [x] is monotonically increasing ⇒
log∗
rank [p [x]] − log∗
rank [x] is monotonically increasing.
Thus, Once x and p[x] are in different blocks, they will always be in
different blocks because:
The rank of the parent can only increases.
And the child’s rank stays the same
Thus, the node x will be billed in the first FindSet operation a patch
charge and block charge if necessary.
Thus, the node x will never be charged again a path charge because
is already pointing to the member set name.
105 / 114
Proof
Proof
Given a node x, we know that:
rank [p [x]] − rank [x] is monotonically increasing ⇒
log∗
rank [p [x]] − log∗
rank [x] is monotonically increasing.
Thus, Once x and p[x] are in different blocks, they will always be in
different blocks because:
The rank of the parent can only increases.
And the child’s rank stays the same
Thus, the node x will be billed in the first FindSet operation a patch
charge and block charge if necessary.
Thus, the node x will never be charged again a path charge because
is already pointing to the member set name.
105 / 114
Proof
Proof
Given a node x, we know that:
rank [p [x]] − rank [x] is monotonically increasing ⇒
log∗
rank [p [x]] − log∗
rank [x] is monotonically increasing.
Thus, Once x and p[x] are in different blocks, they will always be in
different blocks because:
The rank of the parent can only increases.
And the child’s rank stays the same
Thus, the node x will be billed in the first FindSet operation a patch
charge and block charge if necessary.
Thus, the node x will never be charged again a path charge because
is already pointing to the member set name.
105 / 114
Proof
Proof
Given a node x, we know that:
rank [p [x]] − rank [x] is monotonically increasing ⇒
log∗
rank [p [x]] − log∗
rank [x] is monotonically increasing.
Thus, Once x and p[x] are in different blocks, they will always be in
different blocks because:
The rank of the parent can only increases.
And the child’s rank stays the same
Thus, the node x will be billed in the first FindSet operation a patch
charge and block charge if necessary.
Thus, the node x will never be charged again a path charge because
is already pointing to the member set name.
105 / 114
Proof
Proof
Given a node x, we know that:
rank [p [x]] − rank [x] is monotonically increasing ⇒
log∗
rank [p [x]] − log∗
rank [x] is monotonically increasing.
Thus, Once x and p[x] are in different blocks, they will always be in
different blocks because:
The rank of the parent can only increases.
And the child’s rank stays the same
Thus, the node x will be billed in the first FindSet operation a patch
charge and block charge if necessary.
Thus, the node x will never be charged again a path charge because
is already pointing to the member set name.
105 / 114
Proof
Proof
Given a node x, we know that:
rank [p [x]] − rank [x] is monotonically increasing ⇒
log∗
rank [p [x]] − log∗
rank [x] is monotonically increasing.
Thus, Once x and p[x] are in different blocks, they will always be in
different blocks because:
The rank of the parent can only increases.
And the child’s rank stays the same
Thus, the node x will be billed in the first FindSet operation a patch
charge and block charge if necessary.
Thus, the node x will never be charged again a path charge because
is already pointing to the member set name.
105 / 114
Remaining Goal
The Total cost of the FindSet’s Operations
Total cost of FindSet’s = Total Block Charges + Total Path Charges.
We want to show
Total Block Charges + Total Path Charges= O(m log∗
n)
106 / 114
Remaining Goal
The Total cost of the FindSet’s Operations
Total cost of FindSet’s = Total Block Charges + Total Path Charges.
We want to show
Total Block Charges + Total Path Charges= O(m log∗
n)
106 / 114
Bounding Block Charges
This part is easy
Block numbers range over 0, ..., log∗
n − 1.
The number of Block Charges per FindSet is ≤ log∗
n + 1 .
The total number of FindSet’s is ≤ m
The total number of Block Charges is ≤ m(log∗
n + 1) .
107 / 114
Bounding Block Charges
This part is easy
Block numbers range over 0, ..., log∗
n − 1.
The number of Block Charges per FindSet is ≤ log∗
n + 1 .
The total number of FindSet’s is ≤ m
The total number of Block Charges is ≤ m(log∗
n + 1) .
107 / 114
Bounding Block Charges
This part is easy
Block numbers range over 0, ..., log∗
n − 1.
The number of Block Charges per FindSet is ≤ log∗
n + 1 .
The total number of FindSet’s is ≤ m
The total number of Block Charges is ≤ m(log∗
n + 1) .
107 / 114
Bounding Block Charges
This part is easy
Block numbers range over 0, ..., log∗
n − 1.
The number of Block Charges per FindSet is ≤ log∗
n + 1 .
The total number of FindSet’s is ≤ m
The total number of Block Charges is ≤ m(log∗
n + 1) .
107 / 114
Bounding Path Charges
Claim
Let N(j) be the number of nodes whose ranks are in block j. Then, for all
j ≥ 0, N(j) ≤ 3n
2B(j)
Proof
By Lemma 3, N(j) ≤
B(j)
r=B(j−1)+1
n
2r summing over all possible ranks
For j = 0:
N (0) ≤
n
20
+
n
2
=
3n
2
=
3n
2B(0)
108 / 114
Bounding Path Charges
Claim
Let N(j) be the number of nodes whose ranks are in block j. Then, for all
j ≥ 0, N(j) ≤ 3n
2B(j)
Proof
By Lemma 3, N(j) ≤
B(j)
r=B(j−1)+1
n
2r summing over all possible ranks
For j = 0:
N (0) ≤
n
20
+
n
2
=
3n
2
=
3n
2B(0)
108 / 114
Bounding Path Charges
Claim
Let N(j) be the number of nodes whose ranks are in block j. Then, for all
j ≥ 0, N(j) ≤ 3n
2B(j)
Proof
By Lemma 3, N(j) ≤
B(j)
r=B(j−1)+1
n
2r summing over all possible ranks
For j = 0:
N (0) ≤
n
20
+
n
2
=
3n
2
=
3n
2B(0)
108 / 114
Bounding Path Charges
Claim
Let N(j) be the number of nodes whose ranks are in block j. Then, for all
j ≥ 0, N(j) ≤ 3n
2B(j)
Proof
By Lemma 3, N(j) ≤
B(j)
r=B(j−1)+1
n
2r summing over all possible ranks
For j = 0:
N (0) ≤
n
20
+
n
2
=
3n
2
=
3n
2B(0)
108 / 114
Bounding Path Charges
Claim
Let N(j) be the number of nodes whose ranks are in block j. Then, for all
j ≥ 0, N(j) ≤ 3n
2B(j)
Proof
By Lemma 3, N(j) ≤
B(j)
r=B(j−1)+1
n
2r summing over all possible ranks
For j = 0:
N (0) ≤
n
20
+
n
2
=
3n
2
=
3n
2B(0)
108 / 114
Proof of claim
For j ≥ 1
N(j) ≤
n
2B(j−1)+1
B(j)−(B(j−1)+1)
r=0
1
2r
<
n
2B(j−1)+1
∞
r=0
1
2r
=
n
2B(j−1)
This is where the fact that B (j) = 2B(j−1)is used.
=
n
B(j)
<
3n
2B(j)
109 / 114
Proof of claim
For j ≥ 1
N(j) ≤
n
2B(j−1)+1
B(j)−(B(j−1)+1)
r=0
1
2r
<
n
2B(j−1)+1
∞
r=0
1
2r
=
n
2B(j−1)
This is where the fact that B (j) = 2B(j−1)is used.
=
n
B(j)
<
3n
2B(j)
109 / 114
Proof of claim
For j ≥ 1
N(j) ≤
n
2B(j−1)+1
B(j)−(B(j−1)+1)
r=0
1
2r
<
n
2B(j−1)+1
∞
r=0
1
2r
=
n
2B(j−1)
This is where the fact that B (j) = 2B(j−1)is used.
=
n
B(j)
<
3n
2B(j)
109 / 114
Proof of claim
For j ≥ 1
N(j) ≤
n
2B(j−1)+1
B(j)−(B(j−1)+1)
r=0
1
2r
<
n
2B(j−1)+1
∞
r=0
1
2r
=
n
2B(j−1)
This is where the fact that B (j) = 2B(j−1)is used.
=
n
B(j)
<
3n
2B(j)
109 / 114
Proof of claim
For j ≥ 1
N(j) ≤
n
2B(j−1)+1
B(j)−(B(j−1)+1)
r=0
1
2r
<
n
2B(j−1)+1
∞
r=0
1
2r
=
n
2B(j−1)
This is where the fact that B (j) = 2B(j−1)is used.
=
n
B(j)
<
3n
2B(j)
109 / 114
Bounding Path Charges
We have the following
Let P(n) denote the overall number of path charges. Then:
P(n) ≤
log∗
n−1
j=0
αj · βj (9)
αj is the max number of nodes with ranks in Block j
βj is the max number of path charges per node of Block j.
110 / 114
Bounding Path Charges
We have the following
Let P(n) denote the overall number of path charges. Then:
P(n) ≤
log∗
n−1
j=0
αj · βj (9)
αj is the max number of nodes with ranks in Block j
βj is the max number of path charges per node of Block j.
110 / 114
Bounding Path Charges
We have the following
Let P(n) denote the overall number of path charges. Then:
P(n) ≤
log∗
n−1
j=0
αj · βj (9)
αj is the max number of nodes with ranks in Block j
βj is the max number of path charges per node of Block j.
110 / 114
Then, we have the following
Upper Bounds
By claim, αj upper-bounded by 3n
2B(j) ,
In addition, we need to bound βj that represents the maximum
number of path charges for nodes x at block j.
Note: Any node in Block j that is given a P.C. will be in Block j
after all m operations.
111 / 114
Then, we have the following
Upper Bounds
By claim, αj upper-bounded by 3n
2B(j) ,
In addition, we need to bound βj that represents the maximum
number of path charges for nodes x at block j.
Note: Any node in Block j that is given a P.C. will be in Block j
after all m operations.
111 / 114
Then, we have the following
Upper Bounds
By claim, αj upper-bounded by 3n
2B(j) ,
In addition, we need to bound βj that represents the maximum
number of path charges for nodes x at block j.
Note: Any node in Block j that is given a P.C. will be in Block j
after all m operations.
111 / 114
Then, we have the following
Upper Bounds
By claim, αj upper-bounded by 3n
2B(j) ,
In addition, we need to bound βj that represents the maximum
number of path charges for nodes x at block j.
Note: Any node in Block j that is given a P.C. will be in Block j
after all m operations.
Path Compression
is issued
111 / 114
Now, we bound βj
So, every time x is assessed a Path Charges, it gets a new parent with
increased rank.
Note: x’s rank is not changed by path compression
Suppose x has a rank in Block j
Repeated Path Charges to x will ultimately result in x’s parent having
a rank in a Block higher than j.
From that point onward, x is given Block Charges, not Path Charges.
Therefore, the Worst Case
x has the lowest rank in Block j, i.e., B (j − 1) + 1, and x’s parents
ranks successively take on the values.
B(j − 1) + 2, B(j − 1) + 3, ..., B(j)
112 / 114
Now, we bound βj
So, every time x is assessed a Path Charges, it gets a new parent with
increased rank.
Note: x’s rank is not changed by path compression
Suppose x has a rank in Block j
Repeated Path Charges to x will ultimately result in x’s parent having
a rank in a Block higher than j.
From that point onward, x is given Block Charges, not Path Charges.
Therefore, the Worst Case
x has the lowest rank in Block j, i.e., B (j − 1) + 1, and x’s parents
ranks successively take on the values.
B(j − 1) + 2, B(j − 1) + 3, ..., B(j)
112 / 114
Now, we bound βj
So, every time x is assessed a Path Charges, it gets a new parent with
increased rank.
Note: x’s rank is not changed by path compression
Suppose x has a rank in Block j
Repeated Path Charges to x will ultimately result in x’s parent having
a rank in a Block higher than j.
From that point onward, x is given Block Charges, not Path Charges.
Therefore, the Worst Case
x has the lowest rank in Block j, i.e., B (j − 1) + 1, and x’s parents
ranks successively take on the values.
B(j − 1) + 2, B(j − 1) + 3, ..., B(j)
112 / 114
Now, we bound βj
So, every time x is assessed a Path Charges, it gets a new parent with
increased rank.
Note: x’s rank is not changed by path compression
Suppose x has a rank in Block j
Repeated Path Charges to x will ultimately result in x’s parent having
a rank in a Block higher than j.
From that point onward, x is given Block Charges, not Path Charges.
Therefore, the Worst Case
x has the lowest rank in Block j, i.e., B (j − 1) + 1, and x’s parents
ranks successively take on the values.
B(j − 1) + 2, B(j − 1) + 3, ..., B(j)
112 / 114
Now, we bound βj
So, every time x is assessed a Path Charges, it gets a new parent with
increased rank.
Note: x’s rank is not changed by path compression
Suppose x has a rank in Block j
Repeated Path Charges to x will ultimately result in x’s parent having
a rank in a Block higher than j.
From that point onward, x is given Block Charges, not Path Charges.
Therefore, the Worst Case
x has the lowest rank in Block j, i.e., B (j − 1) + 1, and x’s parents
ranks successively take on the values.
B(j − 1) + 2, B(j − 1) + 3, ..., B(j)
112 / 114
Finally
Hence, x can be given at most B(j) − B(j − 1) − 1 Path Charges.
Therefore:
P(n) ≤
log∗
n−1
j=0
3n
2B(j)(B(j) − B(j − 1) − 1)
P(n) ≤
log∗
n−1
j=0
3n
2B(j) B(j)
P(n) = 3
2n log∗
n
113 / 114
Finally
Hence, x can be given at most B(j) − B(j − 1) − 1 Path Charges.
Therefore:
P(n) ≤
log∗
n−1
j=0
3n
2B(j)(B(j) − B(j − 1) − 1)
P(n) ≤
log∗
n−1
j=0
3n
2B(j) B(j)
P(n) = 3
2n log∗
n
113 / 114
Finally
Hence, x can be given at most B(j) − B(j − 1) − 1 Path Charges.
Therefore:
P(n) ≤
log∗
n−1
j=0
3n
2B(j)(B(j) − B(j − 1) − 1)
P(n) ≤
log∗
n−1
j=0
3n
2B(j) B(j)
P(n) = 3
2n log∗
n
113 / 114
Finally
Hence, x can be given at most B(j) − B(j − 1) − 1 Path Charges.
Therefore:
P(n) ≤
log∗
n−1
j=0
3n
2B(j)(B(j) − B(j − 1) − 1)
P(n) ≤
log∗
n−1
j=0
3n
2B(j) B(j)
P(n) = 3
2n log∗
n
113 / 114
Thus
FindSet operations contribute
O(m(log∗
n + 1) + n log∗
n) = O(m log∗
n) (10)
MakeSet and Link contribute O(n)
Entire sequence takes O (m log∗
n).
114 / 114
Thus
FindSet operations contribute
O(m(log∗
n + 1) + n log∗
n) = O(m log∗
n) (10)
MakeSet and Link contribute O(n)
Entire sequence takes O (m log∗
n).
114 / 114

More Related Content

PDF
23 Matrix Algorithms
Andres Mendez-Vazquez
 
PPT
Disjoint sets
Core Condor
 
PPT
Numerical Methods
Teja Ande
 
PDF
線形回帰モデル
貴之 八木
 
PDF
Solution set 3
慧环 赵
 
PDF
勾配法
貴之 八木
 
PDF
Lecture 6 radial basis-function_network
ParveenMalik18
 
PPTX
Statistical Physics Assignment Help
Statistics Assignment Help
 
23 Matrix Algorithms
Andres Mendez-Vazquez
 
Disjoint sets
Core Condor
 
Numerical Methods
Teja Ande
 
線形回帰モデル
貴之 八木
 
Solution set 3
慧环 赵
 
勾配法
貴之 八木
 
Lecture 6 radial basis-function_network
ParveenMalik18
 
Statistical Physics Assignment Help
Statistics Assignment Help
 

What's hot (20)

PPTX
Physical Chemistry Assignment Help
Edu Assignment Help
 
PDF
Reachability Analysis "Control Of Dynamical Non-Linear Systems"
M Reza Rahmati
 
PPTX
Matlab Assignment Help
Matlab Assignment Experts
 
PDF
Lecture 1 computational intelligence
ParveenMalik18
 
PDF
Lecture 2 fuzzy inference system
ParveenMalik18
 
PDF
X01 Supervised learning problem linear regression one feature theorie
Marco Moldenhauer
 
PDF
Numerical
1821986
 
PPT
finding Min and max element from given array using divide & conquer
Swati Kulkarni Jaipurkar
 
PPTX
numericai matmatic matlab uygulamalar ali abdullah
Ali Abdullah
 
PDF
Iterative procedure for uniform continuous mapping.
Alexander Decker
 
PDF
Interpolation with Finite differences
Dr. Nirav Vyas
 
PDF
Numarical values
AmanSaeed11
 
PPT
5.1 greedy 03
Krish_ver2
 
PPTX
08 decrease and conquer spring 15
Hira Gul
 
PDF
Chapter2 - Linear Time-Invariant System
Attaporn Ninsuwan
 
PDF
Finite difference &amp; interpolation
Daffodil International University
 
PDF
Partitions
Nicholas Teff
 
PDF
On the lambert w function
TrungKienVu3
 
PDF
Eigenvalue eigenvector slides
AmanSaeed11
 
PDF
Fixed Point Theorem in Fuzzy Metric Space Using (CLRg) Property
inventionjournals
 
Physical Chemistry Assignment Help
Edu Assignment Help
 
Reachability Analysis "Control Of Dynamical Non-Linear Systems"
M Reza Rahmati
 
Matlab Assignment Help
Matlab Assignment Experts
 
Lecture 1 computational intelligence
ParveenMalik18
 
Lecture 2 fuzzy inference system
ParveenMalik18
 
X01 Supervised learning problem linear regression one feature theorie
Marco Moldenhauer
 
Numerical
1821986
 
finding Min and max element from given array using divide & conquer
Swati Kulkarni Jaipurkar
 
numericai matmatic matlab uygulamalar ali abdullah
Ali Abdullah
 
Iterative procedure for uniform continuous mapping.
Alexander Decker
 
Interpolation with Finite differences
Dr. Nirav Vyas
 
Numarical values
AmanSaeed11
 
5.1 greedy 03
Krish_ver2
 
08 decrease and conquer spring 15
Hira Gul
 
Chapter2 - Linear Time-Invariant System
Attaporn Ninsuwan
 
Finite difference &amp; interpolation
Daffodil International University
 
Partitions
Nicholas Teff
 
On the lambert w function
TrungKienVu3
 
Eigenvalue eigenvector slides
AmanSaeed11
 
Fixed Point Theorem in Fuzzy Metric Space Using (CLRg) Property
inventionjournals
 
Ad

Viewers also liked (13)

PPTX
Union find
Vyakhya Shrivastava
 
PPTX
Advanced Algorithms #1 - Union/Find on Disjoint-set Data Structures.
Andrea Angella
 
PDF
07. disjoint set
Onkar Nath Sharma
 
PDF
18 Basic Graph Algorithms
Andres Mendez-Vazquez
 
PPTX
Huffman tree
Al-amin Hossain
 
PPT
Sets and disjoint sets union123
Ankita Goyal
 
PPTX
Keynote Session : Using Behavioral Psychology and Science of Habit to Change ...
Priyanka Aash
 
PPTX
Cyber threat intelligence: maturity and metrics
Mark Arena
 
PPTX
Set data structure
Tech_MX
 
PDF
Graph theory
Lifeparticle
 
PDF
Graph theory
Kumar
 
PPTX
Dijkstra
jagdeeparora86
 
PPTX
ICS Security 101 by Sandeep Singh
OWASP Delhi
 
Advanced Algorithms #1 - Union/Find on Disjoint-set Data Structures.
Andrea Angella
 
07. disjoint set
Onkar Nath Sharma
 
18 Basic Graph Algorithms
Andres Mendez-Vazquez
 
Huffman tree
Al-amin Hossain
 
Sets and disjoint sets union123
Ankita Goyal
 
Keynote Session : Using Behavioral Psychology and Science of Habit to Change ...
Priyanka Aash
 
Cyber threat intelligence: maturity and metrics
Mark Arena
 
Set data structure
Tech_MX
 
Graph theory
Lifeparticle
 
Graph theory
Kumar
 
Dijkstra
jagdeeparora86
 
ICS Security 101 by Sandeep Singh
OWASP Delhi
 
Ad

Similar to 17 Disjoint Set Representation (20)

PDF
Solutions for Problems: Engineering Optimization by Ranjan Ganguli
industriale82
 
PDF
Introduction (1).pdf
ShivareddyGangam
 
PDF
smtlecture.7
Roberto Bruttomesso
 
PDF
03-data-structures.pdf
Nash229987
 
PPTX
backtracking 8 Queen.pptx
JoshipavanEdduluru1
 
PPTX
Introduction fea
ahmad saepuddin
 
PPT
Data Structures- Part5 recursion
Abdullah Al-hazmy
 
PPTX
Eigen value and Eigen Vector.pptx
Mahesh Kumar Lohano
 
PPT
test pre
farazch
 
PDF
Recursion Lecture in Java
Raffi Khatchadourian
 
PDF
Path relinking for high dimensional continuous optimization
Patxi Gortázar
 
PDF
Numarical values highlighted
AmanSaeed11
 
PPTX
Soft Heaps
⌨️ Andrey Goder
 
PDF
Daa chapter 2
B.Kirron Reddi
 
PPT
DAA (Unit-2) (ii).ppt design analysis of algorithms
ssuser99ca78
 
PDF
Sample2
Nima Rasekh
 
PPTX
Brute Force and Divide & Conquer Technique
ssusered62011
 
PDF
Ordinary Differential Equations And Dynamical Systems Gerald Teschl
lampliongic9
 
PDF
neural networksNnf
Sandilya Sridhara
 
PDF
Artificial Neural Networks Lect7: Neural networks based on competition
Mohammed Bennamoun
 
Solutions for Problems: Engineering Optimization by Ranjan Ganguli
industriale82
 
Introduction (1).pdf
ShivareddyGangam
 
smtlecture.7
Roberto Bruttomesso
 
03-data-structures.pdf
Nash229987
 
backtracking 8 Queen.pptx
JoshipavanEdduluru1
 
Introduction fea
ahmad saepuddin
 
Data Structures- Part5 recursion
Abdullah Al-hazmy
 
Eigen value and Eigen Vector.pptx
Mahesh Kumar Lohano
 
test pre
farazch
 
Recursion Lecture in Java
Raffi Khatchadourian
 
Path relinking for high dimensional continuous optimization
Patxi Gortázar
 
Numarical values highlighted
AmanSaeed11
 
Daa chapter 2
B.Kirron Reddi
 
DAA (Unit-2) (ii).ppt design analysis of algorithms
ssuser99ca78
 
Sample2
Nima Rasekh
 
Brute Force and Divide & Conquer Technique
ssusered62011
 
Ordinary Differential Equations And Dynamical Systems Gerald Teschl
lampliongic9
 
neural networksNnf
Sandilya Sridhara
 
Artificial Neural Networks Lect7: Neural networks based on competition
Mohammed Bennamoun
 

More from Andres Mendez-Vazquez (20)

PDF
2.03 bayesian estimation
Andres Mendez-Vazquez
 
PDF
05 linear transformations
Andres Mendez-Vazquez
 
PDF
01.04 orthonormal basis_eigen_vectors
Andres Mendez-Vazquez
 
PDF
01.03 squared matrices_and_other_issues
Andres Mendez-Vazquez
 
PDF
01.02 linear equations
Andres Mendez-Vazquez
 
PDF
01.01 vector spaces
Andres Mendez-Vazquez
 
PDF
06 recurrent neural_networks
Andres Mendez-Vazquez
 
PDF
05 backpropagation automatic_differentiation
Andres Mendez-Vazquez
 
PDF
Zetta global
Andres Mendez-Vazquez
 
PDF
01 Introduction to Neural Networks and Deep Learning
Andres Mendez-Vazquez
 
PDF
25 introduction reinforcement_learning
Andres Mendez-Vazquez
 
PDF
Neural Networks and Deep Learning Syllabus
Andres Mendez-Vazquez
 
PDF
Introduction to artificial_intelligence_syllabus
Andres Mendez-Vazquez
 
PDF
Ideas 09 22_2018
Andres Mendez-Vazquez
 
PDF
Ideas about a Bachelor in Machine Learning/Data Sciences
Andres Mendez-Vazquez
 
PDF
Analysis of Algorithms Syllabus
Andres Mendez-Vazquez
 
PDF
20 k-means, k-center, k-meoids and variations
Andres Mendez-Vazquez
 
PDF
18.1 combining models
Andres Mendez-Vazquez
 
PDF
17 vapnik chervonenkis dimension
Andres Mendez-Vazquez
 
PDF
A basic introduction to learning
Andres Mendez-Vazquez
 
2.03 bayesian estimation
Andres Mendez-Vazquez
 
05 linear transformations
Andres Mendez-Vazquez
 
01.04 orthonormal basis_eigen_vectors
Andres Mendez-Vazquez
 
01.03 squared matrices_and_other_issues
Andres Mendez-Vazquez
 
01.02 linear equations
Andres Mendez-Vazquez
 
01.01 vector spaces
Andres Mendez-Vazquez
 
06 recurrent neural_networks
Andres Mendez-Vazquez
 
05 backpropagation automatic_differentiation
Andres Mendez-Vazquez
 
Zetta global
Andres Mendez-Vazquez
 
01 Introduction to Neural Networks and Deep Learning
Andres Mendez-Vazquez
 
25 introduction reinforcement_learning
Andres Mendez-Vazquez
 
Neural Networks and Deep Learning Syllabus
Andres Mendez-Vazquez
 
Introduction to artificial_intelligence_syllabus
Andres Mendez-Vazquez
 
Ideas 09 22_2018
Andres Mendez-Vazquez
 
Ideas about a Bachelor in Machine Learning/Data Sciences
Andres Mendez-Vazquez
 
Analysis of Algorithms Syllabus
Andres Mendez-Vazquez
 
20 k-means, k-center, k-meoids and variations
Andres Mendez-Vazquez
 
18.1 combining models
Andres Mendez-Vazquez
 
17 vapnik chervonenkis dimension
Andres Mendez-Vazquez
 
A basic introduction to learning
Andres Mendez-Vazquez
 

Recently uploaded (20)

PDF
67243-Cooling and Heating & Calculation.pdf
DHAKA POLYTECHNIC
 
PPTX
FUNDAMENTALS OF ELECTRIC VEHICLES UNIT-1
MikkiliSuresh
 
PPTX
MULTI LEVEL DATA TRACKING USING COOJA.pptx
dollysharma12ab
 
PDF
2025 Laurence Sigler - Advancing Decision Support. Content Management Ecommer...
Francisco Javier Mora Serrano
 
PDF
67243-Cooling and Heating & Calculation.pdf
DHAKA POLYTECHNIC
 
PDF
AI-Driven IoT-Enabled UAV Inspection Framework for Predictive Maintenance and...
ijcncjournal019
 
PPTX
MSME 4.0 Template idea hackathon pdf to understand
alaudeenaarish
 
PDF
Cryptography and Information :Security Fundamentals
Dr. Madhuri Jawale
 
PPTX
business incubation centre aaaaaaaaaaaaaa
hodeeesite4
 
PDF
67243-Cooling and Heating & Calculation.pdf
DHAKA POLYTECHNIC
 
PPTX
Module2 Data Base Design- ER and NF.pptx
gomathisankariv2
 
PPTX
Inventory management chapter in automation and robotics.
atisht0104
 
PDF
Chad Ayach - A Versatile Aerospace Professional
Chad Ayach
 
PPTX
Victory Precisions_Supplier Profile.pptx
victoryprecisions199
 
PPTX
MT Chapter 1.pptx- Magnetic particle testing
ABCAnyBodyCanRelax
 
PDF
Zero carbon Building Design Guidelines V4
BassemOsman1
 
PDF
Biodegradable Plastics: Innovations and Market Potential (www.kiu.ac.ug)
publication11
 
PDF
Zero Carbon Building Performance standard
BassemOsman1
 
PPTX
Tunnel Ventilation System in Kanpur Metro
220105053
 
PDF
LEAP-1B presedntation xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
hatem173148
 
67243-Cooling and Heating & Calculation.pdf
DHAKA POLYTECHNIC
 
FUNDAMENTALS OF ELECTRIC VEHICLES UNIT-1
MikkiliSuresh
 
MULTI LEVEL DATA TRACKING USING COOJA.pptx
dollysharma12ab
 
2025 Laurence Sigler - Advancing Decision Support. Content Management Ecommer...
Francisco Javier Mora Serrano
 
67243-Cooling and Heating & Calculation.pdf
DHAKA POLYTECHNIC
 
AI-Driven IoT-Enabled UAV Inspection Framework for Predictive Maintenance and...
ijcncjournal019
 
MSME 4.0 Template idea hackathon pdf to understand
alaudeenaarish
 
Cryptography and Information :Security Fundamentals
Dr. Madhuri Jawale
 
business incubation centre aaaaaaaaaaaaaa
hodeeesite4
 
67243-Cooling and Heating & Calculation.pdf
DHAKA POLYTECHNIC
 
Module2 Data Base Design- ER and NF.pptx
gomathisankariv2
 
Inventory management chapter in automation and robotics.
atisht0104
 
Chad Ayach - A Versatile Aerospace Professional
Chad Ayach
 
Victory Precisions_Supplier Profile.pptx
victoryprecisions199
 
MT Chapter 1.pptx- Magnetic particle testing
ABCAnyBodyCanRelax
 
Zero carbon Building Design Guidelines V4
BassemOsman1
 
Biodegradable Plastics: Innovations and Market Potential (www.kiu.ac.ug)
publication11
 
Zero Carbon Building Performance standard
BassemOsman1
 
Tunnel Ventilation System in Kanpur Metro
220105053
 
LEAP-1B presedntation xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
hatem173148
 

17 Disjoint Set Representation

  • 1. Analysis of Algorithm Disjoint Set Representation Andres Mendez-Vazquez November 8, 2015 1 / 114
  • 2. Outline 1 Disjoint Set Representation Definition of the Problem Operations 2 Union-Find Problem The Main Problem Applications 3 Implementations First Attempt: Circular List Operations and Cost Still we have a Problem Weighted-Union Heuristic Operations Still a Problem Heuristic Union by Rank 4 Balanced Union Path compression Time Complexity Ackermann’s Function Bounds The Rank Observation Proof of Complexity Theorem for Union by Rank and Path Compression) 2 / 114
  • 3. Outline 1 Disjoint Set Representation Definition of the Problem Operations 2 Union-Find Problem The Main Problem Applications 3 Implementations First Attempt: Circular List Operations and Cost Still we have a Problem Weighted-Union Heuristic Operations Still a Problem Heuristic Union by Rank 4 Balanced Union Path compression Time Complexity Ackermann’s Function Bounds The Rank Observation Proof of Complexity Theorem for Union by Rank and Path Compression) 3 / 114
  • 4. Disjoint Set Representation Problem 1 Items are drawn from the finite universe U = 1, 2, ..., n for some fixed n. 2 We want to maintain a partition of U as a collection of disjoint sets. 3 In addition, we want to uniquely name each set by one of its items called its representative item. These disjoint sets are maintained under the following operations 1 MakeSet(x) 2 Union(A,B) 3 Find(x) 4 / 114
  • 5. Disjoint Set Representation Problem 1 Items are drawn from the finite universe U = 1, 2, ..., n for some fixed n. 2 We want to maintain a partition of U as a collection of disjoint sets. 3 In addition, we want to uniquely name each set by one of its items called its representative item. These disjoint sets are maintained under the following operations 1 MakeSet(x) 2 Union(A,B) 3 Find(x) 4 / 114
  • 6. Disjoint Set Representation Problem 1 Items are drawn from the finite universe U = 1, 2, ..., n for some fixed n. 2 We want to maintain a partition of U as a collection of disjoint sets. 3 In addition, we want to uniquely name each set by one of its items called its representative item. These disjoint sets are maintained under the following operations 1 MakeSet(x) 2 Union(A,B) 3 Find(x) 4 / 114
  • 7. Disjoint Set Representation Problem 1 Items are drawn from the finite universe U = 1, 2, ..., n for some fixed n. 2 We want to maintain a partition of U as a collection of disjoint sets. 3 In addition, we want to uniquely name each set by one of its items called its representative item. These disjoint sets are maintained under the following operations 1 MakeSet(x) 2 Union(A,B) 3 Find(x) 4 / 114
  • 8. Disjoint Set Representation Problem 1 Items are drawn from the finite universe U = 1, 2, ..., n for some fixed n. 2 We want to maintain a partition of U as a collection of disjoint sets. 3 In addition, we want to uniquely name each set by one of its items called its representative item. These disjoint sets are maintained under the following operations 1 MakeSet(x) 2 Union(A,B) 3 Find(x) 4 / 114
  • 9. Disjoint Set Representation Problem 1 Items are drawn from the finite universe U = 1, 2, ..., n for some fixed n. 2 We want to maintain a partition of U as a collection of disjoint sets. 3 In addition, we want to uniquely name each set by one of its items called its representative item. These disjoint sets are maintained under the following operations 1 MakeSet(x) 2 Union(A,B) 3 Find(x) 4 / 114
  • 10. Outline 1 Disjoint Set Representation Definition of the Problem Operations 2 Union-Find Problem The Main Problem Applications 3 Implementations First Attempt: Circular List Operations and Cost Still we have a Problem Weighted-Union Heuristic Operations Still a Problem Heuristic Union by Rank 4 Balanced Union Path compression Time Complexity Ackermann’s Function Bounds The Rank Observation Proof of Complexity Theorem for Union by Rank and Path Compression) 5 / 114
  • 11. Operations MakeSet(x) Given x ∈ U currently not belonging to any set in the collection, create a new singleton set {x} and name it x. This is usually done at start, once per item, to create the initial trivial partition. Union(A,B) It changes the current partition by replacing its sets A and B with A ∪ B. Name the set A or B. The operation may choose either one of the two representatives as the new representatives. Find(x) It returns the name of the set that currently contains item x. 6 / 114
  • 12. Operations MakeSet(x) Given x ∈ U currently not belonging to any set in the collection, create a new singleton set {x} and name it x. This is usually done at start, once per item, to create the initial trivial partition. Union(A,B) It changes the current partition by replacing its sets A and B with A ∪ B. Name the set A or B. The operation may choose either one of the two representatives as the new representatives. Find(x) It returns the name of the set that currently contains item x. 6 / 114
  • 13. Operations MakeSet(x) Given x ∈ U currently not belonging to any set in the collection, create a new singleton set {x} and name it x. This is usually done at start, once per item, to create the initial trivial partition. Union(A,B) It changes the current partition by replacing its sets A and B with A ∪ B. Name the set A or B. The operation may choose either one of the two representatives as the new representatives. Find(x) It returns the name of the set that currently contains item x. 6 / 114
  • 14. Example for x = 1 to 9 do MakeSet(x) 98621 2 3 4 5 6 8 97 Then, you do a Union(1, 2) Now, Union(3, 4); Union(5, 8); Union(6, 9) 7 / 114
  • 15. Example for x = 1 to 9 do MakeSet(x) 98621 2 3 4 5 6 8 97 Then, you do a Union(1, 2) 9863 4 5 6 8 9721 2 Now, Union(3, 4); Union(5, 8); Union(6, 9) 7 / 114
  • 16. Example for x = 1 to 9 do MakeSet(x) 98621 2 3 4 5 6 8 97 Then, you do a Union(1, 2) 9863 4 5 6 8 9721 2 Now, Union(3, 4); Union(5, 8); Union(6, 9) 21 2 3 4 5 678 9 7 / 114
  • 17. Example Now, Union(1, 5); Union(7, 4) 21 2 3 45 678 9 Then, if we do the following operations Find(1) returns 5 Find(9) returns 9 Finally, Union(5, 9) Then Find(9) returns 5 8 / 114
  • 18. Example Now, Union(1, 5); Union(7, 4) 21 2 3 45 678 9 Then, if we do the following operations Find(1) returns 5 Find(9) returns 9 Finally, Union(5, 9) Then Find(9) returns 5 8 / 114
  • 19. Example Now, Union(1, 5); Union(7, 4) 21 2 3 45 678 9 Then, if we do the following operations Find(1) returns 5 Find(9) returns 9 Finally, Union(5, 9) 21 2 3 45 6 78 9 Then Find(9) returns 5 8 / 114
  • 20. Outline 1 Disjoint Set Representation Definition of the Problem Operations 2 Union-Find Problem The Main Problem Applications 3 Implementations First Attempt: Circular List Operations and Cost Still we have a Problem Weighted-Union Heuristic Operations Still a Problem Heuristic Union by Rank 4 Balanced Union Path compression Time Complexity Ackermann’s Function Bounds The Rank Observation Proof of Complexity Theorem for Union by Rank and Path Compression) 9 / 114
  • 21. Union-Find Problem Problem S = be a sequence of m = |S| MakeSet, Union and Find operations (intermixed in arbitrary order): n of which are MakeSet. At most n − 1 are Union. The rest are Finds. Cost(S) = total computational time to execute sequence s. Goal: Find an implementation that, for every m and n, minimizes the amortized cost per operation: Cost (S) |S| (1) for any arbitrary sequence S. 10 / 114
  • 22. Union-Find Problem Problem S = be a sequence of m = |S| MakeSet, Union and Find operations (intermixed in arbitrary order): n of which are MakeSet. At most n − 1 are Union. The rest are Finds. Cost(S) = total computational time to execute sequence s. Goal: Find an implementation that, for every m and n, minimizes the amortized cost per operation: Cost (S) |S| (1) for any arbitrary sequence S. 10 / 114
  • 23. Union-Find Problem Problem S = be a sequence of m = |S| MakeSet, Union and Find operations (intermixed in arbitrary order): n of which are MakeSet. At most n − 1 are Union. The rest are Finds. Cost(S) = total computational time to execute sequence s. Goal: Find an implementation that, for every m and n, minimizes the amortized cost per operation: Cost (S) |S| (1) for any arbitrary sequence S. 10 / 114
  • 24. Union-Find Problem Problem S = be a sequence of m = |S| MakeSet, Union and Find operations (intermixed in arbitrary order): n of which are MakeSet. At most n − 1 are Union. The rest are Finds. Cost(S) = total computational time to execute sequence s. Goal: Find an implementation that, for every m and n, minimizes the amortized cost per operation: Cost (S) |S| (1) for any arbitrary sequence S. 10 / 114
  • 25. Union-Find Problem Problem S = be a sequence of m = |S| MakeSet, Union and Find operations (intermixed in arbitrary order): n of which are MakeSet. At most n − 1 are Union. The rest are Finds. Cost(S) = total computational time to execute sequence s. Goal: Find an implementation that, for every m and n, minimizes the amortized cost per operation: Cost (S) |S| (1) for any arbitrary sequence S. 10 / 114
  • 26. Union-Find Problem Problem S = be a sequence of m = |S| MakeSet, Union and Find operations (intermixed in arbitrary order): n of which are MakeSet. At most n − 1 are Union. The rest are Finds. Cost(S) = total computational time to execute sequence s. Goal: Find an implementation that, for every m and n, minimizes the amortized cost per operation: Cost (S) |S| (1) for any arbitrary sequence S. 10 / 114
  • 27. Outline 1 Disjoint Set Representation Definition of the Problem Operations 2 Union-Find Problem The Main Problem Applications 3 Implementations First Attempt: Circular List Operations and Cost Still we have a Problem Weighted-Union Heuristic Operations Still a Problem Heuristic Union by Rank 4 Balanced Union Path compression Time Complexity Ackermann’s Function Bounds The Rank Observation Proof of Complexity Theorem for Union by Rank and Path Compression) 11 / 114
  • 28. Applications Examples 1 Maintaining partitions and equivalence classes. 2 Graph connectivity under edge insertion. 3 Minimum spanning trees (e.g. Kruskal’s algorithm). 4 Random maze construction. 12 / 114
  • 29. Applications Examples 1 Maintaining partitions and equivalence classes. 2 Graph connectivity under edge insertion. 3 Minimum spanning trees (e.g. Kruskal’s algorithm). 4 Random maze construction. 12 / 114
  • 30. Applications Examples 1 Maintaining partitions and equivalence classes. 2 Graph connectivity under edge insertion. 3 Minimum spanning trees (e.g. Kruskal’s algorithm). 4 Random maze construction. 12 / 114
  • 31. Applications Examples 1 Maintaining partitions and equivalence classes. 2 Graph connectivity under edge insertion. 3 Minimum spanning trees (e.g. Kruskal’s algorithm). 4 Random maze construction. 12 / 114
  • 32. Outline 1 Disjoint Set Representation Definition of the Problem Operations 2 Union-Find Problem The Main Problem Applications 3 Implementations First Attempt: Circular List Operations and Cost Still we have a Problem Weighted-Union Heuristic Operations Still a Problem Heuristic Union by Rank 4 Balanced Union Path compression Time Complexity Ackermann’s Function Bounds The Rank Observation Proof of Complexity Theorem for Union by Rank and Path Compression) 13 / 114
  • 33. Circular lists We use the following structures Data structure: Two arrays Set[1..n] and next[1..n]. Set[x] returns the name of the set that contains item x. A is a set if and only if Set[A] = A next[x] returns the next item on the list of the set that contains item x. 14 / 114
  • 34. Circular lists We use the following structures Data structure: Two arrays Set[1..n] and next[1..n]. Set[x] returns the name of the set that contains item x. A is a set if and only if Set[A] = A next[x] returns the next item on the list of the set that contains item x. 14 / 114
  • 35. Circular lists We use the following structures Data structure: Two arrays Set[1..n] and next[1..n]. Set[x] returns the name of the set that contains item x. A is a set if and only if Set[A] = A next[x] returns the next item on the list of the set that contains item x. 14 / 114
  • 36. Circular lists We use the following structures Data structure: Two arrays Set[1..n] and next[1..n]. Set[x] returns the name of the set that contains item x. A is a set if and only if Set[A] = A next[x] returns the next item on the list of the set that contains item x. 14 / 114
  • 37. Circular lists Example: n = 16, Partition: {{1, 2, 8, 9} , {4, 3, 10, 13, 14, 15, 16} , {7, 6, 5, 11, 12} Set next 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1 1 4 4 7 7 7 1 1 4 7 7 4 4 4 4 2 8 10 3 12 5 6 9 1 13 7 11 14 15 16 4 Set Position 1 15 / 114
  • 38. Circular lists Example: n = 16, Partition: {{1, 2, 8, 9} , {4, 3, 10, 13, 14, 15, 16} , {7, 6, 5, 11, 12} Set next 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1 1 4 4 7 7 7 1 1 4 7 7 4 4 4 4 2 8 10 3 12 5 6 9 1 13 7 11 14 15 16 4 Set Position 1 1 2 8 91 15 / 114
  • 39. Circular lists Set Position 7 7 6 5 12 117 Set Position 4 16 / 114
  • 40. Circular lists Set Position 7 7 6 5 12 117 Set Position 4 4 3 10 13 14 15 164 16 / 114
  • 41. Operations and Cost Make(x) 1 Set[x] = x 2 next[x] = x Complexity O (1) Time Find(x) 1 return Set[x] Complexity O (1) Time 17 / 114
  • 42. Operations and Cost Make(x) 1 Set[x] = x 2 next[x] = x Complexity O (1) Time Find(x) 1 return Set[x] Complexity O (1) Time 17 / 114
  • 43. Operations and Cost Make(x) 1 Set[x] = x 2 next[x] = x Complexity O (1) Time Find(x) 1 return Set[x] Complexity O (1) Time 17 / 114
  • 44. Operations and Cost Make(x) 1 Set[x] = x 2 next[x] = x Complexity O (1) Time Find(x) 1 return Set[x] Complexity O (1) Time 17 / 114
  • 45. Operations and Cost For the union We are assuming Set[A] = A =Set[B] = B Union1(A, B) 1 Set[B] = A 2 x =next[B] 3 while (x = B) 4 Set[x] = A /* Rename Set B to A*/ 5 x =next[x] 6 x =next[B] /* Splice list A and B */ 7 next[B] =next[A] 8 next[A] = x 18 / 114
  • 46. Operations and Cost For the union We are assuming Set[A] = A =Set[B] = B Union1(A, B) 1 Set[B] = A 2 x =next[B] 3 while (x = B) 4 Set[x] = A /* Rename Set B to A*/ 5 x =next[x] 6 x =next[B] /* Splice list A and B */ 7 next[B] =next[A] 8 next[A] = x 18 / 114
  • 47. Operations and Cost For the union We are assuming Set[A] = A =Set[B] = B Union1(A, B) 1 Set[B] = A 2 x =next[B] 3 while (x = B) 4 Set[x] = A /* Rename Set B to A*/ 5 x =next[x] 6 x =next[B] /* Splice list A and B */ 7 next[B] =next[A] 8 next[A] = x 18 / 114
  • 48. Operations an Cost Thus, we have in the Splice part A B x 19 / 114
  • 49. We have a Problem Complexity O (|B|) Time Not only that, if we have the following sequence of operations 1 for x = 1 to n 2 MakeSet(x) 3 for x = 1 to n − 1 4 Union1(x + 1, x) 20 / 114
  • 50. We have a Problem Complexity O (|B|) Time Not only that, if we have the following sequence of operations 1 for x = 1 to n 2 MakeSet(x) 3 for x = 1 to n − 1 4 Union1(x + 1, x) 20 / 114
  • 51. Thus Thus, we have the following number of aggregated steps n + n−1 i=1 i = n + n (n − 1) 2 = n + n2 − n 2 = n2 2 + n 2 = Θ n2 21 / 114
  • 52. Thus Thus, we have the following number of aggregated steps n + n−1 i=1 i = n + n (n − 1) 2 = n + n2 − n 2 = n2 2 + n 2 = Θ n2 21 / 114
  • 53. Thus Thus, we have the following number of aggregated steps n + n−1 i=1 i = n + n (n − 1) 2 = n + n2 − n 2 = n2 2 + n 2 = Θ n2 21 / 114
  • 54. Thus Thus, we have the following number of aggregated steps n + n−1 i=1 i = n + n (n − 1) 2 = n + n2 − n 2 = n2 2 + n 2 = Θ n2 21 / 114
  • 55. Thus Thus, we have the following number of aggregated steps n + n−1 i=1 i = n + n (n − 1) 2 = n + n2 − n 2 = n2 2 + n 2 = Θ n2 21 / 114
  • 56. Aggregate Time Thus, the aggregate time is as follow Aggregate Time = Θ n2 Therefore Amortized Time per operation = Θ (n) 22 / 114
  • 57. Aggregate Time Thus, the aggregate time is as follow Aggregate Time = Θ n2 Therefore Amortized Time per operation = Θ (n) 22 / 114
  • 58. This is not exactly good Thus, we need to have something better We will try now the Weighted-Union Heuristic!!! 23 / 114
  • 59. Outline 1 Disjoint Set Representation Definition of the Problem Operations 2 Union-Find Problem The Main Problem Applications 3 Implementations First Attempt: Circular List Operations and Cost Still we have a Problem Weighted-Union Heuristic Operations Still a Problem Heuristic Union by Rank 4 Balanced Union Path compression Time Complexity Ackermann’s Function Bounds The Rank Observation Proof of Complexity Theorem for Union by Rank and Path Compression) 24 / 114
  • 60. Implementation 2: Weighted-Union Heuristic Lists We extend the previous data structure Data structure: Three arrays Set[1..n], next[1..n], size[1..n]. size[A] returns the number of items in set A if A == Set[A] (Otherwise, we do not care). 25 / 114
  • 61. Operations MakeSet(x) 1 Set[x] = x 2 next[x] = x 3 size[x] = 1 Complexity O (1) time Find(x) 1 return Set[x] Complexity O (1) time 26 / 114
  • 62. Operations MakeSet(x) 1 Set[x] = x 2 next[x] = x 3 size[x] = 1 Complexity O (1) time Find(x) 1 return Set[x] Complexity O (1) time 26 / 114
  • 63. Operations MakeSet(x) 1 Set[x] = x 2 next[x] = x 3 size[x] = 1 Complexity O (1) time Find(x) 1 return Set[x] Complexity O (1) time 26 / 114
  • 64. Operations MakeSet(x) 1 Set[x] = x 2 next[x] = x 3 size[x] = 1 Complexity O (1) time Find(x) 1 return Set[x] Complexity O (1) time 26 / 114
  • 65. Operations Union2(A, B) 1 if size[set [A]] >size[set [B]] 2 size[set [A]] =size[set [A]]+size[set [B]] 3 Union1(A, B) 4 else 5 size[set [B]] =size[set [A]]+size[set [B]] 6 Union1(B, A) Note: Weight Balanced Union: Merge smaller set into large set Complexity O (min {|A| , |B|}) time. 27 / 114
  • 66. Operations Union2(A, B) 1 if size[set [A]] >size[set [B]] 2 size[set [A]] =size[set [A]]+size[set [B]] 3 Union1(A, B) 4 else 5 size[set [B]] =size[set [A]]+size[set [B]] 6 Union1(B, A) Note: Weight Balanced Union: Merge smaller set into large set Complexity O (min {|A| , |B|}) time. 27 / 114
  • 67. What about the operations eliciting the worst behavior Remember 1 for x = 1 to n 2 MakeSet(x) 3 for x = 1 to n − 1 4 Union2(x + 1, x) We have then n + n−1 i=1 1 = n + n − 1 = 2n − 1 = Θ (n) IMPORTANT: This is not the worst sequence!!! 28 / 114
  • 68. What about the operations eliciting the worst behavior Remember 1 for x = 1 to n 2 MakeSet(x) 3 for x = 1 to n − 1 4 Union2(x + 1, x) We have then n + n−1 i=1 1 = n + n − 1 = 2n − 1 = Θ (n) IMPORTANT: This is not the worst sequence!!! 28 / 114
  • 69. For this, notice the following worst sequence Worst Sequence s MakeSet(x), for x = 1, .., n. Then do n − 1 Unions in round-robin manner. Within each round, the sets have roughly equal size. Starting round: Each round has size 1. Next round: Each round has size 2. Next: ... size 4. ... We claim the following Aggregate time = Θ(n log n) Amortized time per operation = Θ(log n) 29 / 114
  • 70. For this, notice the following worst sequence Worst Sequence s MakeSet(x), for x = 1, .., n. Then do n − 1 Unions in round-robin manner. Within each round, the sets have roughly equal size. Starting round: Each round has size 1. Next round: Each round has size 2. Next: ... size 4. ... We claim the following Aggregate time = Θ(n log n) Amortized time per operation = Θ(log n) 29 / 114
  • 71. For this, notice the following worst sequence Worst Sequence s MakeSet(x), for x = 1, .., n. Then do n − 1 Unions in round-robin manner. Within each round, the sets have roughly equal size. Starting round: Each round has size 1. Next round: Each round has size 2. Next: ... size 4. ... We claim the following Aggregate time = Θ(n log n) Amortized time per operation = Θ(log n) 29 / 114
  • 72. For this, notice the following worst sequence Worst Sequence s MakeSet(x), for x = 1, .., n. Then do n − 1 Unions in round-robin manner. Within each round, the sets have roughly equal size. Starting round: Each round has size 1. Next round: Each round has size 2. Next: ... size 4. ... We claim the following Aggregate time = Θ(n log n) Amortized time per operation = Θ(log n) 29 / 114
  • 73. For this, notice the following worst sequence Worst Sequence s MakeSet(x), for x = 1, .., n. Then do n − 1 Unions in round-robin manner. Within each round, the sets have roughly equal size. Starting round: Each round has size 1. Next round: Each round has size 2. Next: ... size 4. ... We claim the following Aggregate time = Θ(n log n) Amortized time per operation = Θ(log n) 29 / 114
  • 74. For this, notice the following worst sequence Worst Sequence s MakeSet(x), for x = 1, .., n. Then do n − 1 Unions in round-robin manner. Within each round, the sets have roughly equal size. Starting round: Each round has size 1. Next round: Each round has size 2. Next: ... size 4. ... We claim the following Aggregate time = Θ(n log n) Amortized time per operation = Θ(log n) 29 / 114
  • 75. For this, notice the following worst sequence Worst Sequence s MakeSet(x), for x = 1, .., n. Then do n − 1 Unions in round-robin manner. Within each round, the sets have roughly equal size. Starting round: Each round has size 1. Next round: Each round has size 2. Next: ... size 4. ... We claim the following Aggregate time = Θ(n log n) Amortized time per operation = Θ(log n) 29 / 114
  • 76. For this, notice the following worst sequence Worst Sequence s MakeSet(x), for x = 1, .., n. Then do n − 1 Unions in round-robin manner. Within each round, the sets have roughly equal size. Starting round: Each round has size 1. Next round: Each round has size 2. Next: ... size 4. ... We claim the following Aggregate time = Θ(n log n) Amortized time per operation = Θ(log n) 29 / 114
  • 77. For this, notice the following worst sequence Example n = 16 Round 0: {1} {2} {3} {4} {5} {6} {7} {8} {9} {10} {11} {12} {13} {14} {15} {16} Round 1: {1, 2} {3, 4} {5, 6} {7, 8} {9, 10} {11, 12} {13, 14} {15, 16} Round 2: {1, 2, 3, 4} {5, 6, 7, 8} {9, 10, 11, 12} {13, 14, 15, 16} Round 3: {1, 2, 3, 4, 5, 6, 7, 8} {9, 10, 11, 12, 13, 14, 15, 16} Round 4: {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16} 30 / 114
  • 78. For this, notice the following worst sequence Example n = 16 Round 0: {1} {2} {3} {4} {5} {6} {7} {8} {9} {10} {11} {12} {13} {14} {15} {16} Round 1: {1, 2} {3, 4} {5, 6} {7, 8} {9, 10} {11, 12} {13, 14} {15, 16} Round 2: {1, 2, 3, 4} {5, 6, 7, 8} {9, 10, 11, 12} {13, 14, 15, 16} Round 3: {1, 2, 3, 4, 5, 6, 7, 8} {9, 10, 11, 12, 13, 14, 15, 16} Round 4: {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16} 30 / 114
  • 79. For this, notice the following worst sequence Example n = 16 Round 0: {1} {2} {3} {4} {5} {6} {7} {8} {9} {10} {11} {12} {13} {14} {15} {16} Round 1: {1, 2} {3, 4} {5, 6} {7, 8} {9, 10} {11, 12} {13, 14} {15, 16} Round 2: {1, 2, 3, 4} {5, 6, 7, 8} {9, 10, 11, 12} {13, 14, 15, 16} Round 3: {1, 2, 3, 4, 5, 6, 7, 8} {9, 10, 11, 12, 13, 14, 15, 16} Round 4: {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16} 30 / 114
  • 80. For this, notice the following worst sequence Example n = 16 Round 0: {1} {2} {3} {4} {5} {6} {7} {8} {9} {10} {11} {12} {13} {14} {15} {16} Round 1: {1, 2} {3, 4} {5, 6} {7, 8} {9, 10} {11, 12} {13, 14} {15, 16} Round 2: {1, 2, 3, 4} {5, 6, 7, 8} {9, 10, 11, 12} {13, 14, 15, 16} Round 3: {1, 2, 3, 4, 5, 6, 7, 8} {9, 10, 11, 12, 13, 14, 15, 16} Round 4: {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16} 30 / 114
  • 81. For this, notice the following worst sequence Example n = 16 Round 0: {1} {2} {3} {4} {5} {6} {7} {8} {9} {10} {11} {12} {13} {14} {15} {16} Round 1: {1, 2} {3, 4} {5, 6} {7, 8} {9, 10} {11, 12} {13, 14} {15, 16} Round 2: {1, 2, 3, 4} {5, 6, 7, 8} {9, 10, 11, 12} {13, 14, 15, 16} Round 3: {1, 2, 3, 4, 5, 6, 7, 8} {9, 10, 11, 12, 13, 14, 15, 16} Round 4: {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16} 30 / 114
  • 82. Now Given the previous worst case What is the complexity of this implementation? 31 / 114
  • 83. Now, the Amortized Costs of this implementation Claim 1: Amortized time per operation is O(log n) For this, we have the following theorem!!! Theorem 1 Using the linked-list representation of disjoint sets and the weighted-Union heuristic, a sequence of m MakeSet, Union, and FindSet operations, n of which are MakeSet operations, takes O (m + n log n) time. 32 / 114
  • 84. Now, the Amortized Costs of this implementation Claim 1: Amortized time per operation is O(log n) For this, we have the following theorem!!! Theorem 1 Using the linked-list representation of disjoint sets and the weighted-Union heuristic, a sequence of m MakeSet, Union, and FindSet operations, n of which are MakeSet operations, takes O (m + n log n) time. 32 / 114
  • 85. Proof Because each Union operation unites two disjoint sets We perform at most n − 1 Union operations over all. We now bound the total time taken by these Union operations We start by determining, for each object, an upper bound on the number of times the object’s pointer back to its set object is updated. Consider a particular object x. We know that each time x’s pointer was updated, x must have started in the smaller set. The first time x’s pointer was updated, therefore, the resulting set must have had at least 2 members. Similarly, the next time x’s pointer was updated, the resulting set must have had at least 4 members. 33 / 114
  • 86. Proof Because each Union operation unites two disjoint sets We perform at most n − 1 Union operations over all. We now bound the total time taken by these Union operations We start by determining, for each object, an upper bound on the number of times the object’s pointer back to its set object is updated. Consider a particular object x. We know that each time x’s pointer was updated, x must have started in the smaller set. The first time x’s pointer was updated, therefore, the resulting set must have had at least 2 members. Similarly, the next time x’s pointer was updated, the resulting set must have had at least 4 members. 33 / 114
  • 87. Proof Because each Union operation unites two disjoint sets We perform at most n − 1 Union operations over all. We now bound the total time taken by these Union operations We start by determining, for each object, an upper bound on the number of times the object’s pointer back to its set object is updated. Consider a particular object x. We know that each time x’s pointer was updated, x must have started in the smaller set. The first time x’s pointer was updated, therefore, the resulting set must have had at least 2 members. Similarly, the next time x’s pointer was updated, the resulting set must have had at least 4 members. 33 / 114
  • 88. Proof Because each Union operation unites two disjoint sets We perform at most n − 1 Union operations over all. We now bound the total time taken by these Union operations We start by determining, for each object, an upper bound on the number of times the object’s pointer back to its set object is updated. Consider a particular object x. We know that each time x’s pointer was updated, x must have started in the smaller set. The first time x’s pointer was updated, therefore, the resulting set must have had at least 2 members. Similarly, the next time x’s pointer was updated, the resulting set must have had at least 4 members. 33 / 114
  • 89. Proof Because each Union operation unites two disjoint sets We perform at most n − 1 Union operations over all. We now bound the total time taken by these Union operations We start by determining, for each object, an upper bound on the number of times the object’s pointer back to its set object is updated. Consider a particular object x. We know that each time x’s pointer was updated, x must have started in the smaller set. The first time x’s pointer was updated, therefore, the resulting set must have had at least 2 members. Similarly, the next time x’s pointer was updated, the resulting set must have had at least 4 members. 33 / 114
  • 90. Proof Continuing on We observe that for any k ≤ n, after x’s pointer has been updated log n times!!! The resulting set must have at least k members. Thus Since the largest set has at most n members, each object’s pointer is updated at most log n times over all the Union operations. Then The total time spent updating object pointers over all Union operations is O (n log n). 34 / 114
  • 91. Proof Continuing on We observe that for any k ≤ n, after x’s pointer has been updated log n times!!! The resulting set must have at least k members. Thus Since the largest set has at most n members, each object’s pointer is updated at most log n times over all the Union operations. Then The total time spent updating object pointers over all Union operations is O (n log n). 34 / 114
  • 92. Proof Continuing on We observe that for any k ≤ n, after x’s pointer has been updated log n times!!! The resulting set must have at least k members. Thus Since the largest set has at most n members, each object’s pointer is updated at most log n times over all the Union operations. Then The total time spent updating object pointers over all Union operations is O (n log n). 34 / 114
  • 93. Proof Continuing on We observe that for any k ≤ n, after x’s pointer has been updated log n times!!! The resulting set must have at least k members. Thus Since the largest set has at most n members, each object’s pointer is updated at most log n times over all the Union operations. Then The total time spent updating object pointers over all Union operations is O (n log n). 34 / 114
  • 94. Proof We must also account for updating the tail pointers and the list lengths It takes only O (1) time per Union operation Therefore The total time spent in all Union operations is thus O (n log n). The time for the entire sequence of m operations follows easily Each MakeSet and FindSet operation takes O (1) time, and there are O (m) of them. 35 / 114
  • 95. Proof We must also account for updating the tail pointers and the list lengths It takes only O (1) time per Union operation Therefore The total time spent in all Union operations is thus O (n log n). The time for the entire sequence of m operations follows easily Each MakeSet and FindSet operation takes O (1) time, and there are O (m) of them. 35 / 114
  • 96. Proof We must also account for updating the tail pointers and the list lengths It takes only O (1) time per Union operation Therefore The total time spent in all Union operations is thus O (n log n). The time for the entire sequence of m operations follows easily Each MakeSet and FindSet operation takes O (1) time, and there are O (m) of them. 35 / 114
  • 97. Proof Therefore The total time for the entire sequence is thus O (m + n log n). 36 / 114
  • 98. Amortized Cost: Aggregate Analysis Aggregate cost O(m + n log n). Amortized cost per operation O(log n). O(m + n log n) m = O (1 + log n) = O (log n) (2) 37 / 114
  • 99. There are other ways of analyzing the amortized cost It is possible to use 1 Accounting Method. 2 Potential Method. 38 / 114
  • 100. Amortized Costs: Accounting Method Accounting method MakeSet(x): Charge (1 + log n). 1 to do the operation, log n stored as credit with item x. Find(x): Charge 1, and use it to do the operation. Union(A, B): Charge 0 and use 1 stored credit from each item in the smaller set to move it. 39 / 114
  • 101. Amortized Costs: Accounting Method Accounting method MakeSet(x): Charge (1 + log n). 1 to do the operation, log n stored as credit with item x. Find(x): Charge 1, and use it to do the operation. Union(A, B): Charge 0 and use 1 stored credit from each item in the smaller set to move it. 39 / 114
  • 102. Amortized Costs: Accounting Method Accounting method MakeSet(x): Charge (1 + log n). 1 to do the operation, log n stored as credit with item x. Find(x): Charge 1, and use it to do the operation. Union(A, B): Charge 0 and use 1 stored credit from each item in the smaller set to move it. 39 / 114
  • 103. Amortized Costs: Accounting Method Credit invariant Total stored credit is S |S| log n |S| , where the summation is taken over the collection S of all disjoint sets of the current partition. 40 / 114
  • 104. Amortized Costs: Potential Method Potential function method Exercise: Define a regular potential function and use it to do the amortized analysis. Can you make the Union amortized cost O(log n), MakeSet and Find costs O(1)? 41 / 114
  • 105. Amortized Costs: Potential Method Potential function method Exercise: Define a regular potential function and use it to do the amortized analysis. Can you make the Union amortized cost O(log n), MakeSet and Find costs O(1)? 41 / 114
  • 106. Amortized Costs: Potential Method Potential function method Exercise: Define a regular potential function and use it to do the amortized analysis. Can you make the Union amortized cost O(log n), MakeSet and Find costs O(1)? 41 / 114
  • 107. Outline 1 Disjoint Set Representation Definition of the Problem Operations 2 Union-Find Problem The Main Problem Applications 3 Implementations First Attempt: Circular List Operations and Cost Still we have a Problem Weighted-Union Heuristic Operations Still a Problem Heuristic Union by Rank 4 Balanced Union Path compression Time Complexity Ackermann’s Function Bounds The Rank Observation Proof of Complexity Theorem for Union by Rank and Path Compression) 42 / 114
  • 108. Improving over the heuristic using union by rank Union by Rank Instead of using the number of nodes in each tree to make a decision, we maintain a rank, a upper bound on the height of the tree. We have the following data structure to support this: We maintain a parent array p[1..n]. A is a set if and only if A = p[A] (a tree root). x ∈ A if and only if x is in the tree rooted at A. 43 / 114
  • 109. Improving over the heuristic using union by rank Union by Rank Instead of using the number of nodes in each tree to make a decision, we maintain a rank, a upper bound on the height of the tree. We have the following data structure to support this: We maintain a parent array p[1..n]. A is a set if and only if A = p[A] (a tree root). x ∈ A if and only if x is in the tree rooted at A. 43 / 114
  • 110. Improving over the heuristic using union by rank Union by Rank Instead of using the number of nodes in each tree to make a decision, we maintain a rank, a upper bound on the height of the tree. We have the following data structure to support this: We maintain a parent array p[1..n]. A is a set if and only if A = p[A] (a tree root). x ∈ A if and only if x is in the tree rooted at A. 43 / 114
  • 111. Improving over the heuristic using union by rank Union by Rank Instead of using the number of nodes in each tree to make a decision, we maintain a rank, a upper bound on the height of the tree. We have the following data structure to support this: We maintain a parent array p[1..n]. A is a set if and only if A = p[A] (a tree root). x ∈ A if and only if x is in the tree rooted at A. 43 / 114
  • 112. Improving over the heuristic using union by rank Union by Rank Instead of using the number of nodes in each tree to make a decision, we maintain a rank, a upper bound on the height of the tree. We have the following data structure to support this: We maintain a parent array p[1..n]. A is a set if and only if A = p[A] (a tree root). x ∈ A if and only if x is in the tree rooted at A. 1 13 5 8 20 14 10 19 4 2 18 6 15 9 11 17 7 3 12 16 43 / 114
  • 113. Forest of Up-Trees: Operations without union by rank or weight MakeSet(x) 1 p[x] = x Complexity O (1) time Union(A, B) 1 p[B] = A Note: We are assuming that p[A] == A =p[B] == B. This is the reason we need a find operation!!! 44 / 114
  • 114. Forest of Up-Trees: Operations without union by rank or weight MakeSet(x) 1 p[x] = x Complexity O (1) time Union(A, B) 1 p[B] = A Note: We are assuming that p[A] == A =p[B] == B. This is the reason we need a find operation!!! 44 / 114
  • 115. Forest of Up-Trees: Operations without union by rank or weight MakeSet(x) 1 p[x] = x Complexity O (1) time Union(A, B) 1 p[B] = A Note: We are assuming that p[A] == A =p[B] == B. This is the reason we need a find operation!!! 44 / 114
  • 116. Example Remember we are doing the joins without caring about getting the worst case B A 45 / 114
  • 117. Forest of Up-Trees: Operations without union by rank or weight Find(x) 1 if x ==p[x] 2 return x 3 return Find(p [x]) Example 46 / 114
  • 118. Forest of Up-Trees: Operations without union by rank or weight Find(x) 1 if x ==p[x] 2 return x 3 return Find(p [x]) Example x 46 / 114
  • 119. Forest of Up-Trees: Operations without union by rank or weight Still I can give you a horrible case Sequence of operations 1 for x = 1 to n 2 MakeSet(x) 3 for x = 1 to n − 1 4 Union(x) 5 for x = 1 to n − 1 6 Find(1) 47 / 114
  • 120. Forest of Up-Trees: Operations without union by rank or weight Still I can give you a horrible case Sequence of operations 1 for x = 1 to n 2 MakeSet(x) 3 for x = 1 to n − 1 4 Union(x) 5 for x = 1 to n − 1 6 Find(1) 47 / 114
  • 121. Forest of Up-Trees: Operations without union by rank or weight We finish with this data structure 1 2 n-1 n Thus the last part of the sequence give us a total time of Aggregate Time Θ n2 Amortized Analysis per operation Θ (n) 48 / 114
  • 122. Forest of Up-Trees: Operations without union by rank or weight We finish with this data structure 1 2 n-1 n Thus the last part of the sequence give us a total time of Aggregate Time Θ n2 Amortized Analysis per operation Θ (n) 48 / 114
  • 123. Self-Adjusting forest of Up-Trees How, we avoid this problem Use together the following heuristics!!! 1 Balanced Union. By tree weight (i.e., size) By tree rank (i.e., height) 2 Find with path compression Observations Each single improvement (1 or 2) by itself will result in logarithmic amortized cost per operation. The two improvements combined will result in amortized cost per operation approaching very close to O(1). 49 / 114
  • 124. Self-Adjusting forest of Up-Trees How, we avoid this problem Use together the following heuristics!!! 1 Balanced Union. By tree weight (i.e., size) By tree rank (i.e., height) 2 Find with path compression Observations Each single improvement (1 or 2) by itself will result in logarithmic amortized cost per operation. The two improvements combined will result in amortized cost per operation approaching very close to O(1). 49 / 114
  • 125. Self-Adjusting forest of Up-Trees How, we avoid this problem Use together the following heuristics!!! 1 Balanced Union. By tree weight (i.e., size) By tree rank (i.e., height) 2 Find with path compression Observations Each single improvement (1 or 2) by itself will result in logarithmic amortized cost per operation. The two improvements combined will result in amortized cost per operation approaching very close to O(1). 49 / 114
  • 126. Self-Adjusting forest of Up-Trees How, we avoid this problem Use together the following heuristics!!! 1 Balanced Union. By tree weight (i.e., size) By tree rank (i.e., height) 2 Find with path compression Observations Each single improvement (1 or 2) by itself will result in logarithmic amortized cost per operation. The two improvements combined will result in amortized cost per operation approaching very close to O(1). 49 / 114
  • 127. Self-Adjusting forest of Up-Trees How, we avoid this problem Use together the following heuristics!!! 1 Balanced Union. By tree weight (i.e., size) By tree rank (i.e., height) 2 Find with path compression Observations Each single improvement (1 or 2) by itself will result in logarithmic amortized cost per operation. The two improvements combined will result in amortized cost per operation approaching very close to O(1). 49 / 114
  • 128. Self-Adjusting forest of Up-Trees How, we avoid this problem Use together the following heuristics!!! 1 Balanced Union. By tree weight (i.e., size) By tree rank (i.e., height) 2 Find with path compression Observations Each single improvement (1 or 2) by itself will result in logarithmic amortized cost per operation. The two improvements combined will result in amortized cost per operation approaching very close to O(1). 49 / 114
  • 129. Self-Adjusting forest of Up-Trees How, we avoid this problem Use together the following heuristics!!! 1 Balanced Union. By tree weight (i.e., size) By tree rank (i.e., height) 2 Find with path compression Observations Each single improvement (1 or 2) by itself will result in logarithmic amortized cost per operation. The two improvements combined will result in amortized cost per operation approaching very close to O(1). 49 / 114
  • 130. Balanced Union by Size Using size for Balanced Union We can use the size of each set to obtain what we want 50 / 114
  • 131. We have then MakeSet(x) 1 p[x] = x 2 size[x] = 1 Note: Complexity O (1) time Union(A, B) Input: assume that p[A]=A=p[B]=B 1 if size[A] >size[B] 2 size[A] =size[A]+size[B] 3 p[B] = A 4 else 5 size[B] =size[A]+size[B] 6 p[A] = B Note: Complexity O (1) time 51 / 114
  • 132. We have then MakeSet(x) 1 p[x] = x 2 size[x] = 1 Note: Complexity O (1) time Union(A, B) Input: assume that p[A]=A=p[B]=B 1 if size[A] >size[B] 2 size[A] =size[A]+size[B] 3 p[B] = A 4 else 5 size[B] =size[A]+size[B] 6 p[A] = B Note: Complexity O (1) time 51 / 114
  • 133. We have then MakeSet(x) 1 p[x] = x 2 size[x] = 1 Note: Complexity O (1) time Union(A, B) Input: assume that p[A]=A=p[B]=B 1 if size[A] >size[B] 2 size[A] =size[A]+size[B] 3 p[B] = A 4 else 5 size[B] =size[A]+size[B] 6 p[A] = B Note: Complexity O (1) time 51 / 114
  • 134. Example Now, we use the size for the union size[A]>size[B] B A 52 / 114
  • 135. Nevertheless Union by size can make the analysis too complex People would rather use the rank Rank It is defined as the height of the tree Because The use of the rank simplify the amortized analysis for the data structure!!! 53 / 114
  • 136. Nevertheless Union by size can make the analysis too complex People would rather use the rank Rank It is defined as the height of the tree Because The use of the rank simplify the amortized analysis for the data structure!!! 53 / 114
  • 137. Nevertheless Union by size can make the analysis too complex People would rather use the rank Rank It is defined as the height of the tree Because The use of the rank simplify the amortized analysis for the data structure!!! 53 / 114
  • 138. Thus, we use the balanced union by rank MakeSet(x) 1 p[x] = x 2 rank[x] = 0 Note: Complexity O (1) time Union(A, B) Input: assume that p[A]=A=p[B]=B 1 if rank[A] >rank[B] 2 p[B] = A 3 else 4 p[A] = B 5 if rank[A] ==rank[B] 6 rank[B]=rank[B]+1 Note: Complexity O (1) time 54 / 114
  • 139. Thus, we use the balanced union by rank MakeSet(x) 1 p[x] = x 2 rank[x] = 0 Note: Complexity O (1) time Union(A, B) Input: assume that p[A]=A=p[B]=B 1 if rank[A] >rank[B] 2 p[B] = A 3 else 4 p[A] = B 5 if rank[A] ==rank[B] 6 rank[B]=rank[B]+1 Note: Complexity O (1) time 54 / 114
  • 140. Thus, we use the balanced union by rank MakeSet(x) 1 p[x] = x 2 rank[x] = 0 Note: Complexity O (1) time Union(A, B) Input: assume that p[A]=A=p[B]=B 1 if rank[A] >rank[B] 2 p[B] = A 3 else 4 p[A] = B 5 if rank[A] ==rank[B] 6 rank[B]=rank[B]+1 Note: Complexity O (1) time 54 / 114
  • 141. Example Now We use the rank for the union Case I The rank of A is larger than B 55 / 114
  • 142. Example Now We use the rank for the union Case I The rank of A is larger than B rank[A]>rank[B] B A 55 / 114
  • 143. Example Case II The rank of B is larger than A 56 / 114
  • 144. Example Case II The rank of B is larger than A rank[B]>rank[A] B A 56 / 114
  • 145. Outline 1 Disjoint Set Representation Definition of the Problem Operations 2 Union-Find Problem The Main Problem Applications 3 Implementations First Attempt: Circular List Operations and Cost Still we have a Problem Weighted-Union Heuristic Operations Still a Problem Heuristic Union by Rank 4 Balanced Union Path compression Time Complexity Ackermann’s Function Bounds The Rank Observation Proof of Complexity Theorem for Union by Rank and Path Compression) 57 / 114
  • 146. Here is the new heuristic to improve overall performance: Path Compression Find(x) 1 if x=p[x] 2 p[x]=Find(p [x]) 3 return p[x] Complexity O (depth (x)) time 58 / 114
  • 147. Here is the new heuristic to improve overall performance: Path Compression Find(x) 1 if x=p[x] 2 p[x]=Find(p [x]) 3 return p[x] Complexity O (depth (x)) time 58 / 114
  • 148. Example We have the following structure 59 / 114
  • 151. Path compression Find(x) should traverse the path from x up to its root. This might as well create shortcuts along the way to improve the efficiency of the future operations. Find(2) 3 13 15 10 13 15 10 3 12 14 12 1411 11 2 2 1 1 7 8 7 84 6 5 9 4 6 5 9 62 / 114
  • 152. Outline 1 Disjoint Set Representation Definition of the Problem Operations 2 Union-Find Problem The Main Problem Applications 3 Implementations First Attempt: Circular List Operations and Cost Still we have a Problem Weighted-Union Heuristic Operations Still a Problem Heuristic Union by Rank 4 Balanced Union Path compression Time Complexity Ackermann’s Function Bounds The Rank Observation Proof of Complexity Theorem for Union by Rank and Path Compression) 63 / 114
  • 153. Time complexity Tight upper bound on time complexity An amortized time of O(mα(m, n)) for m operations. Where α(m, n) is the inverse of the Ackermann’s function (almost a constant). This bound, for a slightly different definition of α than that given here is shown in Cormen’s book. 64 / 114
  • 154. Time complexity Tight upper bound on time complexity An amortized time of O(mα(m, n)) for m operations. Where α(m, n) is the inverse of the Ackermann’s function (almost a constant). This bound, for a slightly different definition of α than that given here is shown in Cormen’s book. 64 / 114
  • 155. Time complexity Tight upper bound on time complexity An amortized time of O(mα(m, n)) for m operations. Where α(m, n) is the inverse of the Ackermann’s function (almost a constant). This bound, for a slightly different definition of α than that given here is shown in Cormen’s book. 64 / 114
  • 156. Ackermann’s Function Definition A(1, j) = 2j where j ≥ 1 A(i, 1) = A(i − 1, 2) where i ≥ 2 A(i, j) = A(i − 1, A(i, j − 1)) where i, j ≥ 2 Note: This is one of several in-equivalent but similar definitions of Ackermann’s function found in the literature. Cormen’s book authors give a different definition, although they never really call theirs Ackermann’s function. Property Ackermann’s function grows very fast, thus it’s inverse grows very slow. 65 / 114
  • 157. Ackermann’s Function Definition A(1, j) = 2j where j ≥ 1 A(i, 1) = A(i − 1, 2) where i ≥ 2 A(i, j) = A(i − 1, A(i, j − 1)) where i, j ≥ 2 Note: This is one of several in-equivalent but similar definitions of Ackermann’s function found in the literature. Cormen’s book authors give a different definition, although they never really call theirs Ackermann’s function. Property Ackermann’s function grows very fast, thus it’s inverse grows very slow. 65 / 114
  • 158. Ackermann’s Function Definition A(1, j) = 2j where j ≥ 1 A(i, 1) = A(i − 1, 2) where i ≥ 2 A(i, j) = A(i − 1, A(i, j − 1)) where i, j ≥ 2 Note: This is one of several in-equivalent but similar definitions of Ackermann’s function found in the literature. Cormen’s book authors give a different definition, although they never really call theirs Ackermann’s function. Property Ackermann’s function grows very fast, thus it’s inverse grows very slow. 65 / 114
  • 159. Ackermann’s Function Definition A(1, j) = 2j where j ≥ 1 A(i, 1) = A(i − 1, 2) where i ≥ 2 A(i, j) = A(i − 1, A(i, j − 1)) where i, j ≥ 2 Note: This is one of several in-equivalent but similar definitions of Ackermann’s function found in the literature. Cormen’s book authors give a different definition, although they never really call theirs Ackermann’s function. Property Ackermann’s function grows very fast, thus it’s inverse grows very slow. 65 / 114
  • 160. Ackermann’s Function Definition A(1, j) = 2j where j ≥ 1 A(i, 1) = A(i − 1, 2) where i ≥ 2 A(i, j) = A(i − 1, A(i, j − 1)) where i, j ≥ 2 Note: This is one of several in-equivalent but similar definitions of Ackermann’s function found in the literature. Cormen’s book authors give a different definition, although they never really call theirs Ackermann’s function. Property Ackermann’s function grows very fast, thus it’s inverse grows very slow. 65 / 114
  • 161. Ackermann’s Function Definition A(1, j) = 2j where j ≥ 1 A(i, 1) = A(i − 1, 2) where i ≥ 2 A(i, j) = A(i − 1, A(i, j − 1)) where i, j ≥ 2 Note: This is one of several in-equivalent but similar definitions of Ackermann’s function found in the literature. Cormen’s book authors give a different definition, although they never really call theirs Ackermann’s function. Property Ackermann’s function grows very fast, thus it’s inverse grows very slow. 65 / 114
  • 162. Ackermann’s Function Example A(3, 4) A (3, 4) = 2 2... 2 2 2 ... 2 2 2 ... 2 16 Notation: 2 2... 2 10 means 22222222222 66 / 114
  • 163. Inverse of Ackermann’s function Definition α(m, n) = min i ≥ 1|A i, m n > log n (3) Note: This is not a true mathematical inverse. Intuition: Grows about as slowly as Ackermann’s function does fast. How slowly? Let m n = k, then m ≥ n → k ≥ 1 67 / 114
  • 164. Inverse of Ackermann’s function Definition α(m, n) = min i ≥ 1|A i, m n > log n (3) Note: This is not a true mathematical inverse. Intuition: Grows about as slowly as Ackermann’s function does fast. How slowly? Let m n = k, then m ≥ n → k ≥ 1 67 / 114
  • 165. Inverse of Ackermann’s function Definition α(m, n) = min i ≥ 1|A i, m n > log n (3) Note: This is not a true mathematical inverse. Intuition: Grows about as slowly as Ackermann’s function does fast. How slowly? Let m n = k, then m ≥ n → k ≥ 1 67 / 114
  • 166. Inverse of Ackermann’s function Definition α(m, n) = min i ≥ 1|A i, m n > log n (3) Note: This is not a true mathematical inverse. Intuition: Grows about as slowly as Ackermann’s function does fast. How slowly? Let m n = k, then m ≥ n → k ≥ 1 67 / 114
  • 167. Thus First We can show that A(i, k) ≥ A(i, 1) for all i ≥ 1. This is left to you... For Example Consider i = 4, then A(i, k) ≥ A(4, 1) = 2 2...2 10 ≈ 1080. Finally if log n < 1080. i.e., if n < 21080 =⇒ α(m, n) ≤ 4 68 / 114
  • 168. Thus First We can show that A(i, k) ≥ A(i, 1) for all i ≥ 1. This is left to you... For Example Consider i = 4, then A(i, k) ≥ A(4, 1) = 2 2...2 10 ≈ 1080. Finally if log n < 1080. i.e., if n < 21080 =⇒ α(m, n) ≤ 4 68 / 114
  • 169. Thus First We can show that A(i, k) ≥ A(i, 1) for all i ≥ 1. This is left to you... For Example Consider i = 4, then A(i, k) ≥ A(4, 1) = 2 2...2 10 ≈ 1080. Finally if log n < 1080. i.e., if n < 21080 =⇒ α(m, n) ≤ 4 68 / 114
  • 170. Thus First We can show that A(i, k) ≥ A(i, 1) for all i ≥ 1. This is left to you... For Example Consider i = 4, then A(i, k) ≥ A(4, 1) = 2 2...2 10 ≈ 1080. Finally if log n < 1080. i.e., if n < 21080 =⇒ α(m, n) ≤ 4 68 / 114
  • 171. Thus First We can show that A(i, k) ≥ A(i, 1) for all i ≥ 1. This is left to you... For Example Consider i = 4, then A(i, k) ≥ A(4, 1) = 2 2...2 10 ≈ 1080. Finally if log n < 1080. i.e., if n < 21080 =⇒ α(m, n) ≤ 4 68 / 114
  • 172. Instead of Using the Ackermann Inverse We define the following function log∗ n = min i ≥ 0| log(i) n ≤ 1 (4) The i means log · · · log n i times Then We will establish O (m log∗ n) as upper bound. 69 / 114
  • 173. Instead of Using the Ackermann Inverse We define the following function log∗ n = min i ≥ 0| log(i) n ≤ 1 (4) The i means log · · · log n i times Then We will establish O (m log∗ n) as upper bound. 69 / 114
  • 174. In particular Something Notable In particular, we have that log∗ 2 2... 2 k = k + 1 For Example log∗ 265536 = 2 2222 4 = 5 (5) Therefore We have that log∗ n ≤ 5 for all practical purposes. 70 / 114
  • 175. In particular Something Notable In particular, we have that log∗ 2 2... 2 k = k + 1 For Example log∗ 265536 = 2 2222 4 = 5 (5) Therefore We have that log∗ n ≤ 5 for all practical purposes. 70 / 114
  • 176. In particular Something Notable In particular, we have that log∗ 2 2... 2 k = k + 1 For Example log∗ 265536 = 2 2222 4 = 5 (5) Therefore We have that log∗ n ≤ 5 for all practical purposes. 70 / 114
  • 177. The Rank Observation Something Notable It is that once somebody becomes a child of another node their rank does not change given any posterior operation. 71 / 114
  • 178. For Example The number in the right is the height MakeSet(1),MakeSet(2),MakeSet(3), ..., MakeSet(10) 1/0 2/0 3/0 4/0 5/0 6/0 7/0 8/0 9/0 10/0 72 / 114
  • 179. Example Now, we do Union(6, 1),Union(7, 2), ..., Union(10, 1) 1/1 2/1 3/1 4/1 5/1 6/0 7/0 8/0 9/0 10/0 73 / 114
  • 180. Example Next - Assuming that you are using a FindSet to get the name set Union(1, 2) 3/1 4/1 5/1 8/0 9/0 10/0 2/2 7/01/1 6/0 74 / 114
  • 183. Example Now you give a FindSet(8) 2/2 1/1 6/0 7/0 4/3 9/0 5/1 10/03/1 8/0 77 / 114
  • 184. Example Now you give a Union(4, 5) 2/2 1/1 6/0 7/0 4/3 9/03/1 8/0 5/1 10/0 78 / 114
  • 185. Properties of ranks Lemma 1 (About the Rank Properties) 1 ∀x, rank[x] ≤ rank[p[x]]. 2 ∀x and x = p[x], then rank[x] < rank[p[x]]. 3 rank[x] is initially 0. 4 rank[x] does not decrease. 5 Once x = p[x] holds rank[x] does not change. 6 rank[p[x]] is a monotonically increasing function of time. Proof By induction on the number of operations... 79 / 114
  • 186. Properties of ranks Lemma 1 (About the Rank Properties) 1 ∀x, rank[x] ≤ rank[p[x]]. 2 ∀x and x = p[x], then rank[x] < rank[p[x]]. 3 rank[x] is initially 0. 4 rank[x] does not decrease. 5 Once x = p[x] holds rank[x] does not change. 6 rank[p[x]] is a monotonically increasing function of time. Proof By induction on the number of operations... 79 / 114
  • 187. Properties of ranks Lemma 1 (About the Rank Properties) 1 ∀x, rank[x] ≤ rank[p[x]]. 2 ∀x and x = p[x], then rank[x] < rank[p[x]]. 3 rank[x] is initially 0. 4 rank[x] does not decrease. 5 Once x = p[x] holds rank[x] does not change. 6 rank[p[x]] is a monotonically increasing function of time. Proof By induction on the number of operations... 79 / 114
  • 188. Properties of ranks Lemma 1 (About the Rank Properties) 1 ∀x, rank[x] ≤ rank[p[x]]. 2 ∀x and x = p[x], then rank[x] < rank[p[x]]. 3 rank[x] is initially 0. 4 rank[x] does not decrease. 5 Once x = p[x] holds rank[x] does not change. 6 rank[p[x]] is a monotonically increasing function of time. Proof By induction on the number of operations... 79 / 114
  • 189. Properties of ranks Lemma 1 (About the Rank Properties) 1 ∀x, rank[x] ≤ rank[p[x]]. 2 ∀x and x = p[x], then rank[x] < rank[p[x]]. 3 rank[x] is initially 0. 4 rank[x] does not decrease. 5 Once x = p[x] holds rank[x] does not change. 6 rank[p[x]] is a monotonically increasing function of time. Proof By induction on the number of operations... 79 / 114
  • 190. Properties of ranks Lemma 1 (About the Rank Properties) 1 ∀x, rank[x] ≤ rank[p[x]]. 2 ∀x and x = p[x], then rank[x] < rank[p[x]]. 3 rank[x] is initially 0. 4 rank[x] does not decrease. 5 Once x = p[x] holds rank[x] does not change. 6 rank[p[x]] is a monotonically increasing function of time. Proof By induction on the number of operations... 79 / 114
  • 191. Properties of ranks Lemma 1 (About the Rank Properties) 1 ∀x, rank[x] ≤ rank[p[x]]. 2 ∀x and x = p[x], then rank[x] < rank[p[x]]. 3 rank[x] is initially 0. 4 rank[x] does not decrease. 5 Once x = p[x] holds rank[x] does not change. 6 rank[p[x]] is a monotonically increasing function of time. Proof By induction on the number of operations... 79 / 114
  • 192. For Example Imagine a MakeSet(x) Then, rank [x] ≤ rank [p [x]] Thus, it is true after n operations. The we get the n + 1 operations that can be: Case I - FindSet. Case II - Union. The rest are for you to prove It is a good mental exercise!!! 80 / 114
  • 193. For Example Imagine a MakeSet(x) Then, rank [x] ≤ rank [p [x]] Thus, it is true after n operations. The we get the n + 1 operations that can be: Case I - FindSet. Case II - Union. The rest are for you to prove It is a good mental exercise!!! 80 / 114
  • 194. For Example Imagine a MakeSet(x) Then, rank [x] ≤ rank [p [x]] Thus, it is true after n operations. The we get the n + 1 operations that can be: Case I - FindSet. Case II - Union. The rest are for you to prove It is a good mental exercise!!! 80 / 114
  • 195. For Example Imagine a MakeSet(x) Then, rank [x] ≤ rank [p [x]] Thus, it is true after n operations. The we get the n + 1 operations that can be: Case I - FindSet. Case II - Union. The rest are for you to prove It is a good mental exercise!!! 80 / 114
  • 196. For Example Imagine a MakeSet(x) Then, rank [x] ≤ rank [p [x]] Thus, it is true after n operations. The we get the n + 1 operations that can be: Case I - FindSet. Case II - Union. The rest are for you to prove It is a good mental exercise!!! 80 / 114
  • 197. The Number of Nodes in a Tree Lemma 2 For all tree roots x, size(x) ≥ 2rank[x] Note size (x)= Number of nodes in tree rooted at x Proof By induction on the number of link operations: Basis Step Before first link, all ranks are 0 and each tree contains one node. Inductive Step Consider linking x and y (Link (x, y)) Assume lemma holds before this operation; we show that it will holds after. 81 / 114
  • 198. The Number of Nodes in a Tree Lemma 2 For all tree roots x, size(x) ≥ 2rank[x] Note size (x)= Number of nodes in tree rooted at x Proof By induction on the number of link operations: Basis Step Before first link, all ranks are 0 and each tree contains one node. Inductive Step Consider linking x and y (Link (x, y)) Assume lemma holds before this operation; we show that it will holds after. 81 / 114
  • 199. The Number of Nodes in a Tree Lemma 2 For all tree roots x, size(x) ≥ 2rank[x] Note size (x)= Number of nodes in tree rooted at x Proof By induction on the number of link operations: Basis Step Before first link, all ranks are 0 and each tree contains one node. Inductive Step Consider linking x and y (Link (x, y)) Assume lemma holds before this operation; we show that it will holds after. 81 / 114
  • 200. The Number of Nodes in a Tree Lemma 2 For all tree roots x, size(x) ≥ 2rank[x] Note size (x)= Number of nodes in tree rooted at x Proof By induction on the number of link operations: Basis Step Before first link, all ranks are 0 and each tree contains one node. Inductive Step Consider linking x and y (Link (x, y)) Assume lemma holds before this operation; we show that it will holds after. 81 / 114
  • 201. The Number of Nodes in a Tree Lemma 2 For all tree roots x, size(x) ≥ 2rank[x] Note size (x)= Number of nodes in tree rooted at x Proof By induction on the number of link operations: Basis Step Before first link, all ranks are 0 and each tree contains one node. Inductive Step Consider linking x and y (Link (x, y)) Assume lemma holds before this operation; we show that it will holds after. 81 / 114
  • 202. The Number of Nodes in a Tree Lemma 2 For all tree roots x, size(x) ≥ 2rank[x] Note size (x)= Number of nodes in tree rooted at x Proof By induction on the number of link operations: Basis Step Before first link, all ranks are 0 and each tree contains one node. Inductive Step Consider linking x and y (Link (x, y)) Assume lemma holds before this operation; we show that it will holds after. 81 / 114
  • 203. The Number of Nodes in a Tree Lemma 2 For all tree roots x, size(x) ≥ 2rank[x] Note size (x)= Number of nodes in tree rooted at x Proof By induction on the number of link operations: Basis Step Before first link, all ranks are 0 and each tree contains one node. Inductive Step Consider linking x and y (Link (x, y)) Assume lemma holds before this operation; we show that it will holds after. 81 / 114
  • 204. Case 1: rank[x] = rank[y] Assume rank [x] < rank [y] Note: rank [x] == rank [x] and rank [y] == rank [y] 82 / 114
  • 205. Therefore We have that size (y) = size (x) + size (y) ≥ 2rank[x] + 2rank[y] ≥ 2rank[y] = 2rank [y] 83 / 114
  • 206. Therefore We have that size (y) = size (x) + size (y) ≥ 2rank[x] + 2rank[y] ≥ 2rank[y] = 2rank [y] 83 / 114
  • 207. Therefore We have that size (y) = size (x) + size (y) ≥ 2rank[x] + 2rank[y] ≥ 2rank[y] = 2rank [y] 83 / 114
  • 208. Therefore We have that size (y) = size (x) + size (y) ≥ 2rank[x] + 2rank[y] ≥ 2rank[y] = 2rank [y] 83 / 114
  • 209. Case 2: rank[x] == rank[y] Assume rank [x] == rank [y] Note: rank [x] == rank [x] and rank [y] == rank [y] + 1 84 / 114
  • 210. Therefore We have that size (y) = size (x) + size (y) ≥ 2rank[x] + 2rank[y] ≥ 2rank[y]+1 = 2rank [y] Note: In the worst case rank [x] == rank [y] == 0 85 / 114
  • 211. Therefore We have that size (y) = size (x) + size (y) ≥ 2rank[x] + 2rank[y] ≥ 2rank[y]+1 = 2rank [y] Note: In the worst case rank [x] == rank [y] == 0 85 / 114
  • 212. Therefore We have that size (y) = size (x) + size (y) ≥ 2rank[x] + 2rank[y] ≥ 2rank[y]+1 = 2rank [y] Note: In the worst case rank [x] == rank [y] == 0 85 / 114
  • 213. Therefore We have that size (y) = size (x) + size (y) ≥ 2rank[x] + 2rank[y] ≥ 2rank[y]+1 = 2rank [y] Note: In the worst case rank [x] == rank [y] == 0 85 / 114
  • 214. Therefore We have that size (y) = size (x) + size (y) ≥ 2rank[x] + 2rank[y] ≥ 2rank[y]+1 = 2rank [y] Note: In the worst case rank [x] == rank [y] == 0 85 / 114
  • 215. The number of nodes at certain rank Lemma 3 For any integer r ≥ 0, there are an most n 2r nodes of rank r. Proof First fix r. When rank r is assigned to some node x, then imagine that you label each node in the tree rooted at x by “x.” By lemma 21.3, 2r or more nodes are labeled each time when executing a union. By lemma 21.2, each node is labeled at most once, when its root is first assigned rank r. If there were more than n 2r nodes of rank r. Then, we will have that more than 2r · n 2r = n nodes would be labeled by a node of rank r, a contradiction. 86 / 114
  • 216. The number of nodes at certain rank Lemma 3 For any integer r ≥ 0, there are an most n 2r nodes of rank r. Proof First fix r. When rank r is assigned to some node x, then imagine that you label each node in the tree rooted at x by “x.” By lemma 21.3, 2r or more nodes are labeled each time when executing a union. By lemma 21.2, each node is labeled at most once, when its root is first assigned rank r. If there were more than n 2r nodes of rank r. Then, we will have that more than 2r · n 2r = n nodes would be labeled by a node of rank r, a contradiction. 86 / 114
  • 217. The number of nodes at certain rank Lemma 3 For any integer r ≥ 0, there are an most n 2r nodes of rank r. Proof First fix r. When rank r is assigned to some node x, then imagine that you label each node in the tree rooted at x by “x.” By lemma 21.3, 2r or more nodes are labeled each time when executing a union. By lemma 21.2, each node is labeled at most once, when its root is first assigned rank r. If there were more than n 2r nodes of rank r. Then, we will have that more than 2r · n 2r = n nodes would be labeled by a node of rank r, a contradiction. 86 / 114
  • 218. The number of nodes at certain rank Lemma 3 For any integer r ≥ 0, there are an most n 2r nodes of rank r. Proof First fix r. When rank r is assigned to some node x, then imagine that you label each node in the tree rooted at x by “x.” By lemma 21.3, 2r or more nodes are labeled each time when executing a union. By lemma 21.2, each node is labeled at most once, when its root is first assigned rank r. If there were more than n 2r nodes of rank r. Then, we will have that more than 2r · n 2r = n nodes would be labeled by a node of rank r, a contradiction. 86 / 114
  • 219. The number of nodes at certain rank Lemma 3 For any integer r ≥ 0, there are an most n 2r nodes of rank r. Proof First fix r. When rank r is assigned to some node x, then imagine that you label each node in the tree rooted at x by “x.” By lemma 21.3, 2r or more nodes are labeled each time when executing a union. By lemma 21.2, each node is labeled at most once, when its root is first assigned rank r. If there were more than n 2r nodes of rank r. Then, we will have that more than 2r · n 2r = n nodes would be labeled by a node of rank r, a contradiction. 86 / 114
  • 220. The number of nodes at certain rank Lemma 3 For any integer r ≥ 0, there are an most n 2r nodes of rank r. Proof First fix r. When rank r is assigned to some node x, then imagine that you label each node in the tree rooted at x by “x.” By lemma 21.3, 2r or more nodes are labeled each time when executing a union. By lemma 21.2, each node is labeled at most once, when its root is first assigned rank r. If there were more than n 2r nodes of rank r. Then, we will have that more than 2r · n 2r = n nodes would be labeled by a node of rank r, a contradiction. 86 / 114
  • 221. The number of nodes at certain rank Lemma 3 For any integer r ≥ 0, there are an most n 2r nodes of rank r. Proof First fix r. When rank r is assigned to some node x, then imagine that you label each node in the tree rooted at x by “x.” By lemma 21.3, 2r or more nodes are labeled each time when executing a union. By lemma 21.2, each node is labeled at most once, when its root is first assigned rank r. If there were more than n 2r nodes of rank r. Then, we will have that more than 2r · n 2r = n nodes would be labeled by a node of rank r, a contradiction. 86 / 114
  • 222. Corollary 1 Corollary 1 Every node has rank at most log n . Proof if there is a rank r such that r > log n → n 2r < 1 nodes of rank r a contradiction. 87 / 114
  • 223. Corollary 1 Corollary 1 Every node has rank at most log n . Proof if there is a rank r such that r > log n → n 2r < 1 nodes of rank r a contradiction. 87 / 114
  • 224. Providing the time bound Lemma 4 (Lemma 21.7) Suppose we convert a sequence S of m MakeSet, Union and FindSet operations into a sequence S of m MakeSet, Link, and FindSet operations by turning each Union into two FindSet operations followed by a Link. Then, if sequence S runs in O(m log∗ n) time, sequence S runs in O(m log∗ n) time. 88 / 114
  • 225. Proof: The proof is quite easy 1 Since each UNION operation in sequence S is converted into three operations in S. m ≤ m ≤ 3m (6) 2 We have that m = O (m ) 3 Then, if the new sequence S runs in O (m log∗ n) this implies that the old sequence S runs in O (m log∗ n) 89 / 114
  • 226. Proof: The proof is quite easy 1 Since each UNION operation in sequence S is converted into three operations in S. m ≤ m ≤ 3m (6) 2 We have that m = O (m ) 3 Then, if the new sequence S runs in O (m log∗ n) this implies that the old sequence S runs in O (m log∗ n) 89 / 114
  • 227. Proof: The proof is quite easy 1 Since each UNION operation in sequence S is converted into three operations in S. m ≤ m ≤ 3m (6) 2 We have that m = O (m ) 3 Then, if the new sequence S runs in O (m log∗ n) this implies that the old sequence S runs in O (m log∗ n) 89 / 114
  • 228. Theorem for Union by Rank and Path Compression Theorem Any sequence of m MakeSet, Link, and FindSet operations, n of which are MakeSet operations, is performed in worst-case time O(m log∗ n). Proof First, MakeSet and Link take O(1) time. The Key of the Analysis is to Accurately Charging FindSet. 90 / 114
  • 229. Theorem for Union by Rank and Path Compression Theorem Any sequence of m MakeSet, Link, and FindSet operations, n of which are MakeSet operations, is performed in worst-case time O(m log∗ n). Proof First, MakeSet and Link take O(1) time. The Key of the Analysis is to Accurately Charging FindSet. 90 / 114
  • 230. Theorem for Union by Rank and Path Compression Theorem Any sequence of m MakeSet, Link, and FindSet operations, n of which are MakeSet operations, is performed in worst-case time O(m log∗ n). Proof First, MakeSet and Link take O(1) time. The Key of the Analysis is to Accurately Charging FindSet. 90 / 114
  • 231. For this, we have the following We can do the following Partition ranks into blocks. Put each rank j into block log∗ r for r = 0, 1, ..., log n (Corollary 1). Highest-numbered block is log∗ (log n) = (log∗ n) − 1. In addition, the cost of FindSet pays for the foollowing situations 1 The FindSet pays for the cost of the root and its child. 2 A bill is given to every node whose rank parent changes in the path compression!!! 91 / 114
  • 232. For this, we have the following We can do the following Partition ranks into blocks. Put each rank j into block log∗ r for r = 0, 1, ..., log n (Corollary 1). Highest-numbered block is log∗ (log n) = (log∗ n) − 1. In addition, the cost of FindSet pays for the foollowing situations 1 The FindSet pays for the cost of the root and its child. 2 A bill is given to every node whose rank parent changes in the path compression!!! 91 / 114
  • 233. For this, we have the following We can do the following Partition ranks into blocks. Put each rank j into block log∗ r for r = 0, 1, ..., log n (Corollary 1). Highest-numbered block is log∗ (log n) = (log∗ n) − 1. In addition, the cost of FindSet pays for the foollowing situations 1 The FindSet pays for the cost of the root and its child. 2 A bill is given to every node whose rank parent changes in the path compression!!! 91 / 114
  • 234. For this, we have the following We can do the following Partition ranks into blocks. Put each rank j into block log∗ r for r = 0, 1, ..., log n (Corollary 1). Highest-numbered block is log∗ (log n) = (log∗ n) − 1. In addition, the cost of FindSet pays for the foollowing situations 1 The FindSet pays for the cost of the root and its child. 2 A bill is given to every node whose rank parent changes in the path compression!!! 91 / 114
  • 235. For this, we have the following We can do the following Partition ranks into blocks. Put each rank j into block log∗ r for r = 0, 1, ..., log n (Corollary 1). Highest-numbered block is log∗ (log n) = (log∗ n) − 1. In addition, the cost of FindSet pays for the foollowing situations 1 The FindSet pays for the cost of the root and its child. 2 A bill is given to every node whose rank parent changes in the path compression!!! 91 / 114
  • 236. Now, define the Block function Define the following Upper Bound Function B(j) ≡    −1 if j = −1 1 if j = 0 2 if j = 1 2 2... 2 j−1 if j ≥ 2 92 / 114
  • 237. First Something Notable These are going to be the upper bounds for blocks in the ranks Where For j = 0, 1, ..., log∗ n − 1, block j consist of the set of ranks: B(j − 1) + 1, B(j − 1) + 2, ..., B(j) Elements in Block j (7) 93 / 114
  • 238. First Something Notable These are going to be the upper bounds for blocks in the ranks Where For j = 0, 1, ..., log∗ n − 1, block j consist of the set of ranks: B(j − 1) + 1, B(j − 1) + 2, ..., B(j) Elements in Block j (7) 93 / 114
  • 239. For Example We have that B(−1) = 1 B(0) = 0 B(1) = 2 B(2) = 22 = 4 B(3) = 222 = 24 = 16 B(4) = 2222 = 216 = 65536 94 / 114
  • 240. For Example We have that B(−1) = 1 B(0) = 0 B(1) = 2 B(2) = 22 = 4 B(3) = 222 = 24 = 16 B(4) = 2222 = 216 = 65536 94 / 114
  • 241. For Example We have that B(−1) = 1 B(0) = 0 B(1) = 2 B(2) = 22 = 4 B(3) = 222 = 24 = 16 B(4) = 2222 = 216 = 65536 94 / 114
  • 242. For Example We have that B(−1) = 1 B(0) = 0 B(1) = 2 B(2) = 22 = 4 B(3) = 222 = 24 = 16 B(4) = 2222 = 216 = 65536 94 / 114
  • 243. For Example We have that B(−1) = 1 B(0) = 0 B(1) = 2 B(2) = 22 = 4 B(3) = 222 = 24 = 16 B(4) = 2222 = 216 = 65536 94 / 114
  • 244. For Example We have that B(−1) = 1 B(0) = 0 B(1) = 2 B(2) = 22 = 4 B(3) = 222 = 24 = 16 B(4) = 2222 = 216 = 65536 94 / 114
  • 245. For Example Thus, we have Block j Set of Ranks 0 0,1 1 2 2 3,4 3 5,...,16 4 17,...,65536 ... ... Note B(j) = 2B(j−1) for j > 0. 95 / 114
  • 246. For Example Thus, we have Block j Set of Ranks 0 0,1 1 2 2 3,4 3 5,...,16 4 17,...,65536 ... ... Note B(j) = 2B(j−1) for j > 0. 95 / 114
  • 247. Example Now you give a Union(4, 5) 2/2 1/1 6/0 7/0 4/3 9/03/1 8/0 5/1 10/0 Bock 0 Bock 1 Bock 2 96 / 114
  • 248. Finally Given our Bound in the Ranks Thus, all the blocks from B (0) to B (log∗ n − 1) will be used for storing the ranking elements 97 / 114
  • 249. Charging for FindSets Two types of charges for FindSet(x0) Block charges and Path charges. Charge each node as either: 1) Block Charge 2) Path Charge 98 / 114
  • 250. Charging for FindSets Thus, for find sets The find operation pays for the work done for the root and its immediate child. It also pays for all the nodes which are not in the same block as their parents. 99 / 114
  • 251. Charging for FindSets Thus, for find sets The find operation pays for the work done for the root and its immediate child. It also pays for all the nodes which are not in the same block as their parents. 99 / 114
  • 252. Then First 1 All these nodes are children of some other nodes, so their ranks will not change and they are bound to stay in the same block until the end of the computation. 2 If a node is in the same block as its parent, it will be charged for the work done in the FindSet Operation!!! 100 / 114
  • 253. Then First 1 All these nodes are children of some other nodes, so their ranks will not change and they are bound to stay in the same block until the end of the computation. 2 If a node is in the same block as its parent, it will be charged for the work done in the FindSet Operation!!! 100 / 114
  • 254. Thus We have the following charges Block Charge : For j = 0, 1, ..., log∗ n − 1, give one block charge to the last node with rank in block j on the path x0, x1, ..., xl. Also give one block charge to the child of the root, i.e., xl−1, and the root itself, i.e., xl−1. Path Charge : Give nodes in x0, ..., xl a path charge until they are moved to point to a name element with a rank different from the child’s block 101 / 114
  • 255. Thus We have the following charges Block Charge : For j = 0, 1, ..., log∗ n − 1, give one block charge to the last node with rank in block j on the path x0, x1, ..., xl. Also give one block charge to the child of the root, i.e., xl−1, and the root itself, i.e., xl−1. Path Charge : Give nodes in x0, ..., xl a path charge until they are moved to point to a name element with a rank different from the child’s block 101 / 114
  • 256. Thus We have the following charges Block Charge : For j = 0, 1, ..., log∗ n − 1, give one block charge to the last node with rank in block j on the path x0, x1, ..., xl. Also give one block charge to the child of the root, i.e., xl−1, and the root itself, i.e., xl−1. Path Charge : Give nodes in x0, ..., xl a path charge until they are moved to point to a name element with a rank different from the child’s block 101 / 114
  • 257. Thus We have the following charges Block Charge : For j = 0, 1, ..., log∗ n − 1, give one block charge to the last node with rank in block j on the path x0, x1, ..., xl. Also give one block charge to the child of the root, i.e., xl−1, and the root itself, i.e., xl−1. Path Charge : Give nodes in x0, ..., xl a path charge until they are moved to point to a name element with a rank different from the child’s block 101 / 114
  • 258. Thus We have the following charges Block Charge : For j = 0, 1, ..., log∗ n − 1, give one block charge to the last node with rank in block j on the path x0, x1, ..., xl. Also give one block charge to the child of the root, i.e., xl−1, and the root itself, i.e., xl−1. Path Charge : Give nodes in x0, ..., xl a path charge until they are moved to point to a name element with a rank different from the child’s block 101 / 114
  • 259. Charging for FindSets Two types of charges for FindSet(x0) Block charges and Path charges. Charge each node as either: 1) Block Charge 2) Path Charge 102 / 114
  • 260. Charging for FindSets Two types of charges for FindSet(x0) Block charges and Path charges. Charge each node as either: 1) Block Charge 2) Path Charge 102 / 114
  • 261. Next Something Notable Number of nodes whose parents are in different blocks is limited by (log∗ n) − 1. Making it an upper bound for the charges when changing the last node with rank in block j. 2 charges for the root and its child. Thus The cost of the Block Charges for the FindSet operation is upper bounded by: log∗ n − 1 + 2 = log∗ n + 1. (8) 103 / 114
  • 262. Next Something Notable Number of nodes whose parents are in different blocks is limited by (log∗ n) − 1. Making it an upper bound for the charges when changing the last node with rank in block j. 2 charges for the root and its child. Thus The cost of the Block Charges for the FindSet operation is upper bounded by: log∗ n − 1 + 2 = log∗ n + 1. (8) 103 / 114
  • 263. Next Something Notable Number of nodes whose parents are in different blocks is limited by (log∗ n) − 1. Making it an upper bound for the charges when changing the last node with rank in block j. 2 charges for the root and its child. Thus The cost of the Block Charges for the FindSet operation is upper bounded by: log∗ n − 1 + 2 = log∗ n + 1. (8) 103 / 114
  • 264. Next Something Notable Number of nodes whose parents are in different blocks is limited by (log∗ n) − 1. Making it an upper bound for the charges when changing the last node with rank in block j. 2 charges for the root and its child. Thus The cost of the Block Charges for the FindSet operation is upper bounded by: log∗ n − 1 + 2 = log∗ n + 1. (8) 103 / 114
  • 265. Next Something Notable Number of nodes whose parents are in different blocks is limited by (log∗ n) − 1. Making it an upper bound for the charges when changing the last node with rank in block j. 2 charges for the root and its child. Thus The cost of the Block Charges for the FindSet operation is upper bounded by: log∗ n − 1 + 2 = log∗ n + 1. (8) 103 / 114
  • 266. Claim Claim Once a node other than a root or its child is given a Block Charge (B.C.), it will never be given a Path Charge (P.C.) 104 / 114
  • 267. Proof Proof Given a node x, we know that: rank [p [x]] − rank [x] is monotonically increasing ⇒ log∗ rank [p [x]] − log∗ rank [x] is monotonically increasing. Thus, Once x and p[x] are in different blocks, they will always be in different blocks because: The rank of the parent can only increases. And the child’s rank stays the same Thus, the node x will be billed in the first FindSet operation a patch charge and block charge if necessary. Thus, the node x will never be charged again a path charge because is already pointing to the member set name. 105 / 114
  • 268. Proof Proof Given a node x, we know that: rank [p [x]] − rank [x] is monotonically increasing ⇒ log∗ rank [p [x]] − log∗ rank [x] is monotonically increasing. Thus, Once x and p[x] are in different blocks, they will always be in different blocks because: The rank of the parent can only increases. And the child’s rank stays the same Thus, the node x will be billed in the first FindSet operation a patch charge and block charge if necessary. Thus, the node x will never be charged again a path charge because is already pointing to the member set name. 105 / 114
  • 269. Proof Proof Given a node x, we know that: rank [p [x]] − rank [x] is monotonically increasing ⇒ log∗ rank [p [x]] − log∗ rank [x] is monotonically increasing. Thus, Once x and p[x] are in different blocks, they will always be in different blocks because: The rank of the parent can only increases. And the child’s rank stays the same Thus, the node x will be billed in the first FindSet operation a patch charge and block charge if necessary. Thus, the node x will never be charged again a path charge because is already pointing to the member set name. 105 / 114
  • 270. Proof Proof Given a node x, we know that: rank [p [x]] − rank [x] is monotonically increasing ⇒ log∗ rank [p [x]] − log∗ rank [x] is monotonically increasing. Thus, Once x and p[x] are in different blocks, they will always be in different blocks because: The rank of the parent can only increases. And the child’s rank stays the same Thus, the node x will be billed in the first FindSet operation a patch charge and block charge if necessary. Thus, the node x will never be charged again a path charge because is already pointing to the member set name. 105 / 114
  • 271. Proof Proof Given a node x, we know that: rank [p [x]] − rank [x] is monotonically increasing ⇒ log∗ rank [p [x]] − log∗ rank [x] is monotonically increasing. Thus, Once x and p[x] are in different blocks, they will always be in different blocks because: The rank of the parent can only increases. And the child’s rank stays the same Thus, the node x will be billed in the first FindSet operation a patch charge and block charge if necessary. Thus, the node x will never be charged again a path charge because is already pointing to the member set name. 105 / 114
  • 272. Proof Proof Given a node x, we know that: rank [p [x]] − rank [x] is monotonically increasing ⇒ log∗ rank [p [x]] − log∗ rank [x] is monotonically increasing. Thus, Once x and p[x] are in different blocks, they will always be in different blocks because: The rank of the parent can only increases. And the child’s rank stays the same Thus, the node x will be billed in the first FindSet operation a patch charge and block charge if necessary. Thus, the node x will never be charged again a path charge because is already pointing to the member set name. 105 / 114
  • 273. Proof Proof Given a node x, we know that: rank [p [x]] − rank [x] is monotonically increasing ⇒ log∗ rank [p [x]] − log∗ rank [x] is monotonically increasing. Thus, Once x and p[x] are in different blocks, they will always be in different blocks because: The rank of the parent can only increases. And the child’s rank stays the same Thus, the node x will be billed in the first FindSet operation a patch charge and block charge if necessary. Thus, the node x will never be charged again a path charge because is already pointing to the member set name. 105 / 114
  • 274. Remaining Goal The Total cost of the FindSet’s Operations Total cost of FindSet’s = Total Block Charges + Total Path Charges. We want to show Total Block Charges + Total Path Charges= O(m log∗ n) 106 / 114
  • 275. Remaining Goal The Total cost of the FindSet’s Operations Total cost of FindSet’s = Total Block Charges + Total Path Charges. We want to show Total Block Charges + Total Path Charges= O(m log∗ n) 106 / 114
  • 276. Bounding Block Charges This part is easy Block numbers range over 0, ..., log∗ n − 1. The number of Block Charges per FindSet is ≤ log∗ n + 1 . The total number of FindSet’s is ≤ m The total number of Block Charges is ≤ m(log∗ n + 1) . 107 / 114
  • 277. Bounding Block Charges This part is easy Block numbers range over 0, ..., log∗ n − 1. The number of Block Charges per FindSet is ≤ log∗ n + 1 . The total number of FindSet’s is ≤ m The total number of Block Charges is ≤ m(log∗ n + 1) . 107 / 114
  • 278. Bounding Block Charges This part is easy Block numbers range over 0, ..., log∗ n − 1. The number of Block Charges per FindSet is ≤ log∗ n + 1 . The total number of FindSet’s is ≤ m The total number of Block Charges is ≤ m(log∗ n + 1) . 107 / 114
  • 279. Bounding Block Charges This part is easy Block numbers range over 0, ..., log∗ n − 1. The number of Block Charges per FindSet is ≤ log∗ n + 1 . The total number of FindSet’s is ≤ m The total number of Block Charges is ≤ m(log∗ n + 1) . 107 / 114
  • 280. Bounding Path Charges Claim Let N(j) be the number of nodes whose ranks are in block j. Then, for all j ≥ 0, N(j) ≤ 3n 2B(j) Proof By Lemma 3, N(j) ≤ B(j) r=B(j−1)+1 n 2r summing over all possible ranks For j = 0: N (0) ≤ n 20 + n 2 = 3n 2 = 3n 2B(0) 108 / 114
  • 281. Bounding Path Charges Claim Let N(j) be the number of nodes whose ranks are in block j. Then, for all j ≥ 0, N(j) ≤ 3n 2B(j) Proof By Lemma 3, N(j) ≤ B(j) r=B(j−1)+1 n 2r summing over all possible ranks For j = 0: N (0) ≤ n 20 + n 2 = 3n 2 = 3n 2B(0) 108 / 114
  • 282. Bounding Path Charges Claim Let N(j) be the number of nodes whose ranks are in block j. Then, for all j ≥ 0, N(j) ≤ 3n 2B(j) Proof By Lemma 3, N(j) ≤ B(j) r=B(j−1)+1 n 2r summing over all possible ranks For j = 0: N (0) ≤ n 20 + n 2 = 3n 2 = 3n 2B(0) 108 / 114
  • 283. Bounding Path Charges Claim Let N(j) be the number of nodes whose ranks are in block j. Then, for all j ≥ 0, N(j) ≤ 3n 2B(j) Proof By Lemma 3, N(j) ≤ B(j) r=B(j−1)+1 n 2r summing over all possible ranks For j = 0: N (0) ≤ n 20 + n 2 = 3n 2 = 3n 2B(0) 108 / 114
  • 284. Bounding Path Charges Claim Let N(j) be the number of nodes whose ranks are in block j. Then, for all j ≥ 0, N(j) ≤ 3n 2B(j) Proof By Lemma 3, N(j) ≤ B(j) r=B(j−1)+1 n 2r summing over all possible ranks For j = 0: N (0) ≤ n 20 + n 2 = 3n 2 = 3n 2B(0) 108 / 114
  • 285. Proof of claim For j ≥ 1 N(j) ≤ n 2B(j−1)+1 B(j)−(B(j−1)+1) r=0 1 2r < n 2B(j−1)+1 ∞ r=0 1 2r = n 2B(j−1) This is where the fact that B (j) = 2B(j−1)is used. = n B(j) < 3n 2B(j) 109 / 114
  • 286. Proof of claim For j ≥ 1 N(j) ≤ n 2B(j−1)+1 B(j)−(B(j−1)+1) r=0 1 2r < n 2B(j−1)+1 ∞ r=0 1 2r = n 2B(j−1) This is where the fact that B (j) = 2B(j−1)is used. = n B(j) < 3n 2B(j) 109 / 114
  • 287. Proof of claim For j ≥ 1 N(j) ≤ n 2B(j−1)+1 B(j)−(B(j−1)+1) r=0 1 2r < n 2B(j−1)+1 ∞ r=0 1 2r = n 2B(j−1) This is where the fact that B (j) = 2B(j−1)is used. = n B(j) < 3n 2B(j) 109 / 114
  • 288. Proof of claim For j ≥ 1 N(j) ≤ n 2B(j−1)+1 B(j)−(B(j−1)+1) r=0 1 2r < n 2B(j−1)+1 ∞ r=0 1 2r = n 2B(j−1) This is where the fact that B (j) = 2B(j−1)is used. = n B(j) < 3n 2B(j) 109 / 114
  • 289. Proof of claim For j ≥ 1 N(j) ≤ n 2B(j−1)+1 B(j)−(B(j−1)+1) r=0 1 2r < n 2B(j−1)+1 ∞ r=0 1 2r = n 2B(j−1) This is where the fact that B (j) = 2B(j−1)is used. = n B(j) < 3n 2B(j) 109 / 114
  • 290. Bounding Path Charges We have the following Let P(n) denote the overall number of path charges. Then: P(n) ≤ log∗ n−1 j=0 αj · βj (9) αj is the max number of nodes with ranks in Block j βj is the max number of path charges per node of Block j. 110 / 114
  • 291. Bounding Path Charges We have the following Let P(n) denote the overall number of path charges. Then: P(n) ≤ log∗ n−1 j=0 αj · βj (9) αj is the max number of nodes with ranks in Block j βj is the max number of path charges per node of Block j. 110 / 114
  • 292. Bounding Path Charges We have the following Let P(n) denote the overall number of path charges. Then: P(n) ≤ log∗ n−1 j=0 αj · βj (9) αj is the max number of nodes with ranks in Block j βj is the max number of path charges per node of Block j. 110 / 114
  • 293. Then, we have the following Upper Bounds By claim, αj upper-bounded by 3n 2B(j) , In addition, we need to bound βj that represents the maximum number of path charges for nodes x at block j. Note: Any node in Block j that is given a P.C. will be in Block j after all m operations. 111 / 114
  • 294. Then, we have the following Upper Bounds By claim, αj upper-bounded by 3n 2B(j) , In addition, we need to bound βj that represents the maximum number of path charges for nodes x at block j. Note: Any node in Block j that is given a P.C. will be in Block j after all m operations. 111 / 114
  • 295. Then, we have the following Upper Bounds By claim, αj upper-bounded by 3n 2B(j) , In addition, we need to bound βj that represents the maximum number of path charges for nodes x at block j. Note: Any node in Block j that is given a P.C. will be in Block j after all m operations. 111 / 114
  • 296. Then, we have the following Upper Bounds By claim, αj upper-bounded by 3n 2B(j) , In addition, we need to bound βj that represents the maximum number of path charges for nodes x at block j. Note: Any node in Block j that is given a P.C. will be in Block j after all m operations. Path Compression is issued 111 / 114
  • 297. Now, we bound βj So, every time x is assessed a Path Charges, it gets a new parent with increased rank. Note: x’s rank is not changed by path compression Suppose x has a rank in Block j Repeated Path Charges to x will ultimately result in x’s parent having a rank in a Block higher than j. From that point onward, x is given Block Charges, not Path Charges. Therefore, the Worst Case x has the lowest rank in Block j, i.e., B (j − 1) + 1, and x’s parents ranks successively take on the values. B(j − 1) + 2, B(j − 1) + 3, ..., B(j) 112 / 114
  • 298. Now, we bound βj So, every time x is assessed a Path Charges, it gets a new parent with increased rank. Note: x’s rank is not changed by path compression Suppose x has a rank in Block j Repeated Path Charges to x will ultimately result in x’s parent having a rank in a Block higher than j. From that point onward, x is given Block Charges, not Path Charges. Therefore, the Worst Case x has the lowest rank in Block j, i.e., B (j − 1) + 1, and x’s parents ranks successively take on the values. B(j − 1) + 2, B(j − 1) + 3, ..., B(j) 112 / 114
  • 299. Now, we bound βj So, every time x is assessed a Path Charges, it gets a new parent with increased rank. Note: x’s rank is not changed by path compression Suppose x has a rank in Block j Repeated Path Charges to x will ultimately result in x’s parent having a rank in a Block higher than j. From that point onward, x is given Block Charges, not Path Charges. Therefore, the Worst Case x has the lowest rank in Block j, i.e., B (j − 1) + 1, and x’s parents ranks successively take on the values. B(j − 1) + 2, B(j − 1) + 3, ..., B(j) 112 / 114
  • 300. Now, we bound βj So, every time x is assessed a Path Charges, it gets a new parent with increased rank. Note: x’s rank is not changed by path compression Suppose x has a rank in Block j Repeated Path Charges to x will ultimately result in x’s parent having a rank in a Block higher than j. From that point onward, x is given Block Charges, not Path Charges. Therefore, the Worst Case x has the lowest rank in Block j, i.e., B (j − 1) + 1, and x’s parents ranks successively take on the values. B(j − 1) + 2, B(j − 1) + 3, ..., B(j) 112 / 114
  • 301. Now, we bound βj So, every time x is assessed a Path Charges, it gets a new parent with increased rank. Note: x’s rank is not changed by path compression Suppose x has a rank in Block j Repeated Path Charges to x will ultimately result in x’s parent having a rank in a Block higher than j. From that point onward, x is given Block Charges, not Path Charges. Therefore, the Worst Case x has the lowest rank in Block j, i.e., B (j − 1) + 1, and x’s parents ranks successively take on the values. B(j − 1) + 2, B(j − 1) + 3, ..., B(j) 112 / 114
  • 302. Finally Hence, x can be given at most B(j) − B(j − 1) − 1 Path Charges. Therefore: P(n) ≤ log∗ n−1 j=0 3n 2B(j)(B(j) − B(j − 1) − 1) P(n) ≤ log∗ n−1 j=0 3n 2B(j) B(j) P(n) = 3 2n log∗ n 113 / 114
  • 303. Finally Hence, x can be given at most B(j) − B(j − 1) − 1 Path Charges. Therefore: P(n) ≤ log∗ n−1 j=0 3n 2B(j)(B(j) − B(j − 1) − 1) P(n) ≤ log∗ n−1 j=0 3n 2B(j) B(j) P(n) = 3 2n log∗ n 113 / 114
  • 304. Finally Hence, x can be given at most B(j) − B(j − 1) − 1 Path Charges. Therefore: P(n) ≤ log∗ n−1 j=0 3n 2B(j)(B(j) − B(j − 1) − 1) P(n) ≤ log∗ n−1 j=0 3n 2B(j) B(j) P(n) = 3 2n log∗ n 113 / 114
  • 305. Finally Hence, x can be given at most B(j) − B(j − 1) − 1 Path Charges. Therefore: P(n) ≤ log∗ n−1 j=0 3n 2B(j)(B(j) − B(j − 1) − 1) P(n) ≤ log∗ n−1 j=0 3n 2B(j) B(j) P(n) = 3 2n log∗ n 113 / 114
  • 306. Thus FindSet operations contribute O(m(log∗ n + 1) + n log∗ n) = O(m log∗ n) (10) MakeSet and Link contribute O(n) Entire sequence takes O (m log∗ n). 114 / 114
  • 307. Thus FindSet operations contribute O(m(log∗ n + 1) + n log∗ n) = O(m log∗ n) (10) MakeSet and Link contribute O(n) Entire sequence takes O (m log∗ n). 114 / 114