CH 21

Chapter 21 discusses parallel and distributed storage in database systems, focusing on data partitioning techniques such as horizontal, vertical, round-robin, hash, and range partitioning. It evaluates these techniques based on their efficiency for various data access types, including scanning, point queries, and range queries. The chapter also addresses issues of data distribution skew and introduces the concept of virtual node partitioning to improve load balancing across nodes.

Uploaded by

udemydummyacc

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views10 pages

CH 21

Uploaded by

udemydummyacc

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 10

Chapter 21: Parallel and Distributed Storage

Database System Concepts, 7th Ed.

©Silberschatz, Korth and Sudarshan
See www.db-book.com for conditions on re-use
Parallel Storage

Database System Concepts - 7th Edition 21.2 ©Silberschatz, Korth and Sudarshan
Data Partitioning (1)
 In its simplest form, I/O parallelism refers to reducing the time required to
retrieve relations from disk by partitioning the relations on multiple disks, on
multiple nodes (servers).
• We focus on parallelism across nodes.
• Same techniques can be used across disks on a node.
 Two Main Approaches
• Horizontal partitioning: tuples of a relation are divided among many nodes
such that some subset of tuple resides on each node.
• Vertical partitioning: e.g. r(A,B,C,D) with primary key A into r1(A,B) and
r2(A,C,D) (discussed in Chapter 13).
• By default, the word partitioning refers to horizontal partitioning.

Database System Concepts - 7th Edition 21.3 ©Silberschatz, Korth and Sudarshan
Data Partitioning (2)
 Partitioning techniques (number of nodes = ):
Round-robin:
Send the th tuple inserted in the relation
to node mod .
Hash partitioning:
• Choose one or more attributes as the
partitioning attributes.
• Choose hash function with range of .
• Let denote result of hash function applied
to the partitioning attribute value of a
tuple. Send tuple to node .

 Range partitioning:
• Choose an attribute as the partitioning
attribute.
• A partitioning vector is chosen.
• If , then .
• Consider a tuple where is the
partitioning attribute.
 If , then goes to node .
 If , then goes to node .
 If , then goes to node .

Database System Concepts - 7th Edition 21.5 ©Silberschatz, Korth and Sudarshan
Comparison of Partitioning Techniques (1)
 Evaluate how well partitioning techniques (round robin,
hash partitioning, range partitioning) support the following
types of data access:
1. Scanning the entire relation.
 SQL: select * from r
2. Locating a tuple associatively (point queries).
 E.g., .
 SQL: select * from r where r.A = 25
3. Locating all tuples such that the value of a given
attribute lies within a specified range (range queries).
 E.g., .
 SQL: select * from r where 10 <= r.A and r.A < 25

Database System Concepts - 7th Edition 21.6 ©Silberschatz, Korth and Sudarshan
Comparison of Partitioning Techniques (2)
Round robin:
 Best suited for sequential scan of entire relation
on each query.
• All nodes have almost an equal number of
tuples; retrieval work is thus well balanced
between nodes.
 All queries must be processed at all nodes
Hash partitioning:
 Good for sequential access
• Assuming hash function is good, and
partitioning attributes form a key, tuples will
be equally distributed between nodes
 Good for point queries on partitioning attribute
• Can lookup single node, leaving others
available for answering other queries.
 Range queries inefficient, must be processed at
all nodes

Database System Concepts - 7th Edition 21.7 ©Silberschatz, Korth and Sudarshan
Comparison of Partitioning Techniques (3)
Range partitioning:
 Provides data clustering by partitioning attribute
value.
• Good for sequential access
• Good for point queries on partitioning
attribute: only one node needs to be
accessed.
 For range queries on partitioning attribute, one
to a few nodes may need to be accessed
• Remaining nodes are available for other
queries.

Database System Concepts - 7th Edition 21.8 ©Silberschatz, Korth and Sudarshan
Types of Skew
 Data-distribution skew: some nodes have many tuples,
while others may have fewer tuples. Could occur due to
• Attribute-value skew.
 Some partitioning-attribute values appear in many
tuples.
 All the tuples with the same value for the partitioning
attribute end up in the same partition.
 Can occur with range-partitioning and hash-
partitioning.
• Partition skew.
 Imbalance, even without attribute-value skew
 Badly chosen range-partition vector may assign too
many tuples to some partitions and too few to others.
 Less likely with hash-partitioning
 Execution skew can occur even without data distribution
skew
• E.g. relation range-partitioned on date, and most queries
access tuples with recent dates

Database System Concepts - 7th Edition 21.9 ©Silberschatz, Korth and Sudarshan
Virtual Node Partitioning
 Key idea: pretend there are several times (10x to 20x) as many virtual nodes as
real nodes
• Virtual nodes are mapped to real nodes.
• Tuples are partitioned across virtual nodes.
• It can be used to support any of the partitioning techniques discussed before.
 Mapping of virtual nodes to real nodes
• Round-robin: virtual node i mapped to real node .
• Mapping table: mapping table virtual_to_real_map[] tracks which virtual node
is on which real node
 Allows skew to be handled by moving virtual nodes from more loaded
nodes to less loaded nodes.
 Both data distribution skew and execution skew can be handled.

CH 21
No ratings yet
CH 21
44 pages
Chapter 21: Parallel Databases
No ratings yet
Chapter 21: Parallel Databases
43 pages
Chapter 20: Parallel Databases
No ratings yet
Chapter 20: Parallel Databases
6 pages
Lecture 2 Lecture PPT #3,4,5,6
No ratings yet
Lecture 2 Lecture PPT #3,4,5,6
34 pages
I/O Parallelism Interquery Parallelism Intraquery Parallelism Intraoperation Parallelism Interoperation Parallelism Design of Parallel Systems
No ratings yet
I/O Parallelism Interquery Parallelism Intraquery Parallelism Intraoperation Parallelism Interoperation Parallelism Design of Parallel Systems
42 pages
Parallel and Distributed Storage Advances
No ratings yet
Parallel and Distributed Storage Advances
43 pages
I/O Parallelism Interquery Parallelism Intraquery Parallelism Intraoperation Parallelism Interoperation Parallelism Design of Parallel Systems
No ratings yet
I/O Parallelism Interquery Parallelism Intraquery Parallelism Intraoperation Parallelism Interoperation Parallelism Design of Parallel Systems
42 pages
CH14
No ratings yet
CH14
43 pages
Chapter 22: Parallel and Distributed Query Processing: Database System Concepts, 7 Ed
No ratings yet
Chapter 22: Parallel and Distributed Query Processing: Database System Concepts, 7 Ed
79 pages
Unit I
No ratings yet
Unit I
43 pages
Parallel Dbs
No ratings yet
Parallel Dbs
42 pages
Where To Leave The Data ?: - Parallel Systems - Scalable Distributed Data Structures - Dynamic Hash Table (P2P)
No ratings yet
Where To Leave The Data ?: - Parallel Systems - Scalable Distributed Data Structures - Dynamic Hash Table (P2P)
39 pages
Where To Leave The Data ?: - Parallel Systems - Scalable Distributed Data Structures - Dynamic Hash Table (P2P)
No ratings yet
Where To Leave The Data ?: - Parallel Systems - Scalable Distributed Data Structures - Dynamic Hash Table (P2P)
39 pages
2 Parallel Databases
No ratings yet
2 Parallel Databases
44 pages
Third Year Engineering: 21BTCS604 - Advanced DBMS
No ratings yet
Third Year Engineering: 21BTCS604 - Advanced DBMS
51 pages
CH 22
No ratings yet
CH 22
34 pages
Parallel Databases
No ratings yet
Parallel Databases
19 pages
U4 - 5 I o Parallelism
No ratings yet
U4 - 5 I o Parallelism
8 pages
IO Parallelism
No ratings yet
IO Parallelism
4 pages
Ads Mse
No ratings yet
Ads Mse
22 pages
Database Parallelism Essentials
No ratings yet
Database Parallelism Essentials
46 pages
Parallel Database System
No ratings yet
Parallel Database System
55 pages
Cloud Computing Unit-3 Complete Notes 13-09-2024 Complete Notes
No ratings yet
Cloud Computing Unit-3 Complete Notes 13-09-2024 Complete Notes
25 pages
Module 3 - Parallel and Distributed Database
No ratings yet
Module 3 - Parallel and Distributed Database
22 pages
Lec 22
No ratings yet
Lec 22
45 pages
Ads QB
No ratings yet
Ads QB
17 pages
Chapter 22: Distributed Databases
No ratings yet
Chapter 22: Distributed Databases
10 pages
Chapter 14: Indexing: Database System Concepts, 7 Ed
No ratings yet
Chapter 14: Indexing: Database System Concepts, 7 Ed
29 pages
ICDE 2018 A Graph-Based Database Partitioning Method For Parallel OLAP Query Processing
No ratings yet
ICDE 2018 A Graph-Based Database Partitioning Method For Parallel OLAP Query Processing
12 pages
Chapter 10: Big Data: Database System Concepts, 7 Ed
No ratings yet
Chapter 10: Big Data: Database System Concepts, 7 Ed
14 pages
Sigmod - 15 - Locality-Aware Partitioning in Parallel Database Systems
No ratings yet
Sigmod - 15 - Locality-Aware Partitioning in Parallel Database Systems
14 pages
5 Partitioning
No ratings yet
5 Partitioning
23 pages
Dynamic Hashing and Indexing
No ratings yet
Dynamic Hashing and Indexing
24 pages
Chapter 2
No ratings yet
Chapter 2
61 pages
CH 22
No ratings yet
CH 22
93 pages
Parallel Database QA Detailed
No ratings yet
Parallel Database QA Detailed
2 pages
Data Partitioning
No ratings yet
Data Partitioning
5 pages
Data Partition Survey
No ratings yet
Data Partition Survey
23 pages
9.CSI2004-ADBMS Module2 Part1
No ratings yet
9.CSI2004-ADBMS Module2 Part1
54 pages
17 DatabaseArchitectures
No ratings yet
17 DatabaseArchitectures
41 pages
Lec 18 Notes
No ratings yet
Lec 18 Notes
1 page
M.C.a. (Sem - IV) Paper - IV - Adavanced Database Techniques
No ratings yet
M.C.a. (Sem - IV) Paper - IV - Adavanced Database Techniques
114 pages
Ch10 Big Data
No ratings yet
Ch10 Big Data
57 pages
Partition Table in STARS Concept and Evaluations
No ratings yet
Partition Table in STARS Concept and Evaluations
8 pages
Ch10 Big Data
No ratings yet
Ch10 Big Data
55 pages
Partitioning Techniques With Respect To Performance Tuning: Hash Technique Column1 Column2 Column3
No ratings yet
Partitioning Techniques With Respect To Performance Tuning: Hash Technique Column1 Column2 Column3
4 pages
Chapter 13: Data Storage Structures: Database System Concepts, 7 Ed
No ratings yet
Chapter 13: Data Storage Structures: Database System Concepts, 7 Ed
29 pages
Partitioning PDF
No ratings yet
Partitioning PDF
5 pages
Data Storage Structures
No ratings yet
Data Storage Structures
38 pages
Chapter 22: Distributed Databases
No ratings yet
Chapter 22: Distributed Databases
91 pages
Advanced Parallel DB Systems
No ratings yet
Advanced Parallel DB Systems
30 pages
RDBMS - Module5 - Distributed and Parallel DB
No ratings yet
RDBMS - Module5 - Distributed and Parallel DB
7 pages
Class 7 - Scaling, Sharding, Consistent Hashing
No ratings yet
Class 7 - Scaling, Sharding, Consistent Hashing
4 pages
Distributeddbms 181016095138
No ratings yet
Distributeddbms 181016095138
54 pages
Storage Structure
No ratings yet
Storage Structure
28 pages
Improving Database Performance With A Mixed Fragmentation Design
No ratings yet
Improving Database Performance With A Mixed Fragmentation Design
18 pages
Week08 - Physical Design
No ratings yet
Week08 - Physical Design
24 pages
Centralized Versus Distributed DBMS: T T T T A A A A
No ratings yet
Centralized Versus Distributed DBMS: T T T T A A A A
3 pages
Blockchain 3
No ratings yet
Blockchain 3
34 pages
CH 18
No ratings yet
CH 18
67 pages
ch23 1
No ratings yet
ch23 1
34 pages
Risk Analysis
No ratings yet
Risk Analysis
10 pages
New and Emerging Product Ideas
No ratings yet
New and Emerging Product Ideas
27 pages
Startup Business Canvas
No ratings yet
Startup Business Canvas
23 pages
Product Development Examples
No ratings yet
Product Development Examples
35 pages
Innovation Process
No ratings yet
Innovation Process
12 pages
Enterpreneurship 101
No ratings yet
Enterpreneurship 101
15 pages
Islam's Political Role in India
100% (1)
Islam's Political Role in India
6 pages
Laboratory 4 Report
No ratings yet
Laboratory 4 Report
10 pages
Back Ups 600
No ratings yet
Back Ups 600
2 pages
CFL & LED Bulbs: A Comparison
No ratings yet
CFL & LED Bulbs: A Comparison
4 pages
Pmpracticaln005classdiagram 170514063747 PDF
No ratings yet
Pmpracticaln005classdiagram 170514063747 PDF
16 pages
Characterizing of Design Fires For Clothing Stores
No ratings yet
Characterizing of Design Fires For Clothing Stores
10 pages
Smart Textiles
No ratings yet
Smart Textiles
2 pages
Lesson 7. The Material Self
No ratings yet
Lesson 7. The Material Self
9 pages
Topic 8
No ratings yet
Topic 8
16 pages
10 Punjabi Dishes
No ratings yet
10 Punjabi Dishes
4 pages
Automobile Body Engineering Exam Guide
No ratings yet
Automobile Body Engineering Exam Guide
24 pages
Introduction To Transistor As Amplifier
100% (1)
Introduction To Transistor As Amplifier
8 pages
Individual Revit Compentency Matrix
100% (1)
Individual Revit Compentency Matrix
2 pages
I Am Not Able To Login To Greythr Mobile App?
No ratings yet
I Am Not Able To Login To Greythr Mobile App?
3 pages
Unit 5 Notes
No ratings yet
Unit 5 Notes
15 pages
LESSON 1 - Literary Criticism
No ratings yet
LESSON 1 - Literary Criticism
21 pages
Final Test - English For Mechanics (Career Paths)
No ratings yet
Final Test - English For Mechanics (Career Paths)
3 pages
Using The Low Cycle Fatigue Approach When KT Nominal Stress Exceeds The Yield Strength: A Fundamental Mistake!
No ratings yet
Using The Low Cycle Fatigue Approach When KT Nominal Stress Exceeds The Yield Strength: A Fundamental Mistake!
2 pages
Dewesoft Sirius Xhs PWR Product Data en
No ratings yet
Dewesoft Sirius Xhs PWR Product Data en
4 pages
ملف المحادثه
No ratings yet
ملف المحادثه
28 pages
Chapter SupplyChain Engl
No ratings yet
Chapter SupplyChain Engl
10 pages
Panic Rules! Everything You Need To Know About The Global Economy PDF
No ratings yet
Panic Rules! Everything You Need To Know About The Global Economy PDF
140 pages
Java Features
83% (6)
Java Features
15 pages
Assignment 3: Units 2.1, 2.2 and 2.3 of Module 2 of The IACLE Contact Lens Course
No ratings yet
Assignment 3: Units 2.1, 2.2 and 2.3 of Module 2 of The IACLE Contact Lens Course
5 pages
Binding in Practice - The Last Link
100% (1)
Binding in Practice - The Last Link
4 pages
Special Products
100% (2)
Special Products
5 pages
Computer Programing Python - Midterm
No ratings yet
Computer Programing Python - Midterm
4 pages
Goldsmith Deserted Village 231110 143234
No ratings yet
Goldsmith Deserted Village 231110 143234
28 pages
Backs
No ratings yet
Backs
9 pages
Term Paper On Uber VS Pathao
No ratings yet
Term Paper On Uber VS Pathao
17 pages

CH 21

Uploaded by

CH 21

Uploaded by

Chapter 21: Parallel and Distributed Storage

Database System Concepts, 7th Ed.

You might also like