SlideShare a Scribd company logo
Multistage Networks
• Multistage networks consist of multiple sages of
switch boxes, and should be able to connect any
input to any output.
• A multistage network is called blocking if the
simultaneous connections of some multiple
input-output pairs may result in conflicts in the
use of switches or communication links.
• A nonblocking multistage network can perform
all possible connections between inputs and
outputs by rearranging its connections.
Multistage Networks
The schematic of a typical multistage interconnection network.
Multistage Omega Network
• One of the most commonly used
multistage interconnects is the Omega
network.
• This network consists of log p stages,
where p is the number of inputs/outputs.
• At each stage, input i is connected to
output j if:
Network Topologies:
Multistage Omega Network
Each stage of the Omega network implements a perfect
shuffle as follows:
A perfect shuffle interconnection for eight inputs and outputs.
Multistage Omega Network
• The perfect shuffle patterns are connected
using 2×2 switches.
• The switches operate in two modes – crossover
or passthrough.
Two switching configurations of the 2 × 2 switch:
(a) Pass-through; (b) Cross-over.
Multistage Omega Network
A complete omega network connecting eight inputs and eight outputs.
An omega network has p/2 × log p switching nodes, and
the cost of such a network grows as (p log p).
A complete Omega network with the perfect shuffle
interconnects and switches can now be illustrated:
Multistage Omega Network – Routing
• Let s be the binary representation of the source
and d be that of the destination processor.
• The data traverses the link to the first switching
node. If the most significant bits of s and d are
the same, then the data is routed in pass-through
mode by the switch else, it switches to crossover.
• This process is repeated for each of the log p
switching stages.
• Note that this is not a non-blocking switch.
Multistage Omega Network – Routing
An example of blocking in omega network: one of the messages
(010 to 111 or 110 to 100) is blocked at link AB.
Hypercube Interconnection
• Hypercube or binary n-cube multiprocessor
structure is composed of N=2^n processors
interconnected in n-dimensional binary cube.
• Used in loosely coupled processors.
• Each processor form the node of the cube.
• Each processor has direct communication path
with n other neighbor processor.
• There are 2^n distinct n-bit binary address that
can be assigned to each processor.
Hypercubes and their Construction
Construction of hypercubes from hypercubes of lower dimension.
Properties of Hypercubes
• The distance between any two nodes is at most
log p.
• Each node has log p neighbors.
• The distance between two nodes is given by
the number of bit positions at which the two
nodes differ.
• Routing messages through an n-cube structure
may take from one to n links from a source node
to a destination node.
• For example, in a three-cube structure, node 000
can communicate directly with node 001.
• It must cross at least two links to communicate
with 011 (from 000 to 001 to 011 or from 000 to
010 to 011).
• It is necessary to go through at least three links to
communicate from node 000 to node 111.
• A routing procedure can be developed by
computing the exclusive-OR of the source node
address with the destination node address.
Cache Coherence
• In shared memory multi-processor system,
processor share memory and they have local
memory (part or all of which is cache).
• To ensure ability of the system to execute
memory instruction independently, multiple
copies of the data must be identical which is
called cache coherence.
Condition to Incoherence
• This condition arise when the processor need
to share the writable data.
• In both policy write back and write through
incoherence condition is created.
• In case of DMA also, IOP modify the data in
main memory which reside in the cache and
can’t be updated.
Cache Coherence
in Multiprocessor Systems
Cache coherence in multiprocessor systems: (a) Invalidate protocol; (b)
Update protocol for shared variables.
When the value of a variable changes, all its copies
must either be invalidated or updated.
Cache Coherence:
Update and Invalidate Protocols
• If a processor just reads a value once and does
not need it again, an update protocol may
generate significant overhead.
• If two processors make interleaved test and
updates to a variable, an update protocol is
better.
• Both protocols suffer from false sharing
overheads (two words that are not shared,
however, they lie on the same cache line).
• Most current machines use invalidate protocols.
Maintaining Coherence
Using Invalidate Protocols
• Each copy of a data item is associated with a state.
• One example of such a set of states is, shared, invalid,
or dirty.
• In shared state, there are multiple valid copies of the
data item (and therefore, an invalidate would have to
be generated on an update).
• In dirty state, only one copy exists and therefore, no
invalidates need to be generated.
• In invalid state, the data copy is invalid, therefore, a
read generates a data request (and associated state
changes).
Maintaining Coherence
Using Invalidate Protocols
State diagram of a simple three-state coherence protocol.
Maintaining Coherence
Using Invalidate Protocols
Considering serial execution of 2 instructions with the simple
three-state coherence protocol.
Treating x=x+y as x has load instruction
32 ,
6 ,
33 , 13 ,
Maintaining Coherence
Using Invalidate Protocols
Considering serial execution of 2 instructions with the simple
three-state coherence protocol.
Treating x=x+y as x has read instruction
32 ,
19 ,
33 , 13 ,
Maintaining Coherence
Using Invalidate Protocols
Considering parallel execution of 2 instructions with the simple
three-state coherence protocol.
Treat x = x+ y as load and store (write) instruction
Snoopy Cache Systems
How are invalidates sent to the right processors?
In snoopy caches, there is a broadcast media that listens to all
invalidates and read requests and performs appropriate
coherence operations locally.
A simple snoopy bus based cache coherence system.
Performance of Snoopy Caches
• Once copies of data are tagged dirty, all
subsequent operations can be performed locally
on the cache without generating external traffic.
• If a data item is read by a number of processors,
it transitions to the shared state in the cache and
all subsequent read operations become local.
• If processors read and update data at the same
time, they generate coherence requests on the
bus - which is ultimately bandwidth limited.
Directory Based Systems
• In snoopy caches, each coherence operation is
sent to all processors. This is an inherent
limitation.
• Why not send coherence requests to only
those processors that need to be notified?
• This is done using a directory, which maintains
a presence vector for each data item (cache
line) along with its global state.
Directory Based Systems
Architecture of typical directory based systems: (a) a centralized
directory; and (b) a distributed directory.
Performance of
Directory Based Schemes
• The need for a broadcast media is replaced by
the directory.
• The additional bits to store the directory may
add significant overhead.
• The underlying network must be able to carry
all the coherence requests.
• The directory is a point of contention,
therefore, distributed directory schemes must
be used.

More Related Content

PPTX
Multiprocessor
Neel Patel
 
PPTX
Query processing
Deepak Singh
 
PPTX
Computer architecture multi processor
Mazin Alwaaly
 
PPTX
Parallel processing
Praveen Kumar
 
PPTX
Linux Kernel Programming
Nalin Sharma
 
PPT
Unit 3-pipelining & vector processing
vishal choudhary
 
PPSX
Computer networks
Nabendu Maji
 
PDF
Chapter 1 - introduction - parallel computing
Heman Pathak
 
Multiprocessor
Neel Patel
 
Query processing
Deepak Singh
 
Computer architecture multi processor
Mazin Alwaaly
 
Parallel processing
Praveen Kumar
 
Linux Kernel Programming
Nalin Sharma
 
Unit 3-pipelining & vector processing
vishal choudhary
 
Computer networks
Nabendu Maji
 
Chapter 1 - introduction - parallel computing
Heman Pathak
 

What's hot (20)

PPTX
Communication model of parallel platforms
Syed Zaid Irshad
 
PDF
Part 02 Linux Kernel Module Programming
Tushar B Kute
 
PPT
SPOOLING.ppt
Aayushigupta243868
 
PDF
Array Processor
Anshuman Biswal
 
PPTX
07. datacenters
Muhammad Ahad
 
PPT
System models in distributed system
ishapadhy
 
PPT
program flow mechanisms, advanced computer architecture
Pankaj Kumar Jain
 
PPTX
Multiple processor (ppt 2010)
Arth Ramada
 
PPT
Multiprocessor Systems
vampugani
 
PPT
Kernel mode vs user mode in linux
Siddique Ibrahim
 
PPTX
Layered approach in OS by Fahad Rafi.pptx
WarisBaig
 
PPTX
Operating Systems: Processor Management
Damian T. Gordon
 
PDF
Multithreading
Dr. A. B. Shinde
 
PPTX
Uni Processor Architecture
Ashish KC
 
PPTX
OS Building and Booting in Fundamentals of OS
Vivekananda Gn
 
PPT
Cache memory and cache
VISHAL DONGA
 
PPTX
Programming Language Evolution
Kushan Dananjaya
 
PPTX
Multiprocessor architecture
Arpan Baishya
 
ODP
Linux Internals - Kernel/Core
Shay Cohen
 
PPT
advanced computer architesture-conditions of parallelism
Pankaj Kumar Jain
 
Communication model of parallel platforms
Syed Zaid Irshad
 
Part 02 Linux Kernel Module Programming
Tushar B Kute
 
SPOOLING.ppt
Aayushigupta243868
 
Array Processor
Anshuman Biswal
 
07. datacenters
Muhammad Ahad
 
System models in distributed system
ishapadhy
 
program flow mechanisms, advanced computer architecture
Pankaj Kumar Jain
 
Multiple processor (ppt 2010)
Arth Ramada
 
Multiprocessor Systems
vampugani
 
Kernel mode vs user mode in linux
Siddique Ibrahim
 
Layered approach in OS by Fahad Rafi.pptx
WarisBaig
 
Operating Systems: Processor Management
Damian T. Gordon
 
Multithreading
Dr. A. B. Shinde
 
Uni Processor Architecture
Ashish KC
 
OS Building and Booting in Fundamentals of OS
Vivekananda Gn
 
Cache memory and cache
VISHAL DONGA
 
Programming Language Evolution
Kushan Dananjaya
 
Multiprocessor architecture
Arpan Baishya
 
Linux Internals - Kernel/Core
Shay Cohen
 
advanced computer architesture-conditions of parallelism
Pankaj Kumar Jain
 
Ad

Viewers also liked (18)

PPTX
Lecture 48
RahulRathi94
 
PDF
13. multiprocessing
karishmamubeen
 
PPTX
Multiprocessing -Interprocessing communication and process sunchronization,se...
Neena R Krishna
 
PPTX
Ec305.13 buses mgl
Д. Ганаа
 
PPTX
Bus
Asif Iqbal
 
PPTX
Lecture 39
RahulRathi94
 
PDF
Computer organiztion1
Umang Gupta
 
PPT
Arbitration
Osman Amin
 
PPT
Arbitration in computer organization
Amit kashyap
 
PPTX
Lecture 47
RahulRathi94
 
PPT
Multi processing
Muhammad Ishaq
 
PPTX
Multiprocessor
Kamal Acharya
 
DOCX
Arbitration notes
Ranadeep Poddar
 
PPTX
Types Of Buses
Akhil Ahuja
 
PDF
Intro to Buses (Computer Architecture)
Matthew Levandowski
 
PPTX
Multiprocessor system
Mr. Vikram Singh Slathia
 
PPT
Presentation on arbitration
singhgurpreet0013
 
PPTX
Types of buses of computer
SAGAR DODHIA
 
Lecture 48
RahulRathi94
 
13. multiprocessing
karishmamubeen
 
Multiprocessing -Interprocessing communication and process sunchronization,se...
Neena R Krishna
 
Ec305.13 buses mgl
Д. Ганаа
 
Lecture 39
RahulRathi94
 
Computer organiztion1
Umang Gupta
 
Arbitration
Osman Amin
 
Arbitration in computer organization
Amit kashyap
 
Lecture 47
RahulRathi94
 
Multi processing
Muhammad Ishaq
 
Multiprocessor
Kamal Acharya
 
Arbitration notes
Ranadeep Poddar
 
Types Of Buses
Akhil Ahuja
 
Intro to Buses (Computer Architecture)
Matthew Levandowski
 
Multiprocessor system
Mr. Vikram Singh Slathia
 
Presentation on arbitration
singhgurpreet0013
 
Types of buses of computer
SAGAR DODHIA
 
Ad

Similar to Multiprocessor (20)

PDF
Lecture 3 parallel programming platforms
Vajira Thambawita
 
PPTX
Physical organization of parallel platforms
Syed Zaid Irshad
 
PPTX
Dynamic interconnection networks
Prasenjit Dey
 
PPT
Unit 6 interconnection structure
Dipesh Vaya
 
PPTX
Multiprocessor structures
Shareb Ismaeel
 
PPT
Chapter 08
Google
 
PPT
Snooping 2
Yasir Khan
 
PPT
multi processors
Acad
 
PPT
Multiple processor systems
jeetesh036
 
PPT
unit1.ppt
MsRAMYACSE
 
PDF
Week5
student
 
PPT
mutiprocessor systems chapter8 ph.d .ppt
naghamsalimmohammed
 
PPT
chapter-6-multiprocessors-and-thread-level (1).ppt
harishM874937
 
PDF
Chap 8 switching
Mukesh Tekwani
 
PPTX
Bus Based Multiprocessors v2
Mustafa Yumurtacı
 
PPT
Distributed shared memory in distributed systems.ppt
lasmonkapota201
 
PPT
multiprocessor _system _presentation.ppt
naghamallella
 
PDF
OPERATING SYSTEM DESIGN FOR NEW COMPUTER ARCHITECTURES 2
egavagsaz
 
PDF
ACA module-4-ACA ACS aca-module-4-aca.pdf
drmanojkumarsharma24
 
PDF
Pdc chapter1
SyedSafeer1
 
Lecture 3 parallel programming platforms
Vajira Thambawita
 
Physical organization of parallel platforms
Syed Zaid Irshad
 
Dynamic interconnection networks
Prasenjit Dey
 
Unit 6 interconnection structure
Dipesh Vaya
 
Multiprocessor structures
Shareb Ismaeel
 
Chapter 08
Google
 
Snooping 2
Yasir Khan
 
multi processors
Acad
 
Multiple processor systems
jeetesh036
 
unit1.ppt
MsRAMYACSE
 
Week5
student
 
mutiprocessor systems chapter8 ph.d .ppt
naghamsalimmohammed
 
chapter-6-multiprocessors-and-thread-level (1).ppt
harishM874937
 
Chap 8 switching
Mukesh Tekwani
 
Bus Based Multiprocessors v2
Mustafa Yumurtacı
 
Distributed shared memory in distributed systems.ppt
lasmonkapota201
 
multiprocessor _system _presentation.ppt
naghamallella
 
OPERATING SYSTEM DESIGN FOR NEW COMPUTER ARCHITECTURES 2
egavagsaz
 
ACA module-4-ACA ACS aca-module-4-aca.pdf
drmanojkumarsharma24
 
Pdc chapter1
SyedSafeer1
 

Recently uploaded (20)

DOCX
Modul Ajar Deep Learning Bahasa Inggris Kelas 11 Terbaru 2025
wahyurestu63
 
PPTX
20250924 Navigating the Future: How to tell the difference between an emergen...
McGuinness Institute
 
PPTX
How to Close Subscription in Odoo 18 - Odoo Slides
Celine George
 
PPTX
Introduction to pediatric nursing in 5th Sem..pptx
AneetaSharma15
 
DOCX
Unit 5: Speech-language and swallowing disorders
JELLA VISHNU DURGA PRASAD
 
PPTX
PROTIEN ENERGY MALNUTRITION: NURSING MANAGEMENT.pptx
PRADEEP ABOTHU
 
PDF
Antianginal agents, Definition, Classification, MOA.pdf
Prerana Jadhav
 
PPTX
A Smarter Way to Think About Choosing a College
Cyndy McDonald
 
PPTX
Care of patients with elImination deviation.pptx
AneetaSharma15
 
PDF
The Minister of Tourism, Culture and Creative Arts, Abla Dzifa Gomashie has e...
nservice241
 
PPTX
Applications of matrices In Real Life_20250724_091307_0000.pptx
gehlotkrish03
 
PPTX
Artificial-Intelligence-in-Drug-Discovery by R D Jawarkar.pptx
Rahul Jawarkar
 
PPTX
Command Palatte in Odoo 18.1 Spreadsheet - Odoo Slides
Celine George
 
PPTX
Information Texts_Infographic on Forgetting Curve.pptx
Tata Sevilla
 
PPTX
Artificial Intelligence in Gastroentrology: Advancements and Future Presprec...
AyanHossain
 
PPTX
HISTORY COLLECTION FOR PSYCHIATRIC PATIENTS.pptx
PoojaSen20
 
PPTX
Gupta Art & Architecture Temple and Sculptures.pptx
Virag Sontakke
 
PPTX
An introduction to Prepositions for beginners.pptx
drsiddhantnagine
 
PPTX
Five Point Someone – Chetan Bhagat | Book Summary & Analysis by Bhupesh Kushwaha
Bhupesh Kushwaha
 
PDF
Module 2: Public Health History [Tutorial Slides]
JonathanHallett4
 
Modul Ajar Deep Learning Bahasa Inggris Kelas 11 Terbaru 2025
wahyurestu63
 
20250924 Navigating the Future: How to tell the difference between an emergen...
McGuinness Institute
 
How to Close Subscription in Odoo 18 - Odoo Slides
Celine George
 
Introduction to pediatric nursing in 5th Sem..pptx
AneetaSharma15
 
Unit 5: Speech-language and swallowing disorders
JELLA VISHNU DURGA PRASAD
 
PROTIEN ENERGY MALNUTRITION: NURSING MANAGEMENT.pptx
PRADEEP ABOTHU
 
Antianginal agents, Definition, Classification, MOA.pdf
Prerana Jadhav
 
A Smarter Way to Think About Choosing a College
Cyndy McDonald
 
Care of patients with elImination deviation.pptx
AneetaSharma15
 
The Minister of Tourism, Culture and Creative Arts, Abla Dzifa Gomashie has e...
nservice241
 
Applications of matrices In Real Life_20250724_091307_0000.pptx
gehlotkrish03
 
Artificial-Intelligence-in-Drug-Discovery by R D Jawarkar.pptx
Rahul Jawarkar
 
Command Palatte in Odoo 18.1 Spreadsheet - Odoo Slides
Celine George
 
Information Texts_Infographic on Forgetting Curve.pptx
Tata Sevilla
 
Artificial Intelligence in Gastroentrology: Advancements and Future Presprec...
AyanHossain
 
HISTORY COLLECTION FOR PSYCHIATRIC PATIENTS.pptx
PoojaSen20
 
Gupta Art & Architecture Temple and Sculptures.pptx
Virag Sontakke
 
An introduction to Prepositions for beginners.pptx
drsiddhantnagine
 
Five Point Someone – Chetan Bhagat | Book Summary & Analysis by Bhupesh Kushwaha
Bhupesh Kushwaha
 
Module 2: Public Health History [Tutorial Slides]
JonathanHallett4
 

Multiprocessor

  • 1. Multistage Networks • Multistage networks consist of multiple sages of switch boxes, and should be able to connect any input to any output. • A multistage network is called blocking if the simultaneous connections of some multiple input-output pairs may result in conflicts in the use of switches or communication links. • A nonblocking multistage network can perform all possible connections between inputs and outputs by rearranging its connections.
  • 2. Multistage Networks The schematic of a typical multistage interconnection network.
  • 3. Multistage Omega Network • One of the most commonly used multistage interconnects is the Omega network. • This network consists of log p stages, where p is the number of inputs/outputs. • At each stage, input i is connected to output j if:
  • 4. Network Topologies: Multistage Omega Network Each stage of the Omega network implements a perfect shuffle as follows: A perfect shuffle interconnection for eight inputs and outputs.
  • 5. Multistage Omega Network • The perfect shuffle patterns are connected using 2×2 switches. • The switches operate in two modes – crossover or passthrough. Two switching configurations of the 2 × 2 switch: (a) Pass-through; (b) Cross-over.
  • 6. Multistage Omega Network A complete omega network connecting eight inputs and eight outputs. An omega network has p/2 × log p switching nodes, and the cost of such a network grows as (p log p). A complete Omega network with the perfect shuffle interconnects and switches can now be illustrated:
  • 7. Multistage Omega Network – Routing • Let s be the binary representation of the source and d be that of the destination processor. • The data traverses the link to the first switching node. If the most significant bits of s and d are the same, then the data is routed in pass-through mode by the switch else, it switches to crossover. • This process is repeated for each of the log p switching stages. • Note that this is not a non-blocking switch.
  • 8. Multistage Omega Network – Routing An example of blocking in omega network: one of the messages (010 to 111 or 110 to 100) is blocked at link AB.
  • 9. Hypercube Interconnection • Hypercube or binary n-cube multiprocessor structure is composed of N=2^n processors interconnected in n-dimensional binary cube. • Used in loosely coupled processors. • Each processor form the node of the cube. • Each processor has direct communication path with n other neighbor processor. • There are 2^n distinct n-bit binary address that can be assigned to each processor.
  • 10. Hypercubes and their Construction Construction of hypercubes from hypercubes of lower dimension.
  • 11. Properties of Hypercubes • The distance between any two nodes is at most log p. • Each node has log p neighbors. • The distance between two nodes is given by the number of bit positions at which the two nodes differ.
  • 12. • Routing messages through an n-cube structure may take from one to n links from a source node to a destination node. • For example, in a three-cube structure, node 000 can communicate directly with node 001. • It must cross at least two links to communicate with 011 (from 000 to 001 to 011 or from 000 to 010 to 011). • It is necessary to go through at least three links to communicate from node 000 to node 111. • A routing procedure can be developed by computing the exclusive-OR of the source node address with the destination node address.
  • 13. Cache Coherence • In shared memory multi-processor system, processor share memory and they have local memory (part or all of which is cache). • To ensure ability of the system to execute memory instruction independently, multiple copies of the data must be identical which is called cache coherence.
  • 14. Condition to Incoherence • This condition arise when the processor need to share the writable data. • In both policy write back and write through incoherence condition is created. • In case of DMA also, IOP modify the data in main memory which reside in the cache and can’t be updated.
  • 15. Cache Coherence in Multiprocessor Systems Cache coherence in multiprocessor systems: (a) Invalidate protocol; (b) Update protocol for shared variables. When the value of a variable changes, all its copies must either be invalidated or updated.
  • 16. Cache Coherence: Update and Invalidate Protocols • If a processor just reads a value once and does not need it again, an update protocol may generate significant overhead. • If two processors make interleaved test and updates to a variable, an update protocol is better. • Both protocols suffer from false sharing overheads (two words that are not shared, however, they lie on the same cache line). • Most current machines use invalidate protocols.
  • 17. Maintaining Coherence Using Invalidate Protocols • Each copy of a data item is associated with a state. • One example of such a set of states is, shared, invalid, or dirty. • In shared state, there are multiple valid copies of the data item (and therefore, an invalidate would have to be generated on an update). • In dirty state, only one copy exists and therefore, no invalidates need to be generated. • In invalid state, the data copy is invalid, therefore, a read generates a data request (and associated state changes).
  • 18. Maintaining Coherence Using Invalidate Protocols State diagram of a simple three-state coherence protocol.
  • 19. Maintaining Coherence Using Invalidate Protocols Considering serial execution of 2 instructions with the simple three-state coherence protocol. Treating x=x+y as x has load instruction 32 , 6 , 33 , 13 ,
  • 20. Maintaining Coherence Using Invalidate Protocols Considering serial execution of 2 instructions with the simple three-state coherence protocol. Treating x=x+y as x has read instruction 32 , 19 , 33 , 13 ,
  • 21. Maintaining Coherence Using Invalidate Protocols Considering parallel execution of 2 instructions with the simple three-state coherence protocol. Treat x = x+ y as load and store (write) instruction
  • 22. Snoopy Cache Systems How are invalidates sent to the right processors? In snoopy caches, there is a broadcast media that listens to all invalidates and read requests and performs appropriate coherence operations locally. A simple snoopy bus based cache coherence system.
  • 23. Performance of Snoopy Caches • Once copies of data are tagged dirty, all subsequent operations can be performed locally on the cache without generating external traffic. • If a data item is read by a number of processors, it transitions to the shared state in the cache and all subsequent read operations become local. • If processors read and update data at the same time, they generate coherence requests on the bus - which is ultimately bandwidth limited.
  • 24. Directory Based Systems • In snoopy caches, each coherence operation is sent to all processors. This is an inherent limitation. • Why not send coherence requests to only those processors that need to be notified? • This is done using a directory, which maintains a presence vector for each data item (cache line) along with its global state.
  • 25. Directory Based Systems Architecture of typical directory based systems: (a) a centralized directory; and (b) a distributed directory.
  • 26. Performance of Directory Based Schemes • The need for a broadcast media is replaced by the directory. • The additional bits to store the directory may add significant overhead. • The underlying network must be able to carry all the coherence requests. • The directory is a point of contention, therefore, distributed directory schemes must be used.