SlideShare a Scribd company logo
22/12/2005 Distributed Shared-Memory
Architectures by Seda Demirağ
Distrubuted Shared-Memory Architectures
by Seda Demirağ
22/12/2005 Distributed Shared-Memory
Architectures by Seda Demirağ
Distributed Shared-Memory Architectures
• According to Flynn (1972),computers’ parallelism can be categorizied
like this:
Computers’ Parallelism
Single
instruction
stream, Single
data stream
(SISD)
Single
instruction
stream, multiple
data stream
(SIMD)
Multiple
instruction
stream, Single
data stream
(MISD)
Multiple
instruction
stream, Multiple
data stream
(MIMD)
22/12/2005 Distributed Shared-Memory
Architectures by Seda Demirağ
Distributed Shared-Memory Architectures
• We can also clasify the MIMD Structures into two:
MIMD
Centralized
(symmetric)-
Shared Memory
Architectures
Distributed-
Shared Memory
Architecture
22/12/2005 Distributed Shared-Memory
Architectures by Seda Demirağ
Distributed Shared-Memory Architectures
Centralized(symmetric)-shared memory architecture
Centralized(symmetric)-shared memory architecture
Caches can
contain either
private or
shared data.
This causes
cache chorence
problem.
Uniform access
time to all
memory from
all processors
22/12/2005 Distributed Shared-Memory
Architectures by Seda Demirağ
Distributed Shared-Memory Architectures
Distributed memory architecture
Distributed memory architecture
To support larger
processor counts
Some processors may
be connected by a
single bus, but this is
less scalable than
global interconnection
network
Cost effective way to
scale the memory
Bandwidth (if most of
access is to local
memory)
But communicating
data between processors
becomes more complex,
has higher latency,
at least
22/12/2005 Distributed Shared-Memory
Architectures by Seda Demirağ
Distributed Shared-Memory Architectures
Models for Communication Among Processors
Models for Communication Among Processors
There are two alternative architectural approaches that differ
in the method used for communicating data among processors
Distributed Shared-Memory
Architectures(DSM) : Commu.
Occurs via a shared address space
Multicomputers(Clusters) : The
address space can consist of
multiple private address spaces
that are logically disjoint.
Message-Passing Multiprocessors:
Comm. Of data is done by explicitly
passing messages among the processors.
For an access or operation on data,
a processors sends message to the receiver.
Receiver performs the operation and
sends the result back.
22/12/2005 Distributed Shared-Memory
Architectures by Seda Demirağ
Distributed Shared-Memory Architectures
Distributed-shared memory architecture
Distributed-shared memory architecture
The first DSM architectures apperared in the late 1970s and continued
through the early 1980s,embodied in three machines: the Carnegie Mellon
Cm, the IBM RP3, and the BBN Buterfly.
In uniprocessors, the long access time to memory is largely hidden throug the
use of caches. Unfortunately, adapting caches to work in a multiprocessor
enviroment is difficult.
When used in a multiprocessor, caching introduces an additional problem:
cache coherence, which arises when different processors cache and update
values of the same memory location.
22/12/2005 Distributed Shared-Memory
Architectures by Seda Demirağ
Distributed Shared-Memory Architectures
What is cache coherence?
What is cache coherence?
I will explain this with an example:
Processor: A, B Memory location X
Time Event Cache content
in A
Cache content
in B
Memory content
for X
0 * * 1
1 A reads X 1 * 1
2 B reads X 1 1 1
3 A Writes X 0 1 0
22/12/2005 Distributed Shared-Memory
Architectures by Seda Demirağ
Distributed Shared-Memory Architectures
A memory system is coherence when it satisfy the following
conditions:
• To the same location, a write immediately followed by a read by
the same processor will always return the written value.
• To the same location, a read from P2 immediately follows a write
by P1 will returns the value written by P1
• Two writes to the same location by any two processors are seen in
the same order by all processors
This ensures a shared location will not have different copies in cache
blocks.
22/12/2005 Distributed Shared-Memory
Architectures by Seda Demirağ
Distributed Shared-Memory Architectures
DSM Architectures which excluding cache coherence:
Protocols for cache coherence:
These systems have caches,shared data are marked as uncacheable and only private
data are kept in the caches.
SW can cache the shared data by copying the data from the shared portion of the
address space to the local private portion of at he address space that is cached.
Coherence controlled by software. Advantage is little HW support.
Snooping Protocol Directory Protocol
22/12/2005 Distributed Shared-Memory
Architectures by Seda Demirağ
Distributed Shared-Memory Architectures
Snooping Protocol:
Snooping Protocol:
In a snooping system, all caches on the bus monitor the bus to
determine if they have a copy of the block of data that is requested on
the bus. Every cache has a copy of the sharing status of every block of
physical memory it has.
There are two types of Snooping Protocol:
write-invalidate: the processor that is writing data causes copies in the caches of
all other processors in the system to be rendered invalid before it changes its local
copy. The local machine does this by sending an invalidation signal over the bus, which
causes all of the other caches to check for a copy of the invalidated file. Once the
cache copies have been invalidated, the data on the local machine can be updated
until another processor requests it.
write-update: the processor that is writing the data broadcasts the new data
over the bus (without issuing the invalidation signal). All caches that contain copies
of the data are then updated. This scheme differs from write-invalidate in that it
does not create only one local copy for writes.
22/12/2005 Distributed Shared-Memory
Architectures by Seda Demirağ
Distributed Shared-Memory Architectures
Directory-Based Cache Coherence Protocols
Directory-Based Cache Coherence Protocols
Each directory is
reaponsible for
tracking caches
that share the
memory address
of the portion of
memory in the node.
The directory must
track the state of
the cache block.
The states are Shared,
Uncached and Exclusive.
22/12/2005 Distributed Shared-Memory
Architectures by Seda Demirağ
Distributed Shared-Memory Architectures
The possible messages sent among
nodes to maintain coherence,
along with source and destination
node. (P = requesting processor
number, A = requested address,
D = data contents.)
22/12/2005 Distributed Shared-Memory
Architectures by Seda Demirağ
Distributed Shared-Memory Architectures
Example of Directory Protocol
Example of Directory Protocol
State transition diagram for an
İn dividual cache block in a
directory-based system:
Requests by the local processor
are shown in black and those from
home directory are shown in gray.
22/12/2005 Distributed Shared-Memory
Architectures by Seda Demirağ
Distributed Shared-Memory Architectures
Example of Directory Protocol
Example of Directory Protocol
The state transition diagram for the
directory: All actions are in gray
because they are all externally caused.
Bold indicates the action taken by
the directory in response to the request.
Bold italics indicate an action that
updates the sharing set, Sharers.
22/12/2005 Distributed Shared-Memory
Architectures by Seda Demirağ
Distributed Shared-Memory Architectures
Example of Directory Protocol (cont’d)
Example of Directory Protocol (cont’d)
The state of uncached:
Read miss: The requesting processor is
sent the requested data from memory and
the requestor is made the only sharing node.
The state of the block is made shared.
Write miss: The requesting processor
is sent the value and becomes the sharing
node. The block is made exclusive to
indicate that the only valid copy is cached.
Sharers indicates the identity of the owner.
22/12/2005 Distributed Shared-Memory
Architectures by Seda Demirağ
Distributed Shared-Memory Architectures
Example of Directory Protocol (cont’d)
Example of Directory Protocol (cont’d)
The state of shared:
Read miss: The requesting processor is
sent the requested data from memory and
The requesting processor is added to the
sharing set.
Write miss: The requesting processor
is sent the value. All processors in the
set Sharers are sent invalidate messages,
and the Sharers set is to contain the
identity of the requesting processor.
The state of the block is made exclusive.
22/12/2005 Distributed Shared-Memory
Architectures by Seda Demirağ
Distributed Shared-Memory Architectures
Example of Directory Protocol (cont’d)
Example of Directory Protocol (cont’d)
The state of exclusive:
Read miss: The qwner processor is sent a data
fetch message. The identity of the requesting
processor is added to the set Sharers, whivh still
contains the identity of the processor that was
the owner.
Data write back: The owner processor is
replacing the block and therefore must write
it back. This write back makes the memory
copy up to date, the block is now uncached and
the Sharers set is empty.
Write miss: The block has a new owner.
A message sent to the old owner, causing
the cache to invalidate the block and send the
value to the directory. Sharers is set to the
identity of the new owner, and the state of
the block remains exclusive.
22/12/2005 Distributed Shared-Memory
Architectures by Seda Demirağ
Distributed Shared-Memory Architectures
Performance of DSM Multiprocessors
Performance of DSM Multiprocessors
In DSM architectures, the memory requests between local and remote
is key to performance.
It affects the bandwidth and the latency seen by requests.
In the performance example we will separate the cache misses
into local and remote requests.
We will also compare the performance changings of the computational
kernels FFT, LU; the applications Barnes and Ocean.
22/12/2005 Distributed Shared-Memory
Architectures by Seda Demirağ
Distributed Shared-Memory Architectures
Performance of DSM Multiprocessors(cont’d)
Performance of DSM Multiprocessors(cont’d)
The miss rates with these cache sizes are not
affected much by changes in processor count,
with the exception of Ocean. The rise of miss
rate at 64 processors results from these
factors:
An increase in mapping conflicts in cache
that occur when the grid becomes small
which leads to a rise in local misses and
an increase in the number of the coherence
misses, which are all remote.
22/12/2005 Distributed Shared-Memory
Architectures by Seda Demirağ
Distributed Shared-Memory Architectures
Performance of DSM Multiprocessors(cont’d)
Performance of DSM Multiprocessors(cont’d)
This figure shows how the miss rates
change as the cache size is increased,
assuming a 64- processor execution and
64-byte blocks. By the time we reach the
largest cache size shown 512 KB, the
remote miss rat is equal to or greater
than the local miss rate.
22/12/2005 Distributed Shared-Memory
Architectures by Seda Demirağ
Distributed Shared-Memory Architectures
Performance of DSM Multiprocessors(cont’d)
Performance of DSM Multiprocessors(cont’d)
We examine the effect of tchanging the
block size in this example. Increases in
block size reduce the mis rate, even for
large blocks, although the performance
benefits for going to the largest blocks
are small. So most of the improvement
in miss rate comes from a reduction in
the local misses.
22/12/2005 Distributed Shared-Memory
Architectures by Seda Demirağ
Distributed Shared-Memory Architectures
Performance of DSM Multiprocessors(cont’d)
Performance of DSM Multiprocessors(cont’d)
The number of bytes per data reference
climbs steadily as block size is increased.
22/12/2005 Distributed Shared-Memory
Architectures by Seda Demirağ
Distributed Shared-Memory Architectures
Performance of DSM Multiprocessors(cont’d)
Performance of DSM Multiprocessors(cont’d)
The effective latency of memory references
in a DSM multiprocessor depends both on
the relative frequency of cache misses and
on the location of the memory where the
accesses are served.
22/12/2005 Distributed Shared-Memory
Architectures by Seda Demirağ
Distributed Shared-Memory Architectures
REFERENCES:
• Andrew S. T., Maarten V. S., Distributed Systems, 2002
• John L. H., David A. P. , Computer Architecture: A quantitive Approach,
2003
• Abraham S., Peter B. G., Greg G., Operating Systems Concepts, 2003
• Jinseok K., Gyungho L., Binding Time in Distributed Shared Memory
Architectures, 1998 International Conference on Parallel Processing.
• Bill N., Virginia L., Distributed Shared Memory: A Survey of Issues and
Algorithms, Volume 24, Issue 8, August 1991, IEEE Computer Society
Press
• S. Zhou, M. Stumm, D. Wortman, K. Li, Heterogeneous Distributed
Shared Memory, IEEE Transactions on Parallel and Distributed
Systems, v.3 n.5, p.540-554, September 1992.
22/12/2005 Distributed Shared-Memory
Architectures by Seda Demirağ
Distributed Shared-Memory Architectures
Any Questions?

More Related Content

Similar to Distributed Shared memory architecture.ppt (20)

PDF
Coma
student
 
PPTX
Parallel Processing Presentation2
daniyalqureshi712
 
PPTX
Distributed shared memory ch 5
Alagappa Government Arts College, Karaikudi
 
DOCX
Cache memory
Muhammad Imran
 
PPT
Distributed shared memory in distributed systems.ppt
lasmonkapota201
 
PPT
Dos final ppt
sanjana1988
 
PPT
Dos final ppt
Sanjana Bakshi
 
PPTX
NUMA
Pallab Ray
 
PPT
distributed shared memory
Ashish Kumar
 
PPT
Snooping 2
Yasir Khan
 
PPTX
Introduction to Thread Level Parallelism
Dilum Bandara
 
PPTX
6.distributed shared memory
Gd Goenka University
 
PPTX
Architecture of cash memory for engineering ch 3.pptx
magedsrhan773
 
PDF
Week5
student
 
PPT
Executing Multiple Thread on Modern Processor
NurHadisukmana3
 
PPTX
Distributed Shared Memory Systems
Ankit Gupta
 
PPTX
ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCESSING
Zena Abo-Altaheen
 
PPT
Snooping protocols 3
Yasir Khan
 
PDF
Distributed Shared Memory-jhgfdsserty.pdf
RichardMathengeSPASP
 
Coma
student
 
Parallel Processing Presentation2
daniyalqureshi712
 
Distributed shared memory ch 5
Alagappa Government Arts College, Karaikudi
 
Cache memory
Muhammad Imran
 
Distributed shared memory in distributed systems.ppt
lasmonkapota201
 
Dos final ppt
sanjana1988
 
Dos final ppt
Sanjana Bakshi
 
distributed shared memory
Ashish Kumar
 
Snooping 2
Yasir Khan
 
Introduction to Thread Level Parallelism
Dilum Bandara
 
6.distributed shared memory
Gd Goenka University
 
Architecture of cash memory for engineering ch 3.pptx
magedsrhan773
 
Week5
student
 
Executing Multiple Thread on Modern Processor
NurHadisukmana3
 
Distributed Shared Memory Systems
Ankit Gupta
 
ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCESSING
Zena Abo-Altaheen
 
Snooping protocols 3
Yasir Khan
 
Distributed Shared Memory-jhgfdsserty.pdf
RichardMathengeSPASP
 

More from Balasubramanian699229 (17)

PPTX
Ethics_Theories about right action..pptx
Balasubramanian699229
 
PPTX
Data raceData raceData raceData race.pptx
Balasubramanian699229
 
PPT
OpenMP-Quinn17_L4bOpen <MP_Open MP_Open MP
Balasubramanian699229
 
PDF
Unit 2_ Flow & Error Control in computer networks
Balasubramanian699229
 
PDF
computer networks Error Detection Methods.pdf
Balasubramanian699229
 
PPTX
Engineers as experimenter in professional ethics
Balasubramanian699229
 
PPT
Introduction to Computers.ppt
Balasubramanian699229
 
PPT
Unit III.ppt
Balasubramanian699229
 
PPT
DATA COMMUNICATIONS.ppt
Balasubramanian699229
 
PDF
bargaining.pdf
Balasubramanian699229
 
PDF
occupational crime.pdf
Balasubramanian699229
 
PPT
quicksort (1).ppt
Balasubramanian699229
 
PPTX
Civic Virtues_Unit1.pptx
Balasubramanian699229
 
PDF
Titanic-Powerpoint.pdf
Balasubramanian699229
 
PDF
3 Mile island.pdf
Balasubramanian699229
 
PPTX
Chernobyl_Nuclear_Disaster_ppt.pptx
Balasubramanian699229
 
Ethics_Theories about right action..pptx
Balasubramanian699229
 
Data raceData raceData raceData race.pptx
Balasubramanian699229
 
OpenMP-Quinn17_L4bOpen <MP_Open MP_Open MP
Balasubramanian699229
 
Unit 2_ Flow & Error Control in computer networks
Balasubramanian699229
 
computer networks Error Detection Methods.pdf
Balasubramanian699229
 
Engineers as experimenter in professional ethics
Balasubramanian699229
 
Introduction to Computers.ppt
Balasubramanian699229
 
Unit III.ppt
Balasubramanian699229
 
DATA COMMUNICATIONS.ppt
Balasubramanian699229
 
bargaining.pdf
Balasubramanian699229
 
occupational crime.pdf
Balasubramanian699229
 
quicksort (1).ppt
Balasubramanian699229
 
Civic Virtues_Unit1.pptx
Balasubramanian699229
 
Titanic-Powerpoint.pdf
Balasubramanian699229
 
3 Mile island.pdf
Balasubramanian699229
 
Chernobyl_Nuclear_Disaster_ppt.pptx
Balasubramanian699229
 
Ad

Recently uploaded (20)

PPT
Footbinding.pptmnmkjkjkknmnnjkkkkkkkkkkkkkk
mamadoundiaye42742
 
PDF
Basic_Concepts_in_Clinical_Biochemistry_2018كيمياء_عملي.pdf
AdelLoin
 
PPTX
MODULE 05 - CLOUD COMPUTING AND SECURITY.pptx
Alvas Institute of Engineering and technology, Moodabidri
 
PPTX
MODULE 03 - CLOUD COMPUTING AND SECURITY.pptx
Alvas Institute of Engineering and technology, Moodabidri
 
PDF
Halide Perovskites’ Multifunctional Properties: Coordination Engineering, Coo...
TaameBerhe2
 
PDF
Water Industry Process Automation & Control Monthly July 2025
Water Industry Process Automation & Control
 
PDF
methodology-driven-mbse-murphy-july-hsv-huntsville6680038572db67488e78ff00003...
henriqueltorres1
 
PPT
Testing and final inspection of a solar PV system
MuhammadSanni2
 
PDF
mbse_An_Introduction_to_Arcadia_20150115.pdf
henriqueltorres1
 
PPT
New_school_Engineering_presentation_011707.ppt
VinayKumar304579
 
PDF
REINFORCEMENT LEARNING IN DECISION MAKING SEMINAR REPORT
anushaashraf20
 
PPTX
Introduction to Internal Combustion Engines - Types, Working and Camparison.pptx
UtkarshPatil98
 
PPTX
OCS353 DATA SCIENCE FUNDAMENTALS- Unit 1 Introduction to Data Science
A R SIVANESH M.E., (Ph.D)
 
PPTX
How Industrial Project Management Differs From Construction.pptx
jamespit799
 
PDF
Electrical Machines and Their Protection.pdf
Nabajyoti Banik
 
PDF
Data structures notes for unit 2 in computer science.pdf
sshubhamsingh265
 
PPTX
What is Shot Peening | Shot Peening is a Surface Treatment Process
Vibra Finish
 
PDF
20ES1152 Programming for Problem Solving Lab Manual VRSEC.pdf
Ashutosh Satapathy
 
PDF
Digital water marking system project report
Kamal Acharya
 
PPTX
MODULE 04 - CLOUD COMPUTING AND SECURITY.pptx
Alvas Institute of Engineering and technology, Moodabidri
 
Footbinding.pptmnmkjkjkknmnnjkkkkkkkkkkkkkk
mamadoundiaye42742
 
Basic_Concepts_in_Clinical_Biochemistry_2018كيمياء_عملي.pdf
AdelLoin
 
MODULE 05 - CLOUD COMPUTING AND SECURITY.pptx
Alvas Institute of Engineering and technology, Moodabidri
 
MODULE 03 - CLOUD COMPUTING AND SECURITY.pptx
Alvas Institute of Engineering and technology, Moodabidri
 
Halide Perovskites’ Multifunctional Properties: Coordination Engineering, Coo...
TaameBerhe2
 
Water Industry Process Automation & Control Monthly July 2025
Water Industry Process Automation & Control
 
methodology-driven-mbse-murphy-july-hsv-huntsville6680038572db67488e78ff00003...
henriqueltorres1
 
Testing and final inspection of a solar PV system
MuhammadSanni2
 
mbse_An_Introduction_to_Arcadia_20150115.pdf
henriqueltorres1
 
New_school_Engineering_presentation_011707.ppt
VinayKumar304579
 
REINFORCEMENT LEARNING IN DECISION MAKING SEMINAR REPORT
anushaashraf20
 
Introduction to Internal Combustion Engines - Types, Working and Camparison.pptx
UtkarshPatil98
 
OCS353 DATA SCIENCE FUNDAMENTALS- Unit 1 Introduction to Data Science
A R SIVANESH M.E., (Ph.D)
 
How Industrial Project Management Differs From Construction.pptx
jamespit799
 
Electrical Machines and Their Protection.pdf
Nabajyoti Banik
 
Data structures notes for unit 2 in computer science.pdf
sshubhamsingh265
 
What is Shot Peening | Shot Peening is a Surface Treatment Process
Vibra Finish
 
20ES1152 Programming for Problem Solving Lab Manual VRSEC.pdf
Ashutosh Satapathy
 
Digital water marking system project report
Kamal Acharya
 
MODULE 04 - CLOUD COMPUTING AND SECURITY.pptx
Alvas Institute of Engineering and technology, Moodabidri
 
Ad

Distributed Shared memory architecture.ppt

  • 1. 22/12/2005 Distributed Shared-Memory Architectures by Seda Demirağ Distrubuted Shared-Memory Architectures by Seda Demirağ
  • 2. 22/12/2005 Distributed Shared-Memory Architectures by Seda Demirağ Distributed Shared-Memory Architectures • According to Flynn (1972),computers’ parallelism can be categorizied like this: Computers’ Parallelism Single instruction stream, Single data stream (SISD) Single instruction stream, multiple data stream (SIMD) Multiple instruction stream, Single data stream (MISD) Multiple instruction stream, Multiple data stream (MIMD)
  • 3. 22/12/2005 Distributed Shared-Memory Architectures by Seda Demirağ Distributed Shared-Memory Architectures • We can also clasify the MIMD Structures into two: MIMD Centralized (symmetric)- Shared Memory Architectures Distributed- Shared Memory Architecture
  • 4. 22/12/2005 Distributed Shared-Memory Architectures by Seda Demirağ Distributed Shared-Memory Architectures Centralized(symmetric)-shared memory architecture Centralized(symmetric)-shared memory architecture Caches can contain either private or shared data. This causes cache chorence problem. Uniform access time to all memory from all processors
  • 5. 22/12/2005 Distributed Shared-Memory Architectures by Seda Demirağ Distributed Shared-Memory Architectures Distributed memory architecture Distributed memory architecture To support larger processor counts Some processors may be connected by a single bus, but this is less scalable than global interconnection network Cost effective way to scale the memory Bandwidth (if most of access is to local memory) But communicating data between processors becomes more complex, has higher latency, at least
  • 6. 22/12/2005 Distributed Shared-Memory Architectures by Seda Demirağ Distributed Shared-Memory Architectures Models for Communication Among Processors Models for Communication Among Processors There are two alternative architectural approaches that differ in the method used for communicating data among processors Distributed Shared-Memory Architectures(DSM) : Commu. Occurs via a shared address space Multicomputers(Clusters) : The address space can consist of multiple private address spaces that are logically disjoint. Message-Passing Multiprocessors: Comm. Of data is done by explicitly passing messages among the processors. For an access or operation on data, a processors sends message to the receiver. Receiver performs the operation and sends the result back.
  • 7. 22/12/2005 Distributed Shared-Memory Architectures by Seda Demirağ Distributed Shared-Memory Architectures Distributed-shared memory architecture Distributed-shared memory architecture The first DSM architectures apperared in the late 1970s and continued through the early 1980s,embodied in three machines: the Carnegie Mellon Cm, the IBM RP3, and the BBN Buterfly. In uniprocessors, the long access time to memory is largely hidden throug the use of caches. Unfortunately, adapting caches to work in a multiprocessor enviroment is difficult. When used in a multiprocessor, caching introduces an additional problem: cache coherence, which arises when different processors cache and update values of the same memory location.
  • 8. 22/12/2005 Distributed Shared-Memory Architectures by Seda Demirağ Distributed Shared-Memory Architectures What is cache coherence? What is cache coherence? I will explain this with an example: Processor: A, B Memory location X Time Event Cache content in A Cache content in B Memory content for X 0 * * 1 1 A reads X 1 * 1 2 B reads X 1 1 1 3 A Writes X 0 1 0
  • 9. 22/12/2005 Distributed Shared-Memory Architectures by Seda Demirağ Distributed Shared-Memory Architectures A memory system is coherence when it satisfy the following conditions: • To the same location, a write immediately followed by a read by the same processor will always return the written value. • To the same location, a read from P2 immediately follows a write by P1 will returns the value written by P1 • Two writes to the same location by any two processors are seen in the same order by all processors This ensures a shared location will not have different copies in cache blocks.
  • 10. 22/12/2005 Distributed Shared-Memory Architectures by Seda Demirağ Distributed Shared-Memory Architectures DSM Architectures which excluding cache coherence: Protocols for cache coherence: These systems have caches,shared data are marked as uncacheable and only private data are kept in the caches. SW can cache the shared data by copying the data from the shared portion of the address space to the local private portion of at he address space that is cached. Coherence controlled by software. Advantage is little HW support. Snooping Protocol Directory Protocol
  • 11. 22/12/2005 Distributed Shared-Memory Architectures by Seda Demirağ Distributed Shared-Memory Architectures Snooping Protocol: Snooping Protocol: In a snooping system, all caches on the bus monitor the bus to determine if they have a copy of the block of data that is requested on the bus. Every cache has a copy of the sharing status of every block of physical memory it has. There are two types of Snooping Protocol: write-invalidate: the processor that is writing data causes copies in the caches of all other processors in the system to be rendered invalid before it changes its local copy. The local machine does this by sending an invalidation signal over the bus, which causes all of the other caches to check for a copy of the invalidated file. Once the cache copies have been invalidated, the data on the local machine can be updated until another processor requests it. write-update: the processor that is writing the data broadcasts the new data over the bus (without issuing the invalidation signal). All caches that contain copies of the data are then updated. This scheme differs from write-invalidate in that it does not create only one local copy for writes.
  • 12. 22/12/2005 Distributed Shared-Memory Architectures by Seda Demirağ Distributed Shared-Memory Architectures Directory-Based Cache Coherence Protocols Directory-Based Cache Coherence Protocols Each directory is reaponsible for tracking caches that share the memory address of the portion of memory in the node. The directory must track the state of the cache block. The states are Shared, Uncached and Exclusive.
  • 13. 22/12/2005 Distributed Shared-Memory Architectures by Seda Demirağ Distributed Shared-Memory Architectures The possible messages sent among nodes to maintain coherence, along with source and destination node. (P = requesting processor number, A = requested address, D = data contents.)
  • 14. 22/12/2005 Distributed Shared-Memory Architectures by Seda Demirağ Distributed Shared-Memory Architectures Example of Directory Protocol Example of Directory Protocol State transition diagram for an İn dividual cache block in a directory-based system: Requests by the local processor are shown in black and those from home directory are shown in gray.
  • 15. 22/12/2005 Distributed Shared-Memory Architectures by Seda Demirağ Distributed Shared-Memory Architectures Example of Directory Protocol Example of Directory Protocol The state transition diagram for the directory: All actions are in gray because they are all externally caused. Bold indicates the action taken by the directory in response to the request. Bold italics indicate an action that updates the sharing set, Sharers.
  • 16. 22/12/2005 Distributed Shared-Memory Architectures by Seda Demirağ Distributed Shared-Memory Architectures Example of Directory Protocol (cont’d) Example of Directory Protocol (cont’d) The state of uncached: Read miss: The requesting processor is sent the requested data from memory and the requestor is made the only sharing node. The state of the block is made shared. Write miss: The requesting processor is sent the value and becomes the sharing node. The block is made exclusive to indicate that the only valid copy is cached. Sharers indicates the identity of the owner.
  • 17. 22/12/2005 Distributed Shared-Memory Architectures by Seda Demirağ Distributed Shared-Memory Architectures Example of Directory Protocol (cont’d) Example of Directory Protocol (cont’d) The state of shared: Read miss: The requesting processor is sent the requested data from memory and The requesting processor is added to the sharing set. Write miss: The requesting processor is sent the value. All processors in the set Sharers are sent invalidate messages, and the Sharers set is to contain the identity of the requesting processor. The state of the block is made exclusive.
  • 18. 22/12/2005 Distributed Shared-Memory Architectures by Seda Demirağ Distributed Shared-Memory Architectures Example of Directory Protocol (cont’d) Example of Directory Protocol (cont’d) The state of exclusive: Read miss: The qwner processor is sent a data fetch message. The identity of the requesting processor is added to the set Sharers, whivh still contains the identity of the processor that was the owner. Data write back: The owner processor is replacing the block and therefore must write it back. This write back makes the memory copy up to date, the block is now uncached and the Sharers set is empty. Write miss: The block has a new owner. A message sent to the old owner, causing the cache to invalidate the block and send the value to the directory. Sharers is set to the identity of the new owner, and the state of the block remains exclusive.
  • 19. 22/12/2005 Distributed Shared-Memory Architectures by Seda Demirağ Distributed Shared-Memory Architectures Performance of DSM Multiprocessors Performance of DSM Multiprocessors In DSM architectures, the memory requests between local and remote is key to performance. It affects the bandwidth and the latency seen by requests. In the performance example we will separate the cache misses into local and remote requests. We will also compare the performance changings of the computational kernels FFT, LU; the applications Barnes and Ocean.
  • 20. 22/12/2005 Distributed Shared-Memory Architectures by Seda Demirağ Distributed Shared-Memory Architectures Performance of DSM Multiprocessors(cont’d) Performance of DSM Multiprocessors(cont’d) The miss rates with these cache sizes are not affected much by changes in processor count, with the exception of Ocean. The rise of miss rate at 64 processors results from these factors: An increase in mapping conflicts in cache that occur when the grid becomes small which leads to a rise in local misses and an increase in the number of the coherence misses, which are all remote.
  • 21. 22/12/2005 Distributed Shared-Memory Architectures by Seda Demirağ Distributed Shared-Memory Architectures Performance of DSM Multiprocessors(cont’d) Performance of DSM Multiprocessors(cont’d) This figure shows how the miss rates change as the cache size is increased, assuming a 64- processor execution and 64-byte blocks. By the time we reach the largest cache size shown 512 KB, the remote miss rat is equal to or greater than the local miss rate.
  • 22. 22/12/2005 Distributed Shared-Memory Architectures by Seda Demirağ Distributed Shared-Memory Architectures Performance of DSM Multiprocessors(cont’d) Performance of DSM Multiprocessors(cont’d) We examine the effect of tchanging the block size in this example. Increases in block size reduce the mis rate, even for large blocks, although the performance benefits for going to the largest blocks are small. So most of the improvement in miss rate comes from a reduction in the local misses.
  • 23. 22/12/2005 Distributed Shared-Memory Architectures by Seda Demirağ Distributed Shared-Memory Architectures Performance of DSM Multiprocessors(cont’d) Performance of DSM Multiprocessors(cont’d) The number of bytes per data reference climbs steadily as block size is increased.
  • 24. 22/12/2005 Distributed Shared-Memory Architectures by Seda Demirağ Distributed Shared-Memory Architectures Performance of DSM Multiprocessors(cont’d) Performance of DSM Multiprocessors(cont’d) The effective latency of memory references in a DSM multiprocessor depends both on the relative frequency of cache misses and on the location of the memory where the accesses are served.
  • 25. 22/12/2005 Distributed Shared-Memory Architectures by Seda Demirağ Distributed Shared-Memory Architectures REFERENCES: • Andrew S. T., Maarten V. S., Distributed Systems, 2002 • John L. H., David A. P. , Computer Architecture: A quantitive Approach, 2003 • Abraham S., Peter B. G., Greg G., Operating Systems Concepts, 2003 • Jinseok K., Gyungho L., Binding Time in Distributed Shared Memory Architectures, 1998 International Conference on Parallel Processing. • Bill N., Virginia L., Distributed Shared Memory: A Survey of Issues and Algorithms, Volume 24, Issue 8, August 1991, IEEE Computer Society Press • S. Zhou, M. Stumm, D. Wortman, K. Li, Heterogeneous Distributed Shared Memory, IEEE Transactions on Parallel and Distributed Systems, v.3 n.5, p.540-554, September 1992.
  • 26. 22/12/2005 Distributed Shared-Memory Architectures by Seda Demirağ Distributed Shared-Memory Architectures Any Questions?