SlideShare a Scribd company logo
Distributed Computing Seminar


Lecture 1: Introduction to Distributed
Computing & Systems Background


  Christophe Bisciglia, Aaron Kimball, & Sierra Michels-Slettvet
                          Summer 2007
      Except where otherwise noted, the contents of this presentation are
      © Copyright 2007 University of Washington and are licensed under
      the Creative Commons Attribution 2.5 License.
Course Overview
   5 lectures
    1  Introduction
     2 Technical Side: MapReduce & GFS
     2 Theoretical: Algorithms for distributed
      computing
   Readings + Questions nightly
       Readings: http://code.google.com/edu/content/submissions/mapreduce-minilecture/listing.html
       Questions: http://code.google.com/edu/content/submissions/mapreduce-
        minilecture/MapReduceMiniSeriesReadingQuestions.doc
Outline
   Introduction to Distributed Computing
   Parallel vs. Distributed Computing
   History of Distributed Computing
   Parallelization and Synchronization
   Networking Basics
Computer Speedup




Moore’s Law: “The density of transistors on a chip doubles every 18
months, for the same cost” (1965)
                                   Image: Tom’s Hardware and not subject to the Creative
                                     Commons license applicable to the rest of this work.
Scope of problems
 What can you do with 1 computer?
 What can you do with 100 computers?
 What can you do with an entire data
  center?
Distributed problems
   Rendering multiple frames of high-quality
    animation




Image: DreamWorks Animation and not subject to the Creative Commons license applicable to the rest of this work.
Distributed problems
    Simulating several
     hundred or thousand
     characters




    Happy Feet © Kingdom Feature Productions;
    Lord of the Rings © New Line Cinema, neither image is subject to the Creative
    Commons license applicable to the rest of the work.
Distributed problems
   Indexing the web (Google)
   Simulating an Internet-sized network for
    networking experiments (PlanetLab)
   Speeding up content delivery (Akamai)




What is the key attribute that all these examples have in common?
Parallel vs. Distributed
   Parallel computing can mean:
     Vector processing of data
     Multiple CPUs in a single computer
   Distributed computing is multiple CPUs
    across many computers over the network
A Brief History… 1975-85
   Parallel computing was
    favored in the early
    years
   Primarily vector-based
    at first
   Gradually more thread-
    based parallelism was
    introduced

Image: Computer Pictures Database and Cray Research Corp and is not subject to the Creative Commons license
applicable to the rest of this work.
A Brief History… 1985-95
 “Massively parallel architectures” start
  rising in prominence
 Message Passing Interface (MPI) and
  other libraries developed
 Bandwidth was a big problem
A Brief History… 1995-Today
 Cluster/grid architecture increasingly
  dominant
 Special node machines eschewed in favor
  of COTS technologies
 Web-wide cluster software
 Companies like Google take this to the
  extreme
Parallelization &
Synchronization
Parallelization Idea
   Parallelization is “easy” if processing can be
    cleanly split into n units:

                   work

                                      Partition
                                      problem


            w1     w2     w3
Parallelization Idea (2)


              w1         w2        w3

                    Spawn worker threads:

           thread         thread        thread




  In a parallel computation, we would like to have as
  many threads as we have processors. e.g., a four-
  processor computer would be able to run four threads
  at the same time.
Parallelization Idea (3)


                Workers process data:

       thread         thread            thread
         w1              w2               w3
Parallelization Idea (4)


      thread   thread    thread
        w1        w2       w3

                                  Report
                                  results



               results
Parallelization Pitfalls
But this model is too simple!

   How do we assign work units to worker threads?
   What if we have more work units than threads?
   How do we aggregate the results at the end?
   How do we know all the workers have finished?
   What if the work cannot be divided into
    completely separate tasks?

    What is the common theme of all of these problems?
Parallelization Pitfalls (2)
   Each of these problems represents a point
    at which multiple threads must
    communicate with one another, or access
    a shared resource.

   Golden rule: Any memory that can be used
    by multiple threads must have an
    associated synchronization system!
What is Wrong With This?
    Thread 1:                          Thread 2:
    void foo() {                       void bar() {
      x++;                               y++;
      y = x;                             x+=3;
    }                                  }

If the initial state is y = 0, x = 6, what happens
after these threads finish running?
Multithreaded = Unpredictability
 Many things that look like “one step” operations
  actually take several steps under the hood:
      Thread 1:          Thread 2:
      void foo() {       void bar() {
        eax = mem[x];      eax = mem[y];
        inc eax;           inc eax;
        mem[x] = eax;      mem[y] = eax;
        ebx = mem[x];      eax = mem[x];
        mem[y] = ebx;      add eax, 3;
      }                    mem[x] = eax;
                         }


   When we run a multithreaded program, we don’t
    know what order threads run in, nor do we know
    when they will interrupt one another.
Multithreaded = Unpredictability
This applies to more than just integers:

 Pulling work units from a queue
 Reporting work back to master unit
 Telling another thread that it can begin the
  “next phase” of processing

… All require synchronization!
Synchronization Primitives
   A synchronization primitive is a special
    shared variable that guarantees that it can
    only be accessed atomically.

   Hardware support guarantees that
    operations on synchronization primitives
    only ever take one step
Semaphores
   A semaphore is a flag               Set:           Reset:

    that can be raised or
    lowered in one step
   Semaphores were
    flags that railroad
    engineers would use
    when entering a
    shared track


    Only one side of the semaphore can ever be red! (Can both be
    green?)
Semaphores
 set() and reset() can be thought of as
  lock() and unlock()
 Calls to lock() when the semaphore is
  already locked cause the thread to block.

   Pitfalls: Must “bind” semaphores to
    particular objects; must remember to
    unlock correctly
The “corrected” example
Thread 1:                      Thread 2:

void foo() {                   void bar() {
  sem.lock();                    sem.lock();
  x++;                           y++;
  y = x;                         x+=3;
  sem.unlock();                  sem.unlock();
}                              }

Global var “Semaphore sem = new Semaphore();” guards access to
x&y
Condition Variables
   A condition variable notifies threads that a
    particular condition has been met

   Inform another thread that a queue now
    contains elements to pull from (or that it’s
    empty – request more elements!)

   Pitfall: What if nobody’s listening?
The final example
Thread 1:                       Thread 2:

void foo() {                    void bar() {
  sem.lock();                     sem.lock();
  x++;                            if(!fooDone)
  y = x;                            fooFinishedCV.wait(sem);
  fooDone = true;                 y++;
  sem.unlock();                   x+=3;
  fooFinishedCV.notify();         sem.unlock();
}                               }

 Global vars: Semaphore sem = new Semaphore(); ConditionVar
 fooFinishedCV = new ConditionVar(); boolean fooDone = false;
Too Much Synchronization?
Deadlock
Synchronization becomes even
more complicated when multiple
locks can be used


Can cause entire system to “get
stuck”
Thread A:                                        Thread B:
semaphore1.lock();                               semaphore2.lock();
semaphore2.lock();                               semaphore1.lock();
/* use data guarded by                           /* use data guarded by
     semaphores */                                    semaphores */
semaphore1.unlock();                             semaphore1.unlock();
semaphore2.unlock();                             semaphore2.unlock();
(Image: RPI CSCI.4210 Operating Systems notes)
The Moral: Be Careful!
   Synchronization is hard
     Need  to consider all possible shared state
     Must keep locks organized and use them
      consistently and correctly
 Knowing there are bugs may be tricky;
  fixing them can be even worse!
 Keeping shared state to a minimum
  reduces total system complexity
Fundamentals of
Networking
Sockets: The Internet = tubes?
 A socket is the basic network interface
 Provides a two-way “pipe” abstraction
  between two applications
 Client creates a socket, and connects to
  the server, who receives a socket
  representing the other side
Ports
   Within an IP address, a port is a sub-address
    identifying a listening program
   Allows multiple clients to connect to a server at
    once
What makes this work?
   Underneath the socket layer are several more
    protocols
   Most important are TCP and IP (which are used
    hand-in-hand so often, they’re often spoken of as
    one protocol: TCP/IP)


                             TCP
               IP header                   Your data
                            header




    Even more low-level protocols handle how data is sent over
    Ethernet wires, or how bits are sent through the air using 802.11
    wireless…
Why is This Necessary?
   Not actually tube-like “underneath the hood”
   Unlike phone system (circuit switched), the
    packet switched Internet uses many routes at
    once


          you                      www.google.com
Networking Issues
 If a party to a socket disconnects, how
  much data did they receive?
 … Did they crash? Or did a machine in the
  middle?
 Can someone in the middle
  intercept/modify our data?
 Traffic congestion makes switch/router
  topology important for efficient throughput
Conclusions
 Processing more data means using more
  machines at the same time
 Cooperation between processes requires
  synchronization
 Designing real distributed systems requires
  consideration of networking topology

   Next time: How MapReduce works

More Related Content

What's hot (20)

PPTX
Distributed operating system
udaya khanal
 
PPT
Chapter 1-distribute Computing
nakomuri
 
PPTX
Applications of Distributed Systems
sandra sukarieh
 
DOC
Centralized vs distrbution system
zirram
 
PPT
Chapter 1 -_characterization_of_distributed_systems
Francelyno Murela
 
PPT
16.Distributed System Structure
Senthil Kanth
 
PPTX
Distributed Systems
naveedchak
 
PDF
Lecture 1 introduction to parallel and distributed computing
Vajira Thambawita
 
PPT
Lecture 1 (distributed systems)
Fazli Amin
 
PPTX
Distributed system architecture
Yisal Khan
 
ODP
Distributed operating system(os)
Dinesh Modak
 
PDF
Design issues of dos
vanamali_vanu
 
PPT
Fundamentals
Divya Srinivasan
 
PDF
chapter 2 architecture
Sharda University Greater Noida
 
PPT
Distributed OS - An Introduction
Suhit Kulkarni
 
PPT
Distributed & parallel system
Manish Singh
 
PPTX
Introduction to Parallel Computing
Roshan Karunarathna
 
PPTX
Distributed Shared Memory Systems
Arush Nagpal
 
PPTX
Distributed operating system
Prankit Mishra
 
DOC
Distributed Operating System,Network OS and Middle-ware.??
Abdul Aslam
 
Distributed operating system
udaya khanal
 
Chapter 1-distribute Computing
nakomuri
 
Applications of Distributed Systems
sandra sukarieh
 
Centralized vs distrbution system
zirram
 
Chapter 1 -_characterization_of_distributed_systems
Francelyno Murela
 
16.Distributed System Structure
Senthil Kanth
 
Distributed Systems
naveedchak
 
Lecture 1 introduction to parallel and distributed computing
Vajira Thambawita
 
Lecture 1 (distributed systems)
Fazli Amin
 
Distributed system architecture
Yisal Khan
 
Distributed operating system(os)
Dinesh Modak
 
Design issues of dos
vanamali_vanu
 
Fundamentals
Divya Srinivasan
 
chapter 2 architecture
Sharda University Greater Noida
 
Distributed OS - An Introduction
Suhit Kulkarni
 
Distributed & parallel system
Manish Singh
 
Introduction to Parallel Computing
Roshan Karunarathna
 
Distributed Shared Memory Systems
Arush Nagpal
 
Distributed operating system
Prankit Mishra
 
Distributed Operating System,Network OS and Middle-ware.??
Abdul Aslam
 

Viewers also liked (20)

PPT
Distributed Computing Seminar - Lecture 2: MapReduce Theory and Implementation
tugrulh
 
PPT
Distributed computing seminar lecture 3 - distributed file systems
tugrulh
 
PPTX
distributed Computing system model
Harshad Umredkar
 
PDF
Client Server Model and Distributed Computing
Abhishek Jaisingh
 
PPT
Chapter 1
Hassan Dar
 
PPTX
A short introduction to Spark and its benefits
Johan Picard
 
PDF
Distributed computing the Google way
Eduard Hildebrandt
 
PDF
OpenMP Tutorial for Beginners
Dhanashree Prasad
 
PPTX
Aos distibutted system
Vijay Kumar Verma
 
PDF
Optimizing Hive Queries
DataWorks Summit
 
PPTX
Distributed computing environment
Ravi Bhushan
 
PPTX
Distributed computing
Keshab Nath
 
PPT
Distributed computing
Alokeparna Choudhury
 
PDF
Hive tuning
Michael Zhang
 
PPTX
Google cluster architecture
Abhijeet Desai
 
PPT
Distributed computing ).ppt him
Himanshu Saini
 
PDF
Reservation Presentation
Amrish Jhaveri
 
PPTX
Distributed computing
shivli0769
 
PDF
Big Data Developers in Paris presentation : Social Data
Abdellah Lamrani Alaoui
 
PPT
Introduction To Map Reduce
rantav
 
Distributed Computing Seminar - Lecture 2: MapReduce Theory and Implementation
tugrulh
 
Distributed computing seminar lecture 3 - distributed file systems
tugrulh
 
distributed Computing system model
Harshad Umredkar
 
Client Server Model and Distributed Computing
Abhishek Jaisingh
 
Chapter 1
Hassan Dar
 
A short introduction to Spark and its benefits
Johan Picard
 
Distributed computing the Google way
Eduard Hildebrandt
 
OpenMP Tutorial for Beginners
Dhanashree Prasad
 
Aos distibutted system
Vijay Kumar Verma
 
Optimizing Hive Queries
DataWorks Summit
 
Distributed computing environment
Ravi Bhushan
 
Distributed computing
Keshab Nath
 
Distributed computing
Alokeparna Choudhury
 
Hive tuning
Michael Zhang
 
Google cluster architecture
Abhijeet Desai
 
Distributed computing ).ppt him
Himanshu Saini
 
Reservation Presentation
Amrish Jhaveri
 
Distributed computing
shivli0769
 
Big Data Developers in Paris presentation : Social Data
Abdellah Lamrani Alaoui
 
Introduction To Map Reduce
rantav
 
Ad

Similar to Google: Cluster computing and MapReduce: Introduction to Distributed System Design (20)

PPT
Lec1 Intro
mobius.cn
 
PPT
Introduction to Cluster Computing and Map Reduce (from Google)
Sri Prasanna
 
PPT
Introduction & Parellelization on large scale clusters
Sri Prasanna
 
PPT
Distributed computing presentation
Delhi/NCR HUG
 
PPT
Parallel Programming: Beyond the Critical Section
Tony Albrecht
 
ODP
Multithreading 101
Tim Penhey
 
ODP
Concept of thread
Munmun Das Bhowmik
 
PDF
Peyton jones-2011-parallel haskell-the_future
Takayuki Muranushi
 
PDF
Simon Peyton Jones: Managing parallelism
Skills Matter
 
PPT
Threaded Programming
Sri Prasanna
 
PPT
Parallel Programming Primer 1
mobius.cn
 
PPTX
Interactions complicate debugging
Syed Zaid Irshad
 
PPT
Parallel Programming Primer
Sri Prasanna
 
PPT
Semaphores and Monitors
sathish sak
 
PDF
Parallel Algorithms
Dr Sandeep Kumar Poonia
 
PPTX
Cc module 3.pptx
ssuserbead51
 
PDF
Topic 4: Concurrency
Zubair Nabi
 
PDF
Multithreading 101
Tim Penhey
 
PDF
Concurrency
Sri Prasanna
 
PDF
Blocks & GCD
rsebbe
 
Lec1 Intro
mobius.cn
 
Introduction to Cluster Computing and Map Reduce (from Google)
Sri Prasanna
 
Introduction & Parellelization on large scale clusters
Sri Prasanna
 
Distributed computing presentation
Delhi/NCR HUG
 
Parallel Programming: Beyond the Critical Section
Tony Albrecht
 
Multithreading 101
Tim Penhey
 
Concept of thread
Munmun Das Bhowmik
 
Peyton jones-2011-parallel haskell-the_future
Takayuki Muranushi
 
Simon Peyton Jones: Managing parallelism
Skills Matter
 
Threaded Programming
Sri Prasanna
 
Parallel Programming Primer 1
mobius.cn
 
Interactions complicate debugging
Syed Zaid Irshad
 
Parallel Programming Primer
Sri Prasanna
 
Semaphores and Monitors
sathish sak
 
Parallel Algorithms
Dr Sandeep Kumar Poonia
 
Cc module 3.pptx
ssuserbead51
 
Topic 4: Concurrency
Zubair Nabi
 
Multithreading 101
Tim Penhey
 
Concurrency
Sri Prasanna
 
Blocks & GCD
rsebbe
 
Ad

Recently uploaded (20)

PDF
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
PPTX
Agentforce World Tour Toronto '25 - MCP with MuleSoft
Alexandra N. Martinez
 
PDF
Staying Human in a Machine- Accelerated World
Catalin Jora
 
PPTX
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
PDF
What’s my job again? Slides from Mark Simos talk at 2025 Tampa BSides
Mark Simos
 
PPTX
MuleSoft MCP Support (Model Context Protocol) and Use Case Demo
shyamraj55
 
PPTX
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
PPTX
Agentforce World Tour Toronto '25 - Supercharge MuleSoft Development with Mod...
Alexandra N. Martinez
 
PDF
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PDF
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
Lilly Gracia
 
PDF
SIZING YOUR AIR CONDITIONER---A PRACTICAL GUIDE.pdf
Muhammad Rizwan Akram
 
PDF
NLJUG Speaker academy 2025 - first session
Bert Jan Schrijver
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PDF
ICONIQ State of AI Report 2025 - The Builder's Playbook
Razin Mustafiz
 
DOCX
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
PDF
NASA A Researcher’s Guide to International Space Station : Physical Sciences ...
Dr. PANKAJ DHUSSA
 
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
Agentforce World Tour Toronto '25 - MCP with MuleSoft
Alexandra N. Martinez
 
Staying Human in a Machine- Accelerated World
Catalin Jora
 
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
What’s my job again? Slides from Mark Simos talk at 2025 Tampa BSides
Mark Simos
 
MuleSoft MCP Support (Model Context Protocol) and Use Case Demo
shyamraj55
 
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
Agentforce World Tour Toronto '25 - Supercharge MuleSoft Development with Mod...
Alexandra N. Martinez
 
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
Lilly Gracia
 
SIZING YOUR AIR CONDITIONER---A PRACTICAL GUIDE.pdf
Muhammad Rizwan Akram
 
NLJUG Speaker academy 2025 - first session
Bert Jan Schrijver
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
ICONIQ State of AI Report 2025 - The Builder's Playbook
Razin Mustafiz
 
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
NASA A Researcher’s Guide to International Space Station : Physical Sciences ...
Dr. PANKAJ DHUSSA
 

Google: Cluster computing and MapReduce: Introduction to Distributed System Design

  • 1. Distributed Computing Seminar Lecture 1: Introduction to Distributed Computing & Systems Background Christophe Bisciglia, Aaron Kimball, & Sierra Michels-Slettvet Summer 2007 Except where otherwise noted, the contents of this presentation are © Copyright 2007 University of Washington and are licensed under the Creative Commons Attribution 2.5 License.
  • 2. Course Overview  5 lectures 1 Introduction  2 Technical Side: MapReduce & GFS  2 Theoretical: Algorithms for distributed computing  Readings + Questions nightly  Readings: http://code.google.com/edu/content/submissions/mapreduce-minilecture/listing.html  Questions: http://code.google.com/edu/content/submissions/mapreduce- minilecture/MapReduceMiniSeriesReadingQuestions.doc
  • 3. Outline  Introduction to Distributed Computing  Parallel vs. Distributed Computing  History of Distributed Computing  Parallelization and Synchronization  Networking Basics
  • 4. Computer Speedup Moore’s Law: “The density of transistors on a chip doubles every 18 months, for the same cost” (1965) Image: Tom’s Hardware and not subject to the Creative Commons license applicable to the rest of this work.
  • 5. Scope of problems  What can you do with 1 computer?  What can you do with 100 computers?  What can you do with an entire data center?
  • 6. Distributed problems  Rendering multiple frames of high-quality animation Image: DreamWorks Animation and not subject to the Creative Commons license applicable to the rest of this work.
  • 7. Distributed problems  Simulating several hundred or thousand characters Happy Feet © Kingdom Feature Productions; Lord of the Rings © New Line Cinema, neither image is subject to the Creative Commons license applicable to the rest of the work.
  • 8. Distributed problems  Indexing the web (Google)  Simulating an Internet-sized network for networking experiments (PlanetLab)  Speeding up content delivery (Akamai) What is the key attribute that all these examples have in common?
  • 9. Parallel vs. Distributed  Parallel computing can mean:  Vector processing of data  Multiple CPUs in a single computer  Distributed computing is multiple CPUs across many computers over the network
  • 10. A Brief History… 1975-85  Parallel computing was favored in the early years  Primarily vector-based at first  Gradually more thread- based parallelism was introduced Image: Computer Pictures Database and Cray Research Corp and is not subject to the Creative Commons license applicable to the rest of this work.
  • 11. A Brief History… 1985-95  “Massively parallel architectures” start rising in prominence  Message Passing Interface (MPI) and other libraries developed  Bandwidth was a big problem
  • 12. A Brief History… 1995-Today  Cluster/grid architecture increasingly dominant  Special node machines eschewed in favor of COTS technologies  Web-wide cluster software  Companies like Google take this to the extreme
  • 14. Parallelization Idea  Parallelization is “easy” if processing can be cleanly split into n units: work Partition problem w1 w2 w3
  • 15. Parallelization Idea (2) w1 w2 w3 Spawn worker threads: thread thread thread In a parallel computation, we would like to have as many threads as we have processors. e.g., a four- processor computer would be able to run four threads at the same time.
  • 16. Parallelization Idea (3) Workers process data: thread thread thread w1 w2 w3
  • 17. Parallelization Idea (4) thread thread thread w1 w2 w3 Report results results
  • 18. Parallelization Pitfalls But this model is too simple!  How do we assign work units to worker threads?  What if we have more work units than threads?  How do we aggregate the results at the end?  How do we know all the workers have finished?  What if the work cannot be divided into completely separate tasks? What is the common theme of all of these problems?
  • 19. Parallelization Pitfalls (2)  Each of these problems represents a point at which multiple threads must communicate with one another, or access a shared resource.  Golden rule: Any memory that can be used by multiple threads must have an associated synchronization system!
  • 20. What is Wrong With This? Thread 1: Thread 2: void foo() { void bar() { x++; y++; y = x; x+=3; } } If the initial state is y = 0, x = 6, what happens after these threads finish running?
  • 21. Multithreaded = Unpredictability  Many things that look like “one step” operations actually take several steps under the hood: Thread 1: Thread 2: void foo() { void bar() { eax = mem[x]; eax = mem[y]; inc eax; inc eax; mem[x] = eax; mem[y] = eax; ebx = mem[x]; eax = mem[x]; mem[y] = ebx; add eax, 3; } mem[x] = eax; }  When we run a multithreaded program, we don’t know what order threads run in, nor do we know when they will interrupt one another.
  • 22. Multithreaded = Unpredictability This applies to more than just integers:  Pulling work units from a queue  Reporting work back to master unit  Telling another thread that it can begin the “next phase” of processing … All require synchronization!
  • 23. Synchronization Primitives  A synchronization primitive is a special shared variable that guarantees that it can only be accessed atomically.  Hardware support guarantees that operations on synchronization primitives only ever take one step
  • 24. Semaphores  A semaphore is a flag Set: Reset: that can be raised or lowered in one step  Semaphores were flags that railroad engineers would use when entering a shared track Only one side of the semaphore can ever be red! (Can both be green?)
  • 25. Semaphores  set() and reset() can be thought of as lock() and unlock()  Calls to lock() when the semaphore is already locked cause the thread to block.  Pitfalls: Must “bind” semaphores to particular objects; must remember to unlock correctly
  • 26. The “corrected” example Thread 1: Thread 2: void foo() { void bar() { sem.lock(); sem.lock(); x++; y++; y = x; x+=3; sem.unlock(); sem.unlock(); } } Global var “Semaphore sem = new Semaphore();” guards access to x&y
  • 27. Condition Variables  A condition variable notifies threads that a particular condition has been met  Inform another thread that a queue now contains elements to pull from (or that it’s empty – request more elements!)  Pitfall: What if nobody’s listening?
  • 28. The final example Thread 1: Thread 2: void foo() { void bar() { sem.lock(); sem.lock(); x++; if(!fooDone) y = x; fooFinishedCV.wait(sem); fooDone = true; y++; sem.unlock(); x+=3; fooFinishedCV.notify(); sem.unlock(); } } Global vars: Semaphore sem = new Semaphore(); ConditionVar fooFinishedCV = new ConditionVar(); boolean fooDone = false;
  • 29. Too Much Synchronization? Deadlock Synchronization becomes even more complicated when multiple locks can be used Can cause entire system to “get stuck” Thread A: Thread B: semaphore1.lock(); semaphore2.lock(); semaphore2.lock(); semaphore1.lock(); /* use data guarded by /* use data guarded by semaphores */ semaphores */ semaphore1.unlock(); semaphore1.unlock(); semaphore2.unlock(); semaphore2.unlock(); (Image: RPI CSCI.4210 Operating Systems notes)
  • 30. The Moral: Be Careful!  Synchronization is hard  Need to consider all possible shared state  Must keep locks organized and use them consistently and correctly  Knowing there are bugs may be tricky; fixing them can be even worse!  Keeping shared state to a minimum reduces total system complexity
  • 32. Sockets: The Internet = tubes?  A socket is the basic network interface  Provides a two-way “pipe” abstraction between two applications  Client creates a socket, and connects to the server, who receives a socket representing the other side
  • 33. Ports  Within an IP address, a port is a sub-address identifying a listening program  Allows multiple clients to connect to a server at once
  • 34. What makes this work?  Underneath the socket layer are several more protocols  Most important are TCP and IP (which are used hand-in-hand so often, they’re often spoken of as one protocol: TCP/IP) TCP IP header Your data header Even more low-level protocols handle how data is sent over Ethernet wires, or how bits are sent through the air using 802.11 wireless…
  • 35. Why is This Necessary?  Not actually tube-like “underneath the hood”  Unlike phone system (circuit switched), the packet switched Internet uses many routes at once you www.google.com
  • 36. Networking Issues  If a party to a socket disconnects, how much data did they receive?  … Did they crash? Or did a machine in the middle?  Can someone in the middle intercept/modify our data?  Traffic congestion makes switch/router topology important for efficient throughput
  • 37. Conclusions  Processing more data means using more machines at the same time  Cooperation between processes requires synchronization  Designing real distributed systems requires consideration of networking topology  Next time: How MapReduce works