Chapter 3 - Processes
Introduction
◼ Communication takes place between processes
◼a process is a program in execution
◼from OS perspective, management and scheduling of processes is important
other important issues arise in distributed systems
◼Multi threading to enhance performance by overlapping communication and
local processing
◼how are clients and servers organized and server design issues
◼process or code migration to achieve scalability and to dynamically configure
clients and servers
3.1 Threads and their Implementation
• how are processes and threads related?
• Process tables or PCBs are used to keep track of processes
• there are usually many processes executing concurrently
• processes should not interfere with each other; sharing
resources by processes is transparent
• this concurrency transparency has a high price; allocating
resources for a new process and context switching take
time
• a thread also executes independently from other threads;
but no need of a high degree of concurrency transparency
thereby resulting in better performance
• threads can be used in both distributed and non distributed
systems
• Threads in Non distributed Systems
• a process has an address space (containing program text and
data) and a single thread of control, as well as other resources
such as open files, child processes, accounting information, etc.
Process 1 Process 2 Process 3
processes each with one thread one process with three threads
• each thread has its own program counter,
registers, stack, and state; but all threads of a
process share address space, global variables and
other resources such as open files, etc.
• Threads take turns in running
• Threads allow multiple executions to take place in the same process
environment, called multi threading
• Thread Usage –Why do we need threads?
– e.g., a word processor has different parts for
– interacting with the user
– formatting the page as soon as changes are made
– timed savings (for auto recovery)
– spelling and grammar checking, etc.
1.Simplifying the programming model: since many activities are going on at
once more or less independently
2.They are easier to create and destroy than processes since they do not have
any resources attached to them
3.Performance improves by overlapping activities if there is too much I/ O; i.
e., to avoid blocking when waiting for input or doing calculations, say in a
spreadsheet
• 4. Real parallelism is possible in a multiprocessor system
• in non distributed systems, threads can be used
with shared data instead of processes to avoid
context switching overhead interprocess
communication (IPC)
context switching as the result of IPC
• Threads in Distributed Systems
• Multithreaded Clients
– consider a Web browser; fetching different parts of a
page can be implemented as a separate thread, each
opening its own TCP connection to the server
– each can display the results as it gets its part of the page
– parallelism can also be achieved for replicated servers
since each thread request can be forwarded to separate
replicas
– Multithreaded Servers
– servers can be constructed in three ways
A. single-threaded process
• it gets a request, examines it, carries it out to completion
before getting the next request
b. Threads
– threads are more important for implementing servers
• e.g., a file server
– the dispatcher thread reads incoming requests for a file operation
from clients and passes it to an idle worker thread
C. finite-state machine
• if threads are not available
• it gets a request, examines it, tries to fulfill the request from
cache, else sends a request to the file system
Model Characteristics
Single-threaded processNo parallelism, blocking system calls
Threads Parallelism, blocking system calls (thread only)
Finite-state machine Parallelism, non blocking system calls
read about virtualization (the illusion of having more resources than we actually
have): pages 79 -82
3.2 Anatomy of Clients
Two issues: user interfaces and client-side software
for distribution transparency
A. User Interfaces
• to create a convenient environment for the
interaction of a human user and a remote server;
e.g. mobile phones with simple displays and a set
of keys
• GUIs are most commonly used
• The X Window System (or simply X) as an example
• it has the X kernel: the part of the OS that controls
the terminal (monitor, keyboard, pointing device
like a mouse) and is hardware dependent
b. Client-Side Software for Distribution Transparency
• in addition to the user interface, parts of the
processing and data level in a client-server
application are executed at the client side
• an example is embedded client software for ATMs,
cash registers, etc.
• moreover, client software can also include
components to achieve distribution transparency
• e.g., replication transparency
• assume a distributed system with replicated
servers; the client proxy can send requests to each
replica
3.3 Servers and Design Issues
3.3.1 General Design Issues
a. How to organize servers?
Iterative server
• the server itself handles the request and returns
the result
Concurrent server
• it passes a request to a separate process or
thread and waits for the next incoming request;
e.g., a multithreaded server;
b. Where do clients contact a server?
• using endpoints or ports at the machine where
the server is running where each server listens
to a specific endpoint
• how do clients know the endpoint of a service?
• globally assign endpoints for well-known
services; e.g. FTP is on TCP port 21, HTTP is on
TCP port 80
3.4 Code Migration
• so far, communication was concerned on
passing data
• we may pass programs, even while running
and in heterogeneous systems
• code migration also involves moving data as
well: when a program migrates while running,
its status, pending signals, and other
environment variables such as the stack and
the program counter also have to be moved
Reasons for Migrating Code
• to improve performance; move processes from
heavily-loaded to lightly-loaded machines (load
balancing)
• to reduce communication: move a client
application that performs many database
operations to a server if the database resides on
the server; then send only results to the client
• to exploit parallelism (for nonparallel programs):
e.g., copies of a mobile program (called a mobile
agent is called in search engines) moving from
site to site searching the Web
• migration can be
– sender-initiated: the machine where the code resides
or is currently running; e.g., uploading programs to a
server; may need authentication or that the client is a
registered one; crawlers to index Web pages
– receiver-initiated: by the target machine
• in a client-server model, receiver-initiated is
easier to implement since security issues are
minimized; if clients are allowed to send code
(sender-initiated), the server must know them
since they may access resources such as disk on
the server
Chapter Four
Inter-Process Communication in Distributed
System
1
Introduction
➢ Communication is the process of transferring
information from one place/object to another
place/object to share ideas and concepts among
objects.
➢ In distributed system communication is the
interaction among processes, users, programs in a
networked environment. Or
➢ It is the process of transferring objects (information
(text, audio, image),database object(records, fields,
table), web page object, code, etc) from one place to
another place to share ideas and concepts among
objects.
2
Inter Process Communication (IPC)
➢ IPC is the interaction among processes.
➢ A distributed application requires the participation of two
or more independent entities (processes).
E.g Chatting, db application(UI, database, processing level)
➢ Each process can act as either the sender or the receiver.
E.g Inserting data on remote machine, displaying the
content of remote machine(DB, pages).
➢ When inter-process communication needs to take place
with processes running on different hosts which are behind
a firewall, then you need to use a TCP socket-based
connection to dedicate a port for each process running on a
separate host.
– Eye and hear to communicate with remote object.
E.g SMTP---port 25, FTP---21,http—80 port 3
4
Introduction
➢ Inter process communication is at the heart of all distributed
systems
➢ Communication in distributed systems is based on message
passing as offered by the underlying network as opposed to
using shared memory
➢ Modern distributed systems consist of thousands of processes
scattered across an unreliable network such as the Internet
➢ unless the primitive communication facilities of the network
are replaced by more advanced ones, development of large
scale Distributed Systems becomes extremely difficult
Network Protocols and Standards
➢ why communication in distributed systems? because there is
no shared memory
➢ Two communicating processes must agree on the syntax and
semantics of messages
➢ A protocol is a set of rules that governs data communications
➢ A protocol defines what is communicated, how it is
communicated, and when it is communicated
➢ the key elements of a protocol are syntax, semantics, and
timing
✓ syntax: refers to the structure or format of the data
✓ semantics: refers to the meaning of each section of bits
✓ timing: refers to when data should be sent and how fast
they can be sent
5
6
➢ Two computers, possibly from different manufacturers, must be
able to talk to each other; for such a communication, there has
to be a standard.
➢ The ISO OSI (Open Systems Interconnection) Reference Model is
one of such standards - 7 layers
➢ TCP/IP protocol suite is the other; has 4 or 5 layers
➢ OSI
– Open – to connect open systems or systems that are open for
communication with other open systems using standard
rules that govern the format, contents, and meaning of the
messages sent and received
– these rules are called protocols
– two types of transport layer protocols: connection-oriented
and connectionless
Classification of communication
• Based on nature of middle ware storage device:
• A) Persistent Communication: a message that has been submitted
for transmission is stored by the communication middleware as long
as it takes to deliver it to the receiver.
• Example: Email System
– email server is a middle wares which store the message at one or
several of the storage facilities.
– it is not necessary for the sending application to continue
execution after submitting the message. Likewise, the receiving
application need not be executing when the message is
submitted.
• On the contrary, telephone system & some chatting systems need
the live existence of both sender and receiver at the time of
communication.
7
B) Transient Communication: message is stored by
the communication system only as long as the
sending and receiving application are executing.
• Both the sender and receiver must be active at
the time of communication.
• Example: telephone system
8
Based on way of delivering of data
• A) Synchronous Communication:
• a message is only allowed to be sent if its destination is ready to
receive it.
• sender and receiver to wait for each other to transfer the message.
• there is no memory sharing in message passing, there is no need for
mutual exclusion.
• no buffers are used in communication channels.
• If one process is ready to communicate and the other is not, the one
that is ready must be blocked (or wait).
– Example:
– machine having different speed of processing.
– code send a message ,call a function, etc and is blocked until an
answer, return value, etc arrives.
9
• sending and receiving of a message are dependent events.
• all parties involved in the communication are present at the same
time.
• Example: telephone conversation (not texting), video conferencing ,
a chat room event, etc.
B) Asynchronous Communication
• sending and receiving of a message are independent events.
• Asynchronous message passing systems deliver a message from
sender to receiver without waiting for the receiver to be ready.
• it's one-way communication.
– Example :
– process A and process B
– Continuing executing after sending a message /calling a
function.
• Example: postal system, e-mail messages, text messaging over cell
phones, keyboard , etc. 10
Communication Paradigms in Distributed System
❖Shared memory
❖Message Oriented
❖ Remote procedure call (RPC)
❖Remote Method Invocation (RMI)
❖Stream Oriented
❖Group Communication
However, Communication among a number of processors in
one machine is carried out using shared memory.
• The problem may solved by multiple processes on
the same or different machines.
Real world analogous: Human communication
Direct communication: face –to-face communication
Indirection communication: using phone ,messenger, Posta
to send message ,notice board, etc.
11
Shared Memory
• Shared memory is an efficient means of passing data between
programs. One program will create a memory portion which
other processes (if permitted) can access.
– Example : using the output of one function on any other function. So the
two functions communicate by writing and reading the content of memory.
• In computer programming, shared memory is a method by which
program processes can exchange data more quickly than by
reading and writing using the regular operating system services.
– Os is acting as an interface between hardware and other software.
• E.g a client process may have data to pass to a server process
that the server process is to modify and return to the client.
• we need some kind of synchronization between processes that
read and write shared memory.
Memory
process1 process2 process3 12
• Shared memory models communicate by reading/writing
to shared memory blocks
• A globally physical memory equally accessible to all
processors.
• There is latency in accessing memory if there are a number
of processors.
• Processes can exchange information by reading and
writing data to the shared region.
• One process writes to a buffer and then the other
reads from another process.
• Multiple threads for parallel programming , threads use a
single shared memory.
• Analogous : Organization and its employees
communicate by using a notice board.
13
14
Remote Procedure Call
• the first distributed systems were based on explicit message exchange
between processes through the use of explicit send and receive
procedures; but do not allow access transparency
• in 1984, Birrel and Nelson introduced a different way of handling
communication: RPC
• it allows a program to call a procedure located on another machine
• simple and elegant, but there are implementation problems
– the calling and called procedures run in different address
spaces
– parameters and results have to be exchanged; what if the
machines are not identical?
– what happens if both machines crash?
15
parameter passing in a local procedure
call: the stack before the call to read
▪ Conventional Procedure Call, i.e., on a single machine
▪ e.g. count = read (fd, buf, bytes); a C like statement, where
fd is an integer indicating a file
buf is an array of characters into which data are read
bytes is the number of bytes to be read
the stack while the called
procedure is active
Stack pointer
▪ parameters can be call-by-value (fd and bytes) or call-by
reference (buf) or in some languages call-by-copy/restore
Stack pointer
16
principle of RPC between a client and server program
▪ Client and Server Stubs
▪ RPC would like to make a remote procedure call look the
same as a local one; it should be transparent, i.e., the calling
procedure should not know that the called procedure is
executing on a different machine or vice versa
▪ when a program is compiled, it uses different versions of
library functions called client stubs
▪ a server stub is the server-side equivalent of a client stub
17
▪ Steps of a Remote Procedure Call
1. Client procedure calls client stub in the normal way
2. Client stub builds a message and calls the local OS
(packing parameters into a message is called parameter
marshaling)
3. Client's OS sends the message to the remote OS
4. Remote OS gives the message to the server stub
5. Server stub unpacks the parameters and calls the server
6. Server does the work and returns the result to the stub
7. Server stub packs it in a message and calls the local OS
8. Server's OS sends the message to the client's OS
9. Client's OS gives the message to the client stub
10. Stub unpacks the result and returns to client
▪ hence, for the client remote services are accessed by making
ordinary (local) procedure calls; not by calling send and
receive
 server machine vs server process; client machine vs client process
Sequence of Events During RPC
•
18
19
• resulted from object-based technology that has proven its value in
developing non distributed applications
• it is an expansion of the RPC mechanisms
• it enhances distribution transparency as a consequence of an object
that hides its internal from the outside world by means of a well-
defined interface
• Distributed Objects
– an object encapsulates data, called the state, and the
operations on those data, called methods
– methods are made available through interfaces
– the state of an object can be manipulated only by invoking
methods
– this allows an interface to be placed on one machine while
the object itself resides on another machine; such an
organization is referred to as a distributed object
Remote Object (Method) Invocation (RMI)
20
▪ the state of an object is not distributed, only the interfaces are; such
objects are also referred to as remote objects
▪ the implementation of an object’s interface is called a proxy
(analogous to a client stub in RPC systems)
▪ it is loaded into the client’s address space when a client
binds to a distributed object
▪ tasks: a proxy marshals method invocation into messages
and unmarshals reply messages to return the result of the
method invocation to the client
▪ a server stub, called a skeleton, unmarshals messages and
marshals replies
21
common organization of a remote object with client-side proxy
22
• RPCs and RMIs are not adequate for all distributed system
applications
• the provision of access transparency may be good but they have
semantics that is not adequate for all applications
• example problems
– they assume that the receiving side is running at the time
of communication
– a client is blocked until its request has been processed
4.4 Message Oriented Communication
Message Passing
• Message Passing is a form of communication used in parallel
computing(sending messages to different service providers), object-
oriented programming and intercrosses communication.
• Message passing provides a mechanism for the exchange of data in
memory distributed across the nodes of a cluster.
• Sending message to intermediate storage or directly to the receiver.
– E.g email, text & video chatting application(direct)
• In this model ,processes or objects can send and receive messages
to other processes. Messages are sent from a sender to one or more
recipients.
– One-one-communication. E.g email, telephone, etc.
– One-to-many communication.1client process to many servers, email, TV&
Radio broadcasting, etc.
– Many-to-one communication. Many clients accessing a single server machine.
– Many-to-many communication
• Communication in the message passing paradigm is performed
using the send and receive functions.
23
• Send primitive needs destination process and the
message data as parameters.
• Receive primitive needs the name of sender
processor and should provide a storage buffer for the
message.
• Message passing systems make workers communicate
through a messaging system. Messages keep everyone
separated, so that workers cannot modify each other's
data.
• Example: Radio and TV transmission from source to the
users node, Video conferences, telephone
communication.
24
• it requires the application programmer to be able to identify the
destination process, the message, the source process and the
data types expected from these processes.
• Messages are packets of data and move from process to process.
• Message can be data, destination process ID, sending process
ID, message length, data type, etc.
• Syntax for sending and receiving primitives:
– Send(p, message)…. Sending message to process P, port P,
mailbox, object X, etc.
– Receive(Q, message)…receiving message from process Q,
port P, mailbox, etc.
Process A send
” text ”
mail box/port process B
25
• Two generic message passing primitives for sending and receiving
messages.
send (destination, message)
receive (source, message)
source or dest={ process name, object name, link, mailbox, port}
• Example:
– Send(A, message)… send message to mail box A
– Receive (A, message )….receiving message from mail box A.
• The message passing system has to be told the following
information: sending process, sources location, data type, data
length, receiving process , destination location .
26
Types of message passing
• A) Synchronous Message Passing
• Requires the sender and receiver to wait for each other to transfer the
message.
• The sender will not continue until the receiver has received the message( need
agreement from sender and receiver).
• If process B is not ready to receive at the moment of sending, Process A is
suspended until Process B is receiving.
• No buffers are used in communication channels.
• Synchronous communication can be blocked by channel being busy or in error
since only one message is allowed to be transmitted via a channel at time.
• B) Asynchronous Message Passing
• Deliver a message from sender to receiver without waiting for the receiver to
be ready. Example: Radio transmission ,TV transmission, texting ,etc
• It is store and forward message passing method.
– E.g if process A wants to send data to process B. then A put data on B’ post
box(queue). The next time, Process B looks into its postbox, it will retrieve
the message.
27
28
▪ transient: a message that has been submitted for
transmission is stored by the communication system only as
long as the sending and receiving applications are executing
Persistent Transient
Asynchronous ✓ ✓
Synchronous ✓ message-oriented; three forms
▪ asynchronous: a sender continues immediately after it has
submitted its message for transmission
▪ synchronous: the sender is blocked until its message is
stored in a local buffer at the receiving host or delivered to the
receiver
▪ the different types of communication can be combined
▪ persistent asynchronous: e.g., email
▪ transient asynchronous: e.g., UDP, asynchronous RPC
▪ in general there are six possibilities
29
persistent asynchronous communication persistent synchronous
communication
30
receipt-based transient synchronous communication
transient asynchronous
communication
▪ weakest form; the sender is
blocked until the message is
stored in a local buffer at the
receiving host
31
response-based transient synchronous
communication
delivery-based transient
synchronous communication
at message delivery
▪ the sender is blocked until the
message is delivered to the
receiver for further processing
▪ strongest form; the sender is
blocked until it receives a reply
message from the receiver
32
• until now, we focused on exchanging independent and complete units of
information
• time has no effect on correctness; a system can be slow or fast
• however, there are communications where time has a critical role
• Multimedia
– media
• storage, transmission, interchange, presentation, representation and
perception of different data types:
• text, graphics, images, voice, audio, video, animation, ...
• movie: video + audio + …
– multimedia: handling of a variety of representation media
– end user pull
• information overload and starvation
– technology push
• emerging technology to integrate media
4.5 Stream Oriented Communication
33
▪ The Challenge
▪ new applications
▪ multimedia will be pervasive in few years (as graphics)
▪ storage and transmission
▪ e.g., 2 hours uncompressed HDTV (1920×1080) movie:
1.12 TB (1920×1080x3x25x60x60x2)
▪ videos are extremely large, even after compressed
(actually encoded)
▪ continuous delivery
▪ e.g., 30 frames/s (NTSC), 25 frames/s (PAL) for video
▪ guaranteed Quality of Service
▪ admission control
▪ search
▪ can we look at 100… videos to find the proper one?
34
▪ Types of Media
▪ two types
▪ discrete media: text, executable code, graphics, images;
temporal relationships between data items are not
fundamental to correctly interpret the data
▪ continuous media: video, audio, animation; temporal
relationships between data items are fundamental to
correctly interpret the data
▪ a data stream is a sequence of data units and can be applied
to discrete as well as continuous media
▪ stream-oriented communication provides facilities for the
exchange of time-dependent information (continuous media)
such as audio and video streams
35
• timing in transmission modes
– asynchronous transmission mode: data items are transmitted one after the
other, but no timing constraints; e.g. text transfer
– synchronous transmission mode: a maximum end-to-end delay defined for
each data unit; it is possible that data can be transmitted faster than the
maximum delay, but not slower
– isochronous transmission mode: maximum and minimum end-to-end delay
are defined; also called bounded delay jitter; applicable for distributed
multimedia systems
• a continuous data stream can be simple or complex
– simple stream: consists of a single sequence of data; e.g., mono audio, video
only (only visual frames)
– complex stream: consists of several related simple streams that must be
synchronized; e.g., stereo audio, video consisting of audio and video (may
also contain subtitles, translation to other languages, ...)
Communication in Group
• Is a process of broadcasting a message across the entire
network, rather than just to one other computer.
• The membership in process groups is dynamic: Processes
may join and leave groups.
– Example: in case of chatting the number of processor changes
from time to time.
• All communication between processes occurs within a
communication domain
• Intra-communication: representing communication within
the same process group,
• Inter-communication representing communication
between two distinct process groups.
• Group communication can be carried out using:
– Broad Casting: sending message to the entire network
– Multicasting: sending message to selected nodes or processors.
36
Ordering of Message on recipient side
• First In First Out (FIFO): this true in case of
synchronous communication.
• Causal ordering
• Total management
37
Chapter 5 - Naming
1
2
Introduction
▪ names play an important role to:
▪ share resources
▪ uniquely identify entities
▪ refer to locations
▪ etc.
▪ an important issue is that a name can be resolved to the entity
it refers to
▪ to resolve names, it is necessary to implement a naming
system
▪ in a distributed system, the implementation of a naming
system is itself often distributed, unlike in nondistributed
systems
▪ efficiency and scalability of the naming system are the main
issues
3
▪ we discuss how
▪ human friendly names are organized and implemented;
e.g., those for file systems and the WWW
▪ names are used to locate mobile entities
▪ to remove names that are no more used, also called
garbage collection
Objectives of the Chapter
4
5.1 Naming Entities
▪ Names, Identifiers, and Addresses
▪ a name in a distributed system is a string of bits or
characters that is used to refer to an entity
▪ an entity is anything; e.g., resources such as hosts, printers,
disks, files, objects, processes, users, ...
▪ entities can be operated on; e.g., a resource such as a printer
offers an interface containing operations for printing a
document, requesting the status of a job, ...
▪ to operate on an entity, it is necessary to access it through
its access point, itself an entity (special)
5
▪ access point
▪ the name of an access point is called an address (such as
IP address and port number as used by the transport layer)
▪ the address of the access point of an entity is also referred
to as the address of the entity
▪ an entity can have more than one access point (similar to
accessing an individual through different telephone
numbers)
▪ an entity may change its access point in the course of time
(e.g., a mobile computer getting a new IP address as it
moves)
6
▪ an address is a special kind of name
▪ it refers to at most one entity
▪ each entity is referred by at most one address; even when
replicated such as in Web pages
▪ an entity may change an access point, or an access point
may be reassigned to a different entity (like telephone
numbers in offices)
▪ separating the name of an entity and its address makes it
easier and more flexible; such a name is called location
independent
▪ there are also other types of names that uniquely identify an
entity; in any case an identifier is a name with the following
properties
▪ it refers to at most one entity
▪ each entity is referred by at most one identifier
▪ it always refers to the same entity (never reused)
▪ identifiers allow us to unambiguously refer to an entity
7
▪ examples
▪ name of an FTP server (entity)
▪ URL of the FTP server
▪ address of the FTP server
▪ IP number:port number
▪ the address of the FTP server may change
8
5.2 Name Spaces and Name Resolution
▪ names in a distributed system are organized into a name space
▪ a name space is generally organized as a labeled, directed
graph with two types of nodes
▪ leaf node: represents the named entity and stores
information such as its address or the state of that entity
▪ directory node: a special entity that has a number of
outgoing edges, each labeled with a name
a general naming graph with a single root node

DISTRIBUTED SYSTEM CHAPTER THREE UP TO FIVE.pdf

  • 1.
    Chapter 3 -Processes
  • 2.
    Introduction ◼ Communication takesplace between processes ◼a process is a program in execution ◼from OS perspective, management and scheduling of processes is important other important issues arise in distributed systems ◼Multi threading to enhance performance by overlapping communication and local processing ◼how are clients and servers organized and server design issues ◼process or code migration to achieve scalability and to dynamically configure clients and servers
  • 3.
    3.1 Threads andtheir Implementation • how are processes and threads related? • Process tables or PCBs are used to keep track of processes • there are usually many processes executing concurrently • processes should not interfere with each other; sharing resources by processes is transparent • this concurrency transparency has a high price; allocating resources for a new process and context switching take time • a thread also executes independently from other threads; but no need of a high degree of concurrency transparency thereby resulting in better performance
  • 4.
    • threads canbe used in both distributed and non distributed systems • Threads in Non distributed Systems • a process has an address space (containing program text and data) and a single thread of control, as well as other resources such as open files, child processes, accounting information, etc. Process 1 Process 2 Process 3 processes each with one thread one process with three threads
  • 5.
    • each threadhas its own program counter, registers, stack, and state; but all threads of a process share address space, global variables and other resources such as open files, etc.
  • 6.
    • Threads taketurns in running • Threads allow multiple executions to take place in the same process environment, called multi threading • Thread Usage –Why do we need threads? – e.g., a word processor has different parts for – interacting with the user – formatting the page as soon as changes are made – timed savings (for auto recovery) – spelling and grammar checking, etc. 1.Simplifying the programming model: since many activities are going on at once more or less independently 2.They are easier to create and destroy than processes since they do not have any resources attached to them 3.Performance improves by overlapping activities if there is too much I/ O; i. e., to avoid blocking when waiting for input or doing calculations, say in a spreadsheet • 4. Real parallelism is possible in a multiprocessor system
  • 7.
    • in nondistributed systems, threads can be used with shared data instead of processes to avoid context switching overhead interprocess communication (IPC) context switching as the result of IPC
  • 8.
    • Threads inDistributed Systems • Multithreaded Clients – consider a Web browser; fetching different parts of a page can be implemented as a separate thread, each opening its own TCP connection to the server – each can display the results as it gets its part of the page – parallelism can also be achieved for replicated servers since each thread request can be forwarded to separate replicas – Multithreaded Servers – servers can be constructed in three ways A. single-threaded process • it gets a request, examines it, carries it out to completion before getting the next request
  • 9.
    b. Threads – threadsare more important for implementing servers • e.g., a file server – the dispatcher thread reads incoming requests for a file operation from clients and passes it to an idle worker thread C. finite-state machine • if threads are not available • it gets a request, examines it, tries to fulfill the request from cache, else sends a request to the file system Model Characteristics Single-threaded processNo parallelism, blocking system calls Threads Parallelism, blocking system calls (thread only) Finite-state machine Parallelism, non blocking system calls read about virtualization (the illusion of having more resources than we actually have): pages 79 -82
  • 10.
    3.2 Anatomy ofClients Two issues: user interfaces and client-side software for distribution transparency A. User Interfaces • to create a convenient environment for the interaction of a human user and a remote server; e.g. mobile phones with simple displays and a set of keys • GUIs are most commonly used • The X Window System (or simply X) as an example • it has the X kernel: the part of the OS that controls the terminal (monitor, keyboard, pointing device like a mouse) and is hardware dependent
  • 11.
    b. Client-Side Softwarefor Distribution Transparency • in addition to the user interface, parts of the processing and data level in a client-server application are executed at the client side • an example is embedded client software for ATMs, cash registers, etc. • moreover, client software can also include components to achieve distribution transparency • e.g., replication transparency • assume a distributed system with replicated servers; the client proxy can send requests to each replica
  • 12.
    3.3 Servers andDesign Issues 3.3.1 General Design Issues a. How to organize servers? Iterative server • the server itself handles the request and returns the result Concurrent server • it passes a request to a separate process or thread and waits for the next incoming request; e.g., a multithreaded server;
  • 13.
    b. Where doclients contact a server? • using endpoints or ports at the machine where the server is running where each server listens to a specific endpoint • how do clients know the endpoint of a service? • globally assign endpoints for well-known services; e.g. FTP is on TCP port 21, HTTP is on TCP port 80
  • 14.
    3.4 Code Migration •so far, communication was concerned on passing data • we may pass programs, even while running and in heterogeneous systems • code migration also involves moving data as well: when a program migrates while running, its status, pending signals, and other environment variables such as the stack and the program counter also have to be moved
  • 15.
    Reasons for MigratingCode • to improve performance; move processes from heavily-loaded to lightly-loaded machines (load balancing) • to reduce communication: move a client application that performs many database operations to a server if the database resides on the server; then send only results to the client • to exploit parallelism (for nonparallel programs): e.g., copies of a mobile program (called a mobile agent is called in search engines) moving from site to site searching the Web
  • 16.
    • migration canbe – sender-initiated: the machine where the code resides or is currently running; e.g., uploading programs to a server; may need authentication or that the client is a registered one; crawlers to index Web pages – receiver-initiated: by the target machine • in a client-server model, receiver-initiated is easier to implement since security issues are minimized; if clients are allowed to send code (sender-initiated), the server must know them since they may access resources such as disk on the server
  • 17.
  • 18.
    Introduction ➢ Communication isthe process of transferring information from one place/object to another place/object to share ideas and concepts among objects. ➢ In distributed system communication is the interaction among processes, users, programs in a networked environment. Or ➢ It is the process of transferring objects (information (text, audio, image),database object(records, fields, table), web page object, code, etc) from one place to another place to share ideas and concepts among objects. 2
  • 19.
    Inter Process Communication(IPC) ➢ IPC is the interaction among processes. ➢ A distributed application requires the participation of two or more independent entities (processes). E.g Chatting, db application(UI, database, processing level) ➢ Each process can act as either the sender or the receiver. E.g Inserting data on remote machine, displaying the content of remote machine(DB, pages). ➢ When inter-process communication needs to take place with processes running on different hosts which are behind a firewall, then you need to use a TCP socket-based connection to dedicate a port for each process running on a separate host. – Eye and hear to communicate with remote object. E.g SMTP---port 25, FTP---21,http—80 port 3
  • 20.
    4 Introduction ➢ Inter processcommunication is at the heart of all distributed systems ➢ Communication in distributed systems is based on message passing as offered by the underlying network as opposed to using shared memory ➢ Modern distributed systems consist of thousands of processes scattered across an unreliable network such as the Internet ➢ unless the primitive communication facilities of the network are replaced by more advanced ones, development of large scale Distributed Systems becomes extremely difficult
  • 21.
    Network Protocols andStandards ➢ why communication in distributed systems? because there is no shared memory ➢ Two communicating processes must agree on the syntax and semantics of messages ➢ A protocol is a set of rules that governs data communications ➢ A protocol defines what is communicated, how it is communicated, and when it is communicated ➢ the key elements of a protocol are syntax, semantics, and timing ✓ syntax: refers to the structure or format of the data ✓ semantics: refers to the meaning of each section of bits ✓ timing: refers to when data should be sent and how fast they can be sent 5
  • 22.
    6 ➢ Two computers,possibly from different manufacturers, must be able to talk to each other; for such a communication, there has to be a standard. ➢ The ISO OSI (Open Systems Interconnection) Reference Model is one of such standards - 7 layers ➢ TCP/IP protocol suite is the other; has 4 or 5 layers ➢ OSI – Open – to connect open systems or systems that are open for communication with other open systems using standard rules that govern the format, contents, and meaning of the messages sent and received – these rules are called protocols – two types of transport layer protocols: connection-oriented and connectionless
  • 23.
    Classification of communication •Based on nature of middle ware storage device: • A) Persistent Communication: a message that has been submitted for transmission is stored by the communication middleware as long as it takes to deliver it to the receiver. • Example: Email System – email server is a middle wares which store the message at one or several of the storage facilities. – it is not necessary for the sending application to continue execution after submitting the message. Likewise, the receiving application need not be executing when the message is submitted. • On the contrary, telephone system & some chatting systems need the live existence of both sender and receiver at the time of communication. 7
  • 24.
    B) Transient Communication:message is stored by the communication system only as long as the sending and receiving application are executing. • Both the sender and receiver must be active at the time of communication. • Example: telephone system 8
  • 25.
    Based on wayof delivering of data • A) Synchronous Communication: • a message is only allowed to be sent if its destination is ready to receive it. • sender and receiver to wait for each other to transfer the message. • there is no memory sharing in message passing, there is no need for mutual exclusion. • no buffers are used in communication channels. • If one process is ready to communicate and the other is not, the one that is ready must be blocked (or wait). – Example: – machine having different speed of processing. – code send a message ,call a function, etc and is blocked until an answer, return value, etc arrives. 9
  • 26.
    • sending andreceiving of a message are dependent events. • all parties involved in the communication are present at the same time. • Example: telephone conversation (not texting), video conferencing , a chat room event, etc. B) Asynchronous Communication • sending and receiving of a message are independent events. • Asynchronous message passing systems deliver a message from sender to receiver without waiting for the receiver to be ready. • it's one-way communication. – Example : – process A and process B – Continuing executing after sending a message /calling a function. • Example: postal system, e-mail messages, text messaging over cell phones, keyboard , etc. 10
  • 27.
    Communication Paradigms inDistributed System ❖Shared memory ❖Message Oriented ❖ Remote procedure call (RPC) ❖Remote Method Invocation (RMI) ❖Stream Oriented ❖Group Communication However, Communication among a number of processors in one machine is carried out using shared memory. • The problem may solved by multiple processes on the same or different machines. Real world analogous: Human communication Direct communication: face –to-face communication Indirection communication: using phone ,messenger, Posta to send message ,notice board, etc. 11
  • 28.
    Shared Memory • Sharedmemory is an efficient means of passing data between programs. One program will create a memory portion which other processes (if permitted) can access. – Example : using the output of one function on any other function. So the two functions communicate by writing and reading the content of memory. • In computer programming, shared memory is a method by which program processes can exchange data more quickly than by reading and writing using the regular operating system services. – Os is acting as an interface between hardware and other software. • E.g a client process may have data to pass to a server process that the server process is to modify and return to the client. • we need some kind of synchronization between processes that read and write shared memory. Memory process1 process2 process3 12
  • 29.
    • Shared memorymodels communicate by reading/writing to shared memory blocks • A globally physical memory equally accessible to all processors. • There is latency in accessing memory if there are a number of processors. • Processes can exchange information by reading and writing data to the shared region. • One process writes to a buffer and then the other reads from another process. • Multiple threads for parallel programming , threads use a single shared memory. • Analogous : Organization and its employees communicate by using a notice board. 13
  • 30.
    14 Remote Procedure Call •the first distributed systems were based on explicit message exchange between processes through the use of explicit send and receive procedures; but do not allow access transparency • in 1984, Birrel and Nelson introduced a different way of handling communication: RPC • it allows a program to call a procedure located on another machine • simple and elegant, but there are implementation problems – the calling and called procedures run in different address spaces – parameters and results have to be exchanged; what if the machines are not identical? – what happens if both machines crash?
  • 31.
    15 parameter passing ina local procedure call: the stack before the call to read ▪ Conventional Procedure Call, i.e., on a single machine ▪ e.g. count = read (fd, buf, bytes); a C like statement, where fd is an integer indicating a file buf is an array of characters into which data are read bytes is the number of bytes to be read the stack while the called procedure is active Stack pointer ▪ parameters can be call-by-value (fd and bytes) or call-by reference (buf) or in some languages call-by-copy/restore Stack pointer
  • 32.
    16 principle of RPCbetween a client and server program ▪ Client and Server Stubs ▪ RPC would like to make a remote procedure call look the same as a local one; it should be transparent, i.e., the calling procedure should not know that the called procedure is executing on a different machine or vice versa ▪ when a program is compiled, it uses different versions of library functions called client stubs ▪ a server stub is the server-side equivalent of a client stub
  • 33.
    17 ▪ Steps ofa Remote Procedure Call 1. Client procedure calls client stub in the normal way 2. Client stub builds a message and calls the local OS (packing parameters into a message is called parameter marshaling) 3. Client's OS sends the message to the remote OS 4. Remote OS gives the message to the server stub 5. Server stub unpacks the parameters and calls the server 6. Server does the work and returns the result to the stub 7. Server stub packs it in a message and calls the local OS 8. Server's OS sends the message to the client's OS 9. Client's OS gives the message to the client stub 10. Stub unpacks the result and returns to client ▪ hence, for the client remote services are accessed by making ordinary (local) procedure calls; not by calling send and receive  server machine vs server process; client machine vs client process
  • 34.
    Sequence of EventsDuring RPC • 18
  • 35.
    19 • resulted fromobject-based technology that has proven its value in developing non distributed applications • it is an expansion of the RPC mechanisms • it enhances distribution transparency as a consequence of an object that hides its internal from the outside world by means of a well- defined interface • Distributed Objects – an object encapsulates data, called the state, and the operations on those data, called methods – methods are made available through interfaces – the state of an object can be manipulated only by invoking methods – this allows an interface to be placed on one machine while the object itself resides on another machine; such an organization is referred to as a distributed object Remote Object (Method) Invocation (RMI)
  • 36.
    20 ▪ the stateof an object is not distributed, only the interfaces are; such objects are also referred to as remote objects ▪ the implementation of an object’s interface is called a proxy (analogous to a client stub in RPC systems) ▪ it is loaded into the client’s address space when a client binds to a distributed object ▪ tasks: a proxy marshals method invocation into messages and unmarshals reply messages to return the result of the method invocation to the client ▪ a server stub, called a skeleton, unmarshals messages and marshals replies
  • 37.
    21 common organization ofa remote object with client-side proxy
  • 38.
    22 • RPCs andRMIs are not adequate for all distributed system applications • the provision of access transparency may be good but they have semantics that is not adequate for all applications • example problems – they assume that the receiving side is running at the time of communication – a client is blocked until its request has been processed 4.4 Message Oriented Communication
  • 39.
    Message Passing • MessagePassing is a form of communication used in parallel computing(sending messages to different service providers), object- oriented programming and intercrosses communication. • Message passing provides a mechanism for the exchange of data in memory distributed across the nodes of a cluster. • Sending message to intermediate storage or directly to the receiver. – E.g email, text & video chatting application(direct) • In this model ,processes or objects can send and receive messages to other processes. Messages are sent from a sender to one or more recipients. – One-one-communication. E.g email, telephone, etc. – One-to-many communication.1client process to many servers, email, TV& Radio broadcasting, etc. – Many-to-one communication. Many clients accessing a single server machine. – Many-to-many communication • Communication in the message passing paradigm is performed using the send and receive functions. 23
  • 40.
    • Send primitiveneeds destination process and the message data as parameters. • Receive primitive needs the name of sender processor and should provide a storage buffer for the message. • Message passing systems make workers communicate through a messaging system. Messages keep everyone separated, so that workers cannot modify each other's data. • Example: Radio and TV transmission from source to the users node, Video conferences, telephone communication. 24
  • 41.
    • it requiresthe application programmer to be able to identify the destination process, the message, the source process and the data types expected from these processes. • Messages are packets of data and move from process to process. • Message can be data, destination process ID, sending process ID, message length, data type, etc. • Syntax for sending and receiving primitives: – Send(p, message)…. Sending message to process P, port P, mailbox, object X, etc. – Receive(Q, message)…receiving message from process Q, port P, mailbox, etc. Process A send ” text ” mail box/port process B 25
  • 42.
    • Two genericmessage passing primitives for sending and receiving messages. send (destination, message) receive (source, message) source or dest={ process name, object name, link, mailbox, port} • Example: – Send(A, message)… send message to mail box A – Receive (A, message )….receiving message from mail box A. • The message passing system has to be told the following information: sending process, sources location, data type, data length, receiving process , destination location . 26
  • 43.
    Types of messagepassing • A) Synchronous Message Passing • Requires the sender and receiver to wait for each other to transfer the message. • The sender will not continue until the receiver has received the message( need agreement from sender and receiver). • If process B is not ready to receive at the moment of sending, Process A is suspended until Process B is receiving. • No buffers are used in communication channels. • Synchronous communication can be blocked by channel being busy or in error since only one message is allowed to be transmitted via a channel at time. • B) Asynchronous Message Passing • Deliver a message from sender to receiver without waiting for the receiver to be ready. Example: Radio transmission ,TV transmission, texting ,etc • It is store and forward message passing method. – E.g if process A wants to send data to process B. then A put data on B’ post box(queue). The next time, Process B looks into its postbox, it will retrieve the message. 27
  • 44.
    28 ▪ transient: amessage that has been submitted for transmission is stored by the communication system only as long as the sending and receiving applications are executing Persistent Transient Asynchronous ✓ ✓ Synchronous ✓ message-oriented; three forms ▪ asynchronous: a sender continues immediately after it has submitted its message for transmission ▪ synchronous: the sender is blocked until its message is stored in a local buffer at the receiving host or delivered to the receiver ▪ the different types of communication can be combined ▪ persistent asynchronous: e.g., email ▪ transient asynchronous: e.g., UDP, asynchronous RPC ▪ in general there are six possibilities
  • 45.
    29 persistent asynchronous communicationpersistent synchronous communication
  • 46.
    30 receipt-based transient synchronouscommunication transient asynchronous communication ▪ weakest form; the sender is blocked until the message is stored in a local buffer at the receiving host
  • 47.
    31 response-based transient synchronous communication delivery-basedtransient synchronous communication at message delivery ▪ the sender is blocked until the message is delivered to the receiver for further processing ▪ strongest form; the sender is blocked until it receives a reply message from the receiver
  • 48.
    32 • until now,we focused on exchanging independent and complete units of information • time has no effect on correctness; a system can be slow or fast • however, there are communications where time has a critical role • Multimedia – media • storage, transmission, interchange, presentation, representation and perception of different data types: • text, graphics, images, voice, audio, video, animation, ... • movie: video + audio + … – multimedia: handling of a variety of representation media – end user pull • information overload and starvation – technology push • emerging technology to integrate media 4.5 Stream Oriented Communication
  • 49.
    33 ▪ The Challenge ▪new applications ▪ multimedia will be pervasive in few years (as graphics) ▪ storage and transmission ▪ e.g., 2 hours uncompressed HDTV (1920×1080) movie: 1.12 TB (1920×1080x3x25x60x60x2) ▪ videos are extremely large, even after compressed (actually encoded) ▪ continuous delivery ▪ e.g., 30 frames/s (NTSC), 25 frames/s (PAL) for video ▪ guaranteed Quality of Service ▪ admission control ▪ search ▪ can we look at 100… videos to find the proper one?
  • 50.
    34 ▪ Types ofMedia ▪ two types ▪ discrete media: text, executable code, graphics, images; temporal relationships between data items are not fundamental to correctly interpret the data ▪ continuous media: video, audio, animation; temporal relationships between data items are fundamental to correctly interpret the data ▪ a data stream is a sequence of data units and can be applied to discrete as well as continuous media ▪ stream-oriented communication provides facilities for the exchange of time-dependent information (continuous media) such as audio and video streams
  • 51.
    35 • timing intransmission modes – asynchronous transmission mode: data items are transmitted one after the other, but no timing constraints; e.g. text transfer – synchronous transmission mode: a maximum end-to-end delay defined for each data unit; it is possible that data can be transmitted faster than the maximum delay, but not slower – isochronous transmission mode: maximum and minimum end-to-end delay are defined; also called bounded delay jitter; applicable for distributed multimedia systems • a continuous data stream can be simple or complex – simple stream: consists of a single sequence of data; e.g., mono audio, video only (only visual frames) – complex stream: consists of several related simple streams that must be synchronized; e.g., stereo audio, video consisting of audio and video (may also contain subtitles, translation to other languages, ...)
  • 52.
    Communication in Group •Is a process of broadcasting a message across the entire network, rather than just to one other computer. • The membership in process groups is dynamic: Processes may join and leave groups. – Example: in case of chatting the number of processor changes from time to time. • All communication between processes occurs within a communication domain • Intra-communication: representing communication within the same process group, • Inter-communication representing communication between two distinct process groups. • Group communication can be carried out using: – Broad Casting: sending message to the entire network – Multicasting: sending message to selected nodes or processors. 36
  • 53.
    Ordering of Messageon recipient side • First In First Out (FIFO): this true in case of synchronous communication. • Causal ordering • Total management 37
  • 54.
    Chapter 5 -Naming 1
  • 55.
    2 Introduction ▪ names playan important role to: ▪ share resources ▪ uniquely identify entities ▪ refer to locations ▪ etc. ▪ an important issue is that a name can be resolved to the entity it refers to ▪ to resolve names, it is necessary to implement a naming system ▪ in a distributed system, the implementation of a naming system is itself often distributed, unlike in nondistributed systems ▪ efficiency and scalability of the naming system are the main issues
  • 56.
    3 ▪ we discusshow ▪ human friendly names are organized and implemented; e.g., those for file systems and the WWW ▪ names are used to locate mobile entities ▪ to remove names that are no more used, also called garbage collection Objectives of the Chapter
  • 57.
    4 5.1 Naming Entities ▪Names, Identifiers, and Addresses ▪ a name in a distributed system is a string of bits or characters that is used to refer to an entity ▪ an entity is anything; e.g., resources such as hosts, printers, disks, files, objects, processes, users, ... ▪ entities can be operated on; e.g., a resource such as a printer offers an interface containing operations for printing a document, requesting the status of a job, ... ▪ to operate on an entity, it is necessary to access it through its access point, itself an entity (special)
  • 58.
    5 ▪ access point ▪the name of an access point is called an address (such as IP address and port number as used by the transport layer) ▪ the address of the access point of an entity is also referred to as the address of the entity ▪ an entity can have more than one access point (similar to accessing an individual through different telephone numbers) ▪ an entity may change its access point in the course of time (e.g., a mobile computer getting a new IP address as it moves)
  • 59.
    6 ▪ an addressis a special kind of name ▪ it refers to at most one entity ▪ each entity is referred by at most one address; even when replicated such as in Web pages ▪ an entity may change an access point, or an access point may be reassigned to a different entity (like telephone numbers in offices) ▪ separating the name of an entity and its address makes it easier and more flexible; such a name is called location independent ▪ there are also other types of names that uniquely identify an entity; in any case an identifier is a name with the following properties ▪ it refers to at most one entity ▪ each entity is referred by at most one identifier ▪ it always refers to the same entity (never reused) ▪ identifiers allow us to unambiguously refer to an entity
  • 60.
    7 ▪ examples ▪ nameof an FTP server (entity) ▪ URL of the FTP server ▪ address of the FTP server ▪ IP number:port number ▪ the address of the FTP server may change
  • 61.
    8 5.2 Name Spacesand Name Resolution ▪ names in a distributed system are organized into a name space ▪ a name space is generally organized as a labeled, directed graph with two types of nodes ▪ leaf node: represents the named entity and stores information such as its address or the state of that entity ▪ directory node: a special entity that has a number of outgoing edges, each labeled with a name a general naming graph with a single root node