SlideShare a Scribd company logo
Chapter 3: Logical Time
Ajay Kshemkalyani and Mukesh Singhal
Distributed Computing: Principles, Algorithms, and Systems
Cambridge University Press
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 1 / 67
Distributed Computing: Principles, Algorithms, and Systems
Introduction
The concept of causality between events is fundamental to the design and
analysis of parallel and distributed computing and operating systems.
Usually causality is tracked using physical time.
In distributed systems, it is not possible to have a global physical time.
As asynchronous distributed computations make progress in spurts, the
logical time is sufficient to capture the fundamental monotonicity property
associated with causality in distributed systems.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 2 / 67
Distributed Computing: Principles, Algorithms, and Systems
Introduction
This chapter discusses three ways to implement logical time - scalar time,
vector time, and matrix time.
Causality among events in a distributed system is a powerful concept in
reasoning, analyzing, and drawing inferences about a computation.
The knowledge of the causal precedence relation among the events of
processes helps solve a variety of problems in distributed systems, such as
distributed algorithms design, tracking of dependent events, knowledge about
the progress of a computation, and concurrency measures.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 3 / 67
Distributed Computing: Principles, Algorithms, and Systems
A Framework for a System of Logical Clocks
Definition
A system of logical clocks consists of a time domain T and a logical clock C.
Elements of T form a partially ordered set over a relation <.
Relation < is called the happened before or causal precedence. Intuitively,
this relation is analogous to the earlier than relation provided by the physical
time.
The logical clock C is a function that maps an event e in a distributed
system to an element in the time domain T, denoted as C(e) and called the
timestamp of e, and is defined as follows:
C : H 7→ T
such that the following property is satisfied:
for two events ei and ej , ei → ej =⇒ C(ei ) < C(ej ).
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 4 / 67
Distributed Computing: Principles, Algorithms, and Systems
A Framework for a System of Logical Clocks
This monotonicity property is called the clock consistency condition.
When T and C satisfy the following condition,
for two events ei and ej , ei → ej ⇔ C(ei ) < C(ej )
the system of clocks is said to be strongly consistent.
Implementing Logical Clocks
Implementation of logical clocks requires addressing two issues: data
structures local to every process to represent logical time and a protocol to
update the data structures to ensure the consistency condition.
Each process pi maintains data structures that allow it the following two
capabilities:
◮ A local logical clock, denoted by lci , that helps process pi measure its own
progress.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 5 / 67
Distributed Computing: Principles, Algorithms, and Systems
Implementing Logical Clocks
◮ A logical global clock, denoted by gci , that is a representation of process pi ’s
local view of the logical global time. Typically, lci is a part of gci .
The protocol ensures that a process’s logical clock, and thus its view of the global
time, is managed consistently. The protocol consists of the following two rules:
R1: This rule governs how the local logical clock is updated by a process
when it executes an event.
R2: This rule governs how a process updates its global logical clock to
update its view of the global time and global progress.
Systems of logical clocks differ in their representation of logical time and also
in the protocol to update the logical clocks.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 6 / 67
Distributed Computing: Principles, Algorithms, and Systems
Scalar Time
Proposed by Lamport in 1978 as an attempt to totally order events in a
distributed system.
Time domain is the set of non-negative integers.
The logical local clock of a process pi and its local view of the global time
are squashed into one integer variable Ci .
Rules R1 and R2 to update the clocks are as follows:
R1: Before executing an event (send, receive, or internal), process pi executes
the following:
Ci := Ci + d (d > 0)
In general, every time R1 is executed, d can have a different value; however,
typically d is kept at 1.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 7 / 67
Distributed Computing: Principles, Algorithms, and Systems
Scalar Time
R2: Each message piggybacks the clock value of its sender at sending time.
When a process pi receives a message with timestamp Cmsg , it executes the
following actions:
◮ Ci := max(Ci , Cmsg )
◮ Execute R1.
◮ Deliver the message.
Figure 3.1 shows evolution of scalar time.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 8 / 67
Distributed Computing: Principles, Algorithms, and Systems
Scalar Time
Evolution of scalar time:
p
1
p
2
p
3
1 2 3
3 10
11
5 6 7
2
7
9
4
b
1
8 9
4 5
1
Figure 3.1: The space-time diagram of a distributed execution.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 9 / 67
Distributed Computing: Principles, Algorithms, and Systems
Basic Properties
Consistency Property
Scalar clocks satisfy the monotonicity and hence the consistency property:
for two events ei and ej , ei → ej =⇒ C(ei ) < C(ej ).
Total Ordering
Scalar clocks can be used to totally order events in a distributed system.
The main problem in totally ordering events is that two or more events at
different processes may have identical timestamp.
For example in Figure 3.1, the third event of process P1 and the second event
of process P2 have identical scalar timestamp.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 10 / 67
Distributed Computing: Principles, Algorithms, and Systems
Total Ordering
A tie-breaking mechanism is needed to order such events. A tie is broken as
follows:
Process identifiers are linearly ordered and tie among events with identical
scalar timestamp is broken on the basis of their process identifiers.
The lower the process identifier in the ranking, the higher the priority.
The timestamp of an event is denoted by a tuple (t, i) where t is its time of
occurrence and i is the identity of the process where it occurred.
The total order relation ≺ on two events x and y with timestamps (h,i) and
(k,j), respectively, is defined as follows:
x ≺ y ⇔ (h < k or (h = k and i < j))
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 11 / 67
Distributed Computing: Principles, Algorithms, and Systems
Properties. . .
Event counting
If the increment value d is always 1, the scalar time has the following
interesting property: if event e has a timestamp h, then h-1 represents the
minimum logical duration, counted in units of events, required before
producing the event e;
We call it the height of the event e.
In other words, h-1 events have been produced sequentially before the event e
regardless of the processes that produced these events.
For example, in Figure 3.1, five events precede event b on the longest causal
path ending at b.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 12 / 67
Distributed Computing: Principles, Algorithms, and Systems
Properties. . .
No Strong Consistency
The system of scalar clocks is not strongly consistent; that is, for two events
ei and ej, C(ei ) < C(ej ) 6=⇒ ei → ej .
For example, in Figure 3.1, the third event of process P1 has smaller scalar
timestamp than the third event of process P2.However, the former did not
happen before the latter.
The reason that scalar clocks are not strongly consistent is that the logical
local clock and logical global clock of a process are squashed into one,
resulting in the loss causal dependency information among events at different
processes.
For example, in Figure 3.1, when process P2 receives the first message from
process P1, it updates its clock to 3, forgetting that the timestamp of the
latest event at P1 on which it depends is 2.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 13 / 67
Distributed Computing: Principles, Algorithms, and Systems
Vector Time
The system of vector clocks was developed independently by Fidge, Mattern
and Schmuck.
In the system of vector clocks, the time domain is represented by a set of
n-dimensional non-negative integer vectors.
Each process pi maintains a vector vti [1..n], where vti [i] is the local logical
clock of pi and describes the logical time progress at process pi .
vti [j] represents process pi ’s latest knowledge of process pj local time.
If vti [j]=x, then process pi knows that local time at process pj has
progressed till x.
The entire vector vti constitutes pi ’s view of the global logical time and is
used to timestamp events.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 14 / 67
Distributed Computing: Principles, Algorithms, and Systems
Vector Time
Process pi uses the following two rules R1 and R2 to update its clock:
R1: Before executing an event, process pi updates its local logical time as
follows:
vti [i] := vti [i] + d (d > 0)
R2: Each message m is piggybacked with the vector clock vt of the sender
process at sending time. On the receipt of such a message (m,vt), process pi
executes the following sequence of actions:
◮ Update its global logical time as follows:
1 ≤ k ≤ n : vti [k] := max(vti [k], vt[k])
◮ Execute R1.
◮ Deliver the message m.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 15 / 67
Distributed Computing: Principles, Algorithms, and Systems
Vector Time
The timestamp of an event is the value of the vector clock of its process
when the event is executed.
Figure 3.2 shows an example of vector clocks progress with the increment
value d=1.
Initially, a vector clock is [0, 0, 0, ...., 0].
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 16 / 67
Distributed Computing: Principles, Algorithms, and Systems
Vector Time
An Example of Vector Clocks
3
p
p
1
2
0
0
3
0
0
4
3
4
0
1
0
2
0
0 2
3
0
2
4
0
2
3
4
5
3
4
5
6
4
0
0
1
2
3
3
2
3
4
2
p
2
3
0
2
2
0
2
3
2
1
0
0
5
3
4
5
5
4
Figure 3.2: Evolution of vector time.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 17 / 67
Distributed Computing: Principles, Algorithms, and Systems
Vector Time
Comparing Vector Timestamps
The following relations are defined to compare two vector timestamps, vh
and vk:
vh = vk ⇔ ∀x : vh[x] = vk[x]
vh ≤ vk ⇔ ∀x : vh[x] ≤ vk[x]
vh < vk ⇔ vh ≤ vk and ∃x : vh[x] < vk[x]
vh k vk ⇔ ¬(vh < vk) ∧ ¬(vk < vh)
If the process at which an event occurred is known, the test to compare two
timestamps can be simplified as follows: If events x and y respectively
occurred at processes pi and pj and are assigned timestamps vh and vk,
respectively, then
x → y ⇔ vh[i] ≤ vk[i]
x k y ⇔ vh[i] > vk[i] ∧ vh[j] < vk[j]
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 18 / 67
Distributed Computing: Principles, Algorithms, and Systems
Vector Time
Properties of Vectot Time
Isomorphism
If events in a distributed system are timestamped using a system of vector
clocks, we have the following property.
If two events x and y have timestamps vh and vk, respectively, then
x → y ⇔ vh < vk
x k y ⇔ vh k vk.
Thus, there is an isomorphism between the set of partially ordered events
produced by a distributed computation and their vector timestamps.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 19 / 67
Distributed Computing: Principles, Algorithms, and Systems
Vector Time
Strong Consistency
The system of vector clocks is strongly consistent; thus, by examining the
vector timestamp of two events, we can determine if the events are causally
related.
However, Charron-Bost showed that the dimension of vector clocks cannot be
less than n, the total number of processes in the distributed computation, for
this property to hold.
Event Counting
If d=1 (in rule R1), then the ith
component of vector clock at process pi ,
vti [i], denotes the number of events that have occurred at pi until that
instant.
So, if an event e has timestamp vh, vh[j] denotes the number of events
executed by process pj that causally precede e. Clearly,
P
vh[j] − 1
represents the total number of events that causally precede e in the
distributed computation.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 20 / 67
Distributed Computing: Principles, Algorithms, and Systems
Efficient Implementations of Vector Clocks
If the number of processes in a distributed computation is large, then vector
clocks will require piggybacking of huge amount of information in messages.
The message overhead grows linearly with the number of processors in the
system and when there are thousands of processors in the system, the
message size becomes huge even if there are only a few events occurring in
few processors.
We discuss an efficient way to maintain vector clocks.
Charron-Bost showed that if vector clocks have to satisfy the strong
consistency property, then in general vector timestamps must be at least of
size n, the total number of processes.
However, optimizations are possible and next, and we discuss a technique to
implement vector clocks efficiently.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 21 / 67
Distributed Computing: Principles, Algorithms, and Systems
Singhal-Kshemkalyani’s Differential Technique
Singhal-Kshemkalyani’s differential technique is based on the observation that
between successive message sends to the same process, only a few entries of
the vector clock at the sender process are likely to change.
When a process pi sends a message to a process pj , it piggybacks only those
entries of its vector clock that differ since the last message sent to pj .
If entries i1, i2, . . . , in1 of the vector clock at pi have changed to
v1, v2, . . . , vn1 , respectively, since the last message sent to pj , then process pi
piggybacks a compressed timestamp of the form:
{(i1, v1), (i2, v2), . . . , (in1 , vn1 )}
to the next message to pj .
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 22 / 67
Distributed Computing: Principles, Algorithms, and Systems
Singhal-Kshemkalyani’s Differential Technique
When pj receives this message, it updates its vector clock as follows:
vti [ik ] = max(vti [ik ], vk ) for k = 1, 2, . . ., n1.
Thus this technique cuts down the message size, communication bandwidth
and buffer (to store messages) requirements.
In the worst of case, every element of the vector clock has been updated at
pi since the last message to process pj , and the next message from pi to pj
will need to carry the entire vector timestamp of size n.
However, on the average the size of the timestamp on a message will be less
than n.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 23 / 67
Distributed Computing: Principles, Algorithms, and Systems
Singhal-Kshemkalyani’s Differential Technique
Implementation of this technique requires each process to remember the
vector timestamp in the message last sent to every other process.
Direct implementation of this will result in O(n2
) storage overhead at each
process.
Singhal and Kshemkalyani developed a clever technique that cuts down this
storage overhead at each process to O(n). The technique works in the
following manner:
Process pi maintains the following two additional vectors:
◮ LSi [1..n] (‘Last Sent’):
LSi [j] indicates the value of vti [i] when process pi last sent a message to
process pj .
◮ LUi [1..n] (‘Last Update’):
LUi [j] indicates the value of vti [i] when process pi last updated the entry vti [j].
Clearly, LUi [i] = vti [i] at all times and LUi [j] needs to be updated only when
the receipt of a message causes pi to update entry vti [j]. Also, LSi [j] needs
to be updated only when pi sends a message to pj .
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 24 / 67
Distributed Computing: Principles, Algorithms, and Systems
Singhal-Kshemkalyani’s Differential Technique
Since the last communication from pi to pj , only those elements of vector
clock vti [k] have changed for which LSi [j] < LUi [k] holds.
Hence, only these elements need to be sent in a message from pi to pj.
When pi sends a message to pj , it sends only a set of tuples
{(x, vti [x])|LSi [j] < LUi [x]}
as the vector timestamp to pj , instead of sending a vector of n entries in a
message.
Thus the entire vector of size n is not sent along with a message. Instead,
only the elements in the vector clock that have changed since the last
message send to that process are sent in the format
{(p1, latest value), (p2, latest value), . . .}, where pi indicates that the pi th
component of the vector clock has changed.
This technique requires that the communication channels follow FIFO
discipline for message delivery.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 25 / 67
Distributed Computing: Principles, Algorithms, and Systems
Singhal-Kshemkalyani’s Differential Technique
This method is illustrated in Figure 3.3. For instance, the second message
from p3 to p2 (which contains a timestamp {(3, 2)}) informs p2 that the third
component of the vector clock has been modified and the new value is 2.
This is because the process p3 (indicated by the third component of the
vector) has advanced its clock value from 1 to 2 since the last message sent
to p2.
This technique substantially reduces the cost of maintaining vector clocks in
large systems, especially if the process interactions exhibit temporal or spatial
localities.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 26 / 67
Distributed Computing: Principles, Algorithms, and Systems
Singhal-Kshemkalyani’s Differential Technique
p
1
p
2
p
3
p
4
1
0
0
0
1
1
0
0
1
3
2
0
1
2
1
0
0
0
2
0
0
0
3
1
0
0
4
1
0
0
0
1
1
4
4
1
0
0
1
0
{(1,1)}
{(3,1)} {(3,2)}
{(3,4),(4,1)}
{(4,1)}
Figure 3.3: Vector clocks progress in Singhal-Kshemkalyani technique.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 27 / 67
Distributed Computing: Principles, Algorithms, and Systems
Matrix Time
In a system of matrix clocks, the time is represented by a set of n × n matrices of
non-negative integers.
A process pi maintains a matrix mti [1..n, 1..n] where,
mti [i, i] denotes the local logical clock of pi and tracks the progress of the
computation at process pi .
mti [i, j] denotes the latest knowledge that process pi has about the local
logical clock, mtj [j, j], of process pj .
mti [j, k] represents the knowledge that process pi has about the latest
knowledge that pj has about the local logical clock, mtk [k, k], of pk .
The entire matrix mti denotes pi ’s local view of the global logical time.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 28 / 67
Distributed Computing: Principles, Algorithms, and Systems
Matrix Time
Process pi uses the following rules R1 and R2 to update its clock:
R1 : Before executing an event, process pi updates its local logical time as
follows:
mti [i, i] := mti [i, i] + d (d > 0)
R2: Each message m is piggybacked with matrix time mt. When pi receives
such a message (m,mt) from a process pj , pi executes the following sequence
of actions:
◮ Update its global logical time as follows:
(a) 1 ≤ k ≤ n : mti [i, k] := max(mti [i, k], mt[j, k])
(That is, update its row mti [i, ∗] with the pj ’s row in the received timestamp,
mt.)
(b) 1 ≤ k, l ≤ n : mti [k, l] := max(mti [k, l], mt[k, l])
◮ Execute R1.
◮ Deliver message m.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 29 / 67
Distributed Computing: Principles, Algorithms, and Systems
Matrix Time
Figure 3.4 gives an example to illustrate how matrix clocks progress in a
distributed computation. We assume d=1.
Let us consider the following events: e which is the xi -th event at process pi ,
e1
k and e2
k which are the x1
k -th and x2
k -th event at process pk , and e1
j and e2
j
which are the x1
j -th and x2
j -th events at pj.
Let mte denote the matrix timestamp associated with event e. Due to
message m4, e2
k is the last event of pk that causally precedes e, therefore, we
have mte[i, k]=mte[k, k]=x2
k .
Likewise, mte[i, j]=mte[j, j]=x2
j . The last event of pk known by pj , to the
knowledge of pi when it executed event e, is e1
k ; therefore, mte[j, k]=x1
k .
Likewise, we have mte[k, j]=x1
j .
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 30 / 67
Distributed Computing: Principles, Algorithms, and Systems
Matrix Time
e1
j ej
2
k
e2
e1
k
mt k,j mt j,j
]
p
p
p
k
j
i
e
m
m
m
m
2
3
4
e e
e e
1
mte
[ [
[
mt i,k
mt i,k
[ ]
]
]
Figure 3.4: Evolution of matrix time.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 31 / 67
Distributed Computing: Principles, Algorithms, and Systems
Matrix Time
Basic Properties
Vector mti [i, .] contains all the properties of vector clocks.
In addition, matrix clocks have the following property:
mink (mti [k, l]) ≥ t ⇒ process pi knows that every other process pk knows
that pl ’s local time has progressed till t.
◮ If this is true, it is clear that process pi knows that all other processes know
that pl will never send information with a local time ≤ t.
◮ In many applications, this implies that processes will no longer require from pl
certain information and can use this fact to discard obsolete information.
If d is always 1 in the rule R1, then mti [k, l] denotes the number of events
occurred at pl and known by pk as far as pi ’s knowledge is concerned.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 32 / 67
Distributed Computing: Principles, Algorithms, and Systems
Virtual Time
Virtual time system is a paradigm for organizing and synchronizing
distributed systems.
This section a provides description of virtual time and its implementation
using the Time Warp mechanism.
The implementation of virtual time using Time Warp mechanism works on
the basis of an optimistic assumption.
Time Warp relies on the general lookahead-rollback mechanism where each
process executes without regard to other processes having synchronization
conflicts.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 33 / 67
Distributed Computing: Principles, Algorithms, and Systems
Virtual Time
If a conflict is discovered, the offending processes are rolled back to the time
just before the conflict and executed forward along the revised path.
Detection of conflicts and rollbacks are transparent to users.
The implementation of Virtual Time using Time Warp mechanism makes the
following optimistic assumption: synchronization conflicts and thus rollbacks
generally occurs rarely.
next, we discuss in detail Virtual Time and how Time Warp mechanism is
used to implement it.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 34 / 67
Distributed Computing: Principles, Algorithms, and Systems
Virtual Time Definition
“Virtual time is a global, one dimensional, temporal coordinate system on a
distributed computation to measure the computational progress and to define
synchronization.”
A virtual time system is a distributed system executing in coordination with
an imaginary virtual clock that uses virtual time.
Virtual times are real values that are totally ordered by the less than relation,
“<”.
Virtual time is implemented a collection of several loosely synchronized local
virtual clocks.
These local virtual clocks move forward to higher virtual times; however,
occasionaly they move backwards.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 35 / 67
Distributed Computing: Principles, Algorithms, and Systems
Virtual Time Definition
Processes run concurrently and communicate with each other by exchanging
messages.
Every message is characterized by four values:
a) Name of the sender
b) Virtual send time
c) Name of the receiver
d) Virtual receive time
Virtual send time is the virtual time at the sender when the message is sent,
whereas virtual receive time specifies the virtual time when the message must
be received (and processed) by the receiver.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 36 / 67
Distributed Computing: Principles, Algorithms, and Systems
Virtual Time Definition
A problem arises when a message arrives at process late, that is, the virtual
receive time of the message is less than the local virtual time at the receiver
process when the message arrives.
Virtual time systems are subject to two semantic rules similar to Lamport’s
clock conditions:
◮ Rule 1: Virtual send time of each message < virtual receive time of that
message.
◮ Rule 2: Virtual time of each event in a process < Virtual time of next event in
that process.
The above two rules imply that a process sends all messages in increasing
order of virtual send time and a process receives (and processes) all messages
in the increasing order of virtual receive time.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 37 / 67
Distributed Computing: Principles, Algorithms, and Systems
Virtual Time Definition
Causality of events is an important concept in distributed systems and is also
a major constraint in the implementation of virtual time.
It is important an event that causes another should be completely executed
before the caused event can be processed.
The constraint in the implementation of virtual time can be stated as follows:
“If an event A causes event B, then the execution of A and B must be
scheduled in real time so that A is completed before B starts”.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 38 / 67
Distributed Computing: Principles, Algorithms, and Systems
Virtual Time Definition
If event A has an earlier virtual time than event B, we need execute A before
B provided there is no causal chain from A to B.
Better performance can be achieved by scheduling A concurrently with B or
scheduling A after B.
If A and B have exactly the same virtual time coordinate, then there is no
restriction on the order of their scheduling.
If A and B are distinct events, they will have different virtual space
coordinates (since they occur at different processes) and neither will be a
cause for the other.
To sum it up, events with virtual time < ‘t’ complete before the starting of
events at time ‘t’ and events with virtual time > ‘t’ will start only after
events at time ‘t’ are complete.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 39 / 67
Distributed Computing: Principles, Algorithms, and Systems
Virtual Time Definition
Characteristics of Virtual Time
1 Virtual time systems are not all isomorphic; it may be either discrete or
continuous.
2 Virtual time may be only partially ordered.
3 Virtual time may be related to real time or may be independent of it.
4 Virtual time systems may be visible to programmers and manipulated
explicitly as values, or hidden and manipulated implicitly according to some
system-defined discipline
5 Virtual times associated with events may be explicitly calculated by user
programs or they may be assigned by fixed rules.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 40 / 67
Distributed Computing: Principles, Algorithms, and Systems
Comparison with Lamport’s Logical Clocks
In Lamport’s logical clock, an artificial clock is created one for each process
with unique labels from a totally ordered set in a manner consistent with
partial order.
In virtual time, the reverse of the above is done by assuming that every event
is labeled with a clock value from a totally ordered virtual time scale
satisfying Lamport’s clock conditions.
Thus the Time Warp mechanism is an inverse of Lamport’s scheme.
In Lamport’s scheme, all clocks are conservatively maintained so that they
never violate causality.
A process advances its clock as soon as it learns of new causal dependency.
In the virtual time, clocks are optimisticaly advanced and corrective actions
are taken whenever a violation is detected.
Lamport’s initial idea brought about the concept of virtual time but the
model failed to preserve causal independence.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 41 / 67
Distributed Computing: Principles, Algorithms, and Systems
Virtual Time Definition
Time Warp Mechanism
In the implementation of virtual time using Time Warp mechanism, virtual
receive time of message is considered as its timestamp.
The necessary and sufficient conditions for the correct implementation of
virtual time are that each process must handle incoming messages in
timestamp order.
This is highly undesirable and restrictive because process speeds and message
delays are likely to highly variable.
It natural for some processes to get ahead in virtual time of other processes.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 42 / 67
Distributed Computing: Principles, Algorithms, and Systems
Virtual Time Definition
Time Warp Mechanism
It is impossible for a process on the basis of local information alone to block
and wait for the message with the next timestamp.
It is always possible that a message with earlier timestamp arrives later.
So, when a process executes a message, it is very difficult for it determine
whether a message with an earlier timestamp will arrive later.
This is the central problem in virtual time that is solved by the Time Warp
mechanism.
The Time warp mechanism assumes that message communication is reliable,
nad messages may not be delivered in FIFO order.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 43 / 67
Distributed Computing: Principles, Algorithms, and Systems
Virtual Time Definition
Time Warp Mechanism
Time Warp mechanism consists of two major parts: local control mechanism
and global control mechanism.
The local control mechanism insures that events are executed and messages
are processed in the correct order.
The global control mechanism takes care of global issues such as global
progress, termination detection, I/O error handling, flow control, etc.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 44 / 67
Distributed Computing: Principles, Algorithms, and Systems
The Local Control Mechanism
There is no global virtual clock variable in this implementation; each process
has a local virtual clock variable.
The local virtual clock of a process doesn’t change during an event at that
process but it changes only between events.
On the processing of next message from the input queue, the process
increases its local clock to the timestamp of the message.
At any instant, the value of virtual time may differ for each process but the
value is transparent to other processes in the system.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 45 / 67
Distributed Computing: Principles, Algorithms, and Systems
The Local Control Mechanism
When a message is sent, the virtual send time is copied from the sender’s
virtual clock while the name of the receiver and virtual receive time are
assigned based on application specific context.
All arriving messages at a process are stored in an input queue in the
increasing order of timestamps (receive times).
Processes will receive late messages due to factors such as different
computation rates of processes and network delays.
The semantics of virtual time demands that incoming messages be received
by each process strictly in the timestamp order.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 46 / 67
Distributed Computing: Principles, Algorithms, and Systems
The Local Control Mechanism
This is accomplished as follows:
“On the reception of a late message, the receiver rolls back to an earlier
virtual time, cancelling all intermediate side effects and then executes forward
again by executing the late message in the proper sequence.”
If all the messages in the input queue of a process are processed, the state of
the process is said to terminate and its clock is set to + inf.
However, the process is not destroyed as a late message may arrive resulting
it to rollback and execute again.
Thus, each process is doing a constant “lookahead”, processing future
messages from its input queue.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 47 / 67
Distributed Computing: Principles, Algorithms, and Systems
The Local Control Mechanism
Over a length computation, each process may roll back several times while
generally progressing forward with rollback completely transparent to other
processes in the system.
Rollback in a distributed system is complicated: A process that wants to
rollback might have sent many messages to other processes, which in turn
might have sent many messages to other processes, and so on, leading to
deep side effects.
For rollback, messages must be effectively “unsent” and their side effects
should be undone. This is achieved efficiently by using antimessages.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 48 / 67
Distributed Computing: Principles, Algorithms, and Systems
The Local Control Mechanism
Antimessages and the Rollback Mechanism
Runtime representation of a process is composed of the following:
Process name: Virtual spaces coordinate which is unique in the system.
Local virtual clock: Virtual time coordinate
State: Data space of the process including execution stack, program counter
and its own variables
State queue: Contains saved copies of process’s recent states as roll back
with Time warp mechanism requires the state of the process being saved.
Input queue: Contains all recently arrived messages in order of virtual
receive time. Processed messages from the input queue are not deleted as
they are saved in the output queue with a negative sign (antimessage) to
facilitate future roll backs.
Output queue: Contains negative copies of messages the process has
recently sent in virtual send time order. They are needed in case of a rollback.
For every message, there exists an antimessage that is the same in content but
opposite in sign.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 49 / 67
Distributed Computing: Principles, Algorithms, and Systems
Antimessages and the Rollback Mechanism
Whenever a process sends a message, a copy of the message is transmitted to
receiver’s input queue and a negative copy (antimessage) is retained in the
sender’s output queue for use in sender rollback.
Whenever a message and its antimessage appear in the same queue no
matter in which order they arrived, they immediately annihilate each other
resulting in shortening of the queue by one message.
When a message arrives at the input queue of a process with timestamp
greater than virtual clock time of its destination process, it is simply
enqueued.
When the destination process’ virtual time is greater than the virtual time of
message received, the process must do a rollback.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 50 / 67
Distributed Computing: Principles, Algorithms, and Systems
Antimessages and the Rollback Mechanism
Rollback Mechanism
Search the ”State queue” for the last saved state with timestamp that is less
than the timestamp of the message received and restore it.
Make the timestamp of the received message as the value of the local virtual
clock and discard from the state queue all states saved after this time. Then
the resume execution forward from this point.
Now all the messages that are sent between the current state and earlier
state must be “unsent”. This is taken care of by executing a simple rule:
“To unsend a message, simply transmit its antimessage.”
This results in antimessages following the positive ones to the destination. A
negative message causes a rollback at its destination if it’s virtual receive
time is less than the receiver’s virtual time.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 51 / 67
Distributed Computing: Principles, Algorithms, and Systems
Antimessages and the Rollback Mechanism
Depending on the timing, there are several possibilities at the receiver’s end:
First, the original (positive) message has arrived but not yet been processed
at the receiver.
In this case, the negative message causes no rollback, however, it annihilates
with the positive message leaving the receiver with no record of that message.
Second, the original positive message has already been partially or completely
processed by the receiver.
In this case, the negative message causes the receiver to roll back to a virtual
time when the positive message was received.
It will also annihilate the positive message leaving the receiver with no record
that the message existed. When the receiver executes again, the execution
will assume that these message never existed.
A rolled back process may send antimessages to other processes.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 52 / 67
Distributed Computing: Principles, Algorithms, and Systems
Antimessages and the Rollback Mechanism
A negative message can also arrive at the destination before the positive one.
In this case, it is enqueued and will be annihilated when positive message
arrives.
If it is negative message’s turn to be executed at a processs’ input queqe, the
receiver may take any action like a no-op.
Any action taken will eventually be rolled back when the corresponding
positive message arrives.
An optimization would be to skip the antimessage from the input queue and
treat it as a no-op, and when the corresponding positive message arrives, it
will annihilate the negative message, and inhibit any rollback.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 53 / 67
Distributed Computing: Principles, Algorithms, and Systems
Antimessages and the Rollback Mechanism
The antimessage protocol has several advantages:
It is extremely robust and works under all possible circumstances.
It is free from deadlocks as there is no blocking.
It is also free from domino effects.
In the worst case, all processes in system roll back to same virtual time as
original one did and then proceed forward again.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 54 / 67
Distributed Computing: Principles, Algorithms, and Systems
Physical Clock Synchronization: NTP
Motivation
In centralized systems, there is only single clock. A process gets the time by
simply issuing a system call to the kernel.
In distributed systems, there is no global clock or common memory. Each
processor has its own internal clock and its own notion of time.
These clocks can easily drift seconds per day, accumulating significant errors
over time.
Also, because different clocks tick at different rates, they may not remain
always synchronized although they might be synchronized when they start.
This clearly poses serious problems to applications that depend on a
synchronized notion of time.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 55 / 67
Distributed Computing: Principles, Algorithms, and Systems
Physical Clock Synchronization: NTP
Motivation
For most applications and algorithms that run in a distributed system, we
need to know time in one or more of the following contexts:
◮ The time of the day at which an event happened on a specific machine in the
network.
◮ The time interval between two events that happened on different machines in
the network.
◮ The relative ordering of events that happened on different machines in the
network.
Unless the clocks in each machine have a common notion of time, time-based
queries cannot be answered.
Clock synchronization has a significant effect on many problems like secure
systems, fault diagnosis and recovery, scheduled operations, database
systems, and real-world clock values.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 56 / 67
Distributed Computing: Principles, Algorithms, and Systems
Physical Clock Synchronization: NTP
Clock synchronization is the process of ensuring that physically distributed
processors have a common notion of time.
Due to different clocks rates, the clocks at various sites may diverge with
time and periodically a clock synchronization must be performed to correct
this clock skew in distributed systems.
Clocks are synchronized to an accurate real-time standard like UTC
(Universal Coordinated Time).
Clocks that must not only be synchronized with each other but also have to
adhere to physical time are termed physical clocks.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 57 / 67
Distributed Computing: Principles, Algorithms, and Systems
Physical Clock Synchronization: NTP
Definitions and Terminology
Let Ca and Cb be any two clocks.
Time: The time of a clock in a machine p is given by the function Cp(t),
where Cp(t) = t for a perfect clock.
Frequency: Frequency is the rate at which a clock progresses. The
frequency at time t of clock Ca is C
′
a(t).
Offset: Clock offset is the difference between the time reported by a clock
and the real time. The offset of the clock Ca is given by Ca(t) − t. The
offset of clock Ca relative to Cb at time t ≥ 0 is given by Ca(t) − Cb(t).
Skew: The skew of a clock is the difference in the frequencies of the clock
and the perfect clock. The skew of a clock Ca relative to clock Cb at time t
is (C′
a(t) − C′
b(t)). If the skew is bounded by ρ, then as per Equation (1),
clock values are allowed to diverge at a rate in the range of 1 − ρ to 1 + ρ.
Drift (rate): The drift of clock Ca is the second derivative of the clock value
with respect to time, namely, C′′
a (t). The drift of clock Ca relative to clock
Cb at time t is C′′
a (t) − C′′
b (t).
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 58 / 67
Distributed Computing: Principles, Algorithms, and Systems
Physical Clock Synchronization: NTP
Clock Inaccuracies
Physical clocks are synchronized to an accurate real-time standard like UTC
(Universal Coordinated Time).
However, due to the clock inaccuracy discussed above, a timer (clock) is said
to be working within its specification if (where constant ρ is the maximum
skew rate specified by the manufacturer.)
1 − ρ ≤
dC
dt
≤ 1 + ρ (1)
Figure 3.5 illustrates the behavior of fast, slow, and perfect clocks with
respect to UTC.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 59 / 67
Distributed Computing: Principles, Algorithms, and Systems
Physical Clock Synchronization: NTP
Clock
time,
C
UTC, t
Fast Clock
dC/dt > 1
Perfect Clock
dC/dt = 1
Slow Clock
dC/dt < 1
Figure 3.5: The behavior of fast, slow, and perfect clocks with respect to UTC.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 60 / 67
Distributed Computing: Principles, Algorithms, and Systems
Physical Clock Synchronization: NTP
Offset delay estimation method
The Network Time Protocol (NTP) which is widely used for clock
synchronization on the Internet uses the The Offset Delay Estimation
method.
The design of NTP involves a hierarchical tree of time servers.
◮ The primary server at the root synchronizes with the UTC.
◮ The next level contains secondary servers, which act as a backup to the
primary server.
◮ At the lowest level is the synchronization subnet which has the clients.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 61 / 67
Distributed Computing: Principles, Algorithms, and Systems
Physical Clock Synchronization: NTP
Clock offset and delay estimation:
In practice, a source node cannot accurately estimate the local time on the target
node due to varying message or network delays between the nodes.
This protocol employs a common practice of performing several trials and
chooses the trial with the minimum delay.
Figure 3.6 shows how NTP timestamps are numbered and exchanged
between peers A and B.
Let T1, T2, T3, T4 be the values of the four most recent timestamps as shown.
Assume clocks A and B are stable and running at the same speed.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 62 / 67
Distributed Computing: Principles, Algorithms, and Systems
T3
T1
A
B
T2
T4
Figure 3.6: Offset and delay estimation.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 63 / 67
Distributed Computing: Principles, Algorithms, and Systems
Let a = T1 − T3 and b = T2 − T4.
If the network delay difference from A to B and from B to A, called
differential delay, is small, the clock offset θ and roundtrip delay δ of B
relative to A at time T4 are approximately given by the following.
θ =
a + b
2
, δ = a − b (2)
Each NTP message includes the latest three timestamps T1, T2 and T3,
while T4 is determined upon arrival.
Thus, both peers A and B can independently calculate delay and offset using
a single bidirectional message stream as shown in Figure 3.7.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 64 / 67
Distributed Computing: Principles, Algorithms, and Systems
Ti-3
Ti-2
Server A
Server B
Ti-1
Ti
Figure 3.7: Timing diagram for the two servers.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 65 / 67
Distributed Computing: Principles, Algorithms, and Systems
The Network Time Protocol synchronization
protocol.
A pair of servers in symmetric mode exchange pairs of timing messages.
A store of data is then built up about the relationship between the two
servers (pairs of offset and delay).
Specifically, assume that each peer maintains pairs (Oi ,Di ), where
Oi - measure of offset (θ)
Di - transmission delay of two messages (δ).
The offset corresponding to the minimum delay is chosen.
Specifically, the delay and offset are calculated as follows. Assume that
message m takes time t to transfer and m′
takes t′
to transfer.
(Continued on the next slide . . . .)
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 66 / 67
Distributed Computing: Principles, Algorithms, and Systems
The Network Time Protocol synchronization
protocol.
The offset between A’s clock and B’s clock is O. If A’s local clock time is
A(t) and B’s local clock time is B(t), we have
A(t) = B(t) + O (3)
Then,
Ti−2 = Ti−3 + t + O (4)
Ti = Ti−1 − O + t′
(5)
Assuming t = t′
, the offset Oi can be estimated as:
Oi = (Ti−2 − Ti−3 + Ti−1 − Ti )/2 (6)
The round-trip delay is estimated as:
Di = (Ti − Ti−3) − (Ti−1 − Ti−2) (7)
The eight most recent pairs of (Oi , Di ) are retained.
The value of Oi that corresponds to minimum Di is chosen to estimate O.
A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 67 / 67

More Related Content

What's hot (20)

PPTX
Common Standards in Cloud Computing
mrzahidfaiz.blogspot.com
 
PPT
Types of Load distributing algorithm in Distributed System
DHIVYADEVAKI
 
PPTX
daa-unit-3-greedy method
hodcsencet
 
PPTX
AI Unification.pptx
AbhishekGupta413669
 
PPT
Introduction to MPI
Hanif Durad
 
PPTX
Operating Systems Chapter 6 silberschatz
GiulianoRanauro
 
PPTX
Structure of shared memory space
Coder Tech
 
PDF
Cs8493 unit 2
Kathirvel Ayyaswamy
 
PPT
Issues in cloud computing
ronak patel
 
PDF
Design issues of dos
vanamali_vanu
 
DOCX
Distributed system Tanenbaum chapter 1,2,3,4 notes
SAhammedShakil
 
PDF
Distributed Operating System_1
Dr Sandeep Kumar Poonia
 
PDF
Basic communication operations - One to all Broadcast
RashiJoshi11
 
PPTX
distributed Computing system model
Harshad Umredkar
 
PPT
Process Management-Process Migration
MNM Jain Engineering College
 
PPT
File replication
Klawal13
 
PPTX
2. Distributed Systems Hardware & Software concepts
Prajakta Rane
 
PPTX
CRYPTOGRAPHY & NETWORK SECURITY - unit 1
RAMESHBABU311293
 
PPT
3. distributed file system requirements
AbDul ThaYyal
 
PPTX
Cluster computing
Kajal Thakkar
 
Common Standards in Cloud Computing
mrzahidfaiz.blogspot.com
 
Types of Load distributing algorithm in Distributed System
DHIVYADEVAKI
 
daa-unit-3-greedy method
hodcsencet
 
AI Unification.pptx
AbhishekGupta413669
 
Introduction to MPI
Hanif Durad
 
Operating Systems Chapter 6 silberschatz
GiulianoRanauro
 
Structure of shared memory space
Coder Tech
 
Cs8493 unit 2
Kathirvel Ayyaswamy
 
Issues in cloud computing
ronak patel
 
Design issues of dos
vanamali_vanu
 
Distributed system Tanenbaum chapter 1,2,3,4 notes
SAhammedShakil
 
Distributed Operating System_1
Dr Sandeep Kumar Poonia
 
Basic communication operations - One to all Broadcast
RashiJoshi11
 
distributed Computing system model
Harshad Umredkar
 
Process Management-Process Migration
MNM Jain Engineering College
 
File replication
Klawal13
 
2. Distributed Systems Hardware & Software concepts
Prajakta Rane
 
CRYPTOGRAPHY & NETWORK SECURITY - unit 1
RAMESHBABU311293
 
3. distributed file system requirements
AbDul ThaYyal
 
Cluster computing
Kajal Thakkar
 

Similar to Distributed Computing (20)

PDF
Chapter 14 slides Distributed System Presentation
Nehal668249
 
PDF
Time in distributed systmes
mohammad amid abbasi
 
PPT
Time Global States -- Distributed System
sellyscrt
 
PPT
Clocks
guesta013ed8
 
PDF
Distributed computing time
Deepak John
 
PDF
Chapter14.pdfffasfdaddsdsvdsffdhhhahdfdfghhh
PRASAD BANOTH
 
PDF
6.Distributed Operating Systems
Dr Sandeep Kumar Poonia
 
PPT
dokumen.tips_synchronization-in-distributed-systems-chapter-6.ppt
samaghorab
 
PPTX
distributed systems all about the data science process, covering the steps pr...
palaniappancse
 
PPT
Chapter 6-Synchronozation2.ppt
MeymunaMohammed1
 
PPTX
Synchronization
Ameena Tijjani
 
PPTX
CST 402 Distributed Computing Module 2 Notes
sm8i4
 
PPT
Chap 5
suks_87
 
PPT
CS6601-Unit 4 Distributed Systems
Nandakumar P
 
PPTX
Synchronization in distributed computing
SVijaylakshmi
 
PPTX
Physical and Logical Clocks
Dilum Bandara
 
PPTX
slides.06.pptx
balewayalew
 
PDF
Synchonization in Distributed Systems.pdf
cAnhTrn53
 
PPTX
3. syncro. in distributed system
Gd Goenka University
 
Chapter 14 slides Distributed System Presentation
Nehal668249
 
Time in distributed systmes
mohammad amid abbasi
 
Time Global States -- Distributed System
sellyscrt
 
Clocks
guesta013ed8
 
Distributed computing time
Deepak John
 
Chapter14.pdfffasfdaddsdsvdsffdhhhahdfdfghhh
PRASAD BANOTH
 
6.Distributed Operating Systems
Dr Sandeep Kumar Poonia
 
dokumen.tips_synchronization-in-distributed-systems-chapter-6.ppt
samaghorab
 
distributed systems all about the data science process, covering the steps pr...
palaniappancse
 
Chapter 6-Synchronozation2.ppt
MeymunaMohammed1
 
Synchronization
Ameena Tijjani
 
CST 402 Distributed Computing Module 2 Notes
sm8i4
 
Chap 5
suks_87
 
CS6601-Unit 4 Distributed Systems
Nandakumar P
 
Synchronization in distributed computing
SVijaylakshmi
 
Physical and Logical Clocks
Dilum Bandara
 
slides.06.pptx
balewayalew
 
Synchonization in Distributed Systems.pdf
cAnhTrn53
 
3. syncro. in distributed system
Gd Goenka University
 
Ad

Recently uploaded (20)

PPTX
Heart Bleed Bug - A case study (Course: Cryptography and Network Security)
Adri Jovin
 
PPTX
GitOps_Repo_Structure for begeinner(Scaffolindg)
DanialHabibi2
 
PPTX
Solar Thermal Energy System Seminar.pptx
Gpc Purapuza
 
PPTX
Element 7. CHEMICAL AND BIOLOGICAL AGENT.pptx
merrandomohandas
 
PPTX
原版一样(Acadia毕业证书)加拿大阿卡迪亚大学毕业证办理方法
Taqyea
 
PPTX
MobileComputingMANET2023 MobileComputingMANET2023.pptx
masterfake98765
 
PPTX
artificial intelligence applications in Geomatics
NawrasShatnawi1
 
PPTX
The Role of Information Technology in Environmental Protectio....pptx
nallamillisriram
 
PDF
MAD Unit - 1 Introduction of Android IT Department
JappanMavani
 
PPTX
MPMC_Module-2 xxxxxxxxxxxxxxxxxxxxx.pptx
ShivanshVaidya5
 
PPTX
UNIT DAA PPT cover all topics 2021 regulation
archu26
 
PPTX
Introduction to Neural Networks and Perceptron Learning Algorithm.pptx
Kayalvizhi A
 
PPT
PPT2_Metal formingMECHANICALENGINEEIRNG .ppt
Praveen Kumar
 
PPTX
Green Building & Energy Conservation ppt
Sagar Sarangi
 
PDF
Ethics and Trustworthy AI in Healthcare – Governing Sensitive Data, Profiling...
AlqualsaDIResearchGr
 
DOCX
CS-802 (A) BDH Lab manual IPS Academy Indore
thegodhimself05
 
PDF
Water Design_Manual_2005. KENYA FOR WASTER SUPPLY AND SEWERAGE
DancanNgutuku
 
PDF
monopile foundation seminar topic for civil engineering students
Ahina5
 
PDF
Unified_Cloud_Comm_Presentation anil singh ppt
anilsingh298751
 
PDF
Basic_Concepts_in_Clinical_Biochemistry_2018كيمياء_عملي.pdf
AdelLoin
 
Heart Bleed Bug - A case study (Course: Cryptography and Network Security)
Adri Jovin
 
GitOps_Repo_Structure for begeinner(Scaffolindg)
DanialHabibi2
 
Solar Thermal Energy System Seminar.pptx
Gpc Purapuza
 
Element 7. CHEMICAL AND BIOLOGICAL AGENT.pptx
merrandomohandas
 
原版一样(Acadia毕业证书)加拿大阿卡迪亚大学毕业证办理方法
Taqyea
 
MobileComputingMANET2023 MobileComputingMANET2023.pptx
masterfake98765
 
artificial intelligence applications in Geomatics
NawrasShatnawi1
 
The Role of Information Technology in Environmental Protectio....pptx
nallamillisriram
 
MAD Unit - 1 Introduction of Android IT Department
JappanMavani
 
MPMC_Module-2 xxxxxxxxxxxxxxxxxxxxx.pptx
ShivanshVaidya5
 
UNIT DAA PPT cover all topics 2021 regulation
archu26
 
Introduction to Neural Networks and Perceptron Learning Algorithm.pptx
Kayalvizhi A
 
PPT2_Metal formingMECHANICALENGINEEIRNG .ppt
Praveen Kumar
 
Green Building & Energy Conservation ppt
Sagar Sarangi
 
Ethics and Trustworthy AI in Healthcare – Governing Sensitive Data, Profiling...
AlqualsaDIResearchGr
 
CS-802 (A) BDH Lab manual IPS Academy Indore
thegodhimself05
 
Water Design_Manual_2005. KENYA FOR WASTER SUPPLY AND SEWERAGE
DancanNgutuku
 
monopile foundation seminar topic for civil engineering students
Ahina5
 
Unified_Cloud_Comm_Presentation anil singh ppt
anilsingh298751
 
Basic_Concepts_in_Clinical_Biochemistry_2018كيمياء_عملي.pdf
AdelLoin
 
Ad

Distributed Computing

  • 1. Chapter 3: Logical Time Ajay Kshemkalyani and Mukesh Singhal Distributed Computing: Principles, Algorithms, and Systems Cambridge University Press A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 1 / 67
  • 2. Distributed Computing: Principles, Algorithms, and Systems Introduction The concept of causality between events is fundamental to the design and analysis of parallel and distributed computing and operating systems. Usually causality is tracked using physical time. In distributed systems, it is not possible to have a global physical time. As asynchronous distributed computations make progress in spurts, the logical time is sufficient to capture the fundamental monotonicity property associated with causality in distributed systems. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 2 / 67
  • 3. Distributed Computing: Principles, Algorithms, and Systems Introduction This chapter discusses three ways to implement logical time - scalar time, vector time, and matrix time. Causality among events in a distributed system is a powerful concept in reasoning, analyzing, and drawing inferences about a computation. The knowledge of the causal precedence relation among the events of processes helps solve a variety of problems in distributed systems, such as distributed algorithms design, tracking of dependent events, knowledge about the progress of a computation, and concurrency measures. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 3 / 67
  • 4. Distributed Computing: Principles, Algorithms, and Systems A Framework for a System of Logical Clocks Definition A system of logical clocks consists of a time domain T and a logical clock C. Elements of T form a partially ordered set over a relation <. Relation < is called the happened before or causal precedence. Intuitively, this relation is analogous to the earlier than relation provided by the physical time. The logical clock C is a function that maps an event e in a distributed system to an element in the time domain T, denoted as C(e) and called the timestamp of e, and is defined as follows: C : H 7→ T such that the following property is satisfied: for two events ei and ej , ei → ej =⇒ C(ei ) < C(ej ). A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 4 / 67
  • 5. Distributed Computing: Principles, Algorithms, and Systems A Framework for a System of Logical Clocks This monotonicity property is called the clock consistency condition. When T and C satisfy the following condition, for two events ei and ej , ei → ej ⇔ C(ei ) < C(ej ) the system of clocks is said to be strongly consistent. Implementing Logical Clocks Implementation of logical clocks requires addressing two issues: data structures local to every process to represent logical time and a protocol to update the data structures to ensure the consistency condition. Each process pi maintains data structures that allow it the following two capabilities: ◮ A local logical clock, denoted by lci , that helps process pi measure its own progress. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 5 / 67
  • 6. Distributed Computing: Principles, Algorithms, and Systems Implementing Logical Clocks ◮ A logical global clock, denoted by gci , that is a representation of process pi ’s local view of the logical global time. Typically, lci is a part of gci . The protocol ensures that a process’s logical clock, and thus its view of the global time, is managed consistently. The protocol consists of the following two rules: R1: This rule governs how the local logical clock is updated by a process when it executes an event. R2: This rule governs how a process updates its global logical clock to update its view of the global time and global progress. Systems of logical clocks differ in their representation of logical time and also in the protocol to update the logical clocks. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 6 / 67
  • 7. Distributed Computing: Principles, Algorithms, and Systems Scalar Time Proposed by Lamport in 1978 as an attempt to totally order events in a distributed system. Time domain is the set of non-negative integers. The logical local clock of a process pi and its local view of the global time are squashed into one integer variable Ci . Rules R1 and R2 to update the clocks are as follows: R1: Before executing an event (send, receive, or internal), process pi executes the following: Ci := Ci + d (d > 0) In general, every time R1 is executed, d can have a different value; however, typically d is kept at 1. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 7 / 67
  • 8. Distributed Computing: Principles, Algorithms, and Systems Scalar Time R2: Each message piggybacks the clock value of its sender at sending time. When a process pi receives a message with timestamp Cmsg , it executes the following actions: ◮ Ci := max(Ci , Cmsg ) ◮ Execute R1. ◮ Deliver the message. Figure 3.1 shows evolution of scalar time. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 8 / 67
  • 9. Distributed Computing: Principles, Algorithms, and Systems Scalar Time Evolution of scalar time: p 1 p 2 p 3 1 2 3 3 10 11 5 6 7 2 7 9 4 b 1 8 9 4 5 1 Figure 3.1: The space-time diagram of a distributed execution. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 9 / 67
  • 10. Distributed Computing: Principles, Algorithms, and Systems Basic Properties Consistency Property Scalar clocks satisfy the monotonicity and hence the consistency property: for two events ei and ej , ei → ej =⇒ C(ei ) < C(ej ). Total Ordering Scalar clocks can be used to totally order events in a distributed system. The main problem in totally ordering events is that two or more events at different processes may have identical timestamp. For example in Figure 3.1, the third event of process P1 and the second event of process P2 have identical scalar timestamp. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 10 / 67
  • 11. Distributed Computing: Principles, Algorithms, and Systems Total Ordering A tie-breaking mechanism is needed to order such events. A tie is broken as follows: Process identifiers are linearly ordered and tie among events with identical scalar timestamp is broken on the basis of their process identifiers. The lower the process identifier in the ranking, the higher the priority. The timestamp of an event is denoted by a tuple (t, i) where t is its time of occurrence and i is the identity of the process where it occurred. The total order relation ≺ on two events x and y with timestamps (h,i) and (k,j), respectively, is defined as follows: x ≺ y ⇔ (h < k or (h = k and i < j)) A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 11 / 67
  • 12. Distributed Computing: Principles, Algorithms, and Systems Properties. . . Event counting If the increment value d is always 1, the scalar time has the following interesting property: if event e has a timestamp h, then h-1 represents the minimum logical duration, counted in units of events, required before producing the event e; We call it the height of the event e. In other words, h-1 events have been produced sequentially before the event e regardless of the processes that produced these events. For example, in Figure 3.1, five events precede event b on the longest causal path ending at b. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 12 / 67
  • 13. Distributed Computing: Principles, Algorithms, and Systems Properties. . . No Strong Consistency The system of scalar clocks is not strongly consistent; that is, for two events ei and ej, C(ei ) < C(ej ) 6=⇒ ei → ej . For example, in Figure 3.1, the third event of process P1 has smaller scalar timestamp than the third event of process P2.However, the former did not happen before the latter. The reason that scalar clocks are not strongly consistent is that the logical local clock and logical global clock of a process are squashed into one, resulting in the loss causal dependency information among events at different processes. For example, in Figure 3.1, when process P2 receives the first message from process P1, it updates its clock to 3, forgetting that the timestamp of the latest event at P1 on which it depends is 2. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 13 / 67
  • 14. Distributed Computing: Principles, Algorithms, and Systems Vector Time The system of vector clocks was developed independently by Fidge, Mattern and Schmuck. In the system of vector clocks, the time domain is represented by a set of n-dimensional non-negative integer vectors. Each process pi maintains a vector vti [1..n], where vti [i] is the local logical clock of pi and describes the logical time progress at process pi . vti [j] represents process pi ’s latest knowledge of process pj local time. If vti [j]=x, then process pi knows that local time at process pj has progressed till x. The entire vector vti constitutes pi ’s view of the global logical time and is used to timestamp events. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 14 / 67
  • 15. Distributed Computing: Principles, Algorithms, and Systems Vector Time Process pi uses the following two rules R1 and R2 to update its clock: R1: Before executing an event, process pi updates its local logical time as follows: vti [i] := vti [i] + d (d > 0) R2: Each message m is piggybacked with the vector clock vt of the sender process at sending time. On the receipt of such a message (m,vt), process pi executes the following sequence of actions: ◮ Update its global logical time as follows: 1 ≤ k ≤ n : vti [k] := max(vti [k], vt[k]) ◮ Execute R1. ◮ Deliver the message m. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 15 / 67
  • 16. Distributed Computing: Principles, Algorithms, and Systems Vector Time The timestamp of an event is the value of the vector clock of its process when the event is executed. Figure 3.2 shows an example of vector clocks progress with the increment value d=1. Initially, a vector clock is [0, 0, 0, ...., 0]. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 16 / 67
  • 17. Distributed Computing: Principles, Algorithms, and Systems Vector Time An Example of Vector Clocks 3 p p 1 2 0 0 3 0 0 4 3 4 0 1 0 2 0 0 2 3 0 2 4 0 2 3 4 5 3 4 5 6 4 0 0 1 2 3 3 2 3 4 2 p 2 3 0 2 2 0 2 3 2 1 0 0 5 3 4 5 5 4 Figure 3.2: Evolution of vector time. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 17 / 67
  • 18. Distributed Computing: Principles, Algorithms, and Systems Vector Time Comparing Vector Timestamps The following relations are defined to compare two vector timestamps, vh and vk: vh = vk ⇔ ∀x : vh[x] = vk[x] vh ≤ vk ⇔ ∀x : vh[x] ≤ vk[x] vh < vk ⇔ vh ≤ vk and ∃x : vh[x] < vk[x] vh k vk ⇔ ¬(vh < vk) ∧ ¬(vk < vh) If the process at which an event occurred is known, the test to compare two timestamps can be simplified as follows: If events x and y respectively occurred at processes pi and pj and are assigned timestamps vh and vk, respectively, then x → y ⇔ vh[i] ≤ vk[i] x k y ⇔ vh[i] > vk[i] ∧ vh[j] < vk[j] A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 18 / 67
  • 19. Distributed Computing: Principles, Algorithms, and Systems Vector Time Properties of Vectot Time Isomorphism If events in a distributed system are timestamped using a system of vector clocks, we have the following property. If two events x and y have timestamps vh and vk, respectively, then x → y ⇔ vh < vk x k y ⇔ vh k vk. Thus, there is an isomorphism between the set of partially ordered events produced by a distributed computation and their vector timestamps. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 19 / 67
  • 20. Distributed Computing: Principles, Algorithms, and Systems Vector Time Strong Consistency The system of vector clocks is strongly consistent; thus, by examining the vector timestamp of two events, we can determine if the events are causally related. However, Charron-Bost showed that the dimension of vector clocks cannot be less than n, the total number of processes in the distributed computation, for this property to hold. Event Counting If d=1 (in rule R1), then the ith component of vector clock at process pi , vti [i], denotes the number of events that have occurred at pi until that instant. So, if an event e has timestamp vh, vh[j] denotes the number of events executed by process pj that causally precede e. Clearly, P vh[j] − 1 represents the total number of events that causally precede e in the distributed computation. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 20 / 67
  • 21. Distributed Computing: Principles, Algorithms, and Systems Efficient Implementations of Vector Clocks If the number of processes in a distributed computation is large, then vector clocks will require piggybacking of huge amount of information in messages. The message overhead grows linearly with the number of processors in the system and when there are thousands of processors in the system, the message size becomes huge even if there are only a few events occurring in few processors. We discuss an efficient way to maintain vector clocks. Charron-Bost showed that if vector clocks have to satisfy the strong consistency property, then in general vector timestamps must be at least of size n, the total number of processes. However, optimizations are possible and next, and we discuss a technique to implement vector clocks efficiently. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 21 / 67
  • 22. Distributed Computing: Principles, Algorithms, and Systems Singhal-Kshemkalyani’s Differential Technique Singhal-Kshemkalyani’s differential technique is based on the observation that between successive message sends to the same process, only a few entries of the vector clock at the sender process are likely to change. When a process pi sends a message to a process pj , it piggybacks only those entries of its vector clock that differ since the last message sent to pj . If entries i1, i2, . . . , in1 of the vector clock at pi have changed to v1, v2, . . . , vn1 , respectively, since the last message sent to pj , then process pi piggybacks a compressed timestamp of the form: {(i1, v1), (i2, v2), . . . , (in1 , vn1 )} to the next message to pj . A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 22 / 67
  • 23. Distributed Computing: Principles, Algorithms, and Systems Singhal-Kshemkalyani’s Differential Technique When pj receives this message, it updates its vector clock as follows: vti [ik ] = max(vti [ik ], vk ) for k = 1, 2, . . ., n1. Thus this technique cuts down the message size, communication bandwidth and buffer (to store messages) requirements. In the worst of case, every element of the vector clock has been updated at pi since the last message to process pj , and the next message from pi to pj will need to carry the entire vector timestamp of size n. However, on the average the size of the timestamp on a message will be less than n. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 23 / 67
  • 24. Distributed Computing: Principles, Algorithms, and Systems Singhal-Kshemkalyani’s Differential Technique Implementation of this technique requires each process to remember the vector timestamp in the message last sent to every other process. Direct implementation of this will result in O(n2 ) storage overhead at each process. Singhal and Kshemkalyani developed a clever technique that cuts down this storage overhead at each process to O(n). The technique works in the following manner: Process pi maintains the following two additional vectors: ◮ LSi [1..n] (‘Last Sent’): LSi [j] indicates the value of vti [i] when process pi last sent a message to process pj . ◮ LUi [1..n] (‘Last Update’): LUi [j] indicates the value of vti [i] when process pi last updated the entry vti [j]. Clearly, LUi [i] = vti [i] at all times and LUi [j] needs to be updated only when the receipt of a message causes pi to update entry vti [j]. Also, LSi [j] needs to be updated only when pi sends a message to pj . A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 24 / 67
  • 25. Distributed Computing: Principles, Algorithms, and Systems Singhal-Kshemkalyani’s Differential Technique Since the last communication from pi to pj , only those elements of vector clock vti [k] have changed for which LSi [j] < LUi [k] holds. Hence, only these elements need to be sent in a message from pi to pj. When pi sends a message to pj , it sends only a set of tuples {(x, vti [x])|LSi [j] < LUi [x]} as the vector timestamp to pj , instead of sending a vector of n entries in a message. Thus the entire vector of size n is not sent along with a message. Instead, only the elements in the vector clock that have changed since the last message send to that process are sent in the format {(p1, latest value), (p2, latest value), . . .}, where pi indicates that the pi th component of the vector clock has changed. This technique requires that the communication channels follow FIFO discipline for message delivery. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 25 / 67
  • 26. Distributed Computing: Principles, Algorithms, and Systems Singhal-Kshemkalyani’s Differential Technique This method is illustrated in Figure 3.3. For instance, the second message from p3 to p2 (which contains a timestamp {(3, 2)}) informs p2 that the third component of the vector clock has been modified and the new value is 2. This is because the process p3 (indicated by the third component of the vector) has advanced its clock value from 1 to 2 since the last message sent to p2. This technique substantially reduces the cost of maintaining vector clocks in large systems, especially if the process interactions exhibit temporal or spatial localities. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 26 / 67
  • 27. Distributed Computing: Principles, Algorithms, and Systems Singhal-Kshemkalyani’s Differential Technique p 1 p 2 p 3 p 4 1 0 0 0 1 1 0 0 1 3 2 0 1 2 1 0 0 0 2 0 0 0 3 1 0 0 4 1 0 0 0 1 1 4 4 1 0 0 1 0 {(1,1)} {(3,1)} {(3,2)} {(3,4),(4,1)} {(4,1)} Figure 3.3: Vector clocks progress in Singhal-Kshemkalyani technique. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 27 / 67
  • 28. Distributed Computing: Principles, Algorithms, and Systems Matrix Time In a system of matrix clocks, the time is represented by a set of n × n matrices of non-negative integers. A process pi maintains a matrix mti [1..n, 1..n] where, mti [i, i] denotes the local logical clock of pi and tracks the progress of the computation at process pi . mti [i, j] denotes the latest knowledge that process pi has about the local logical clock, mtj [j, j], of process pj . mti [j, k] represents the knowledge that process pi has about the latest knowledge that pj has about the local logical clock, mtk [k, k], of pk . The entire matrix mti denotes pi ’s local view of the global logical time. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 28 / 67
  • 29. Distributed Computing: Principles, Algorithms, and Systems Matrix Time Process pi uses the following rules R1 and R2 to update its clock: R1 : Before executing an event, process pi updates its local logical time as follows: mti [i, i] := mti [i, i] + d (d > 0) R2: Each message m is piggybacked with matrix time mt. When pi receives such a message (m,mt) from a process pj , pi executes the following sequence of actions: ◮ Update its global logical time as follows: (a) 1 ≤ k ≤ n : mti [i, k] := max(mti [i, k], mt[j, k]) (That is, update its row mti [i, ∗] with the pj ’s row in the received timestamp, mt.) (b) 1 ≤ k, l ≤ n : mti [k, l] := max(mti [k, l], mt[k, l]) ◮ Execute R1. ◮ Deliver message m. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 29 / 67
  • 30. Distributed Computing: Principles, Algorithms, and Systems Matrix Time Figure 3.4 gives an example to illustrate how matrix clocks progress in a distributed computation. We assume d=1. Let us consider the following events: e which is the xi -th event at process pi , e1 k and e2 k which are the x1 k -th and x2 k -th event at process pk , and e1 j and e2 j which are the x1 j -th and x2 j -th events at pj. Let mte denote the matrix timestamp associated with event e. Due to message m4, e2 k is the last event of pk that causally precedes e, therefore, we have mte[i, k]=mte[k, k]=x2 k . Likewise, mte[i, j]=mte[j, j]=x2 j . The last event of pk known by pj , to the knowledge of pi when it executed event e, is e1 k ; therefore, mte[j, k]=x1 k . Likewise, we have mte[k, j]=x1 j . A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 30 / 67
  • 31. Distributed Computing: Principles, Algorithms, and Systems Matrix Time e1 j ej 2 k e2 e1 k mt k,j mt j,j ] p p p k j i e m m m m 2 3 4 e e e e 1 mte [ [ [ mt i,k mt i,k [ ] ] ] Figure 3.4: Evolution of matrix time. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 31 / 67
  • 32. Distributed Computing: Principles, Algorithms, and Systems Matrix Time Basic Properties Vector mti [i, .] contains all the properties of vector clocks. In addition, matrix clocks have the following property: mink (mti [k, l]) ≥ t ⇒ process pi knows that every other process pk knows that pl ’s local time has progressed till t. ◮ If this is true, it is clear that process pi knows that all other processes know that pl will never send information with a local time ≤ t. ◮ In many applications, this implies that processes will no longer require from pl certain information and can use this fact to discard obsolete information. If d is always 1 in the rule R1, then mti [k, l] denotes the number of events occurred at pl and known by pk as far as pi ’s knowledge is concerned. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 32 / 67
  • 33. Distributed Computing: Principles, Algorithms, and Systems Virtual Time Virtual time system is a paradigm for organizing and synchronizing distributed systems. This section a provides description of virtual time and its implementation using the Time Warp mechanism. The implementation of virtual time using Time Warp mechanism works on the basis of an optimistic assumption. Time Warp relies on the general lookahead-rollback mechanism where each process executes without regard to other processes having synchronization conflicts. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 33 / 67
  • 34. Distributed Computing: Principles, Algorithms, and Systems Virtual Time If a conflict is discovered, the offending processes are rolled back to the time just before the conflict and executed forward along the revised path. Detection of conflicts and rollbacks are transparent to users. The implementation of Virtual Time using Time Warp mechanism makes the following optimistic assumption: synchronization conflicts and thus rollbacks generally occurs rarely. next, we discuss in detail Virtual Time and how Time Warp mechanism is used to implement it. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 34 / 67
  • 35. Distributed Computing: Principles, Algorithms, and Systems Virtual Time Definition “Virtual time is a global, one dimensional, temporal coordinate system on a distributed computation to measure the computational progress and to define synchronization.” A virtual time system is a distributed system executing in coordination with an imaginary virtual clock that uses virtual time. Virtual times are real values that are totally ordered by the less than relation, “<”. Virtual time is implemented a collection of several loosely synchronized local virtual clocks. These local virtual clocks move forward to higher virtual times; however, occasionaly they move backwards. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 35 / 67
  • 36. Distributed Computing: Principles, Algorithms, and Systems Virtual Time Definition Processes run concurrently and communicate with each other by exchanging messages. Every message is characterized by four values: a) Name of the sender b) Virtual send time c) Name of the receiver d) Virtual receive time Virtual send time is the virtual time at the sender when the message is sent, whereas virtual receive time specifies the virtual time when the message must be received (and processed) by the receiver. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 36 / 67
  • 37. Distributed Computing: Principles, Algorithms, and Systems Virtual Time Definition A problem arises when a message arrives at process late, that is, the virtual receive time of the message is less than the local virtual time at the receiver process when the message arrives. Virtual time systems are subject to two semantic rules similar to Lamport’s clock conditions: ◮ Rule 1: Virtual send time of each message < virtual receive time of that message. ◮ Rule 2: Virtual time of each event in a process < Virtual time of next event in that process. The above two rules imply that a process sends all messages in increasing order of virtual send time and a process receives (and processes) all messages in the increasing order of virtual receive time. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 37 / 67
  • 38. Distributed Computing: Principles, Algorithms, and Systems Virtual Time Definition Causality of events is an important concept in distributed systems and is also a major constraint in the implementation of virtual time. It is important an event that causes another should be completely executed before the caused event can be processed. The constraint in the implementation of virtual time can be stated as follows: “If an event A causes event B, then the execution of A and B must be scheduled in real time so that A is completed before B starts”. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 38 / 67
  • 39. Distributed Computing: Principles, Algorithms, and Systems Virtual Time Definition If event A has an earlier virtual time than event B, we need execute A before B provided there is no causal chain from A to B. Better performance can be achieved by scheduling A concurrently with B or scheduling A after B. If A and B have exactly the same virtual time coordinate, then there is no restriction on the order of their scheduling. If A and B are distinct events, they will have different virtual space coordinates (since they occur at different processes) and neither will be a cause for the other. To sum it up, events with virtual time < ‘t’ complete before the starting of events at time ‘t’ and events with virtual time > ‘t’ will start only after events at time ‘t’ are complete. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 39 / 67
  • 40. Distributed Computing: Principles, Algorithms, and Systems Virtual Time Definition Characteristics of Virtual Time 1 Virtual time systems are not all isomorphic; it may be either discrete or continuous. 2 Virtual time may be only partially ordered. 3 Virtual time may be related to real time or may be independent of it. 4 Virtual time systems may be visible to programmers and manipulated explicitly as values, or hidden and manipulated implicitly according to some system-defined discipline 5 Virtual times associated with events may be explicitly calculated by user programs or they may be assigned by fixed rules. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 40 / 67
  • 41. Distributed Computing: Principles, Algorithms, and Systems Comparison with Lamport’s Logical Clocks In Lamport’s logical clock, an artificial clock is created one for each process with unique labels from a totally ordered set in a manner consistent with partial order. In virtual time, the reverse of the above is done by assuming that every event is labeled with a clock value from a totally ordered virtual time scale satisfying Lamport’s clock conditions. Thus the Time Warp mechanism is an inverse of Lamport’s scheme. In Lamport’s scheme, all clocks are conservatively maintained so that they never violate causality. A process advances its clock as soon as it learns of new causal dependency. In the virtual time, clocks are optimisticaly advanced and corrective actions are taken whenever a violation is detected. Lamport’s initial idea brought about the concept of virtual time but the model failed to preserve causal independence. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 41 / 67
  • 42. Distributed Computing: Principles, Algorithms, and Systems Virtual Time Definition Time Warp Mechanism In the implementation of virtual time using Time Warp mechanism, virtual receive time of message is considered as its timestamp. The necessary and sufficient conditions for the correct implementation of virtual time are that each process must handle incoming messages in timestamp order. This is highly undesirable and restrictive because process speeds and message delays are likely to highly variable. It natural for some processes to get ahead in virtual time of other processes. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 42 / 67
  • 43. Distributed Computing: Principles, Algorithms, and Systems Virtual Time Definition Time Warp Mechanism It is impossible for a process on the basis of local information alone to block and wait for the message with the next timestamp. It is always possible that a message with earlier timestamp arrives later. So, when a process executes a message, it is very difficult for it determine whether a message with an earlier timestamp will arrive later. This is the central problem in virtual time that is solved by the Time Warp mechanism. The Time warp mechanism assumes that message communication is reliable, nad messages may not be delivered in FIFO order. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 43 / 67
  • 44. Distributed Computing: Principles, Algorithms, and Systems Virtual Time Definition Time Warp Mechanism Time Warp mechanism consists of two major parts: local control mechanism and global control mechanism. The local control mechanism insures that events are executed and messages are processed in the correct order. The global control mechanism takes care of global issues such as global progress, termination detection, I/O error handling, flow control, etc. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 44 / 67
  • 45. Distributed Computing: Principles, Algorithms, and Systems The Local Control Mechanism There is no global virtual clock variable in this implementation; each process has a local virtual clock variable. The local virtual clock of a process doesn’t change during an event at that process but it changes only between events. On the processing of next message from the input queue, the process increases its local clock to the timestamp of the message. At any instant, the value of virtual time may differ for each process but the value is transparent to other processes in the system. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 45 / 67
  • 46. Distributed Computing: Principles, Algorithms, and Systems The Local Control Mechanism When a message is sent, the virtual send time is copied from the sender’s virtual clock while the name of the receiver and virtual receive time are assigned based on application specific context. All arriving messages at a process are stored in an input queue in the increasing order of timestamps (receive times). Processes will receive late messages due to factors such as different computation rates of processes and network delays. The semantics of virtual time demands that incoming messages be received by each process strictly in the timestamp order. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 46 / 67
  • 47. Distributed Computing: Principles, Algorithms, and Systems The Local Control Mechanism This is accomplished as follows: “On the reception of a late message, the receiver rolls back to an earlier virtual time, cancelling all intermediate side effects and then executes forward again by executing the late message in the proper sequence.” If all the messages in the input queue of a process are processed, the state of the process is said to terminate and its clock is set to + inf. However, the process is not destroyed as a late message may arrive resulting it to rollback and execute again. Thus, each process is doing a constant “lookahead”, processing future messages from its input queue. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 47 / 67
  • 48. Distributed Computing: Principles, Algorithms, and Systems The Local Control Mechanism Over a length computation, each process may roll back several times while generally progressing forward with rollback completely transparent to other processes in the system. Rollback in a distributed system is complicated: A process that wants to rollback might have sent many messages to other processes, which in turn might have sent many messages to other processes, and so on, leading to deep side effects. For rollback, messages must be effectively “unsent” and their side effects should be undone. This is achieved efficiently by using antimessages. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 48 / 67
  • 49. Distributed Computing: Principles, Algorithms, and Systems The Local Control Mechanism Antimessages and the Rollback Mechanism Runtime representation of a process is composed of the following: Process name: Virtual spaces coordinate which is unique in the system. Local virtual clock: Virtual time coordinate State: Data space of the process including execution stack, program counter and its own variables State queue: Contains saved copies of process’s recent states as roll back with Time warp mechanism requires the state of the process being saved. Input queue: Contains all recently arrived messages in order of virtual receive time. Processed messages from the input queue are not deleted as they are saved in the output queue with a negative sign (antimessage) to facilitate future roll backs. Output queue: Contains negative copies of messages the process has recently sent in virtual send time order. They are needed in case of a rollback. For every message, there exists an antimessage that is the same in content but opposite in sign. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 49 / 67
  • 50. Distributed Computing: Principles, Algorithms, and Systems Antimessages and the Rollback Mechanism Whenever a process sends a message, a copy of the message is transmitted to receiver’s input queue and a negative copy (antimessage) is retained in the sender’s output queue for use in sender rollback. Whenever a message and its antimessage appear in the same queue no matter in which order they arrived, they immediately annihilate each other resulting in shortening of the queue by one message. When a message arrives at the input queue of a process with timestamp greater than virtual clock time of its destination process, it is simply enqueued. When the destination process’ virtual time is greater than the virtual time of message received, the process must do a rollback. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 50 / 67
  • 51. Distributed Computing: Principles, Algorithms, and Systems Antimessages and the Rollback Mechanism Rollback Mechanism Search the ”State queue” for the last saved state with timestamp that is less than the timestamp of the message received and restore it. Make the timestamp of the received message as the value of the local virtual clock and discard from the state queue all states saved after this time. Then the resume execution forward from this point. Now all the messages that are sent between the current state and earlier state must be “unsent”. This is taken care of by executing a simple rule: “To unsend a message, simply transmit its antimessage.” This results in antimessages following the positive ones to the destination. A negative message causes a rollback at its destination if it’s virtual receive time is less than the receiver’s virtual time. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 51 / 67
  • 52. Distributed Computing: Principles, Algorithms, and Systems Antimessages and the Rollback Mechanism Depending on the timing, there are several possibilities at the receiver’s end: First, the original (positive) message has arrived but not yet been processed at the receiver. In this case, the negative message causes no rollback, however, it annihilates with the positive message leaving the receiver with no record of that message. Second, the original positive message has already been partially or completely processed by the receiver. In this case, the negative message causes the receiver to roll back to a virtual time when the positive message was received. It will also annihilate the positive message leaving the receiver with no record that the message existed. When the receiver executes again, the execution will assume that these message never existed. A rolled back process may send antimessages to other processes. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 52 / 67
  • 53. Distributed Computing: Principles, Algorithms, and Systems Antimessages and the Rollback Mechanism A negative message can also arrive at the destination before the positive one. In this case, it is enqueued and will be annihilated when positive message arrives. If it is negative message’s turn to be executed at a processs’ input queqe, the receiver may take any action like a no-op. Any action taken will eventually be rolled back when the corresponding positive message arrives. An optimization would be to skip the antimessage from the input queue and treat it as a no-op, and when the corresponding positive message arrives, it will annihilate the negative message, and inhibit any rollback. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 53 / 67
  • 54. Distributed Computing: Principles, Algorithms, and Systems Antimessages and the Rollback Mechanism The antimessage protocol has several advantages: It is extremely robust and works under all possible circumstances. It is free from deadlocks as there is no blocking. It is also free from domino effects. In the worst case, all processes in system roll back to same virtual time as original one did and then proceed forward again. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 54 / 67
  • 55. Distributed Computing: Principles, Algorithms, and Systems Physical Clock Synchronization: NTP Motivation In centralized systems, there is only single clock. A process gets the time by simply issuing a system call to the kernel. In distributed systems, there is no global clock or common memory. Each processor has its own internal clock and its own notion of time. These clocks can easily drift seconds per day, accumulating significant errors over time. Also, because different clocks tick at different rates, they may not remain always synchronized although they might be synchronized when they start. This clearly poses serious problems to applications that depend on a synchronized notion of time. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 55 / 67
  • 56. Distributed Computing: Principles, Algorithms, and Systems Physical Clock Synchronization: NTP Motivation For most applications and algorithms that run in a distributed system, we need to know time in one or more of the following contexts: ◮ The time of the day at which an event happened on a specific machine in the network. ◮ The time interval between two events that happened on different machines in the network. ◮ The relative ordering of events that happened on different machines in the network. Unless the clocks in each machine have a common notion of time, time-based queries cannot be answered. Clock synchronization has a significant effect on many problems like secure systems, fault diagnosis and recovery, scheduled operations, database systems, and real-world clock values. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 56 / 67
  • 57. Distributed Computing: Principles, Algorithms, and Systems Physical Clock Synchronization: NTP Clock synchronization is the process of ensuring that physically distributed processors have a common notion of time. Due to different clocks rates, the clocks at various sites may diverge with time and periodically a clock synchronization must be performed to correct this clock skew in distributed systems. Clocks are synchronized to an accurate real-time standard like UTC (Universal Coordinated Time). Clocks that must not only be synchronized with each other but also have to adhere to physical time are termed physical clocks. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 57 / 67
  • 58. Distributed Computing: Principles, Algorithms, and Systems Physical Clock Synchronization: NTP Definitions and Terminology Let Ca and Cb be any two clocks. Time: The time of a clock in a machine p is given by the function Cp(t), where Cp(t) = t for a perfect clock. Frequency: Frequency is the rate at which a clock progresses. The frequency at time t of clock Ca is C ′ a(t). Offset: Clock offset is the difference between the time reported by a clock and the real time. The offset of the clock Ca is given by Ca(t) − t. The offset of clock Ca relative to Cb at time t ≥ 0 is given by Ca(t) − Cb(t). Skew: The skew of a clock is the difference in the frequencies of the clock and the perfect clock. The skew of a clock Ca relative to clock Cb at time t is (C′ a(t) − C′ b(t)). If the skew is bounded by ρ, then as per Equation (1), clock values are allowed to diverge at a rate in the range of 1 − ρ to 1 + ρ. Drift (rate): The drift of clock Ca is the second derivative of the clock value with respect to time, namely, C′′ a (t). The drift of clock Ca relative to clock Cb at time t is C′′ a (t) − C′′ b (t). A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 58 / 67
  • 59. Distributed Computing: Principles, Algorithms, and Systems Physical Clock Synchronization: NTP Clock Inaccuracies Physical clocks are synchronized to an accurate real-time standard like UTC (Universal Coordinated Time). However, due to the clock inaccuracy discussed above, a timer (clock) is said to be working within its specification if (where constant ρ is the maximum skew rate specified by the manufacturer.) 1 − ρ ≤ dC dt ≤ 1 + ρ (1) Figure 3.5 illustrates the behavior of fast, slow, and perfect clocks with respect to UTC. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 59 / 67
  • 60. Distributed Computing: Principles, Algorithms, and Systems Physical Clock Synchronization: NTP Clock time, C UTC, t Fast Clock dC/dt > 1 Perfect Clock dC/dt = 1 Slow Clock dC/dt < 1 Figure 3.5: The behavior of fast, slow, and perfect clocks with respect to UTC. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 60 / 67
  • 61. Distributed Computing: Principles, Algorithms, and Systems Physical Clock Synchronization: NTP Offset delay estimation method The Network Time Protocol (NTP) which is widely used for clock synchronization on the Internet uses the The Offset Delay Estimation method. The design of NTP involves a hierarchical tree of time servers. ◮ The primary server at the root synchronizes with the UTC. ◮ The next level contains secondary servers, which act as a backup to the primary server. ◮ At the lowest level is the synchronization subnet which has the clients. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 61 / 67
  • 62. Distributed Computing: Principles, Algorithms, and Systems Physical Clock Synchronization: NTP Clock offset and delay estimation: In practice, a source node cannot accurately estimate the local time on the target node due to varying message or network delays between the nodes. This protocol employs a common practice of performing several trials and chooses the trial with the minimum delay. Figure 3.6 shows how NTP timestamps are numbered and exchanged between peers A and B. Let T1, T2, T3, T4 be the values of the four most recent timestamps as shown. Assume clocks A and B are stable and running at the same speed. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 62 / 67
  • 63. Distributed Computing: Principles, Algorithms, and Systems T3 T1 A B T2 T4 Figure 3.6: Offset and delay estimation. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 63 / 67
  • 64. Distributed Computing: Principles, Algorithms, and Systems Let a = T1 − T3 and b = T2 − T4. If the network delay difference from A to B and from B to A, called differential delay, is small, the clock offset θ and roundtrip delay δ of B relative to A at time T4 are approximately given by the following. θ = a + b 2 , δ = a − b (2) Each NTP message includes the latest three timestamps T1, T2 and T3, while T4 is determined upon arrival. Thus, both peers A and B can independently calculate delay and offset using a single bidirectional message stream as shown in Figure 3.7. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 64 / 67
  • 65. Distributed Computing: Principles, Algorithms, and Systems Ti-3 Ti-2 Server A Server B Ti-1 Ti Figure 3.7: Timing diagram for the two servers. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 65 / 67
  • 66. Distributed Computing: Principles, Algorithms, and Systems The Network Time Protocol synchronization protocol. A pair of servers in symmetric mode exchange pairs of timing messages. A store of data is then built up about the relationship between the two servers (pairs of offset and delay). Specifically, assume that each peer maintains pairs (Oi ,Di ), where Oi - measure of offset (θ) Di - transmission delay of two messages (δ). The offset corresponding to the minimum delay is chosen. Specifically, the delay and offset are calculated as follows. Assume that message m takes time t to transfer and m′ takes t′ to transfer. (Continued on the next slide . . . .) A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 66 / 67
  • 67. Distributed Computing: Principles, Algorithms, and Systems The Network Time Protocol synchronization protocol. The offset between A’s clock and B’s clock is O. If A’s local clock time is A(t) and B’s local clock time is B(t), we have A(t) = B(t) + O (3) Then, Ti−2 = Ti−3 + t + O (4) Ti = Ti−1 − O + t′ (5) Assuming t = t′ , the offset Oi can be estimated as: Oi = (Ti−2 − Ti−3 + Ti−1 − Ti )/2 (6) The round-trip delay is estimated as: Di = (Ti − Ti−3) − (Ti−1 − Ti−2) (7) The eight most recent pairs of (Oi , Di ) are retained. The value of Oi that corresponds to minimum Di is chosen to estimate O. A. Kshemkalyani and M. Singhal (Distributed Computing) Logical Time CUP 2008 67 / 67