UCB CS162 操作系统笔记（九）

最新推荐文章于 2024-11-21 10:01:36 发布

绝不原创的飞龙

最新推荐文章于 2024-11-21 10:01:36 发布

阅读量766

点赞数 9

CC 4.0 BY-SA版权

分类专栏： MLM 文章标签： MLM

License CC BY-NC-SA 4.0 / 自豪地采用谷歌翻译

本文链接：https://blog.csdn.net/wizardforcel/article/details/143172352

MLM 专栏收录该内容

3715 篇文章

订阅专栏

P24：Lecture 24： Networking and TCP IP (Con’t), RPC, Distributed File Systems - RubatoTheEmber - BV1L541117gr

All right， so let’s get started。

外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传

So we did not intend to be remote again this semester， but here we are remote once again。

hopefully just for today。 So this is lecture 24 and we have a lot to go through。

We’re gonna continue talking about distributed consensus making。

Then we’re gonna get into networking and TCP/IP， and time permitting。

we’ll get into our remote procedure call， and distributed file systems。

So if you remember from last time， we talked about distributed consensus making。

where we have a consensus problem。 So nodes propose a value and we want。

even in the presence of crashes and failures of other nodes。

to be able to reach a decision across all of the nodes。

and have that be the same decision at all of the nodes。

So examples are choosing between true and false， or choosing between commit and abort。

Now it’s very important that we make our results durable。

So we’re gonna use a log or other form of stable storage。

to ensure that the decision persists once it’s made。 Okay， and remember also we talked about。

the two phase commit protocol， where we have a persistent stable log on each node。

and that’s where nodes track and record， whether a commit has occurred or not。 There are two phases。

that’s why it’s called two phase commit， which starts with the prepare phase。

where the global coordinator requests， that all of the participants promise to commit。

or decide to roll back the transaction。 So the participants each record their decision。

their promise in the log and acknowledge that， to the coordinator and if anyone votes to abort。

then the coordinator is going to write abort in its log。

and tell everyone to abort and record abort in their log。 During the commit phase。

if everyone has agreed， that they are prepared to commit。

then the coordinator writes commit in its log。 When it writes committed in its log。

that’s when the transaction is considered committed。

and no matter what happens with machines going up and down。

eventually everybody will commit their transactions。 It then asks all the nodes to commit。

and respond with an acknowledgement， after it receives and acknowledge all of the acknowledgments。

it writes got commit to the log。 So the log here is used to guarantee that our decision， persists。

that it’s a decision to commit， or it’s a decision to abort the transaction。 So some discussion。

Now， why is distributed decision making important， and desirable because of faults that can happen？

We want fault tolerance。 We want a grouping machines to be able to come， to a decision。

even if one or more of those happens， to fail during that process。

So this simple failure mode is called， we’re assuming here a simple failure mode。

that’s called fail stop。 So if a machine has problems， it stops， it crashes。

it reboots and then recovers。 There are other modes that we’ll talk about。

in just a moment which are more Byzantine， in terms of the way that they might fail。

Now after a decision is made， we recorded in multiple places that again ensures。

that it persists after that decision is made。 Now， some people have asked， why is Too Faced commit。

not subject to the same paradox as the general’s paradox？ Well。

the difference here is that Too Faced commit， is about eventually coming to the same decision。

It’s not necessarily true that they all come， to the same decision at the same time。

It simply says that they will eventually all come。

to that same decision because machines might fail， and have to reboot。

And so we want to make sure that again， eventually we reach that decision。 Okay。

second thing I want to talk about， is that there is a problem with Too Faced commit。

and there is a reason why Too Faced commit， can be troublesome to actually use and practice。

and that is blocking。 So let’s say， so one machine can be stalled。

that is prevented from doing any useful work， until another site recovers。

So site B writes prepared to commit， and records that in its log。

and then sends a yes vote to the coordinator， which we’ll say is site A and then crashes。 Well。

eventually it’ll come back up。 In the meantime， let’s say site A also crashes。

So now site B is coming back up and it checks its log， and it says， oh， I voted yes。

And so it’s going to send a message to site A， saying， well， what happened？

Did that transaction commit or abort？ Now at this point， site B cannot unilaterally decide。

I’m going to abort， right？ Because it already said it was prepared to commit。

and that update might have actually committed， in which case it then has to go through。

with its promise to commit。 So it’s blocked until A comes back up and responds。

The coordinator comes back up and says， what happened to the transaction。

Now the reason why this is a problem， is because that block site B holds resources， right？

So it might be holding locks on updated items。 It might have pages that are pinned in memory。

and it can’t release any of those resources， until it knows the fate of the update。

And so if it takes the global coordinator， a few hours to come back， well， those resources are held。

and that could hold up progress on site B。 Okay， so there are some alternatives to two phase commit。

There’s an alternative called three phase commit， which adds in another phase。

That’s why it’s called three phase commit， and that allows nodes to fail or block。

and yet we can still make progress。 Now it’s a much more complex algorithm。

and it’s not widely used in practice。 Paxos is an algorithm for a distributed commit。

that was developed and is used by Google， and it doesn’t have the two phase commit blocking problem。

It was developed by Leslie Lamport。 You may remember him from last time。

He’s the person who said that distributed computing， is when a machine crashes and prevents me。

from getting my work done， and I have no idea what that machine is。 And so with Paxos。

there’s no fixed leader。 You choose a leader on the fly。

And so that makes it much easier to deal with， a failure occurring。

but it’s a very complex algorithm， and so it’s not as widely used as we’d like to see it be used。

There’s an alternative that was developed later， called Raft developed by Professor John Austerhat。

at Stanford and it’s much simpler to describe， the complete protocol and so some people。

have started using Raft as an alternative， to Paxos or two phase commit。 Now。

up until now we’ve assumed that failures that occur， occur because of normal things that happen。

like cosmic rays or machine crashes or hardware failures， and other sorts of things。

What happens when failures are caused， because nodes act maliciously？

So malicious means that a node is attempting， to compromise the decision making process。

So if everyone is saying do， yes， the malicious node says， for example， do no。

So we need to use a more hardened decision making process， and like Byzantine agreement。

which we’re going to look， at in just a moment or blockchains。 And unfortunately。

we don’t have time， to look at blockchains today。 So the assumption here is that you have an adversary。

that’s trying to basically do the worst case， damaged your system。

So if you’re just dealing with normal faults， this takes it， this will also help you。

because sometimes systems will fail， in truly bizarre ways。 They start acting inconsistent。 Well。

that to a certain extent could look like a malicious， actor， but it’s not。

It’s just a node that’s misbehaving， and performing incorrectly。

The nice thing about the Byzantine algorithms， is that they’ll handle those misbehaving nodes。

just equally as well。 Okay， so we have a different problem now。

which is the Byzantine generals problem。 So the Byzantine generals problem has n players in it。

So we have one general。 So here’s our general and we have n minus one lieutenants。 In this case。

we have four players， and we have one general and three lieutenants。

So some number of these participants， can be insane or malicious。

And that includes both the general and the lieutenants。 So the commanding general sends an order。

to his n minus or her n minus one lieutenants， such that the following integrity constraints apply。

The first one is that all loyal lieutenants， must obey the same order。

The second one is that if a commanding general is loyal， then all loyal lieutenants obey the order。

that he or she sends。 So here our general， who is loyal， sends attacked everybody。

And all the loyal lieutenants send an attack to everyone else。 But this malicious lieutenant here。

tells this lieutenant retreat， right？ And tells this lieutenant retreat。 All right。

so we want the decision in this case， because the general is loyal and says attack to be attacked。

But again， the malicious lieutenant here， is trying to make the distributed decision be retreat。

So we can differentiate between the malicious actors， and the loyal actors in this case， right？

Because here this lieutenant receives two attack messages， and one retreat message。

And same with this loyal lieutenant receives one retreat message， and two attack messages。

But it’s not always the case。 There are some impossibility results。 So in particular。

if we only have three players， then we can’t solve the Byzantine general’s problem， right？

Because one player here can mess everything up， right？

So here we have a loyal general and a loyal lieutenant， and the general says attack to everybody。

And here we have a malicious lieutenant， and they say retreat， right？

So now from the point of view of this lieutenant， they get one order to attack。

and they hear a hearsay order to retreat。 On the right， we have a malicious general。

who do one lieutenant， the one on the left says attack。

The one on the right says retreat and these two are loyal。

So this one’s going to accept that attack from the general， and it’s also going to hear retreat。

from this loyal lieutenant who’s gonna relay that retreat。

So the challenge here is that we can’t see， we can’t determine from the point of view。

of this lieutenant on the left， which of the commands to follow， right？

Because here we can’t tell is it the general that is loyal。

or that it’s the other lieutenant that is loyal。 So in general with F faults。

we’re gonna need N greater than three F to solve the problem。 You can ask questions。

If you have a question， please ask questions in chat。 That’ll make it easiest。

'cause then everyone can read it， and it’ll be included in the transcript。 Okay。

So there’s lots of algorithms to solve the problem。 The original algorithm has a number of messages。

that’s exponential in N， and newer algorithms have message complexity， of order N squared。

And there’s an algorithm that people use from MIT。

that was developed by Castro and Liskoff back in 1999。

So the question here is if a general was malicious， what orders are we supposed to execute。

for the problem to work？ So in the case where the general is malicious。

you’d basically take the order that was opposite， right？

You’d be able to tell them the general is malicious。

because they’re gonna be telling some people to attack。

They’re gonna be telling some people to retreat。 We still want to reach a general consensus。

for all of the correct players in the game。 Okay。 So when we have these Byzantine fault tolerance algorithms。

it allows multiple machines to make a coordinated decision。

even if some subset of them are malicious。 So it allows us to have a request。

We want this group to make a decision， and we have these malicious participants here。

who are gonna take the information coming in on the request。

and try to make the decision go their way， to pollute by confusing everyone else。

and sending mixed messages to those other participants。 But as long as there are less than a third。

of the population that are malicious， the Byzantine fault tolerance algorithms。

will still be able to come into agreement， across all of the other participants。 Okay。

Any other questions about BFT before we change gears？ And again。

this works well for malicious environments。 It also works well where you have nodes。

that are flaky and fail in bizarre Byzantine kinds of ways。 Okay。

So let’s switch gears and talk about network protocols。

So we have networking protocols on many levels。 So at the physical level。

there are protocols for the mechanical connectors， and for the electrical network。

how we represent zero and ones in waveforms。 That the link level。

there are packet formats and error control。 At the network level， we have routing and addressing。

And then at the transport level， we might have something like reliable message delivered。

And so if we look at the protocols in today’s internet， the ones that are most commonly used。

at the physical layer， we have protocols like ethernet。 So I’m doing Zoom on a computer。

that’s connected over ethernet。 That computer also has a wifi interface。

and can also communicate over that interface。 You can also have cellular protocols like LTE。

In between， again， we have the hourglass。 We have IP， the internet protocol as the narrow waste。

And then we have transport layer protocols， like the unreliable data gram protocol。

and the transmission control protocol。 On top of that， we can build our applications。

We can have libraries that implement protocols， like remote procedure call and so on。

So let’s start with looking at the physical link。 What kinds of networks do we have？ Well。

the most common type of network is a broadcast network， where we have a shared communication medium。

Now you have one， you popped open your computer， inside that shared medium is a set of wires。

that interconnects all of the components， the processor， the I/O devices， memory and so on。

So inside the computer， we call this a bus。 And all devices are simultaneously connected。

to that bus and can communicate。 Originally， Ethernet was a broadcast network。

and all computers on a local subnet， were connected to one another。 Now， we have other examples。

So for example， where the wireless medium is air。 So when we’re in the classroom and we’re speaking。

that’s via a communication medium， that’s a broadcast medium that is the air。 We also have Wi-Fi。

right？ And so Wi-Fi is a broadcast medium。 And so this is one of the reasons why。

when you’re at a cafe， people tell you， use a virtual private network。 For example， the campus VPN。

because you’re broadcasting the information， from your computer and anyone who’s sitting there。

in the cafe could actually receive that information， and potentially know what you’re doing。

and potentially get sensitive information。 There are other broadcast networks like cellular networks。

where all devices are communicating over shared medium， which is radio frequencies。 Okay。

so some of the details about a broadcast network。 So in most common networks that are based on protocols。

like Ethernet， there’s what’s called a media access control， address or MAC address。

It’s a 48-bit physical address for the hardware interface。

And every device in the world has a unique address。

Now I say sort of because the way MAC addresses are assigned， is vendors get a prefix。 So 3。

com gets a prefix， Netgear gets a prefix， and then the assigned devices in their space of bits。

Because manufacturers produce a lot of devices， they sometimes reuse MAC addresses。

So they’re supposed to be unique， but they’re not always guaranteed to be unique。

if the manufacturer has recycled some of those addresses。 Okay， now delivery。

So when you broadcast a packet， how does a receiver know that it’s supposed， to receive that packet？

Because that packet goes to everyone。 So it’s literally like standing up in a room。

and starting talking。 How does someone in the room know， that you’re actually talking to them。

and not just to the room as a whole？ So if we look at the nodes in a network。

they each have this MAC address。 So here this one has MAC address three， and it’s sending a packet。

Well， the way we determine the destination， is we take our packet and we prepend on that packet a header。

And that header has the destination MAC address。 So now when everyone receives that packet。

this one with ID one ignores it。 This one with ID four ignores it。 And this one， oh， ID two。

that matches the destination。 It’ll receive the packet， pass it on to the operating system。

pass it on to the application。 Now very important is that in Ethernet。

this check is done in hardware。 So the operating system isn’t getting interrupted。

every time a packet is broadcast。 The look and see， oh， is this packet for me。 Now that said。

you can actually disable that hardware check， that puts your adapter into mode。

that’s called promiscuous mode， because it’ll now listen to and receive。

every single packet that’s transmitted。 So again， this is the reason why。

you wanna make sure you use a VPN， when you’re using public Wi-Fi。

because anybody can put their adapter into promiscuous mode。 Now they receive all of the packets。

that are being transmitted by people in that area。

and they can decode them and see what people are doing。

Not every application will unfortunately encrypt， its sensitive data。

And so you could have passwords and usernames， and other things that get stolen。 Okay。

so an alternative to broadcast networks， is point-to-point networks。 Why have a shared bus？

You could simplify things， you could increase the available bandwidth。

If you had point-to-point links， then you added in routers and switches。

So why was this not done originally？ Cost。 It used to be routers and switches。

were very expensive on a per-port cost。 Now， there are dollars， tens of dollars， or less per switch。

And so the cost per port is very， very small。 And so point-to-point networks。

are pretty much the point you’ll see in most environments。 Again， except for wireless environments。

So on a point-to-point network， you have a network where every physical wire。

is connected to only two computers， or a computer and a switch or a computer。

and a router or a switch and other switch， or a switch and other router or two routers。 So a switch。

What’s the difference？ I’m using the term switch in routers。 Well。

a switch is a bridge that transforms that shared bus。

that broadcast configuration into a point-to-point network。

And it adaptively figures out what MAC addresses， are available on each port。

and only routes traffic to the ports， that have the particular MAC address。

that matches what’s in the header。 So here the switch is actually looking at the header。

and it’s seeing who is sending this， what’s the MAC address of the sender。

recording it for that port， and looking in a lookup table to find out。

where is the MAC address for the recipient， in the header located。

A router acts as a junction between physical networks。 So between switches and say the internet。

we have routers。 And routers instead of looking at MAC addresses， use IP addresses。

So they’re routing at the layer above。 So the question is， are switch and router。

both computers as well or just special ones？ The answer is sort of yes。

So typically routers are built using， application specific integrated circuits or ASIX。

So this is custom logic。 And， but if you look at what that custom logic is。

a lot of times it’s ARM cores， or other sorts of computational units。

that are the same types of things， that we find in microprocessors。

It is also possible to actually use modern microprocessors， to do routing and switching。

And you can do this if you， for example， want to implement functionality inside the network。

So going against the end-to-end principle， you could push functionality into the network。

implement it on a general purpose computer。 But in general， for the kinds of performance。

that we’re trying to get to， you know， 100 gigabit， 400 gigabit links。

those are typically done using ASIX。 So special purpose hardware。 Okay。

so let’s look at the internet protocol。

外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传

So it’s the internet’s networking layer。 So it’s this red layer here in our stack。

And the service that it provides， is a best effort packet delivery service。 Right。

so it’s gonna try its best， to deliver packets to its destination。 That means， however。

that packets might get lost。 They might get corrupted， they might get delivered out of order。

So the way I like to think about packets is like postcards， right， you go on vacation， and。

you know， when we go away， we’ll send postcards to a bunch of friends。 And we might send， you know。

three or four postcards， from different places while we’re traveling around。

And it’s always humorous when you get back to find out， well。

what order of those postcards arrive in？ Did they arrive in the order we sent them？

And did they arrive， period？ And， you know， a lot of times they don’t arrive， or， you know。

someone missing， and they arrive out of order。 Well， that’s just like the internet， right？

There’s no guarantee that packets won’t be lost， that they won’t be corrupted。

or that they might be delivered out of order。 So there’s a question to summarize the difference。

between switch and router。 So again， switches operate on local area networks。

routers connect local area networks to wider networks。 So switches are on Mac addresses。

routers operate on IP addresses。 Okay， so you can think of IP as a data gram service。

We’re routing across many physical switching domains， or subnets， so local area networks。

And so those local area networks again， subnets， those are switches。

The routers operate here at the network layer， to interconnect those subnets。

So we divide IP addresses up into， you can think of them as a 32-bit namespace。

and that’s divided up into four octets or four bytes。 And so we’ll typically write them。

as those four bytes dot separated。 So here we have 169。229。6。83。 That happens to be the IP address。

from one of the computer science file servers， one of the departmental file servers。

An internet host， this is the computer， that’s connected to the internet。 Now， what does that mean？

Well， that could be anything from， we have smart thermostats。

and those thermostats have an IP address。 They’re connected to the internet。

Your phone has an IP address。 It’s connected to the internet。 Your tablet， your computer。

all of these have an IP address， one or more of them and they’re used for routing。 So for example。

on your phone， you have an IP address that’s assigned， for the LTE or 5G network。 You also。

when you’re on WiFi， have an IP address for that WiFi network。 So simultaneously。

your phone has two IP addresses。 Now， some of these might be private。

and they can’t be used for routing。 So the IP address that you get， say when you’re at home。

is a private network address。 You can’t route directly to it。

And not every computer has a unique IP address。 So when you’re at home， your cable modem。

or your fiber modem has an IP address。 And then your local subnet is a private subnet。

Your cable modem uses network address translation。

to translate requests from host outside your network， to host inside your network and vice versa。

So your machines or your devices at home， all have the same public IP address。

and they have an internal private IP address。 So a subnet is a set of a network that connects hosts。

that have related IP addresses。 And so that it’s identified by a 32-bit value。

and a prefix number of bits。 So it’ll be a slash， the mask。 So here 128。32。131。

0/24 says that all of the addresses， that match the first 24 bits will match。 So are the subnet。

So that would be 128。32。131。whatever is all， on the same subnet。

So the last XX is gonna be between zero and 255。 All right。

And that mask is just simply the number of matching bits。 So 24 represents the first three bytes。

or the first three octets。 But it doesn’t have to be units of eight。 It could be anything。

It could be 22， it could be 31。 It all will depend。 And then often routing within the subnet。

is done by MAC address。 So within a subnet， the switch will see all of the hosts。

that have the same， that are on that subnet， and then just use its MAC address。

to route packets between those hosts。 All right。 So there’s a question when someone from the outside sends。

a request to one machine inside a private network。 How does a router know which computer。

to send the request？ So it’s all gonna depend on the computer。

that made the outbound request to that server。 When that happens network address translation。

will record that there was the outbound request， and then match that up with the inbound response。

or inbound request。 There also are protocols like the universal plug， and play protocol or UPNP。

where a host inside the network can say， hey， route connections for me on this particular port。

to on the public IP to my host inside that。 So for example， devices like game consoles。

and telephone adapters use that kind of approach， to tell the router to open ports up for them。

So that’s how you’ll do that reverse mapping。 So the question is， so the Wi-Fi routers in our home。

is really a combination of switch and router， that is responsible for both public。

and private data transfer。 Absolutely。 That’s why on the back of your router。

you’ll see typically four ports， right？ One is an upstream port， which might connect。

to another router and then the other ports are internal ports， which you can plug your devices into。

and connections will be routed， between or will be switched between those devices。

And then it’s doing a routing function， between the subnet that’s your internal subnet。

like a 192 subnet and the external subnet， which is your providers wide area network subnet。

So someone says， I guess Nat keeps the table， for translation which binds empirical port。

to destination port， is that correct？ Yes。 So Nat does keep a table。

that does mapping between internal ports and external ports。

We’re gonna get into ports in just a moment， when we start talking about UDP。 Okay。

so if we look at the format of an IPv4 packet。

外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传

I’m not gonna go through all the fields here， but some of the fields that are important are。

there’s the version， there’s version four， there’s the size， this is my header plus data。

there’s the destination IP address， again， it’s the 32-bit address， the source address， again。

that’s a 32-bit address， and there’s a checksum type of transport protocol。

and a bunch of other important options， flags and features。

Not all of the features will be supported by all devices。

There’s a sort of core set that are required though。 The function of an IP data gram is just again。

like a postcard。 It’s unreliable， unordered， could be damaged。

packet that’s sent from a source to a destination。

外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传

And the function of the network， is simply to deliver data grams。 Now。

we kind of already started talking about it， but between two hosts on the internet。

is a wide area network。 And you can think of the wide area network。

as a network that covers a broad area。 So it could be a city， it could be a county， a state。

a country， could be an entire planet， right？ The internet is a wide area network。

Wide area networks connect multiple physical data link， networks。

so multiple subnets or local area networks， get connected by a wide area network。

The data link networks are themselves connected by routers， right？

And so different local area networks， could use completely different technologies。

So you might have one local area network， that is using Wi-Fi， another local area network。

that’s using ethernet， and another local area network that’s using LTE， or laser or， you know。

any different technology。 And the routers interconnect all of these different local area， networks。

OK， so the role of routers is to forward each packet that’s。

received on an incoming link to the appropriate， outgoing link。

And the ideas that outgoing link is closer to the destination。

And it’s going to be based on that packet’s destination IP， address。 So that’s， again。

in contrast to switches， which deliver packets based on MAC addresses。

the media access control addresses， routers operate at the network layer。

and route based on IP addresses。 Routers operate with storm forward。

so packets are buffered and then they’re forwarded。 And there’s a forwarding table。

which maps between IP， addresses and the output link。 So here。

we can see what it might look like inside a router。

We have incoming links and packets coming in on those links。

So you can see here we have a bunch of blue packets coming in。 They get buffered in memory。

And then routed to the appropriate output link。 And so here you can see there’s some demultiplexing。

between the green and the red packets。 The green end up going out this link。

The red end up going out this link。 You can see that there’s multiplexing。

of the black packets coming in here and the red packets coming， in here。

And they’re both going out on this outgoing link。 And so we use buffers because we might have more traffic coming。

in for an output link than we have capacity on that output link。 These buffers are fixed size。

So if too many packets， too many of these black and red packets。

come in and it exceeds the capacity of this link consistently。

and we run out of memory to buffer those packets， we drop those packets。 Again， IP is best effort。

So if we run out of space in the routers， packets get dropped。

It’s up to the upper levels to deal with those packets that， get lost。 OK， so packet forwarding。

Here’s an example of an IP packet going from host A to host B。 And when a router receives a packet。

it’s going to weed the IP destination address of the packet。

look at its forwarding table to determine the output port。

and then forward the packet to the corresponding output port。

Now we include default route for subnets where we don’t have， an explicit entry。 And basically。

you can think of this， as for passing the problem off to someone else。

hopefully a more authoritative router that， knows how to get to that location。 So your cable modem。

all it knows， is it has a default route， which is send it， to the cable company。

and that cable company router， will figure out where to send it。

So that way you don’t need to have a cable modem that。

has the way to reach every possible network in the world。

All it needs to know is there’s another more authoritative router， that does have that information。

OK， so I’ve been talking about IP addresses。 I’ve been talking about MAC addresses。

It can get very confusing because we’re， talking about things that operate at different levels。

And so here’s an example to help you， think about what the difference is。 So here we have John Doe。

and John Doe has a Social Security， number， and John Doe lives in Washington， D。C。

But John Doe gets an acceptance letter from Berkeley， comes to Berkeley to go to school。

moves to Berkeley。 All right， so now John has a Berkeley address。

But his Social Security number still remains the same。 Now。

why do we not use MAC addresses for routing？ Because it wouldn’t scale。

How do we know where John Social Security number， 0， 0， 0， 0， 0， 0， 0 is located？

We’d need a table that tells us where every single Social， Security number is located。

So the analogy here is that a MAC address， is like a Social Security number。

whereas an IP address is like some home address。 And the thing is that the MAC address。

is uniquely associated with the device for the lifetime， of that device。

So the MAC address that your laptop has， for its wireless card， is permanent。

It’s assigned at the factory， and it doesn’t change。

Whereas the IP address you have is going to change， throughout the day。 When you’re at home。

you’re going to have one IP address， the public IP address of your cable or fiber modem。

When you’re on campus， well， as you move from building， to building。

and you switch from edge to room network， to edge to room network， your IP address may change。

And again， the MAC address never changes。 OK， so why does Packet Ford and using IP addresses scale？

Because IP addresses are aggregated。 So you’re not just randomly assigned an IP address。

but rather they belong to organizations。 So at UC Berkeley， all IP addresses。

start with the first two bytes being A9 and E5。 So any address of the form， A9， E5， something。

something， belongs to Berkeley。 So a router in New York just needs one entry。

to route to all of the hosts at Berkeley。 If instead we were using a MAC address。

then that router in New York would， have to have table entries for the， whatever， 70。

000 devices that we might have active at any given time， on campus， so it wouldn’t scale。

So the question is， does every router in the world， know that this address belongs to Berkeley。

or only some top nodes？ Again， it depends， right？ Depending on the size of the ISP。

they may not know how to reach Berkeley， which case。

they’ll forward the packet on to their internet service。

provider who will forward it on to someone else。 And eventually you reach a router that is authoritative。

and can route to it。 On the other hand， if you’re at a major provider， like。

say Comcast or AT&T or others， they would have all of the routing tables。

to reach all of the major organizations in the internet。 So the analogy here is， give this letter。

to a person with a given Social Security number， right？ That’s a pretty intractable problem。

to track that person down。 On the other hand， if I say， give it to John Smith， who。

was at 123 First Street in Los Angeles， California， in the United States， very easy to route。

to that particular address。 OK， so some administrative stuff。

We have a midterm next week from 7 to 9 PM。 It’ll cover all course material with a focus。

a priority， on the material from before the last midterm to now， or from the last midterm to now。

And then we have a review session coming up next Monday， on the 25th from 1 to 3 PM。 OK。

back to routing and naming。 So we need to go from human readable names to IP addresses。

So I don’t want to have to remember that www。berkeley。edu。

is represented by the host with IP address 128。32。139。48。 I got that good at remembering numbers。

so I probably would not be able to remember that， and type that in on my phone keyboard。 Same thing。

Sometimes it’s also the case that a name maps， to many different hosts。 So www。google。

com maps to hundreds of thousands， of different hosts depending on where you are in the world。

And so and the load on those different servers。 And so we need a way of mapping from human readable names。

to IP addresses。 There’s a question， is there a class that is about networks？ Yes。

The undergraduate class is CS 168。 And the graduate class is CS 268。

We’re going to do a very surface level discussion of networking。

in this class just because we have a limited amount of time。

But because networking is really important to distributed systems， and systems in general。

we’re going to give you， a basic understanding of how networking works。

But if you really want to understand the full details， and build applications that use networking。

take CS 168。 OK。 So again， IP addresses， they’re hard to remember。 IP addresses change。

If a server crashes and gets replaced by another one， I don’t want people remembering that。

you know， www。berkeley。edu is 12832， 13948。 Right。

So the mechanism we use is called DNS or the domain naming system。

外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传

And the domain naming system is a hierarchical mechanism， that we use for naming。

So names are divided into domains and it goes from right to left。 So the top level domain。

what’s missing here is there’s actually， a dot and that’s the top level is dot。

And then there’s edu and then there’s Berkeley。edu。 And then there’s eks。berkeley。edu。

And then there’s the host that we’re trying to find。

So each domain is owned by a particular organization。 The top level is handled by ICANN。

the Internet Corporation for， assigned numbers and names。

And so they’re the ones that hand out top level domain。 So when， you know， a few years ago。

they created a ton of new domains。 And those were assigned。

handled out or handed out rather by ICANN。 Each of these subsequent levels is owned by a particular organization。

So there’s an organization that runs。com。 There’s an organization that runs。edu。

There’s an organization that runs dot。 That mill。gov。org and so on。

So each of these organizations then hands out to lower level organizations， for。

so for academic organizations in the US， they can get a。edu。 So there’s MIT。edu。

If you want to find out what www。mit。edu maps to， you go to the MIT。edu， server。

which is run by MIT。 Similar in our case， well， we want to find the eks， a web server。

We’re going to go to the campus， the Berkeley campus domain name server。 And it’ll tell us， oh。

go to eks。 Eks is hand， eks department happens to be one of the departments in campus that runs its own DNS。

And so you’ll go to the eks。berkeley。edu DNS， and it will tell you， oh。

that host is for www that maps to 123 to 139。48。 Now that takes time。

It took a lot of time to explain this。 If every time you wanted to look up， you know， www。eks。

berkeley。edu， you had to go all the way up to the top level and go all the way down this tree to find the host that would take a lot of time。

That would also put a tremendous level。 If you think of the billions of devices that are looking up host names every day。

that would put a tremendous level on the load， rather on the top level。 And so we use caching。

And so there’s caching at clients of the results。 And typically you’ll cash for anywhere from a couple of hours all the way up to could be a couple of weeks。

It’s kind of a trade off cash too long。 And if a name changes that the mapping changes。

then it’ll break because you won’t， you’ll think you can’t reach that host cash to infrequently。

you know， too short。 A time period， and you’ll end up putting more load on the DNS server。

So it’s kind of a trade off in terms of fault tolerance versus load balancing。

外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传

Okay， so how important is the correct resolution。 Well。

there are people have constantly been trying to launch attacks against DNS。

because that way they could say， you know， get Wells Fargo。com to resolve to their server。

create a server that looks just like it。 And now you’re going to connect and think you’re connecting to your bank when you’re not really and then you give up your username and password。

So a lot of phishing attacks will oftentimes try to work in conjunction with a DNS manipulation type of attack。

And it’s because DNS really isn’t secure。 There are a lot of vulnerabilities and substitution attacks that can be launched against DNS。

These have happened before back in 2008 there was a major hole in DNS that was discovered。

and it was so critical that they had to， you know。

widely say DNS is broken without actually seeing what the vulnerability was because they didn’t want to leak too much information and allow attackers to start using that attack before people could patch their systems。

Now there are other protocols like a secure DNS protocol。

but it requires a lot of work to actually implement these。

and people have found vulnerabilities in those secure DNS protocols too。

So it’s a challenge and it is a， it is definitely a weak link。 Okay。

So layering is all about building complex servers services rather from simpler services。

Each layer provides the services that are needed by the higher layers。

and it does that by utilizing services that are provided from the lower layer。

It adds a layer of a direction。 So if we look down at the physical link layer things are really very limited。

right， packets have a very limited size maximum transfer unit。

it could be anywhere from 200 bytes if we’re dealing with something like dial up over a telephony network up to 1500 or 800 bytes for。

a particular network like Ethernet routing is limited within a physical link or maybe through a particular switch。

So our goal here is to go from， you know， is to use abstraction as we have throughout this class to go from the messy physical reality to a more desirable abstraction。

So if you look at reality packets have limited is a limited sizes， MTU’s。

they’re unordered sometimes they might be unreliable if， you know， it’s a lossy wireless link。

They’re machine to machine， they only operate on the local area network there。

they’re asynchronous we never know when they’re going to arrive or be sent， and it’s insecure。

Right。 And the abstraction we want is secure reliable messaging。 Right。

so arbitrary sizes ordered reliable from process to process。 So from application to application。

not just simply from one machine to another machine so one operating system to another operating system。

routed anywhere in the world， synchronously delivered and secure against。

And so integrity against manipulation， and perhaps maybe even authenticated for who sent it。

外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传

Okay， remember again our packet format here， we’re going to look at different types of transport protocols。

外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传

So we’re going to build a messaging service on top of IP， we want process to process communication。

And if you think about what IP gives us is， is just so it’s like having a postcard that’s just delivered to an address。

No name on it。 And so let’s say you’ve got， you know。

five or six people living in an address in a postcard arrives。

Who is it from well there’s a sending address， and there’s a destination address。

Well the transport layer says who is the person receiving the application。

So we want routing from process to process and so what we’re going to do is we’re going to add ports。

So 16 bit destination port a 16 bit source port。 This will tell us what is the communication channel。

and we’ll define a communication channel between two applications by what’s called a five to bowl and that five to bowl consists of the source IP address。

the source port。 The destination IP address the destination port， and the protocol。 So in this case。

this is the user data， the unreliable data gram protocol。 And。

and so that defines the five to bowl for this， this connection。

With the unreliable data gram protocol， we have data grams。

an unreliable packet that gets sent from a source to a destination user。

and the important aspect here is it’s relatively low overhead so on top of the IP header which was 20 bytes。

We’re going to add a few more bytes for our port our length or checksum and our port。

Now here’s where things get a little complicated。 But if we go back to our IP picture， right。

there’s a 16 bit header checksum here， which is over this IP header。

外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传

And now we’ve added a 16 bit UDP checksum， which is going to be over our， our data also。 Okay。

so this type of data gram is oftentimes used for things where we want unreliable delivery。

but it might be high bound。 So for example， this video connection。 Typically the video。

So my video would be sent over UDP。 My audio would also be sent over UDP because it’s more important that it get there in a timely manner。

and that it be reliable so there may be some losses or maybe some， some gaps。

And so screen sharing that’s kind of less synchronous and so that oftentimes might be set sent over TCP because you want a nice clear picture so you can read all of the words that are on it。

And so zoom actually gives you the option of whether you want screen sharing done over UDP or you want it done over TCP。

UDP allows you to just like kind of blast these high bandwidth communications。

And so it’s not really very， it’s kind of anti social。

It’s not friendly to other applications that might simultaneously be trying to use the same network。

It’s UDP is just going to send it whatever the rate the sender sending it。

That’s the right we’re going to push it through the network so really the sender controls how it’s using the network and ignores other users in the network。

TCP will see on the other hand tries to be more social and well behaved when a network gets congested。

外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传

Okay， so the Internet architecture really has kind of five layers from the application transport network data link to physical。

the lowest three layers we implement everywhere。 Right。

so the physical layer connects hosts and routers， and other routers。

the data link connects hosts and routers and other routers switches everything。

The network layer again is used at hosts and routers it’s not used at switches typically。

So switches might only implement the lower two， and the application and transport layers only get implemented at hosts。

And so this is where our sockets get implemented when we have a connection。

we have going down all the way the stack across to appear then all the way back up。

back down across to appear， and then all the way back up until we deliver。 So。

in most cases this is how things work。 People break the end to end principle。

and sometimes implement functionality in the in the routers。 So， for example。

caching in the network of content。 That would be applications that effectively are running inside the network instead of running end to end。

And to trade off， you know， and then principle says we should only do that at the end。

but we’ll do it sometimes in the network because we get efficiency and bandwidth benefits and latency benefits also。

Okay。 So， we can think about layering as putting packets in an envelope。

So an application has some data and wants to send from one application to another application。

And so it’s going to take that app that data， it’s going to put it in a transport layer envelope that adds a transport header。

which is used by the transport layer。 The transport layer is going to put it in another envelope。

Right， so it’s like you get a letter from someone and then you take it and you put it into a bigger envelope address to someone else。

Right。 And so that’s the， adds a network header to that。

And then the network layer takes that set of envelopes and puts it into another envelope and adds the frame header that we need for the data link layer。

and then we encode it into the ones and zeros that we need for the physical layer。 So again。

the challenge here and this is why， you know， the layering has huge benefits。

but you can also see and this is not the scale that you know let’s say this is a small amount of data that we’re trying to send。

We’re putting a lot of header information on that relatively small amount of data。

And so the efficiency that we’re going to get at our physical layer could be very limited。

if mostly what we’re sending is header and not data。

外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传

Okay。 So the data gram service is basically a no frills extension of best effort IP so instead of sending packets now we can send longer packets。

They’re bigger than the MTU that we might have at the lowest data link layer。

and it handles the multiplexing and demultiplexing of these packets to processes at the end house。

So we’re able to send between two processes， these longer messages。

TCP the transmission control protocol gives us reliable in order delivery。

And so that gives us connection setup and tear down deals with packets that might be corrupted packets that might be lost flow control congestion controls we don’t overload the receiver。

and so that we don’t overload the network。 So there’s a question between what’s the difference between the data link layer and the physical layer so the physical layer is the ones and zeros。

the data link layer is the protocol that we use to access the media。 So when we’re speaking。

our physical layer is our vocal cords， making us phonemes and our ears。

hearing those phonemes receiving them。 And so data link equivalent would be raising your hand because you have a question that you want to ask or something that allows us to do media access control between people and then also like。

you know， saying your name， so that you know I’m talking to you in a in a group as opposed to just broadcasting something。

Okay， other examples of of internet transport protocols that nobody actually uses are but you know there are standards for it are the data gram congestion control protocol。

the reliable data protocol， and the stream control transmission protocol。 But those aren’t used。

Okay， so remember how sockets work as a concept we at the server created a server socket bind it to an address。

So now you know where you know that that host IP address and port come from。

we listen for connection， and then we’ll use the， call to accept a connection when the client which also created a socket connected it to that host and port that we had bound the server socket to。

then that creates a connection socket。 And that allows us to write requests。

read those requests writer response read those response。 Okay。

so when we’re talking about reliable message delivery we’re talking about how do we make this connection work properly that socket connection。

Okay， so the problem。 Well， we just went through them right physical networks， they can garble。

they can drop or lose packets right at the physical layer packet might not be transmitted or received。

you know， if we’re transmitting at the maximum rate， we get more throughput。

but we might have more error。 And so higher error rate and so some packets might get corrupted or lost。

And we transmitted sort of the best power per bit。 So at a really low power level。 Well， you know。

error correction， you know， might be able to correct some of the errors。

If the if we have a physical encoding that includes error correction。

but it might not be able to correct all of them。 And so again。

we’ll have some garbling or loss of packets。 And so the suggestion means we might not have any place to put an incoming packet。

Right， think about your phone it’s kind of relatively weak processor in it in comparison to a web server that might be sending。

you know， at a very high rate。 If it sends it to higher rate。

you’re not going to have enough buffer space on your phone in the operating system and so it’s just going to drop packets。

And so we’re going to have a network。 So you might have insufficient queue space it switches and routers broadcast link。

you might have two hosts that are simultaneously trying to use the link so two hosts that speak at the same time。

And receivers are just going to hear like some superposition of that signal and they’re not going to be able to distinguish it。

Again， any network where there’s insufficient buffer space and if a sender is fast sending faster than a receiver can can process。

So we want to build reliable message delivery on top of these unreliable packets， these IP packets。

So we need some way to make sure that packets actually make it to the receiver and we’d like exactly once。

So every packet is received at least once， and every packet is received at most ones。

Now if we want to combine this with ordering and say something even stronger that every packet is received by a process at a destination。

exactly once， and in order。 And so this is what the transmission control protocol or TCP provides this way。

So we have a stream of bytes that we send in。 They go through routers switches more switches more routers more switches。

and at the destination process， we get a stream out。

So it’s a reliable byte stream between two processes on different machines over the Internet。

And we can do read， write and flush， just as we could do to a file or local IPC。 So some details。

TCP has to take this byte stream and fragmented into IP packets and then into packets handed to IP and IP may turn around and break it up into further packets。

fragments rather based on the maximum transmission units。

It uses a window based acknowledgement protocol and this is a way of minimizing the state that we need to maintain at the sender and also the state we have to maintain at the receiver。

The window is kind of a reflection of how much storage we have at the receiver。

because we don’t want the sender to overrun the receivers buffer space。

But the window window also has to reflect the speed and the capacity of the network。

We don’t want to overload the network。 Also， it takes time。 Speed of light is finite。

And so it takes time for the traffic we inject in the network to actually reach the sender。

And so we want to make sure that’s not going to overload the sender either。

It handles automatically retransmitting any packets that are lost or garbled。

And it tries to adjust its rate of transmission to be a good citizen。 So unlike UDP， which。

you know， if I just start blasting packets out in UDP， I can blast whatever rate I want。

And the network， you know， could easily become overloaded。

and other clients could see their traffic impacted by my UDP streams。

TCP tries to figure out who else is transmitting what is the available capacity and adjust itself to fit within that available capacity。

So it tries to be a good citizen。 Okay， so let’s look at the problem with drop packets。

So all physical networks， they can garble， they can lose packets， right。

it could be hardware issues。 And so IP， because it’s built on those physical links can garble or drop packets and it doesn’t repair it。

Right， so IP doesn’t try。 The only thing IP does is its header looks at the check。

it’s header checksum just looks at the checksum for the header。 Right。

and so if that gets corrupted， you know， then， you know。

we’re going to throw away the packet because we can’t route it。 But otherwise。

the packet could be damaged and it’s up to the application to figure out what to do。

because that’s the end to end principle。 And again。

that’s important because we want protocols like UDP。

because sometimes we’re willing to accept damaged or corrupted packets and still try to decode them。

use， you know， for example， an audio codec that can handle corruption or。

use a video codec that can handle corruption。 So you still see something。

it might be a garbled frame or you hear something that’s a little garbled。

but you get it instead of simply just having silence and missing all of the context and contents and information。

Okay， so we need reliable message delivery。 Make sure packets are integral。

Their integrity is preserved and make sure they arrive exactly once。

So we can do this by using acknowledgments。 Right。

so we’re going to have checksums that will detect whether this packet gets garbled when we send it from a to B。

And if it’s garbled， we simply discard it。 If it’s okay， then we’ll send back an acknowledgement。

And that tells the sender that we correctly received this packet。

Now if we send it in the packet gets corrupted and so we rejected it B or the network loses it。

then there’ll be a timeout so a starts a timer so when you send a packet。

start a timer that timers running， if you don’t get an acknowledgement within that window of that timeout。

then you’ll just simply retransmit the packet and again start the timer again and wait for the acknowledgement。

So there’s some questions that should pop into mind when I say something like this。 So the first is。

doesn’t mean if the sender doesn’t get an acknowledgement that the receiver did not get the original message。

So it could be。 Right， but not necessarily。 Right。

it could be no right because the acknowledgement itself is a packet sent unreliably。

And so it could get lost。 Right， so it’s sort of like I send you a postcard。

And you send me back a postcard and the post office treats postcards is like the lowest of priority and it gets stuck in machines and。

you know， get lost in the carriers bag and， and so that acknowledgement。 And so。

that’s the data gram that comes back， could just get lost。 Right。

Or it could be the case that it gets delayed。 And so I do send back the acknowledgement。

but that acknowledgement comes after the timeout period。 And so you’ve already started the。

by retransmitting the packet。 And then the sender doesn’t get the acknowledgement。

it just simply retransmits。 And then the receiver is going to get the message twice and have acknowledged each one of those。

Okay， so let’s look at a protocol that we could use。 This protocol is called stop and wait。

we’ll look at this in the context， first of the case where we don’t have any packet loss。

So we’re going to send， and then wait for an acknowledgement。 Okay， and then repeat。

So we send packet one， we get an acknowledgement for packet one。

So this time between when we send the packet， it goes to the receiver。

and we get a response is called the round trip time。

And that is the time it takes the packet to travel from the sender to the receiver and back。 Right。

Now， the one way delay， if you can think about it is going to be D。 Right。

so that’s the delay from the sender to the receiver。 And if it’s symmetric， the links that we have。

then in terms of latency， then the round trip time is going to be twice this one way delay。 Right。

So now we send our second packet。 Right， and we get our second acknowledgement back。 Okay。

and then we can send our third packet。 And we’ll get our third acknowledgement back。 Right。 So now。

how fast can we send data with this kind of approach。 Well， think about it。

What we can use is a law called little’s law that we can apply to the network。

It says that n equals B times the round trip time。 So for stop and wait， we have one packet in flux。

Right。 There’s one packet。 We send a packet， wait for a response round trip time。

and we send the next packet。 So our bandwidth is going to be one packet for round trip time。

So this means that our bandwidth depends on latency， not the capacity of the network。 Right。

so imagine we have a 10 gigabit connection between Berkeley and Paris。 Right。

that’s very high latency。 That could be， you know， a couple hundred milliseconds。

And we’re not going to be able to send at anywhere near the capacity of that 10 gigabit link between Berkeley and Paris。

Okay。 So suppose as an example， our round trip time is 100 milliseconds。

and one packet is 1500 bytes in size。 Then our throughput。

our bandwidth for the network is going to be 1500 over 0。1， which is 120 kilobits per second。

So again， if we have 100 megabit per second link， we’re only using 120 kilobits per second of that capacity。

So we’re using a tiny fraction of the capacity of that link。

So loss recovery here relies on timeouts。 All right， so we have our round trip time。

and we’re going to add something to our round trip time to calculate what our time out value would be。

You know， maybe here it’s 50%。 So 1。5 times our round trip time。

And so if we don’t get our acknowledgement， either because the packet didn’t get delivered or the acknowledgement didn’t get delivered。

we’re going to time out and then retransmit the packet。 So how do we choose a good timeout value？

Well， too short， right， if we made it， you know， just a little， you know， right。

just a little bit over the round trip time， or maybe exactly the round trip time。 Well。

there’s going to be a little bit of variability， because there may be other congestion and traffic in the network。

which is going to cause us to， you know， exceed that round trip time， and that timeout。

our average round trip time and time out。 And then we’re going to be retransmitting a lot。 Right。

so that’ll lead to a lot of duplication。 We said， set it too long。

Let’s say we said it all the way around， you know， down here。

then we’re going to have a lot of disruption when there’s a packet loss。 Right。

because it’s going to take us a long time to realize that the packet has been lost。

and to actually transmit it。 Now， how do we deal with the fact that if we set our timeouts too short。

we’re going to resend packets before they， we get the acknowledgement for that packet so we’ll send packet one twice。

or like in this case， right there， acknowledgement gets lost。

but the packet was actually correctly received。 So， we transmission means this can happen。

And so a particular approach that we could use is just simply to put a sequence number in a message to identify we transmitted packets。

Right， so the receiver is just going to check for duplicate numbers and it gets a duplicate message。

It’ll just simply discard that message。 Right， so the requirements here are the sender keeps a copy of any messages that have not been acknowledged。

Right， and that’s easy， you just， you buffer those messages。

and just retain the buffer until you get an acknowledgement。

and the receiver tracks possible duplicate messages。 So this is a little bit hard。 Right。

because when is it okay to forget about a received message to forget that I actually received a particular message。

Right， so here’s an alternating bit protocol where you send a message。 At a time。

and you don’t send the next message until you get an acknowledgement。 Right。

and so the sender just keeps track of the last message。

and at the receiver they just simply need to track the sequence number of the last message that was received so you send it back and zero。

You acknowledge zero， you send packet one， you acknowledge one， you send packet zero。

you acknowledge zero。 The advantages， this is simple。

and it has low overhead it’s a single bit that we need to keep track of that sequence number。

The downside though is that if the network could arbitrarily delay messages or duplicate messages arbitrarily。

then we might get this packet zero but it’s actually this packet zero and so we can’t tell the difference between is this a new packet zero。

or is this a duplicate of that packet zero， or if we didn’t receive packet zero and we see packet one right so you can see。

there can be a number of different cases where we end up not being able to tell whether we’re getting a new message or we’re getting a duplicate of an old message。

So there’s lots of advantages of moving away from stop and wait right if we have a larger space of acknowledge of acknowledgments。

then we can pipeline。 Right， so we can send multiple messages。 Right。

while we’re waiting for those acknowledgments to come in。 Right。

so the acknowledgments serve dual purposes。 One is they confirm that a packet was received。

And they provide us with ordering， right， because packets could be reordered at the destination。

and so we might receive packet one， before we receive packet zero。

So we might receive acknowledgement one before we receive acknowledgement zero。

But if we have sequence numbers， you know， that are large enough space。

then we can differentiate between messages that are duplicated messages that arrive out of order。

and we can reorder those at the receiver， and avoid having to retransmit them from the sender。

So now how much data can we have in flight。 So going back to little law， right。

our sending window is going to be around trip time times our bytes per second。

So our send a sender’s window size， tells us the number of packets that we’re going to have in flight。

Right， so the packets in flight would be the window size divided by the size of those packets。

So how long does the sender have to keep the packets around。 Right。

so how much buffer space is the sender going to need that’s one question we can ask another question we could ask is how long does the receiver have to keep the packets data to let the sender know yes。

you know， if the， sender tries sending it again yeah I’ve already received that。 Right。

and what if the sender is sending packets faster， then the receiver can process the data。 Right。

and so it’s if this floods the receiver and the receiver right around here runs out of buffer space。

you know， then all of these packets that were sent were sent kind of in vain。

because they’re going to have to be repeated anyway because the receiver dropped those packets on the floor。

外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传

So， remember when we have communication between processes， we have this in memory queue。 Right。

and so you write into the queue as a producer and the consumer process be here is reading from that queue。

and the queue has a fixed capacity。 So if this process a exceeds the capacity of that queue。

then we block it。 Right。 And similarly， if this process be this consumer is trying to read from this queue and it’s empty。

it’ll wait， it’ll block。 Right， so posits provides this in the form of pipes。

So when we think about buffering in a TCP connection， right。

so we have a host here with process a there’s a queue of packets that are outgoing and process be we have a receive queue of packets that have been received。

but not yet consumed by the process， So you can see we’ve taken that sort of single queue that we had in in a Unix environment。

we’ve now split it into two queues， a queue at the sender for outgoing packets。

and a queue with the receiver for packets that have been received， but not yet， process。

So process a sends in the send queue the packets get sent across the network received。

and then they get sent to process be。 So this is bidirectional。

because we’re going to send the response back from host to in a send queue。

packetize that receive queue， and then that goes to process a。

So there’s a separate pair of queues per TCP connection。

So every TCP connection has both a receive queue and a send queue associated with it。

So we need four in memory queues to at each of the house to buffer sends in one direction。

and to buffer receives in the other direction or from the other perspective of that that connection。

Okay， so the window size is the space that we have in this more receive queue， right。

how much remaining space we have in that receive queue and a host is going to advertise this window size in every TCP packet it’s in so everything。

It’s sending back all those acknowledgments and any outgoing traffic is going to say， hey。

this is how much space I have in my buffer don’t overrun it。

So the center is never going to send more than what the receiver。 It’s advertised window sizes。

even if the network link would support sending much more data。 Right。

so we’re going to bound the amount of data that’ll be in flight by the receive queue。 All right。

so the set we use a sliding window protocol and the sender knows that it should never exceed the windows receiver with window size。

and it。 But packets that it previously sent might arrive and then fill the window size。

So as a result， it needs to ensure that the number of sent by but unack acknowledged bytes is less than the advertised window side so that’s going to include any packets that are in flight。

To that receiver。 They’re considered in flight because the receiver has not acknowledged yet that it has received those。

those bites。 So we can send new packets as long as the sent but unack acknowledged packets haven’t already filled that advertised window side so that’s。

that’s the way of getting around the fact that， you know。

because there’s speed of light and round trip times and and delay。

We could have a lot of packets in flight， and we don’t want the receiver saying， hey， I’m full。

you know， by the time that gets back to the sender the sender is already overrun。

So it’s going to keep track of what it’s already sent in into the network。 Okay。

so here’s an example with a window size of three packets。

and the window size to fill the link is given by W times the bandwidth packets per second times the round trip time so little’s law comes into play。

Once again， now for TCP， the window is in bites， not in packets because it’s it’s bites in the buffer。

So we have unacknowledged packets that the sender is sent。

we have out of sequence packets that might be in the receivers window because again the network can。

can reorder。 So we have a packet that goes out。 Another packet that goes out。

Another packet that goes out。 All right， so now we have filled the receivers buffer。 So wait。

or there until we get an acknowledgement。 So now we get an acknowledgement for that first。

And so now we can send another packet。 Right， when we get an acknowledgement for the second one。

Now we can send another packet， we get a knowledge meant for the third one。

Now we can send another packet。 So， we can send another packet to the server。

we can send another packet to the server window。 And so we can send another packet to the server window。

And so， we can send another packet to the server window。 And so。

we can send another packet to the server window。 And so。

we can send another packet to the server window。 So， with TCP， again it’s per byte。

We have three regions at the sender。 So the first sequence region is packets。

bytes rather than have been sent and acknowledged。 All right。

so those we can forget about the second region is sent， but not yet acknowledged。 So。

these could be in flight， they could be represent packets that have been lost packets that have been garbled or packets that just simply haven’t been delivered yet or packets that we simply haven’t gotten the return acknowledgement again because of the round trip time。

And then we have packets and bytes rather that have not yet been sent。

And it’s this window this color region here in blue that’s adjusted by the sender based on what information we get from the receiver。

Similarly， we think about the receiver。 Right the receiver has received bites and given them up to the application。

It has bytes that have been received and buffered and bites that have not yet been received。

And you can see that this is a smaller region than the sender right because these are packets that could be in flight。

Right。 And so they haven’t been received yet or rather。

they haven’t been acknowledged yet rather so the acknowledgments here are in flight。

And these over here are packets that are in flight。

And so they haven’t been received yet at the receiver。 Okay。

so here’s an example of how this window based acknowledgments works in TCP with bites。 Right。

so here we send sequence number 100 size 40 so we have a packet that goes out at 100。

then another packet that goes out at 140。 Right， and so now again we’re taking up space in our buffer size 50 so now from 140 to 190。

And we’re going to get an acknowledgement for 190 to 210 we send out a package。

a packet rather for sequence number 230 size 30。 And so here， the receiver， it’s going to say hey。

I’ve only heard up to 190。 And I’ve got 210。 Right。

And so we’ve got this gap here from 190 to 230 the receive the sender， meanwhile。

still sending packets at the receiver we’re receiving these packets。

So here we are at sequence to 60 size 40， and the receiver is saying hey I’m acknowledging。

I still have only heard up to 190 to 10。 So the question here is if the receivers act doesn’t get through and the sender doesn’t know the window size has increased。

Well TCP stall。 So yes eventually so that’s exactly what’s happened here right。

Well it’s not what’s happened here right the the receiver is sending back an acknowledgement saying。

I’ve missed a packet。 If I don’t get these acknowledgments eventually TCP will stop at the sender because it won’t。

it will think it has filled the advertised window。

And so to wait until it hears acknowledgement saying， okay it’s all right， you know。

for me to proceed。 Right。 And it could be that the acknowledgments aren’t coming back because they’re getting lost。

it could be that the outbound packets are getting lost， or in this case， you know。

things might be arriving out of， there could be a variety of different reasons but yes eventually TCP will stop from the sender point of view if it feels like it is filled the window and has not gotten acknowledgments。

Okay so here we’ve said 340。 And again， still we’re sending acknowledgement for 190 to 10 eventually that acknowledgement makes it back。

And so now the sender will say oh I need to retransmit 190 and so it’ll fill in that gap。

It arrives at the receiver the receiver will then send back。 Okay， I got up to 340。 Right。

And so it’s acknowledging all those intermediate packets。

And so that avoids us needing to retransmit all of those packets right and we can keep going until we fill the window。

All right， so let me just summarize。

外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传

Okay so what TCP gives us is a reliable byte stream between two processes that are on different machines over the Internet。

It gives us the ability to have read， write and flush。

and uses a window based acknowledgement protocols we just looked at。

And in the next lecture we’ll look at how it uses congestion avoidance to dynamically adapt the sender’s window to account for congestion in the network。

And so I’ll see everybody on Thursday and hopefully on Tuesday rather and hopefully we will be back in person on Tuesday。

Thank you。 You。

外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传

You。

P25：Lecture 25： RPC, NFS and AFS - RubatoTheEmber - BV1L541117gr

Okay， let’s get started。

外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传

So this is lecture 25 and we’re going to continue talking about TCD， then dive into。

remote procedure call and then we’re going to look at distributed file systems。 Okay。

so remember with the， with the laser point up here， with the transmission control， protocol。

it’s all about delivering。 There we go。 Okay， so it’s all about delivering bytes reliably from one host to the other。

Now we looked at how we get that reliability using a sliding window protocol where we set。

the size of the window to ensure that we don’t over on receivers buffer。

But remember that one of the other properties that TCP offers is that it tries to be a good。

Internet citizen。 That is it’s going to try and not overload the network between the sender and the receiver。

So that’s what we’re going to look at next， which is how we avoid congestion。

So let me talk about first what I mean by congestion。 Okay。

so congestion is when we have too much data that we’re trying to flow through some。

part of the system。 Now it could be the input link or it could be the output link。

So it could be the link from the server， the rest of the Internet or it could be the link。

between your computer at home and your ISP。 Or it could be any of the links in between， right？

Or it could be because there’s other traffic in the Internet， right？

You’re not the only one using the connection between here and some sort or that you’re connecting。

to in most cases。 And so what is I we do， right？ The Internet protocol will just simply drop packets if there’s too many too much traffic。

for a given link。 Right？ Because if there’s too much traffic consistently for a given link。

it’s going to overflow the， buffers， those output buffers in the world。

And that’s completely acceptable， right？ We said IP is best effort。

And so if it runs into a situation rather than having something complex in the router， the。

router is free to just simply start dropping packets。 Now if we think about this。

what’s going to happen to the TCP connection？ If you remember from last time the sliding window protocol。

if we don’t get acknowledgments， we’re just going to simply retransmit。 Right？

And so then we end up with lots of retransmissions。

And those retransmissions are going to cause more congestion。

And that congestion is going to cause more packets to get drawn。

And more packets getting a drop means more of our acknowledgments or source packets are。

going to get dropped， which means we’ll just keep retransmit。 Right？

And so when bandwidth is kind of scarce， we’re going to just keep injecting more and more。

traffic to get our packets through。 Right？ So that doesn’t really work。 So instead。

TCP tries to implement congestion avoidance。 So if you think about it， we only have congestion。

Right？ How long should we wait to do a retransmission？ Right？ If we wait too long。

then we’re going to waste time when messages actually get lost。 And we wait too short。

then we’re going to retransmit when that acknowledgement was just， slightly delayed。

And so we’re going to waste our precious network bandwidth。

So there’s kind of a stability problem here。 Right？ The more congestion there is。

the more acknowledgments get delayed， the more acknowledgments get delayed。

the more we’re going to unnecessarily climb out and retransmit， the more traffic we’re。

going to inject causing more congestion， causing our apps to get delayed more and so on。

So you can see how this is kind of a vicious cycle。

So it’s kind of closely related to the window size of the sender。 Right？

Because if the window size is too large， we’re trying to push too much traffic through the。

network and the network can’t handle it。 So think about how we choose the sender’s window size。

Right？ Well， we choose it based on the available buffer space that the receiver tells us they have。

So the sender doesn’t overflow the receiver’s buffer。 So this kind of， you know。

now we can think about it。 Okay。 So the only purpose of this window is to ensure we don’t overflow at the receiver。

But we can also say， well， it’s， it’s also to make sure we don’t overflow within the network。

We don’t push the network into congestion。 So we want to really match the rate at which we send packets with kind of the slowest link。

in the network， a bottleneck link。 Okay。 So the way the sender does this is it uses an adaptive algorithm to decide what end should。

be。 All right。 So our goal is fill the network between the sender and the receiver。

And the technique that we use is every time we get an acknowledgement， we’re going to。

increase the window a little bit until some delay or loss of groups。 And then we’re going to adapt。

So TCP calls this slow start because we start sending slow。

Now with each acknowledgement we get back， we’re going to open the window by one。 All right。

If we have a time now， that’s our signal。 All there’s congestion in the network。

And so we’re going to slam the window。 And we’re going to put it in。

So this is an additive increase， multiplicative decrease algorithm。

And so you can think about other alternatives。 My maybe like。

what if we just decrease the window by one？ So is that an increase additive decrease？ Well。

turns out that wouldn’t respond fast。 And so we’d still have a network of congestion。

Our goal here is when we push the network into congestion to very quickly kind of pull。

back off and pull ourselves out of congestion。 Okay。 So how does TCP do this？

So it’s going to artificially restrict the window size if it sees some packet loss。

And kind of have to carefully design this control because it’s really critical that。

we don’t overrun the network or overwhelm the network。 But at the same time。

we want to make sure we’re using as much of the network bandwidth， as possible。

So that might argue for multiplicative increase， multiplicative decrease。 All right。 But again。

we’ve got this careful control。 But if we do a multiplicative increase。

we’re going to push the network way into congestion， before we pull it back in。 All right。

So when they were designed the TCP algorithm， the slow start algorithm， they looked at all。

these different sort of choices and the design space。

What if we open the window really quickly and close the window really quickly or open。

the window quickly because it’s slowly， you know， and so on。 Right。

And what they found was that what gave us the most ability。

And here’s the graph on the left that shows time on the x axis and the window size on。

the y axis where this solid line is the actual available capacity。

And this other one is the actual window size。 What we get with AIMB is an algorithm that you can see oscillates a bit。

but it’s oscillating， around what our true window size is in the network。

So what the network truly can support。 And you can see here， you know， as we grow。

we end up pushing the network， you know， packing， packets into the network until we see losses。

And then you can see big drop， the window size and hash， right？

Then try opening it back up and then big cut and open a little bit and， you know， keep。

having it until we get to this relatively stable state。 All right。 Questions。 Yeah。 All right。

Very good question。 So how does congestion management interact with not pulling up the receivers buffer？

So the cap on the window size is always going to be the receivers advertised space。

So we’re never going to exceed the receiver space， but at the same time， we’re not going。

to exceed the network capacity。 Because maybe the receivers are server， right？

So it has the ability to have a very large buffer space。

We don’t want to overwhelm our weakest link， our lowest bandwidth link or capacity constraint。

link in our network at the same time。 Right。 And vice versa。 We had a client was the receiver。

you know， your mobile device and it doesn’t have a lot， of buffer space。

Even if you’re connected to， you know， a gigabit network， we want to make sure we don’t overwhelm。

that device。 Now， one thing I didn’t mention is what happens when there’s a lossy link， right？

So if you’re operating at the edge of a cell， you know， cell coverage， well， this afternoon。

is actually going to hurt you a lot。 Right。 Because those losses that are occurring because of packets getting dropped or acknowledgement。

is getting dropped， are going to cause the window size to get closed down really small。

That’s why if you’re like trying to watch some video or something and you’re， you know。

on the edge of coverage， it’ll buffer it’ll stutter， go below resolution， because the network。

is struggling to try and TCP is struggling to try and keep the window open and push packets。 Right。

Because every loss it sees， the thing is the link is congested， even if it’s not actual， congestion。

So TCP works well until the loss rate hits about 10%。 When you get to about 10% loss。

it starts in a feeling with congestion。 Okay。 So remember how we had set up a connection over TCP。

we identified our connections by， this five people of the source IP address， source port。

destination IP address， destination， port， and then protocol， which would be， you know， say。

UDP or TCP。 Now， client port， of course， is randomly assigned。

The server port is often some kind of well known port， like 480 for web servers or 443 for。

web servers using SSL rather and SSH is a platform。 All right。

So what are the processes of establishing this connection at the network？ All right。

So this is that the application layer， we have sockets。 The network layer， there are three phases。

There’s connection establishment， which we do with the three way entry。

That’s how we open the connection。 Then you send bytes back and forth。

like the reliable byte stream transfer。 And that’s what we looked at last week。

And we looked at how we do congestion control today。 If the connection failed， right。

we repeatedly try to transmit it。 You know， eventually a client out occurs and you reset the connection。

So you log the web browser and you get the connection to the server。 It was reset。

That means TCP failed。 It couldn’t get packets to whether from the server to the client or the client to the server。

Now when we’re done， we closed the connection and we need to tear down the connection so。

that the user state associated at both the client and the server both have to agree。 Okay。

The connection is now terminated。 All right。 So let’s start with that first part where we’re connecting from a client to a server socket。

All right。 So we’re connecting to that web server on port 80。

So what the client’s going to do is a three way connection。

So it’s going to start with the client initiator sending messages to the server。

Time goes from top to bottom。 All right。 So on the server， the server calls listen， right。

but wait for new connection。 Now the client is going to call connect and connect。

It’ll provide the server’s IP address or name or talking of resolve with the rescue and。

IP address and afford it。 All right。 So now we will send out a message。 A syntax。 All right。

So syntax it gets sent from the client to the server proposing an initial sequence number。

All right。 So you don’t always start to remember at zeros。

You pick some random number and start it at that。 That way if you have two hosts that are talking back and forth multiple times。

you don’t end， up confusing packets from one session。

one connection with packets from another connection。 All right。

So it sends here a sin with sequence number X and each side is going to do that， right？

Because remember， these are bidirectional connections。

And so the client can send the server with one sequence number of space， the server will。

respond back with messages in the other direction using sequence numbers from a different space。

So it’ll send back a sin。 So a sin here’s my proposed initial sequence number。 And hey。

I’m acknowledging that I received your packet with sequence number X by sending。

an acknowledgement with that first one。 And I’m proposing to use one。

So of course now the client has to do the same thing。 That’s the same thing。 Okay， yes。

there’s an acknowledgement。 I got the packet walk。 I’m waiting for life。 That’s why。

So now our connection has opened and we’re going to allocate buffer space at the server。

And we’re ready to go into the connection。 Now we will schedule some thread which can then go and accept that connection and。

you， know， now proceed。 All right。 So we’re going to do our byte exchange back and forth。

We’ll look next at what we might send back and forth。 But first。

we’re going to look at what happens when the two parties are dull。

And it’s a now we want to close the connection。 We send all our data back and forth。 What do we do？

So there’s a four way turnip， right？ Because both the client has state and buffers and the server has state and buffers in each。

direction and we need to throw away all of that state。 And both sides have to agree。 Okay。

so the server， the client rather host one will say here， it’s going to do a close。

So it’ll send a thin packet to the server。 Right。 So that tells the server， okay。

I’m done with this connection。 I’m not going to send you any more data。 The server responds back。

Okay， I got nerd knowledge meant no more data is coming in。 Right。

This connection is now I agree that this connection is close。

If the host that host sends subsequent data on that connection， we’re going to ignore。

As Morgan said， you know， we’re done。 The socket is closing。 All right。

Now on the server will also call close and it’s going to send a thin packet and post one。

will send back a thin app saying， okay， I agree to close your side of the connection。 Right。

And then there’s a timeout here。 After that timeout here that occurs on host one。

we throw away all state associated with， this connection。

That way that the timeout occurs because maybe host didn’t get me involved。

And so it might want to be like， go transmit again， you know， hey， I said I’m closing this。

connection。 Okay。 All right。 So within that window。

we have stayed enough to be able to respond back with the thinner outside， of that window。 We don’t。

All right。 So that gives us a little bit of a window for us to be able to respond back to the second。

host and say， yes， this connection is close。 Now before we can actually turn down the connection。

if there are multiple file descriptors on the， host that were further at the connection。

all to a socket， all of those have the floats。 There’s some reference down in that approach。

The same thing happens with the host to side all anyone holding a file descriptor has to。

agree or close。 Yes。 So the question is， how does host who know if the host one is not received the fan when。

it’s in the fan？ So that’s why host one send the fan act back。 Right。

So how does host one know that host two received the fan act？ Yeah。 So if it doesn’t get received。

it doesn’t matter。 We’re still going to crawl away the state and it can actually consider。 Uh。

and so that sometimes that’ll happen。 Like the fan or the fan act doesn’t get full truth。

One side thinks it’s still connected。 It will try to send some data and it will get back。

I don’t know who you are。 And then it just resets the connection。 You see that happen。

Open the buzzer。 Um， there’s a question in chat。 Why is host two not closing the connection when。

uh， it seems the first thing。 Because both sides have to agree。 So host one。

all the processes that are using that file descriptor have to agree it’s closed。 And host two。

we also have to agree that it’s close。 It’s a， that’s why there’s this four way handshake。

And one side doesn’t mean a laterally just say connection close。 All right。

If I do close the connection， I am going to have， you know， once I receive that response， uh， thin。

I’m then going to throw away， uh， all of the state after， uh， this time out， of groups。

And the question is how does host one differentiate data and fit？ Fill is a special kind of path。

So， uh， data packets are of data type and send back to a， I can send back to it like。

soon and acts are actually can be put in， uh， other packets。 And so you can， uh。

piggyback and act on a data packet or even on a soon back at all。 So like the。

the same act that goes back is a packet that contains a soon and a convenient。 Yeah。

What’s the relation between sockets and ports？ So ports are what you’re listening on or what you’re sending out for a server or what。

you’re sending from as a client。 So ports are at the protocol level。

And then the socket is at the application level。 And that is the connection between， uh。

two missions。 So again， you know， when we define a connection， it’s defined as a source IP address。

source， port destination， IP address， destination port and a protocol。

You see there need to be the typical。 Okay。 Now remember we build distributed applications with messages。

right？ So I send a message to a message， uh， mailbox， a server report by over a connection。

So the question is what’s in those messages？ All right。

And so that’s what we’re going to talk about。 So it’s really a question of how do we represent data in a message？

Right？ Cause you think about like applications work with objects where I build the data structure。

and see or build an object in Java。 And now I want to send some representation of that over the network。

All right。 So I want to go from a machine specific binary representation to bytes that I can send over。

the network。 And this is in contrast to when we think about a single process， right？

The threads running in that process can view the data structures that contain pointers and。

other sorts of things。 And it’s easy for us to follow those points。

That you just do reference the pointer and that gives us the next thing and say a trick。 Right？

But we don’t have shared memory when we have two separate machines。

So that external representation is going to have to take whatever like a tree structure。

and turn it into some serial sequence of bytes。 So that is actually called serialization or Marshall。

And so that’s taking a binary representation of an object in a machine， a virtual address。

space and turning it into a sequence of bytes。 The reverse of this is deserialization or un-marsh。

And that’s taking the bytes and kind of rehydrating it into what could potentially be a complex。

data structure or a video or an audio file or a database file。 Okay。 So simple data types。

So like an interview。 The goal is I want to write the integer x， the inside of the new x， the file。

Like， well， I could just do something like open the file， write， and then I have two choices。

I could write it as an ASCII spring。 The ASCII representation of say 162， 162。 Right？

And then a new line would say。 Or I could write it as four binary bytes。 Of course。

I have to modify it so I open the file with the binary flag。

So I don’t end up with any translation issues。 Well， there’s two different ways I can serialize it。

Serialize it as a sequence of bytes that reflect the bytes that I have in memory or serialize。

it as a set of ASCII parameters。 Now which is the right way？ Turns out， you know。

that it doesn’t matter。 Both are mechanisms that we could use。 One’s a little less dense， right？

If I want to represent a large integer as a ASCII string， it’s going to take more than。

four bytes to represent them。 But I have to make sure that if I write an ASCII string at the sender that the receiver。

I parse an ASCII string， a scanable， an ASCII string。 Okay。

How do we know that the recipient represents X the same way？

If you think about like a pipe with pipes， that’s between two processes on the same machine。

so we know that the machine binary representation of the integer X is going to be the same almost。

empty machines。 Well， what about sockets？ An unissendent between my mat and your windows machine。

Or between my mat and some， you know， MIPS machine。 Well。

I have to worry about an envious when it comes to integers。

You probably remember this from kids you would see， but you know， for a byte address machine。

when I refer to a type like an integer， what does that byte address refer to？

And the case of a big ambient machine， it refers to the most significant bits。

For a little ambient machine， it refers to both significant bits。 Right？

And here I have a little chart， right of processors and envious。 All right。

so we can see that the Motorola 68000 is big and our PC big and until x86 little。

went by some processors， it kind of depends on the little and the more big。 Right。

So I had one max little and the bigger and the well， you know， it’s something that the。

check could check it by running a little program like this， my number of problems。

If you want to see where all we’re going to do is just taking a value and going to print。

out the string of that value in representation。 And then we’re going to print out what each byte wins in a word。

And since this was run on a little and new machine， we’re going to see that the least。

significant bits， the 78， go up first followed by increasing significant bits。 If I ran this on。

you know， power can see， I’d see the opposite。 Like one， two would be first and three， four， five。

six， and seven。 Right。 So then I can ask the question if I’ve got two machines communicating over the internet。

what Indian is the internet？ All right。 So what’s the network by order from integer in the internet versus what might be the host。

by order？ And how do I write programs that are right once run anywhere， say against the POSIX API？

Like the argument behind the POSIX APIs， I write my programs to the POSIX API， then I。

can run them on my Mac， I can run them on the windows with the windows subsystem for Linux。

installed， I can run it on a power PC with the POSIX API and so on。

But if I’ve got applications that are talking to each other。

how do they know what the representation， is？ So the answer is they don’t have to think about it。

They just simply call a function。 One of the POSIX functions that handles that question。 All right。

So we need to decide on an on wire emptiness in case of the internet。 The internet is big and in。

Why is the internet big and in？ Because it’s big and in。 All right。

It happens that the machines that were first being connected to the internet will be in。

the machines or people said， well， you’re big and in。 Now why is this important actually？

Because if I’ve got a little ambient machine talking to a big ambient machine， then I have。

to convert from little ambient representation to big ambient， put it on the network， send。

it across the network and then at the big ambient machine， I can use it as it。

If I have two little ambient machines talking to each other， they’re going to come each。

convert from little ambient to big ambient and then from big ambient back to little ambient。

as a representation。 So that’s going to impose it over。

And so if you actually had say a hyperphone to a few clusters of all little ambient machines。

you probably wouldn’t want to use this as an approach because you’re constantly going。

to be converting from little ambient to big ambient to little ambient even though everyone。

is little ambient。 But in the internet， you don’t know the server of talking to you is little ambient or big。

ambient and so that’s why we need a standard format that we’re going to use。 Okay。

so we have functions that will convert from the native emptiness to the on wire and， these。

All right， so welcome work from a host HD or H2 network and then there’s long versions。

and they’re short versions， whether they’re looking at 16 bit or 32 bit values。 And then vice versa。

So when we get to the target converting from the network format to the host one。 And again。

if I have two little ambient machines talking， then each of these functions is actually。

going to do something。 If I have two big ambient machines talking。

these functions are going to be no off。 So we’re just going to copy the output。 Okay。

What if we have more complicated objects like， you know， from homework zero and one or like。

so we have a structure here that has pointers for string and a count of integers and then。

a pointer to another total。 Right。 Well， we do like a kind of word。

It’s going to just write things out of the sequence of words。 Right。 And so into a file stream。

you can usually imagine having a basically scatter words that would。

read in and we tokenize those at the other end。 So that would， you know。

we could represent the string as we would represent it in the string。

that would send over the socket the same way we would represent the string that we would。

save for the file and read from the file。 Right。 And we write something like a pointer。

We don’t want to write the actual point because that actual pointer depends on the virtual， address。

Even on the same machine， if we wrote pointers out to a file and then tried to read them in。

another process， they would be enough。 Right。 So we have to take pointers and turn them into something that’s like an index。

Right。 So instead of pointing to somewhere in a memory location。

this points to the second structure， that I wrote。 So it’s like an index。

This points to the third structure that I， this points to the tenth structure that I。

haven’t written yet， but I’m going to write。 All right。

So if you make something relative so that string is completely self-contained。

And when we receive it at the other end， we can rehydrate it back into a list or a tree。

of odd ones。 Now there are lots of data serialization formats that people use to do this kind of。

stuff like JSON and sent to the markup language。 There’s lots and lots of ad hoc functions。

外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传

And in fact， there are tons and tons of data serialized data formats。

And I’m not intending to read the columns here， but you know， some of them are human readable。

with standardized is a， you know， binary format and so on。

So when you’re designing a protocol to talk between two applications， you know， like， you know。

going work for a company that’s doing web applications or mobile applications。

And you have to do a client or a protocol。 You look at these like choices for data serialization formats and pick one that meets the criteria。

that we need for application。 Because you’re in readable one， I think it always kind of humates。

You look at like JSON， it’s very human readable。 It’s very verbose， which， you know。

it’s nice if you’re trying to do webbing， but it’s not， so great dealing with， you know。

adjusted limited mobile links or something like that。 And， you know。

it’s a very common format that people use。

外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传

Okay。 Yes。

外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传

Yeah。 So the question is， what about objects that are represented differently on different machines。

say integers of a 32 bit versus 64。 So in that case， actually those are two different data types。

32 bit would be an integer， 64 bits would be a long。

I think an example where you have that happen is what’s your stringing code， like your string。

writing and coding like the ASCII， it might be up to date， mainframe， and code。

It could be double white and coded。 It’s， you know。

representing the language of some of the Asian languages。

And so that’s where like you have to worry about， like， I think I’m speaking asking you。

and you’re expecting your mainframe， you’re expecting up to date。

And so we have different character sets that we’re using。 And so we’d have to agree。

This is what an character string represents。 Is it UTF-A being coded？ Is it asking coded？

Is it double white and coded？

外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传

You know， it’s all。 The key thing is both sides have to be able。

And it either has to be implicit in the type like everybody knows a short 16 bit， but what’s， the。

you know， ordering of those or I think the strings， you know， what is the， okay。

So if there are other questions， I want to shift gears and talk about or local procedure， call。

All right。 So with local procedure call， we try to move up from the level of message。

We don’t want users to have to worry about instructing these messages， picking these formats。

and these serialization and marshal functions and having to sit there right， you know， scan。

apps or other kind of parsing functions。 So in order to do this。

we need something that’s going to wrap up all of that information， to be less of source。

include all the typing information， and then at the destination is， going to include all。

we’re going to receive all the information we need to then deserialize， a marshal that data， right。

And what procedure call might have to wait， you know， for a response or a service。

And at the server， you can have to wait for incoming messages。

And we want to deal with all of these message representation issues that we just talked， about。

So the option here is remote procedure。 The move is up a layer level of indirection and hide everything behind our PC。

So I don’t have to worry about all that complexity。

So now making a call to another procedure on our machine， I want to make that look almost。

the same as a call to a procedure on the same machine， a ordinary function。

And automate everything associated。 So for example， required to see simple calls。

remote file system read on the file。 And that’s going to do something to automatically get translated into a call on the server for。

file system read of the file database。 And the bytes will come back。 And when this call returns。

I’ll have the contents of the file。 Right。 It’s very different from before。

where we think about open the connection， I’ll take， a socket， code， and send it off。 In this case。

doesn’t have to worry about any of that。 Similarly， at the server。

it looks like it’s getting a local procedure call to the file。 And so on。

So the concept here is that the caller calls some function f with some arguments we want， to be to。

And there’s a client stuff。 So it does a local procedure call to this client stuff。

And that’s going to bundle up all the arguments， serialize them， and send them to the server， stuff。

The operating system will deliver that message to the server stuff， which we’ll see that message。

on the arguments and do a local procedure call to the server with those arguments。 Right。

And then back will come to response data and response return values。 Those get bundled up。

serialize sent back to the client stuff， which then receives them， and will then un-martial。

And then an ordinary procedure call would turn to the client， to the caller。 Right。 Now。

these could be on the same machine， or they could be on completely different machines。

across the network。 All right。 So we could do this with pipes or we could do this with sockets。

So how do we go from， you know， I want to make that local procedure call to it actually。

turns into something that happened remotely on a machine and I get back those constants。

外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传

That’s the implementation。 So there was a question from earlier。 So essentially。

is a pointer represented as a relative address effecting， you know， however。

your approach and code structure is going to have to take something that was absolute。

like a pointer and then it’s something that’s going to be relative to the byte stream that。

you’re sending back and forth。 Okay。 So under the covers， it’s just request response message pass。

All we have is message passing between the two machines。 The stub gives us the glue， right。

because it’s responsible on the client side for serializing， the arguments and sending them out。

And then un-martialing， deserializing， it was term values。 And on the server。

it’s responsible for receiving the message containing the arguments and deserializing。

that and then calling the local procedure and serializing up the results and sending it。

back to the point。 So marshal and what happens is going to depend on the system。

but it’s going to involve converting， the values from whatever the binary machine representation is into something that we can。

send on the wire。 And so some of the values that might be passed in through that RPC might be just ordinary。

ordinal values， an integer like a short， some might be cleaner， a pointer to an integer。

And so if it’s a pointer to an integer， the client stuff is going to have to de-martial。

that pointer or de-reference value of that pointer， get the integer and send the integer。

to the other side。 It’s a memory， put it in that memory at the server stub and then pass in a pointer to。

that to the server function。 It gets a lot more complicated potentially if you have things like a pass in two arguments。

that are the same pointer to the same object。 How do you handle it？

You create two copies or one copy of the destination。 So there’s lots of devils in the detail。

but at a high level， we have to worry about things。

that get passed in by value versus things that get passed in by reference。 Okay。

so some more details。 If we look at the equivalence between a remote procedure call and a local procedure call。

the parameters are what goes into the request message。

The result is what goes into the request message。 The name of the local procedure is something that gets passed in the request message and。

then applied at the invoke of this at the server。 The return address here。

so what would be the return address in the local procedure call。

is effectively like a return mailbox that we provide for the server， the server stub。

knows where it suddenly is all about。 Again， in this case it would be a stop。 All right。

so we have a， so as a user， you don’t have to do this。 The user。

there is a stub generator where the compiler just simply jights the stubs for。

you to find stubs in the series。 Question。 Yes， the question is。

does this mean that the server procedure can only work with a pass， by value。

instead of call by value values， instead of call by reference values。 It could be either。

It would be the job of the server stub if things are called by reference arguments to。

basically create those references。 So allocates the memory and then pass the pointer to the arguments converted into deserialize。

into that memory into the server procedure。 And again。

it gets complicated if what you called on the client side contains references。

that point for the same option。 Because then I have to figure out how to recreate that at the server with that point。

Okay， let’s see。 Okay， so how does that compiler generate the stubs？ Well。

the input could be some interface definitions that are written in some interface definition。

language or idea。 And that’ll tell us what the types are in the arguments and what the types are in the。

returns。 In some environments， this is automatically inferred。

So if you have typed structures and things like that， then the compiler can automatically。

infer what the types are of the arguments that are returned。

So all languages of job that’s really super， since everything is strong and tiny。

The output from this compiler is those stubs。 The stub that runs at the client。

the stub that runs at the server。 And it’s going to have code again at the client will do all of the packing up of the arguments。

sending it out， voting for the response， unpack the response， and vice-versa， the server。

it’s going to wait for requests， unpack the arguments and book the local server procedure。

and then wait for response， and then pack up the response variables and send them back。 Okay。

some more details。 What happens if the client and server are different applications with different languages？

Well， it’s just before we have to convert everything to some canonical form， determine。

what our white order into the input like that， and then tag every object with some indication。

of how it’s encoded。 So we can avoid unnecessary conversions。 So everything’s big end-end。

and you know， running code， it’s big end-end to the other， side。

It’s also big end-end that we don’t need to do any conversion。

We take the data and the buffer as it is。 Now， how does the client know where to send its messages？

Well， we need some way of converting that remote service request into a remote machine。

that we can address and remote port address。 Okay， and so that’s the process of binding。

So there’s some， it could be other than statically。

We picked it up in file time that this IP address and port is where you’re going to find。

remote file server or it can be done dynamically so that we can find where that server is。

And doing it dynamic， you know， that is also the benefits because， you know， machines can。

fail or substitute the applications。 We don’t need to recompile our code if you change where the server is located。

And so most systems will do dynamic binding。 So with dynamic binding， we need a name service。

That name service will provide that dynamic translation between a service and where we。

find the data。 So maybe it’s， you know， we need to look up something with the light， light。

the directory， access protocol， or LDAP， and so when I request a lookup， that lookable query。

some name service， to find out where do I find the LDAP server or the eks department。

where the active directory， server and so on。 And then they’re finding again the advantages here are like failover。

something failed， you， can use failover transparently， the clients， the different server。

but it also can use the， access control。 It’s a way， you know， maybe if you’re not on campus。

we don’t tell you what the IP address， is for the LDAP server， where the active directory。

so that way， draw campus， you don’t， notice， you can’t attack it。 So it’s a sort of a problem。

It’s a little bit of a severity-bob spill。 And then there could also be multiple servers。

And so again， we get flexibility in terms of， you know， maybe we can do load balancing。

failover and so on。 Yeah， question？ Yes， the question is， is this connecting with DNS？

DNS is an example of dynamic finding， but there are lots of services that lots of， you， know。

sort of RPC packages that provide their own kind of game service。

So we could use DNS for some things like for web server， right， and doing RPC to a web。

server that owns DNS to find， you know， the well， and I know it’s on the well-known port。

then I can just use DNS。 But if I’m connecting to something like an active directory service， then。

you know， it’s， a different protocol that I use， like NetVial or something else to find the main map。

You can have router level re-drained。 So a lot of load balances。

applications load balances or web server load balances are violating， the end end principle。

but they operate in the network and they look at the traffic coming。

in and redirect it to whichever web server is like a load。

It gets complicated if you’re trying to maintain sessions。

So these things will actually look into the packets flowing through and see， oh， here。

the cookies and I’m going to statefully find this flow to a particular server。 That way， you know。

for example， if you’re logging into a bank， you’ll get back a cookie。

and you might be connecting to one of a thousand different servers。

But every time we open a new connection， you’re going to get redirected to the same server。

as a load balance or look into the packet to see the cookie。 It kind of violates end to end。

but we do it for performance， for load balancing and for， flow load things like that。

Now if we have multiple clients， then we need to make sure that we pass a pointer to the。

client specific return mailbox。 Again， we get this with sockets because those are done with the connection and so we have。

the client port number。 So that tells us where to go back to。 Questions about our decision。

So some issues with different failure modes with RPC than we might have with a single machine。

Because now instead of it being just one machine that might fail， the client could fail， the。

server could fail， they could both。 And so we have to look at what could be different types of code。

You might have a usual level bug in the server code that causes it to crash。 Set call。

We’re saying on the client， causes it to set file。 There might be a kernel bug or， you know。

powers of life fails， machine， the client flashes， or turns off or the server turns on。

Or either those machines get compromised by some error。 Now before we had RPC。

when we’re doing everything on the same machine we shared。 The power supply goes out of my laptop。

Everything’s gone。 You know， the power point comes in。 But if I’m doing RPC。

now the server could crash and the client will crash and the server， will crash。

And so one is going to keep working while the other stuff。

And so now we see some inconsistent view of the work。 If I wrote some data to the server。

did it say before the server flat？ Or no？ Did the server do what I requested or not？

So how do we solve this？ What we just saw how to use distributed transactions or some kind of Byzantine net protocol？

The guarantee that even if one or the other crashes， we’re still able to eventually read。

something sensitive。 But now we’ve added some Hudson because this was not a behavior we had to think about in。

our applications before。 So while we have transparency in that it looks like we’re just making a local procedure。

call， we invoke the RPC just as we invoke a local procedure call。 Now weird things can happen。

I mean， here in failure modes that don’t happen in the local procedure call， please。

Another issue is performance。 If you look at the wall clock， you’ll see that RPC is slow。

A local procedure call is micro-sappening or less。

A remote procedure call on the same machine or where pipes is more expensive because there’s。

kernel crossing associated with IPC。 And if I have to go to a server。

now instead of taking microseconds， you could be talking。

about tens or hundreds of milliseconds in latency。 Plus， just in speed of light latency。

but then we also have to account for the fact that， we have to marshal the data。

we have to serialize it and deserialize it。 We have a gigabyte buffer， right？

And we’re acting on that between threads in the same address space。 There’s no problem。

But if I want， I can pass it by reference， right？ But I want to pass it by reference to a server。

I’ve got to copy that gigabyte to the server。 I have to actually copy and pass it by value。

So programs have to know that you’re using RPC。 And that RPC isn’t free。 But it is not just。

it is functionally a drop in replacement for local procedure calls。 Mostly it’s a preferred rate。

But performance-wise， there’s a significant cost difference。 Now。

there’s huge benefits that we get in being able to be transformed remotely and have it。

look like it’s just local， which is the reason why we still use RPC。

Almost all the familiar applications are built on top of RPC。 But it’s very important to recognize。

you know， you can’t just drop in replace local， procedure calls with remote procedure calls without really thinking through what the performance。

effects are doing。 Now， we can do caching。 We’re going to see how crashing and using the sugar file systems。

But we’re also going to see how it increases all sorts of complexity and consistency issues。

All right。 So if we look at how address-based is communicated with each other， on a local machine。

it’s just， through shared memory or through the file system。 Or we can do things， you know。

so we can do m-app to get some shared memory， manage it， with monitors。 And。

and some afford we can write， you know， like phases of the C compiler， write through。

the file system to communicate with one phase or the other with the new types and the uni-directional。

mechanism for communicating。 Or now we have this ability to use remote procedure calls。 Where that。

again， could be on the same machine or different machines on different climates。

This means we can run services where it had since。 Right？

So we can put services where maybe we want real reliability and availability and durability。

you know， in some cloud data center。 And access them remotely from clients。

The clients that want to make sure， you know， I could use my laptop， my laptop， could be stolen。

And so it’s constantly doing RPC to synchronize the contents of my laptop with a Google data。

center， and storing my data in the cloud。 That way， when I lose my laptop， it’s stolen。

I can get a new laptop and just RPC back all of my data are developed。

So it makes sense that like that’s sort of some place to secure。

And it looks the same whether I’m accessing data on a local machine or on some other machine。

Now there are lots of art and systems out there， Corba is a really old one that people， use。

DCOM is what the attributed common on public model is one。

So Corba is called common object requests， local architecture， common object model and。

DCOM is commonly used in Windows environments。 People who’s in Java， there’s JMI。

Java remote method indication， which as it sounds is aware， of remotely invoking Java methods。

You know， hand opening is like exceptions being passed back to the current values。 Okay。

any questions about RPC？ All right， so we can take this kind of to be extreme。 Right？

And that’s what we get when we look at things like micro effects。

So we look at how we build kernels up till now， we think of the problem with something。

monolith and teams are file systems， what’s memory management， windowing， networking， thread。

support and so on。 But a radical way to think about how to do an operating system would be to break it up。

into components。 If the file system and make it an application level sort。

make the windowing system and make， it be an application level sort。 Now。

the file system is going to look at the same， on the same machine， right？

But we’re going to do everything through RPC instead of here where it’s shared memory。

in the funnel。 Now， question。 Why would I want to do it？

I want to take my monolithic operating system， we’ve got all my state of the C everything。

and break it up into these micro kernel components。 But benefit my identity。 Yeah。

So that’s a very good example is， and we’re using the size of my kernel。

So I’m reducing my trusted code base size。 So I’m trying to validate or verify that my kernel is correct in that tamper where in。

trustworthy。 I’ve reduced now a lot of the code。 And you think about how much code is in a file system or how much code is in a window。

system management。 I’ve taken all of that。 In the extreme， I can squeeze everything out。

Virtual memory management in pajent doesn’t need to be in a kernel and make it an application。

level process。 All right。 And so some of the other reasons are multi-solution。

How systems are incredibly complicated and they have to deal with disks that have structures。

that can get damaged。 And if you don’t have a proper sanity check。

you might accidentally try to dereference something。

from a structure created by reading the file system， you know， some of the metadata and。

cause a segmentation。 On a monolithic panel， you get the， oh。

your Mac is crash or you get a blue screen or black， screen in windows。 Right？ In a microkernel。

it’s like， oops， the file system crash。 We started， right？ And then it just rebuild， you know。

all of its data structures， the same as if I just， booted up。 Same thing。

the window system can crash and just wipe back up。 Right？ And rebuild all of the state。

So it gives you real power in terms of being able to firewall between files。 It also， of course。

is modularity。 Right？ And look in the typical kernel， it’s spaghetti code。

People do things to optimize for all sorts of things。 And so making changes to a file system that’s。

you know， in an integral part of the operating， system。

you really have the same thing with a window subsystem， same thing with the page， and system。 Right？

Everything is dependent upon everything else。 And， you know， there’s so much here at state。

it’s really hard to understand sometimes what， the APIs are when the APIs have changed。

If you have them in separate processes communicating via RPC， the API is very complete。

If you make a lot of incremental or even major changes。

So I can completely swap out one file system implementation for another。

The ground up rewrite implementation of my file system， as long as it uses the same。

and supports the same remote procedure。 I can also now take components and put them somewhere else。

Right？ So I can have my window manager run on a different machine。

my frame buffer run on a different， machine。 For my applications that are writing to the windows subsystem。

exist on another machine。 So that’s really powerful from a distributed computing standpoint。

Now there is a caveat here and we’ve seen over a time， you know， operating systems like。

windows went to a completely micro kernel version。

Everything was pushed out of the kernel that it could。 And then the next version。

they pulled a bunch of components back in。 Why？ Which is like the rule。

like all these benefits around trust， around fault isolation， around， modularity。

and development velocity。 Why would you then want to undo all of that and put big chunks of code back into the model。

of the kernel？ In the back。 Exactly。 Increase in overhead。 Right？ Remember， remote procedure call。

Even on the same machine。 Is expensive。 Right？ There are ways to make it faster using shared memory and things like that。

but there’s still， a cost。 And what they found was that even though they got all the benefits around trust around fault。

isolation around modularity， it came at the cost of performance。 And so they then looked at， well。

where is that trained off better as a separate process。

versus where is that trade off better as we put it back in the kernel。 And then when it crashes。

it’ll be between all the systems。 Okay。 Question。 Yes。 So why can’t I have two versions。

kind of the monolithic and the microkernel where like。

a file system lives both in the kernel and outside of the kernel。

So that’s actually kind of sort of something you’ve worked today。

The virtual file system stuff lives in the kernel and then file system implementations。

live as modules potentially outside the kernel。 But trade off again。

it’s going to be performance and consistent。 Right？

If I take just two copies of the same file system， then I could keep all of the buffers。

and say that the people， you know， open file descriptor tables and say， and so it becomes， more。

But typically， I’ll choose one or the other， you know， I’m going to put it in the kernel， and。

you know， risk it crashing and burning the kernel down or it put it outside， which。

case I’ve played some performance costs。 So you’ll see environments where， you know。

in a moment where we have come both。

外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传

Okay。 So now let’s look at network attached storage。 And with network attached storage。

we have lots of hosts。 So here are all these hosts。

Here are these servers that have storage and these servers are synchronizing the storage。

That’s a host can read from any server。 And my colleague。

Professor Brewer has a theorem of the cap theory。 So between these servers， we want consistency。

So changes that are coming in from， say， this client and to this server and changes that。

are coming in from this client to this server， there’s some serial ordering of because they’re。

all tuned in the same data， just replicated across the stories， one， and sure there’s。

some serial order。 Right。 And everyone sees the same serial order。 So a， before b。

occurred before c。 You also want availability。 So anybody here can get a result。 Right。

So if I want to read some value， read some object that’s stored in this replicated， system。

I can read and get the value。 You also want partition power。 So if I slice a line， you know。

in this machine here down in the lower right， it’s disconnected。

I still want the system to continue to be able to work。

So these hosts here are connected to this server。 I want them to continue to be able to， you know。

read and write。 So what the camp theorem says is if you have these， if you want consistency。

you want availability， you want partition tolerance， you can’t have all great。 So for example。

if you want consistency， so everything appears in the same serial order。

So this host is writing in the upper left and this host in the lower right is upper left。

one is writing here， the root， this server down here， we’re writing this server， replicating。

the same modifying the same value。 I want them to see a， b， c， b， e as your， you know。

write your goal。 And I want them to see that same order。 So one doesn’t see a， c， b。

and the other sees a， b， c。 They both see a， b， c。 In fact。

everybody reads those same objects or same object sees the right a， then the right， c。

And then availability， I want everybody to be able to be reading and writing at the， same time。

Well， if I partition my networks and now this server is disconnected and these two hosts。

are writing to the server， then now I can’t have， I can’t have consistency because if these。

servers are writing that same object， they’re writing D E F， well， these machines over here。

aren’t going to see it。 Right。 So I can’t have consistency availability and then also tolerate position。

And the same thing， if I want availability and the ability to talk， how are partitioning。

then you could， everybody could read， but I can’t allow rights because then I can’t have。

consistency。 I don’t really have availability。 And so you do the mental exercise。

but if you try to see， you’ll see， you say， I want， consistency and partition tolerance。

but I can’t have availability。 If I want availability and consistency， I can’t have the control。

And so。 So this is also known as Brewers Theorem。 There’s developed by Eric Blue proposed by Eric Blue。

Yes。 But then you could dynamically switch， but there’s going to be， if there’s a partition。

I may not be able to see servers or hosts on the other side of that network partition。 Right。

So imagine， you know， I cut the US in half， you know， the East Coast machines might wouldn’t。

be able to see the West Coast machine， but they might be able to see some of the central。

machines and vice versa， you know， from the West Coast， I can see East Coast， but I could。

potentially see some of the sample。 So this is really fundamental， right？

Because this means when we’re thinking about the shrimpy systems and how we’re dissolving， them。

we have to recognize that we can’t have all three of these properties， even though， you know。

if I find build a， I don’t know， distributed financial system， I want all three。

of these properties。 I can’t have them。 Okay。 So some administrative stuff。

The term three is coming up on Thursday and it’s from seven to nine PM going to cover all。

course material with a focus on the material since the last minute。

It was a review session yesterday and we saved that video for you。 Okay。 So distributed file system。

So here we have a client that wants to read a file from the server。

We want transparent access to files that are stored on our motor。

So the way we can do this is we can mount remote file systems into our local file system。

Now just as transparent， it looks like I’m accessing a local file， but I’m really accessing。

a file that’s on another server。 We can do this by specifying by hosting and port or we can have some binding service that。

maps automatically to a particular machine or we can do it with global unique names or。

外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传

all of the above。 What enables this is in units and POSIX like systems。

we have the virtual file system。 So the virtual file system。

just think of it as a layer of indirection on file systems。

外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传

So at the user level， we’re doing read， that gets translated into the C， we call the C library。

function to be read， we do a system call and that traps into the kernel and then dispatches。

out the handle。 Instead of dispatching directly through the file system of interest。

we’re going to dispatch， the virtual file system handle。 So the virtual file system handler again。

you can think of it as just a layer of indirection。

So it’s providing the same kind of functions that a real file system would provide。 So super blocks。

I-nodes， files， directories and so on。 And they designed a virtual file system API to be consistent with what underlying file。

systems can offer。 Okay。 So four primary objects， super block， I-nodes。

directory entry and file object。 I don’t have time to go into detail but basically those map onto what you find in a typical。

file system。 Not every file system will have the same analogies and so sometimes I’ll have to be a little translation。

like the fat file system doesn’t have a super block。 Okay。 So simple distributed file system。

clients issue remote procedure call， open a file。 Okay， read the contents of the file。

So we just simply translate those disc calls， read， open， elsie， write， flush and everything。

into remote procedure calls。 No cache。 The advantage here is the server gives this a consistent view。

right？ Because every request goes from my laptop all the way to the server and then comes back。

So multiple clients are doing the same thing。 The canonical copy is at the server。 But of course。

problems here in different formats。 Right？ Because every request I’m making goes across the network。

So I’m inferring latency。 I’m going to be limited by the bandwidth in the network。

I’m going to be limited by how many requests per second IO operations per second， the server。

can do。 So the server is going to be a bottleneck。

外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传

So we can add caches， right？ And so I’ll put a cache at the client and at the server。

So now some operations can be done locally and reduce the load on the server。

So now when I do a read， the read a value for say F1， I’ll cache it at the server， return。

back the value。 Multiple reads all served out of the cache local performance。 Right？

So that is an RPC， but it’s all local to the virtual file system。

But disadvantage is what happens if I do a write and a machine and then crack and lose， the data。

Right？ Another problem。 So maybe I can write code as an alternative。 Another problem is that。

so in this case， if I crack and I wrote it and I got the acknowledgement， I know it’s okay。

Another problem is cache consistency。 Right？ When this first post reads。

it’s going to get the old value not from move out。 It’s getting V1 instead of getting V2。 Right？

So there’s inconsistent。

外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传

So what happens if the server crashes？ Right？ Does the client wait until it comes back up？

Does it continue operating out of its cache？ If there were changes in the service cache。

were those in nonvolatile RAM instead of be， committed？ Or do they get lost？ Right？

What if there’s shared state across the remote procedure code？ Right？ Typically in UX， right。

I open a file and then I seek through that file。 Right？ There’s a file position。 I mean。

when we keep track of that file position in the phone， well， if I crash， the， server。

I’ve lost that seek。 So if I’m doing read， read， read， read， read from clients。

a crash or come back up， the， clients try to do another read。

we’re going to read from the beginning of the file。 We need to think about that。

What happens if we delete a file at the server and the server crashes before we get an acknowledgement？

So what we really want is a stateless protocol。 One of which everything that we need for that protocol is in that request。

We also want operations to be item hope。 So I can repeat them any number of times and I get the same result。

If I write 100 for memory， to a given number location， I can do that 10 times。

I can do that 100 times。 The end result is always going to be 100 in that number of times。

All right。 So in that case， when I find times out， I can just re-try an operation。

And there are items to be the result is going to be the same。

There are other examples of stateless protocols like HTTP， where we put a cookie that encodes。

our session state in that cookie。 All right。 So I want to talk about two file systems。

really quickly， NFS。 Three layers for NFS， Unix file system interface。

So this is the standard libc operation to open， close， see， and so on。 The VFS layer。

that’s the layer of indirection that tells which file system type we’re going， to。

And then there’s the NFS service， which implements the protocol。 The RPC encoding method。

we use the XDR representation。 So there’s a whole library for doing that。

And it implements all of the functions that we need， like being able to read and write， directories。

manipulate links， delete files， open files， close files， write， and so on。

NFS uses write group cache。 So when clients do a write。

that gets written all the way back to the server， we wait for， the acknowledgment。

So we lose some of the benefits of caching， but we know that when we do a write in NFS。

it goes to the server。 All right。 So now if we’re doing caching。

we need some way of figuring out what’s going on in the， cache。

and we’ll come back and divide it in just a second。 The servers themselves are stateless。

So everything in a request to the server， indeed， everything we need。

So that means it has to contain a position， not just simply read something from an open， file。

And in fact， we don’t have open and close because there’s no state in team at the server。

The request comes in， it’s just read from this I know， I number rather， at this location。

This number of bytes。 It’s item potent， so we can perform multiple requests multiple times。

So reading and writing， that’s pretty simple。 If you delete a file。

we can do that actually multiple times。 You can say， “Rm multiple times。”。

If the file doesn’t exist， the server is just going to say， "Hey， by the way， that file。

doesn’t exist。"， That’s okay。 Failures are where NFS has a problem。 There’s two options。

Server crashes。 What does the client do？ The other option is to find just wait。

But what if it takes a week for the server to come back up because we have to get a part？

Server is going to wait for the week。 The other alternative is that it returns an error。 Well。

what if the client doesn’t know about errors because the client’s written before。

NFS existed and so it expects them to write a file or read a file and get back the data。

So that can be a problem。 That’s why they allow you to have the blocking option is for old clients that don’t know。

how to deal with errors， you have them blocked。 But for modern clients。

they get the error and then they figure out how to convey that， to the user。 Okay。

So here’s the architecture。 And again， we have this VFS layer， but you make system calls。

They go through the VFS layer to the NFS client。 They get RPCed over to the server。

And then the server calls the VFS interface to the actual file system。

So I can have multiple different types of file systems at my NFL server。 Existency。

We do consistency with week consistency。 So client poll every three to 30 seconds。

ask the server as a value has a file change。 The answer is no。 Right。 The answer is yes。

Then the client will switch to using the new version。 So here it holds to say that one’s still okay。

It gets old。 No， it’s now the two。 And so now it’s a group。

If multiple clients are writing for the same file， NFS is a problem。 Because they’re independent。

you know， right， there’s no notion of any kind of blocking or， anything like that。

And so when one 62 used to operate off of a common file server， project group sometimes。

would use the same directory for everyone。 So you can imagine on the night the projects do everybody saving their C files and ending。

up the part like blocks would get in from different files。

So we told them when we separate the records。 Okay。

So if we think about the kind of ordering that we might want， we want some kind of sequential。

ordering constraint。 So what we’re going to say is if you started a write and finish that right before another。

one starts on another client， you get the new value。

If you start that read while the write is in progress， you could get either the old。

value or you get the new value。 Like， that’s what would happen on a local machine is you might get either the old or。

the new depending on what the order is。 So for NFS， if you start more than 30 seconds later。

you’re going to get the new version。 You start before that， you’ll get maybe the old version。

maybe some partial version。 It’s ill defined。 So NFS is super simple， highly portable。

the disadvantages sometimes have been consistent， right because of this whole employee。

And you have to keep checking。 Right。 So it’s like your little brother， a little sister， you know。

in the car asking， are we， there yet？ Are we there yet？ It’s really annoying really quickly。

And kind of if you have a lot of siblings， it overloads the service and the polling traffic。

can be a problem。 All right。 So I’m going to skip over NFS and go to summary。

So TCP gives us a reliable byte stream between two processes。

We’ve seen how we can use a window based protocol for acknowledgments and how we dynamically。

adapt the congestion。 RPC lets us make remote procedure calls that will be exactly like local procedure calls。

Lots of issues under the covers about marshalling and un-marshalling。

And distributing file systems give us transparent access to files stored on other machines。

We can use passion for performance and we enable all this with a virtual file system。

layer that gives us a level of interaction that allows us now to have a publicly file， system。

So this is why modern operating systems are able to support many different file systems。

simultaneously that you don’t have to change your application every time someone adds a。

new file system。 And then with the cache consumes consistency in NFS， it uses polling。

I think it has a chance to talk about Andrew， but the Andrew file system uses all that instead。

And so with that， we’ve covered lots of topics in this class。 I hope you have enjoyed this class。

I hope you’ve enjoyed the home works and the projects。 They haven’t been too challenging for you。

And I want to wish everybody good luck on midterm number three and say thank you for， taking 162。

Thanks and have a great summer also。 Bye。

外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传

[BLANK_AUDIO]。