SlideShare a Scribd company logo
System Design Handbook
System Design Basics ①
1)
Try to break the problem into simpler
modules ( Top down approach)
2) T
alk about the trade -
offs
( No solution is perfect)
calculate the impact on system based on
all the constraints and the end test cases .
←
Focus on interviewer 's
heproblem
] intentions .
T
Ask
Abstract
]
questions
( constraints &
Functionality
finding
Requirements) bottlenecks
idudas
System Design Basics ccontd ) ②
D Architectural pieces / resources available
2) How these resources work together
3) Utilization & Tradeoffs
-
consistent Hashing
-
CAP Theorem ✓
-
Load balancing ✓
-
queues
f- caching -
-
Replication
-
59L vs No -
SQL
I
-
Indexes ✓
-
Proxies
1 -
Data Partitioning
✓
Load Balancing
③
( Distributed system)
Types of distribution -
f
Random
Round -
robin
<
Random ( weights for memory
&
CPU
cycles)
To utilize full scalability &
redundancy ,
add 3 LB
D User ¥ web server
2) Web server ¥
App server 1 Cache Server
( Internal platform)
3) Internal platform DB .
#W
Client LB
er
DB
T ,, LB
Smart clients
Takes a pool of service hosts & balances load.
→ detects hosts that are not responsive
→
recovered hosts
→
addition of new hosts
Load balancing functionality to DB (cache. Service
*
Attractive solution for developers
( small scale systems)
As system grows
→ LBS ( standalone servers
)
Hardware load Balancers :
Expensive but high performance.
e.
g . Citrix Netscaler
Not trivial to configure.
Large companies tend to avoid this config .
or use it as 1st point or contact to their
system to serve user requests
&
Intra network uses smart clients / hybrid
solution → ( Next page) for
load balancing traffic .
Software Load Balancers
'
No pain of creation of smart client
No cost of purchasing dedicated hardware
[ hybrid approach
HA Proxy OSS Load balancer
-4
1)
Running on client machine
'
2
( locally bound port)
e -
g
.
local host : 9000
I
F-
managed by HA Proxy
( with efficient management
of requests on the port )
2) Running on intermediate server : Proxies
running beth
HA Proxy
[
Manages health checks
dirt server side components
removal & addition of machines
balances requests alc pools .
Wortdof
Databases
=
S9Lvs.NoS÷
iaa
1) structured D Unstructured .
2) Predefined schema 2) distributed
3) Data in rows & columns 3)
dynamic schema
Row One Entity Into
column Separate data points
mysar
¥'s::L:L: stores
Oracle
9sa¥fl wait:p:
'
:3;
DB
Postgres
MariaDB
an E o
e
r
÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷
O
e
ai
s I E
E cs O
S O
th s I 3 is
J
Os
e
to 6
of = G
IT u
E
s
O
U
'
E s O S
-
I
g O
f÷÷÷¥÷÷÷÷÷÷÷÷÷.
Et-
s o
U
te
O
G O
e e O
:
tr
- I
'
J S T
O
g
e
I E
E O or I
is
E E
ET
o o
① ± . ÷EE
or or o J
-
E e E E
s
is 3 Er er .
.
I 8 Ess E -3 o
o -
G o
OS u
c S s
d d
- E
E
OE s
w t
o E o o J E
=
O E E
G -
r d Id O is 8 e
E
t
E
°
i n u EE s E -
E -
E O
E s -
d s
gu -
so
§
n
E
d s
f O s . - c O G
e
u f O
d O O - 88
U
I 0
or 6 by us is •
u
w S S s s
o
- W s e
j d u
E
U
o
g
O s a ou ou N & si u
O
or 883
A
88 it's too >
u
s
I E
ch t g - o o
O
s 't ⑤
I
it
Cs cs T g J as a
Notre u
E
O o
-
v
. I d g.
E n
S
E KE u e I
o
O
→
of E s
o s 8 s I I 8 of to
E Z t
-
-
n y
U ou u @ at so .
-
In
O Es O s
6
w o @ Ed O t T
i N
T S e e E O
ng O w . - u
& .
o -
E U -
o E on s a •
R g
S w O -
z
-
O or O
U
N
b d
s D
J O s cry
to Nos y
U G - u
-
-
t O
u z > so I
'
E O
og
u
O si
w Dw or • ← we
t w
O
F
acs
n
-
o
o 8 3 u
f o
S
- T I Z o
's
- E u 06 I d.E es
°
6
§ 03
O
T
w
E s
if÷÷÷÷f÷÷÷i÷÷÷:ff÷÷:÷f÷÷i÷÷÷
G O z
S T
s 89 f-
od '
I O
U
G o T
u Es E o 8- or
#
O E D
s @ u w
u o
J as
6
'
I t
O L
O t u
'
Es
D O 6
g
y
U
d 62
g-
U
w u o
g
N s or
y es or
u
O >
u
d
U s
-
- d E nd
o b Eire E # E E
=
T I n
- ooo EEE -
o
'
E
O o
-
I •
as u
a
1- o
's x G
, +
as
I o
'
88 go.IT IT IT IT
-
w
s
q El if ¥4
. - e d
J o
I 0 8 u
s u g
Ev or
on us
or
Reasonstouses.cl#BJ
1) You need to ensure ACID compliance :
ACID compliance
Reduces anomalies
Protects integrity of the database .
for
many E -
commerce & financial app
"
→
ACID compliant DB
is the first choice .
2) Your data is structured &
unchanging .
If your business is not experiencing
rapid growth or sudden changes
→ No requirements of more servers
→ data is consistent
then there's no reason to use system design
to support variety of data &
high traffic .
Reasonstouse.NO#IB
When all other components of system are fast
→
querying &
searching for data bottleneck .
NoSQL prevent data from being bottleneck .
Big data large success for NoSQL.
1) To store large volumes of data C little Ino structure)
No limit on type of data.
Document DB Stores all data in one place
( No need of type of data)
2) Using cloud & storage to the fullest .
Excellent cost saving solution .
( Easy spread of data
across multiple servers to scale up
)
OR commodity hlw on site ( affordable , smaller )
No headache or additional Stw
& NoSQL DBS like Cassander designed to scale
across multiple data centers out of the box.
3) Useful for rapid 1 agile development.
If you're making quick iterations on schema
SQL will slow you down .
Achieved by
CAP-heore€
tenyC All nodes see same data
updating several nodes .
at same time)
before allowing
g)
reads
E
"
"
"
"
e.
%%%,
Not
g
pareieiontdera£
Availability
[cassandra . Couch DB]
Hr Hr
Every request gets System continues to work
response ( success ( failure) despite message loss ( partial
Achieved by replicating Failure .
data across different servers ( can sustain any amount
of network failure without
- resulting in failure of entire
Data is sufficiently replicated network )
across combination of nodes /
networks to keep the system up .
.in:6#eo:::i:::::ia::i:n:::ans'
We cannot build a datastore which is :
D
continually available
2)
sequentially consistent
3)
partition failure tolerant .
Because ,
To be consistent all nodes should see the same
set of updates in the same order
But if network suffers partition,
update in one partition might not make it to
other partitions
↳ client reads data from out-of-date partition
After having read from up-to-date partition .
Solutions stop serving requests from out -
of -
date
partition.
↳ service is no longer
100% available .
Redundancy&ReplicationJ
Duplication of critical data & services
↳
increasing reliability of system .
For critical services & data ensure that multiple
copies 1 versions are running simultaneously on different
servers 1 databases .
Secure against single node failures .
Provides backups if needed in crisis .
Deter Dow
Primary server secondary server
& Data
2
- - a
Ig Replication 2
Active data Mirrored data
# Service
Redundancy : shared -
nothing architecture.
Every node independent. No central service managing state .
]
More resilient ← New servers
←
Helps in
[ to failures addition without scalability
Nosing special conditions
Caching
Load balancing Scales horizontally
caching :
Locality of reference principle
I Used in almost every layer of
computing .
I Application Server cache :
Placing a cache directly on a
request layer node.
↳ Local storage of response
Requests
in€.÷.
miss
response data
# Caches on one Request layer made
T
catedy
✓
Memory (very fast) Node 's local disk
( faster than going to network storage)
# #
Bottleneck :
If LB distributes requests randomly
↳ same request different nodes
our:c!m£ More
d
by
D Global caches
2) Distributed caches
Distributed
Gimme
-
Divided
using consistent
hashing function
③ ⑤gµeautts.-€#
# #
Easy to increase cache space by adding more hordes
##
Disadvantage :
Resolving a missing node
staring multiple copies of I can be handled
by
data on
different hordes
likes making it more complicated .
# # Even
if node disappears
request can pull data from Origin.
flabellate
#
Single cache space for all the hacks .
↳
Adding a cache source I file store ( faster than original store
)
#
Difficult to manage if no
af clients I request increases -
effective if
Y
fixed dataset that needs to be cached
2)
special Hlw fast Ho .
# Forms of global cache :
GEE { %fbtae.mn?hdEtahhYm
Database
database
contains hot data at
Global
cache
%gz { Database
App
"
logic understands the eviction strategy that spoils
better than cache .
CDN : content Distribution network
4- Cache store for sites that saves
large amount
of static media .
if not
available
Request CDN -
Baek- End
if a
- server
available( L
local ( static Media)
storage
Lf the site isn't large enough to have its own CDN
4µA
transition
Some static media using separate subdomain
- ( static .
yaursuuice.com
using lightweight Nginse serves
↳ entrance DNS from your sauce
to a CDN later
Cache Invalidation
# Cached data needs to be coherent with the database
Lf data in DB modified invalidate the cached data .
# 3 schemes :
=
I Write -
through cache :
cache
Data is written
same time in
Data
DB hath cache & DB .
+
Complete data consistency C cache = DB)
+
Fault tolerance in case of failure Club data loss
)
-
high latency in writes 2 write operations
2)
Write around cache
Cache
Data
DB
+ No cache flooding foe writes
-
read request for newly written data miss
higher latency
d
B) Write back cache :
cache DB
Data
after some
£ interval as under
client
some sepciifird conditions
data is written to DB
from cache
+ law latency &
high throughput bae write -
intensive app
"
-
Data loss TT ( only one
copy
in caches
# Cache Eviction Policies
D FIFO
2) LIFO Ae FILO
3) LRV
4) MRU
5) LF U
Random Replacement
S harding 11 Data Partitioning
# Data
Partitioning :
splitting up DD I table across multiple
machines
manageability .
performance , availability & LB
**
After a certain scale paint ,
it is cheaper and more feasible
to scale
horizontally by adding money instead of
vertical
scaling by adding beefiness
# Methods of Partitioning :
1)
Horizontal Partitioning
:
Different rows into diff.
tables
Range based shading
e.
g .
staring locations by zip different
Table 1 :
Lips with L 100000
ranges in
Table 2 :
Lips with 7 Loo ooo
different tables
and so on
**
come if the value of the range not chosen
carefully
leads to unbalanced servers
e. g . Table I can have more data than table 2 .
Vertical Partitioning
# Feature wise distribution of data
↳ in
different servers .
e.
g.
Instagram - DB sauce 1 :
user info
DB sauce 2 : followers
DB server 3 :
photos
* A
straightforward to implement
* A
lane impact on app .
- -
if app
→ additional growth
need to partition feature specific DB across various sources
( e -
g. it would not be possible for a
single sewer to handle
all metadata queries for Lo billion photos by 140 mill.
users
Directory based partitioning
A
loosely coupled approach to work around issues
mentioned in above two partitioning .
** Create lookup service current partitioning scheme
& abstracts it
away from the DB access code.
Mapping l tuple key → D8 sauce)
Easy to add DD towers or
change partitioning scheme .
Partitioning Criteria
D
key or Hash based partitioning :
Kay atte-
af Hash
function →
Partition
the data number
#
Effectively fines the total number
of sauces 1partitions
So
if we add new source I partition T
o
change in hash function
downtime because
of
d
redistribution
↳
Solution :
consistent Hashing
2) List Partitioning : Each partition is
assigned a list of
values .
Nuo → hookup
record for
→
stare the record
key ( partition based on the key)
3) Round Robin
Partitioning :
uniform data distribution
with '
n .
partitions
the
'
is tuple is assigned to partition
(i mad n)
4
Composite Partitioning :
combination af above
partitioning schemes
flashing t List consistent Hashing
Hr
Hash reduces the
key space to a
size that can be listed .
# Common Problems
of Shouting :
Iharded DB :
Entree constraints on the diff .
operations
Hr
operations -
across multiple tables or
multiple rains in the same table 7
no
longer running
in
single severe.
" Jains A Denoumalizatiom :
Jains on tables on
single sauce straightforward.
* not feasible to
perform joins on shrouded tables
↳ Less efficient C data needs to be compiled from
multiple servers)
# Workaround Denarmalip the DB
so that the queries that
previously read.
jains can be
performed
from a
single table .
( coins Perils of denavmalizatiom
↳
data inconsistency
2)
Referential integrity
:
Foreign keys om shrouded D8
↳
difficult
*
Mast of the RDBMS does not support foreign keys on
stranded DB .
#
If app
"
-
demands referential integrity om shrouded DB
↳
enforce it in app
"
code C SOL
jobs to
clean up dangling references)
3)
Rebalancing :
Reasons to change sharking scheme :
a) horn -
uniform distribution C data wise )
b) non -
uniform laced balancing C request wise)
Workaround: Y add new DB
2) rebalance
↳
change in partitioning scheme
↳ data monument
{ ↳
downtime
We can use
directory -
based
partitioning
↳
highly complex
↳
single paint of failure
(
lookup service 1 table)
Indexes
Well known because -
of databases .
Improves speed of retrieval
-
Increased storage auerhead
-
Shauna writes
↳ write the data
↳
Update the index
Can be created
using one or more columns
*
Rapid random lookups
&
efficient access of ordered records .
# Data structure
column → Painter to whale raw
→ Create
different views of the -
same data .
↳
very good for filtering /
sorting of large data sets .
↳ no need to create additional copies.
# Used foe datasets ( TB in size) & small
payload ( KB)
I
spied over several
physical devices → We need some
way to find the correct
physical location i. e. Indexes
useful under high load situations
Peonies
-
if we have limitcdcaehing
↳ batches several requests into one
client
Backend
>
>
MY > sauce
>
Peony
client
>
Source
^
a
filters requests
-
log requests
transform
-
add / remain headers
encryption / decryption
frequently compression
used resources request co -
ordination
( request traffic optimization
T
we can also use
←
Collapse same data access
spatial locality request into one.
↳
collapsing requests collapsed forwarding
for data that is
spatially
Ipsf
minimize reads from -
origin .
Queues
Effectively manages requests in
large
-
scale distributed system
→
In small systems
→ writes are
fast .
→
In complex systems
→
high incoming load
4- individual writes take more time
*
To achieve high performance & availability
↳
system needs to be
asynchronous
↳
Queue
#
Synchronous behaviour →
degrades performance
d
ye
Load balancing
difficult for fair &
balanced distribution
server
C
, T,
-
T2
r
C,
-13
T ,
gag { ,
T2
↳
Queue
Cu
# Queues :
asynchronous communication protocol
↳ client sends task
↳
gets ACK
from queue lecccipt)
I
serves as
reference
for the results in
future
[ client continues its work .
# Limit on the
sispafeeguest
& number of requests in
queue
# Queue :
Provides fault -
tolerance
[
↳
protection from service
outage /
failure
highly robust
[
↳
retry failed service request
Enforces Quality of Service
guarantee
L Does NOT expose clients to outages)
# Queues :
distributed communication
↳
Open source implementations
↳ Rabbitma ,
Zoeoma ,
Active MQ ,
BeanstalkD .
Consistent Hashing
# Distributed Hash Table
index =
hash -
function C
key)
# Suppose we're
designing distributed
caching system
with n cache servers
↳ hash .
function (
key % n )
Drawbacks :
1) NOT
horizontally scalable
↳ addition of new server results in
↳
need to
change all
existing mapping.
( downtime of system)
2) NOT load balanced
l because -
af non -
uniform distribution of data )
1-
Some caches : hat & saturated
Other caches :
idk &
empty
How to tackle about problems ?
Consistent flashing
What is consistent Hashing ?
→
Very useful strategy for distributed caching & outs .
→
minimizes reorganization in
scaling up / dawn.
→
only kin keys needs to be remapped.
k total number of keys
n number of servers
How it works ?
Typical hash function suppose outputs in [ 0 .
2567
In consistent hashing ,
imagine all
of these integers are placed on a
ring .
255 0
254
,
• •
of
253
,
•
2
±
"
& we have 3 servers : A ,
B & C .
1)
Given a list
of servers ,
hash them to integers in the
range.
255 0
C A
B
2)
Map key to a serum :
a) Hash it to
single integer
b) Mane CLK wise until
you find Laura
c)
map key to that server .
255 0
hL key -
1)
of A
"
h (
key - 2)
'
B
Adding a new server
'
.
will result in
morning the
-
key -2
'
to 'D
255 0
hL key -
1)
C A
"
a
D
h (
key - 2)
'
B
Removing server IA
'
,
will result in
morning the
-
key-1
'
to II
255 O
h ( key -
1)
①
a
D
h (
key - 2)
'
B •
Consider real world scenario
data →
randomly distributed
↳ unbalanced cactus .
How to handle this issue ?
Virtual Replicas
Instead -
of mapping each node to a
single paint
we
map it to multiply paints .
↳ ( more number of replicas
↳ more
equal distribution
↳
good load
balancing)
255 0
D C
•
h ( key -
1)
A
a-
C
B
B
C
AD
D
h [ Key -
2)
^
A
B
Long -
Palling vs tikbsoekrts us Serves -
Sent Events
'
↳
Client -
Senior Communication Protocols
# HTTP Protocol :
request
NJ prepare Remorse
client
>
Serum
<
.
Response
# AJAX Patting :
Clients repeatedly palls servers for data
similar to HTTP protocol
↳
requests sent to screen at regular intervals (0.5sec
)
Drawbacks :

Client keeps asking the source now data
↳ Lot
of
uspomsgy.au
'
empty
'
↳ HTTP Overhead.
.
Request .
)
4
Response
Request.
>
Client a
response
sooner

.
Request >
<
Response
# HTTP Long Patting :
•
Hanging GET '
Sauce does NOT send empty response .
Pushes response to clients only when new data is available
☐
Client makes HTTP
Request 4- waits for the response .
2) Server delays response until update is available
or until time-out occurs.
3) When update → server sends full response.
4) Client sends now
long-
poll request
a)
immediately after receiving response
d)
after a pause to allow acceptable latency period
5) Each request has timeout.
Client needs to reconnect periodically due to timeouts
LP Request .
7
<
My
full Response
LP
Request.
>
Client <
.
full Response
Showa
LP Request >
a
full Response
Wet Sockets
→
duplex communication channel over
single TCP connection .
→ Provides
'
persistent communication
'
( client a serum can send data at anytime]
→
bidirectional communication in always open channel .
pep
socket
Handshake
Request
Handshake Success
Response
<
-
:
is i.
Client <
'
.
Source
.
Communication
}
channel
→ Lower overheads
→
Real time data transfer
Server -
sent Events ( SSE)
client establishes persistent & long-
term connection with sauna
server uses this connection to send data to client
* *
If client wants to send data to server
↳
Requires another technology / protocol .
data request
using regular HTTP
v
'
ing
=
i.
Client < .
Source
•
Always -
open unidirectional
<
communication
< .
channel
-
responses whenever now data available
→
best when we need real -
time data from soever to client
OR server is
generating data in a loop &
will be sending multiple events to the client .

More Related Content

PPTX
Web browser architecture
Nguyen Quang
 
PPTX
Unit - 1: ASP.NET Basic
KALIDHASANR
 
PPTX
Server Side Programming
Milan Thapa
 
PPT
introduction to Web system
hashim102
 
PPTX
Database systems - Chapter 1
shahab3
 
PPTX
Synchronous vs Asynchronous Programming
jeetendra mandal
 
PPTX
Client & server side scripting
baabtra.com - No. 1 supplier of quality freshers
 
PDF
parallel Questions &amp; answers
Md. Mashiur Rahman
 
Web browser architecture
Nguyen Quang
 
Unit - 1: ASP.NET Basic
KALIDHASANR
 
Server Side Programming
Milan Thapa
 
introduction to Web system
hashim102
 
Database systems - Chapter 1
shahab3
 
Synchronous vs Asynchronous Programming
jeetendra mandal
 
Client & server side scripting
baabtra.com - No. 1 supplier of quality freshers
 
parallel Questions &amp; answers
Md. Mashiur Rahman
 

Similar to System Design.pdf (20)

PDF
System Design Basics by Pratyush Majumdar
Pratyush Majumdar
 
PDF
Dealing with Enterprise Level Data
Mike Crabb
 
PPT
SQL or NoSQL, that is the question!
Andraz Tori
 
PPT
Key Challenges in Cloud Computing and How Yahoo! is Approaching Them
Yahoo Developer Network
 
PDF
Scalability Considerations
Navid Malek
 
PPTX
Black Friday and Cyber Monday- Best Practices for Your E-Commerce Database
Tim Vaillancourt
 
PPS
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
Cal Henderson
 
PPTX
Scalable Web Architecture and Distributed Systems
hyun soomyung
 
PDF
Tech Winter Break @gdgkiit | System Design Essentials
Pragati Das
 
PDF
Scale from zero to millions of users.pdf
NedyalkoKarabadzhako
 
PPTX
Scaling your website
Alejandro Marcu
 
PDF
Scalable, good, cheap
Marc Cluet
 
ODP
Front Range PHP NoSQL Databases
Jon Meredith
 
PPS
Web20expo Scalable Web Arch
guest18a0f1
 
PPS
Web20expo Scalable Web Arch
royans
 
PPS
Web20expo Scalable Web Arch
mclee
 
PPS
Scalable Web Architectures - Common Patterns & Approaches
Cal Henderson
 
PPS
Scalable Web Arch
royans
 
PDF
Distributed Systems: scalability and high availability
Renato Lucindo
 
PPTX
Cистема распределенного, масштабируемого и высоконадежного хранения данных дл...
Ontico
 
System Design Basics by Pratyush Majumdar
Pratyush Majumdar
 
Dealing with Enterprise Level Data
Mike Crabb
 
SQL or NoSQL, that is the question!
Andraz Tori
 
Key Challenges in Cloud Computing and How Yahoo! is Approaching Them
Yahoo Developer Network
 
Scalability Considerations
Navid Malek
 
Black Friday and Cyber Monday- Best Practices for Your E-Commerce Database
Tim Vaillancourt
 
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
Cal Henderson
 
Scalable Web Architecture and Distributed Systems
hyun soomyung
 
Tech Winter Break @gdgkiit | System Design Essentials
Pragati Das
 
Scale from zero to millions of users.pdf
NedyalkoKarabadzhako
 
Scaling your website
Alejandro Marcu
 
Scalable, good, cheap
Marc Cluet
 
Front Range PHP NoSQL Databases
Jon Meredith
 
Web20expo Scalable Web Arch
guest18a0f1
 
Web20expo Scalable Web Arch
royans
 
Web20expo Scalable Web Arch
mclee
 
Scalable Web Architectures - Common Patterns & Approaches
Cal Henderson
 
Scalable Web Arch
royans
 
Distributed Systems: scalability and high availability
Renato Lucindo
 
Cистема распределенного, масштабируемого и высоконадежного хранения данных дл...
Ontico
 
Ad

More from JitendraYadav351971 (20)

PDF
computer Graphic.pdf
JitendraYadav351971
 
PDF
sql notes.pdf
JitendraYadav351971
 
PDF
CSS notes.pdf
JitendraYadav351971
 
PDF
core java notes.pdf
JitendraYadav351971
 
PDF
Api Testing.pdf
JitendraYadav351971
 
PDF
ptyhon notes.pdf
JitendraYadav351971
 
PDF
SQL question.pdf
JitendraYadav351971
 
PDF
DSA notes.pdf
JitendraYadav351971
 
PDF
Mongo DB.pdf
JitendraYadav351971
 
PDF
c language.pdf
JitendraYadav351971
 
PDF
Interview Questions.pdf
JitendraYadav351971
 
PDF
OOPS Interview Questions.pdf
JitendraYadav351971
 
PDF
HTML.pdf
JitendraYadav351971
 
PDF
OOPs TOP Question for interview.pdf
JitendraYadav351971
 
PDF
Boost Frontend.pdf
JitendraYadav351971
 
PDF
java notes.pdf
JitendraYadav351971
 
PDF
Thermodynamics.pdf
JitendraYadav351971
 
PDF
C programming.pdf
JitendraYadav351971
 
PDF
Basics of HTML.pdf
JitendraYadav351971
 
PDF
Metaverse.pdf
JitendraYadav351971
 
computer Graphic.pdf
JitendraYadav351971
 
sql notes.pdf
JitendraYadav351971
 
CSS notes.pdf
JitendraYadav351971
 
core java notes.pdf
JitendraYadav351971
 
Api Testing.pdf
JitendraYadav351971
 
ptyhon notes.pdf
JitendraYadav351971
 
SQL question.pdf
JitendraYadav351971
 
DSA notes.pdf
JitendraYadav351971
 
Mongo DB.pdf
JitendraYadav351971
 
c language.pdf
JitendraYadav351971
 
Interview Questions.pdf
JitendraYadav351971
 
OOPS Interview Questions.pdf
JitendraYadav351971
 
OOPs TOP Question for interview.pdf
JitendraYadav351971
 
Boost Frontend.pdf
JitendraYadav351971
 
java notes.pdf
JitendraYadav351971
 
Thermodynamics.pdf
JitendraYadav351971
 
C programming.pdf
JitendraYadav351971
 
Basics of HTML.pdf
JitendraYadav351971
 
Metaverse.pdf
JitendraYadav351971
 
Ad

Recently uploaded (20)

PDF
Unit I Part II.pdf : Security Fundamentals
Dr. Madhuri Jawale
 
PDF
2025 Laurence Sigler - Advancing Decision Support. Content Management Ecommer...
Francisco Javier Mora Serrano
 
PDF
67243-Cooling and Heating & Calculation.pdf
DHAKA POLYTECHNIC
 
PDF
Machine Learning All topics Covers In This Single Slides
AmritTiwari19
 
PPTX
Victory Precisions_Supplier Profile.pptx
victoryprecisions199
 
PDF
Introduction to Ship Engine Room Systems.pdf
Mahmoud Moghtaderi
 
DOCX
SAR - EEEfdfdsdasdsdasdasdasdasdasdasdasda.docx
Kanimozhi676285
 
PDF
67243-Cooling and Heating & Calculation.pdf
DHAKA POLYTECHNIC
 
PPTX
22PCOAM21 Session 1 Data Management.pptx
Guru Nanak Technical Institutions
 
PPTX
Civil Engineering Practices_BY Sh.JP Mishra 23.09.pptx
bineetmishra1990
 
PPTX
Information Retrieval and Extraction - Module 7
premSankar19
 
PDF
2010_Book_EnvironmentalBioengineering (1).pdf
EmilianoRodriguezTll
 
PDF
Cryptography and Information :Security Fundamentals
Dr. Madhuri Jawale
 
PDF
20ME702-Mechatronics-UNIT-1,UNIT-2,UNIT-3,UNIT-4,UNIT-5, 2025-2026
Mohanumar S
 
PDF
Zero carbon Building Design Guidelines V4
BassemOsman1
 
PDF
AI-Driven IoT-Enabled UAV Inspection Framework for Predictive Maintenance and...
ijcncjournal019
 
PPT
Understanding the Key Components and Parts of a Drone System.ppt
Siva Reddy
 
PDF
Packaging Tips for Stainless Steel Tubes and Pipes
heavymetalsandtubes
 
PPTX
Chapter_Seven_Construction_Reliability_Elective_III_Msc CM
SubashKumarBhattarai
 
PPTX
MULTI LEVEL DATA TRACKING USING COOJA.pptx
dollysharma12ab
 
Unit I Part II.pdf : Security Fundamentals
Dr. Madhuri Jawale
 
2025 Laurence Sigler - Advancing Decision Support. Content Management Ecommer...
Francisco Javier Mora Serrano
 
67243-Cooling and Heating & Calculation.pdf
DHAKA POLYTECHNIC
 
Machine Learning All topics Covers In This Single Slides
AmritTiwari19
 
Victory Precisions_Supplier Profile.pptx
victoryprecisions199
 
Introduction to Ship Engine Room Systems.pdf
Mahmoud Moghtaderi
 
SAR - EEEfdfdsdasdsdasdasdasdasdasdasdasda.docx
Kanimozhi676285
 
67243-Cooling and Heating & Calculation.pdf
DHAKA POLYTECHNIC
 
22PCOAM21 Session 1 Data Management.pptx
Guru Nanak Technical Institutions
 
Civil Engineering Practices_BY Sh.JP Mishra 23.09.pptx
bineetmishra1990
 
Information Retrieval and Extraction - Module 7
premSankar19
 
2010_Book_EnvironmentalBioengineering (1).pdf
EmilianoRodriguezTll
 
Cryptography and Information :Security Fundamentals
Dr. Madhuri Jawale
 
20ME702-Mechatronics-UNIT-1,UNIT-2,UNIT-3,UNIT-4,UNIT-5, 2025-2026
Mohanumar S
 
Zero carbon Building Design Guidelines V4
BassemOsman1
 
AI-Driven IoT-Enabled UAV Inspection Framework for Predictive Maintenance and...
ijcncjournal019
 
Understanding the Key Components and Parts of a Drone System.ppt
Siva Reddy
 
Packaging Tips for Stainless Steel Tubes and Pipes
heavymetalsandtubes
 
Chapter_Seven_Construction_Reliability_Elective_III_Msc CM
SubashKumarBhattarai
 
MULTI LEVEL DATA TRACKING USING COOJA.pptx
dollysharma12ab
 

System Design.pdf

  • 2. System Design Basics ① 1) Try to break the problem into simpler modules ( Top down approach) 2) T alk about the trade - offs ( No solution is perfect) calculate the impact on system based on all the constraints and the end test cases . ← Focus on interviewer 's heproblem ] intentions . T Ask Abstract ] questions ( constraints & Functionality finding Requirements) bottlenecks idudas
  • 3. System Design Basics ccontd ) ② D Architectural pieces / resources available 2) How these resources work together 3) Utilization & Tradeoffs - consistent Hashing - CAP Theorem ✓ - Load balancing ✓ - queues f- caching - - Replication - 59L vs No - SQL I - Indexes ✓ - Proxies 1 - Data Partitioning ✓
  • 4. Load Balancing ③ ( Distributed system) Types of distribution - f Random Round - robin < Random ( weights for memory & CPU cycles) To utilize full scalability & redundancy , add 3 LB D User ¥ web server 2) Web server ¥ App server 1 Cache Server ( Internal platform) 3) Internal platform DB . #W Client LB er DB T ,, LB
  • 5. Smart clients Takes a pool of service hosts & balances load. → detects hosts that are not responsive → recovered hosts → addition of new hosts Load balancing functionality to DB (cache. Service * Attractive solution for developers ( small scale systems) As system grows → LBS ( standalone servers ) Hardware load Balancers : Expensive but high performance. e. g . Citrix Netscaler Not trivial to configure. Large companies tend to avoid this config . or use it as 1st point or contact to their system to serve user requests & Intra network uses smart clients / hybrid solution → ( Next page) for load balancing traffic .
  • 6. Software Load Balancers ' No pain of creation of smart client No cost of purchasing dedicated hardware [ hybrid approach HA Proxy OSS Load balancer -4 1) Running on client machine ' 2 ( locally bound port) e - g . local host : 9000 I F- managed by HA Proxy ( with efficient management of requests on the port ) 2) Running on intermediate server : Proxies running beth HA Proxy [ Manages health checks dirt server side components removal & addition of machines balances requests alc pools .
  • 7. Wortdof Databases = S9Lvs.NoS÷ iaa 1) structured D Unstructured . 2) Predefined schema 2) distributed 3) Data in rows & columns 3) dynamic schema Row One Entity Into column Separate data points mysar ¥'s::L:L: stores Oracle 9sa¥fl wait:p: ' :3; DB Postgres MariaDB
  • 8. an E o e r ÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷ O e ai s I E E cs O S O th s I 3 is J Os e to 6 of = G IT u E s O U ' E s O S - I g O f÷÷÷¥÷÷÷÷÷÷÷÷÷. Et- s o U te O G O e e O : tr - I ' J S T O g e I E E O or I is E E ET o o ① ± . ÷EE or or o J - E e E E s
  • 9. is 3 Er er . . I 8 Ess E -3 o o - G o OS u c S s d d - E E OE s w t o E o o J E = O E E G - r d Id O is 8 e E t E ° i n u EE s E - E - E O E s - d s gu - so § n E d s f O s . - c O G e u f O d O O - 88 U I 0 or 6 by us is • u w S S s s o - W s e j d u E U o g O s a ou ou N & si u O or 883 A 88 it's too > u s I E ch t g - o o O s 't ⑤ I it Cs cs T g J as a Notre u E O o - v . I d g. E n S E KE u e I o O → of E s o s 8 s I I 8 of to E Z t - - n y U ou u @ at so . - In O Es O s 6 w o @ Ed O t T i N T S e e E O ng O w . - u & . o - E U - o E on s a • R g S w O - z - O or O U N b d s D J O s cry to Nos y U G - u - - t O u z > so I ' E O og u O si w Dw or • ← we t w O F acs n - o o 8 3 u f o S - T I Z o 's - E u 06 I d.E es ° 6 § 03 O T w E s if÷÷÷÷f÷÷÷i÷÷÷:ff÷÷:÷f÷÷i÷÷÷ G O z S T s 89 f- od ' I O U G o T u Es E o 8- or # O E D s @ u w u o J as 6 ' I t O L O t u ' Es D O 6 g y U d 62 g- U w u o g N s or y es or u O > u d U s - - d E nd o b Eire E # E E = T I n - ooo EEE - o ' E O o - I • as u a 1- o 's x G , + as I o ' 88 go.IT IT IT IT - w s q El if ¥4 . - e d J o I 0 8 u s u g Ev or on us or
  • 10. Reasonstouses.cl#BJ 1) You need to ensure ACID compliance : ACID compliance Reduces anomalies Protects integrity of the database . for many E - commerce & financial app " → ACID compliant DB is the first choice . 2) Your data is structured & unchanging . If your business is not experiencing rapid growth or sudden changes → No requirements of more servers → data is consistent then there's no reason to use system design to support variety of data & high traffic .
  • 11. Reasonstouse.NO#IB When all other components of system are fast → querying & searching for data bottleneck . NoSQL prevent data from being bottleneck . Big data large success for NoSQL. 1) To store large volumes of data C little Ino structure) No limit on type of data. Document DB Stores all data in one place ( No need of type of data) 2) Using cloud & storage to the fullest . Excellent cost saving solution . ( Easy spread of data across multiple servers to scale up ) OR commodity hlw on site ( affordable , smaller ) No headache or additional Stw & NoSQL DBS like Cassander designed to scale across multiple data centers out of the box. 3) Useful for rapid 1 agile development. If you're making quick iterations on schema SQL will slow you down .
  • 12. Achieved by CAP-heore€ tenyC All nodes see same data updating several nodes . at same time) before allowing g) reads E " " " " e. %%%, Not g pareieiontdera£ Availability [cassandra . Couch DB] Hr Hr Every request gets System continues to work response ( success ( failure) despite message loss ( partial Achieved by replicating Failure . data across different servers ( can sustain any amount of network failure without - resulting in failure of entire Data is sufficiently replicated network ) across combination of nodes / networks to keep the system up . .in:6#eo:::i:::::ia::i:n:::ans'
  • 13. We cannot build a datastore which is : D continually available 2) sequentially consistent 3) partition failure tolerant . Because , To be consistent all nodes should see the same set of updates in the same order But if network suffers partition, update in one partition might not make it to other partitions ↳ client reads data from out-of-date partition After having read from up-to-date partition . Solutions stop serving requests from out - of - date partition. ↳ service is no longer 100% available .
  • 14. Redundancy&ReplicationJ Duplication of critical data & services ↳ increasing reliability of system . For critical services & data ensure that multiple copies 1 versions are running simultaneously on different servers 1 databases . Secure against single node failures . Provides backups if needed in crisis . Deter Dow Primary server secondary server & Data 2 - - a Ig Replication 2 Active data Mirrored data # Service Redundancy : shared - nothing architecture. Every node independent. No central service managing state . ] More resilient ← New servers ← Helps in [ to failures addition without scalability Nosing special conditions
  • 15. Caching Load balancing Scales horizontally caching : Locality of reference principle I Used in almost every layer of computing . I Application Server cache : Placing a cache directly on a request layer node. ↳ Local storage of response Requests in€.÷. miss response data # Caches on one Request layer made T catedy ✓ Memory (very fast) Node 's local disk ( faster than going to network storage) # # Bottleneck : If LB distributes requests randomly ↳ same request different nodes our:c!m£ More d by D Global caches 2) Distributed caches
  • 16. Distributed Gimme - Divided using consistent hashing function ③ ⑤gµeautts.-€# # # Easy to increase cache space by adding more hordes ## Disadvantage : Resolving a missing node staring multiple copies of I can be handled by data on different hordes likes making it more complicated . # # Even if node disappears request can pull data from Origin.
  • 17. flabellate # Single cache space for all the hacks . ↳ Adding a cache source I file store ( faster than original store ) # Difficult to manage if no af clients I request increases - effective if Y fixed dataset that needs to be cached 2) special Hlw fast Ho . # Forms of global cache : GEE { %fbtae.mn?hdEtahhYm Database database contains hot data at Global cache %gz { Database App " logic understands the eviction strategy that spoils better than cache .
  • 18. CDN : content Distribution network 4- Cache store for sites that saves large amount of static media . if not available Request CDN - Baek- End if a - server available( L local ( static Media) storage Lf the site isn't large enough to have its own CDN 4µA transition Some static media using separate subdomain - ( static . yaursuuice.com using lightweight Nginse serves ↳ entrance DNS from your sauce to a CDN later
  • 19. Cache Invalidation # Cached data needs to be coherent with the database Lf data in DB modified invalidate the cached data . # 3 schemes : = I Write - through cache : cache Data is written same time in Data DB hath cache & DB . + Complete data consistency C cache = DB) + Fault tolerance in case of failure Club data loss ) - high latency in writes 2 write operations 2) Write around cache Cache Data DB + No cache flooding foe writes - read request for newly written data miss higher latency d
  • 20. B) Write back cache : cache DB Data after some £ interval as under client some sepciifird conditions data is written to DB from cache + law latency & high throughput bae write - intensive app " - Data loss TT ( only one copy in caches # Cache Eviction Policies D FIFO 2) LIFO Ae FILO 3) LRV 4) MRU 5) LF U Random Replacement
  • 21. S harding 11 Data Partitioning # Data Partitioning : splitting up DD I table across multiple machines manageability . performance , availability & LB ** After a certain scale paint , it is cheaper and more feasible to scale horizontally by adding money instead of vertical scaling by adding beefiness # Methods of Partitioning : 1) Horizontal Partitioning : Different rows into diff. tables Range based shading e. g . staring locations by zip different Table 1 : Lips with L 100000 ranges in Table 2 : Lips with 7 Loo ooo different tables and so on ** come if the value of the range not chosen carefully leads to unbalanced servers e. g . Table I can have more data than table 2 .
  • 22. Vertical Partitioning # Feature wise distribution of data ↳ in different servers . e. g. Instagram - DB sauce 1 : user info DB sauce 2 : followers DB server 3 : photos * A straightforward to implement * A lane impact on app . - - if app → additional growth need to partition feature specific DB across various sources ( e - g. it would not be possible for a single sewer to handle all metadata queries for Lo billion photos by 140 mill. users Directory based partitioning A loosely coupled approach to work around issues mentioned in above two partitioning . ** Create lookup service current partitioning scheme & abstracts it away from the DB access code. Mapping l tuple key → D8 sauce) Easy to add DD towers or change partitioning scheme .
  • 23. Partitioning Criteria D key or Hash based partitioning : Kay atte- af Hash function → Partition the data number # Effectively fines the total number of sauces 1partitions So if we add new source I partition T o change in hash function downtime because of d redistribution ↳ Solution : consistent Hashing 2) List Partitioning : Each partition is assigned a list of values . Nuo → hookup record for → stare the record key ( partition based on the key)
  • 24. 3) Round Robin Partitioning : uniform data distribution with ' n . partitions the ' is tuple is assigned to partition (i mad n) 4 Composite Partitioning : combination af above partitioning schemes flashing t List consistent Hashing Hr Hash reduces the key space to a size that can be listed . # Common Problems of Shouting : Iharded DB : Entree constraints on the diff . operations Hr operations - across multiple tables or multiple rains in the same table 7 no longer running in single severe.
  • 25. " Jains A Denoumalizatiom : Jains on tables on single sauce straightforward. * not feasible to perform joins on shrouded tables ↳ Less efficient C data needs to be compiled from multiple servers) # Workaround Denarmalip the DB so that the queries that previously read. jains can be performed from a single table . ( coins Perils of denavmalizatiom ↳ data inconsistency 2) Referential integrity : Foreign keys om shrouded D8 ↳ difficult * Mast of the RDBMS does not support foreign keys on stranded DB . # If app " - demands referential integrity om shrouded DB ↳ enforce it in app " code C SOL jobs to clean up dangling references)
  • 26. 3) Rebalancing : Reasons to change sharking scheme : a) horn - uniform distribution C data wise ) b) non - uniform laced balancing C request wise) Workaround: Y add new DB 2) rebalance ↳ change in partitioning scheme ↳ data monument { ↳ downtime We can use directory - based partitioning ↳ highly complex ↳ single paint of failure ( lookup service 1 table)
  • 27. Indexes Well known because - of databases . Improves speed of retrieval - Increased storage auerhead - Shauna writes ↳ write the data ↳ Update the index Can be created using one or more columns * Rapid random lookups & efficient access of ordered records . # Data structure column → Painter to whale raw → Create different views of the - same data . ↳ very good for filtering / sorting of large data sets . ↳ no need to create additional copies. # Used foe datasets ( TB in size) & small payload ( KB) I spied over several physical devices → We need some way to find the correct physical location i. e. Indexes
  • 28. useful under high load situations Peonies - if we have limitcdcaehing ↳ batches several requests into one client Backend > > MY > sauce > Peony client > Source ^ a filters requests - log requests transform - add / remain headers encryption / decryption frequently compression used resources request co - ordination ( request traffic optimization T we can also use ← Collapse same data access spatial locality request into one. ↳ collapsing requests collapsed forwarding for data that is spatially Ipsf minimize reads from - origin .
  • 29. Queues Effectively manages requests in large - scale distributed system → In small systems → writes are fast . → In complex systems → high incoming load 4- individual writes take more time * To achieve high performance & availability ↳ system needs to be asynchronous ↳ Queue # Synchronous behaviour → degrades performance d ye Load balancing difficult for fair & balanced distribution server C , T, - T2 r C, -13 T , gag { , T2 ↳ Queue Cu
  • 30. # Queues : asynchronous communication protocol ↳ client sends task ↳ gets ACK from queue lecccipt) I serves as reference for the results in future [ client continues its work . # Limit on the sispafeeguest & number of requests in queue # Queue : Provides fault - tolerance [ ↳ protection from service outage / failure highly robust [ ↳ retry failed service request Enforces Quality of Service guarantee L Does NOT expose clients to outages) # Queues : distributed communication ↳ Open source implementations ↳ Rabbitma , Zoeoma , Active MQ , BeanstalkD .
  • 31. Consistent Hashing # Distributed Hash Table index = hash - function C key) # Suppose we're designing distributed caching system with n cache servers ↳ hash . function ( key % n ) Drawbacks : 1) NOT horizontally scalable ↳ addition of new server results in ↳ need to change all existing mapping. ( downtime of system) 2) NOT load balanced l because - af non - uniform distribution of data ) 1- Some caches : hat & saturated Other caches : idk & empty How to tackle about problems ? Consistent flashing
  • 32. What is consistent Hashing ? → Very useful strategy for distributed caching & outs . → minimizes reorganization in scaling up / dawn. → only kin keys needs to be remapped. k total number of keys n number of servers How it works ? Typical hash function suppose outputs in [ 0 . 2567 In consistent hashing , imagine all of these integers are placed on a ring . 255 0 254 , • • of 253 , • 2 ± " & we have 3 servers : A , B & C .
  • 33. 1) Given a list of servers , hash them to integers in the range. 255 0 C A B 2) Map key to a serum : a) Hash it to single integer b) Mane CLK wise until you find Laura c) map key to that server . 255 0 hL key - 1) of A " h ( key - 2) ' B
  • 34. Adding a new server ' . will result in morning the - key -2 ' to 'D 255 0 hL key - 1) C A " a D h ( key - 2) ' B Removing server IA ' , will result in morning the - key-1 ' to II 255 O h ( key - 1) ① a D h ( key - 2) ' B •
  • 35. Consider real world scenario data → randomly distributed ↳ unbalanced cactus . How to handle this issue ? Virtual Replicas Instead - of mapping each node to a single paint we map it to multiply paints . ↳ ( more number of replicas ↳ more equal distribution ↳ good load balancing) 255 0 D C • h ( key - 1) A a- C B B C AD D h [ Key - 2) ^ A B
  • 36. Long - Palling vs tikbsoekrts us Serves - Sent Events ' ↳ Client - Senior Communication Protocols # HTTP Protocol : request NJ prepare Remorse client > Serum < . Response # AJAX Patting : Clients repeatedly palls servers for data similar to HTTP protocol ↳ requests sent to screen at regular intervals (0.5sec ) Drawbacks : Client keeps asking the source now data ↳ Lot of uspomsgy.au ' empty ' ↳ HTTP Overhead. . Request . ) 4 Response Request. > Client a response sooner . Request > < Response
  • 37. # HTTP Long Patting : • Hanging GET ' Sauce does NOT send empty response . Pushes response to clients only when new data is available ☐ Client makes HTTP Request 4- waits for the response . 2) Server delays response until update is available or until time-out occurs. 3) When update → server sends full response. 4) Client sends now long- poll request a) immediately after receiving response d) after a pause to allow acceptable latency period 5) Each request has timeout. Client needs to reconnect periodically due to timeouts LP Request . 7 < My full Response LP Request. > Client < . full Response Showa LP Request > a full Response
  • 38. Wet Sockets → duplex communication channel over single TCP connection . → Provides ' persistent communication ' ( client a serum can send data at anytime] → bidirectional communication in always open channel . pep socket Handshake Request Handshake Success Response < - : is i. Client < ' . Source . Communication } channel → Lower overheads → Real time data transfer
  • 39. Server - sent Events ( SSE) client establishes persistent & long- term connection with sauna server uses this connection to send data to client * * If client wants to send data to server ↳ Requires another technology / protocol . data request using regular HTTP v ' ing = i. Client < . Source • Always - open unidirectional < communication < . channel - responses whenever now data available → best when we need real - time data from soever to client OR server is generating data in a loop & will be sending multiple events to the client .