SlideShare a Scribd company logo
Distributed Systems + NodeJS
Bruno Bossola
TEL AVIV 30 MARCH 2017
@bbossola
@bbossola
Whoami
● Developer since 1988
● XP Coach 2000+
● Co-founder of JUG Torino
● Java Champion since 2005
● I live in London, love the weather...
@bbossola
Agenda
● Distributed programming
● How does it work, what does it mean
● The CAP theorem
● CAP explained with live demos!
– CA system using two phase commit
– AP system using sloppy quorums
– CP system using majority quorums
● Q&A
This is a 30 minutes
version of the original
120mins presentation!!
0 min.
10 mins.
20 mins.
@bbossola
Distributed programming
● Do we need it?
@bbossola
Distributed programming
● Any system should deal with two tasks:
– Storage
– Computation
● We will look here at the storage part of the equation. We need. three basic
properties:
– Scalability
– Availability
– Consistency
@bbossola
Scalability
● The ability of a system/network/process to:
– handle a growing amount of work
– be enlarged to accommodate new growth
A scalable system continue to meet the needs of its users as the scale increase
clipart courtesy of openclipart.orgclipart courtesy of openclipart.org
@bbossola
How do we scale? partitioning
● Slice the dataset into smaller independent sets
● reduces the impact of dataset growth
– improves performance by limiting the amount of data to be examined
– improves availability by the ability of partitions to fail indipendently
@bbossola
How do we scale? partitioning
● But can also be a source of problems
– what happens if a partition become unavailable?
– what if It becomes slower?
– what if it becomes unresponsive?
clipart courtesy of openclipart.org
@bbossola
How do we scale? replication
● Copies of the same data on multiple machines
● Benefits:
– allows more servers to take part in the computation
– improves performance by making additional computing power and bandwidth
– improves availability by creating copy of the data
@bbossola
How do we scale? replication
● But it's also a source of problems
– there are independent copies of the data
– need to be kept in sync on multiple machines
● Your system must follow a consistency model
v4 v4
v8
v8 v4 v5
v7
v8
clipart courtesy of openclipart.org
@bbossola
Availability
● The proportion of time a system is in functioning conditions
● The system is fault-tolerant
– the ability of your system to behave in a well defined manner once a fault
occurs
● All clients can always read and write
– In distributed systems this
is achieved by redundancy
clipart courtesy of openclipart.org
@bbossola
Consistency
● Any read on a data item X returns a value corresponding to the result of the
most recent write on X.
● Each client always has the same view of the data
● Also know as “Strong Consistency”
clipart courtesy of cliparts.co
@bbossola
Consistency flavours
● Strong consistency
– every replica sees every update in the same order.
– no two replicas may have different values at the same time.
● Weak consistency
– every replica will see every update, but possibly in different orders.
● Eventual consistency
– every replica will eventually see every update and will eventually agree on
all values.
@bbossola
The CAP theorem
CONSISTENCY AVAILABILITY
PARTITION
TOLERANCE
@bbossola
The CAP theorem
● You cannot have all :(
● You can select two properties at
once
Sorry, this has been mathematically proven and no, has not been debunked.
@bbossola
The CAP theorem
CA systems!
● You selected consistency and
availability!
● Strict quorum protocols (two/multi
phase commit)
● Most RDBMS
Hey! A network partition will
f**k you up good!
@bbossola
The CAP theorem
AP systems!
● You selected availability and
partition tolerance!
● Sloppy quorums and conflict
resolution protocols
● Amazon Dynamo, Riak, Cassandra
@bbossola
The CAP theorem
CP systems!
● You selected consistency and
partition tolerance!
● Majority quorum protocols (paxos,
raft, zab)
● Apache Zookeeper, Google
Spanner
@bbossola
NodeJS time!
● Let's write our brand new key value store
● We will code all three different flavours
● We will have many nodes, fully replicated
● No sharding
● We will kill servers!
● We will trigger network
partitions!
– (no worries. it's a simulation!)
clipart courtesy of cliparts.co
@bbossola
CA key-value store
● Uses classic two-phase commit
● Works like a local system
● Not partition tolerant
● Classic RDMBS (strict quorum)
@bbossola
AP key-value store
● Eventually consistent design
● Prioritizes availability over consistency
● Dynamo, Riak, Cassandra (sloppy quorum)
@bbossola
CP key-value store
● Uses majority quorum (raft)
● Guarantees eventual consistency
● Zookeper, Spanner (majority quorum)
@bbossola
Q&A
Amazon Dynamo:
http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html
The RAFT consensus algorithm:
https://raft.github.io/
http://thesecretlivesofdata.com/raft/
The code used into this presentation:
https://github.com/bbossola/sysdist

More Related Content

What's hot (17)

PDF
Netflix at-disney-09-26-2014
Monal Daxini
 
PDF
Monitoring, the Prometheus Way - Julius Voltz, Prometheus
Docker, Inc.
 
PDF
Cassandra serving netflix @ scale
Vinay Kumar Chella
 
PDF
Season 7 Episode 1 - Tools for Data Scientists
aspyker
 
PDF
Running Spark on Cloud
Qubole
 
PDF
NetflixOSS Open House Lightning talks
Ruslan Meshenberg
 
PPTX
Scylla Summit 2018: Scylla Feature Talks - Scylla Streaming and Repair Updates
ScyllaDB
 
PPTX
EVCache at Netflix
Shashi Shekar Madappa
 
PDF
Introduction to Akka-Streams
dmantula
 
PPTX
iFood on Delivering 100 Million Events a Month to Restaurants with Scylla
ScyllaDB
 
PPTX
Arc305 how netflix leverages multiple regions to increase availability an i...
Ruslan Meshenberg
 
PDF
Beaming flink to the cloud @ netflix ff 2016-monal-daxini
Monal Daxini
 
PDF
Netflix Keystone - How Netflix Handles Data Streams up to 11M Events/Sec
Peter Bakas
 
PDF
Leveraging Databricks for Spark pipelines
Rose Toomey
 
PDF
Building and running cloud native cassandra
Vinay Kumar Chella
 
PDF
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
Monal Daxini
 
PDF
Looking towards an official cassandra sidecar netflix
Vinay Kumar Chella
 
Netflix at-disney-09-26-2014
Monal Daxini
 
Monitoring, the Prometheus Way - Julius Voltz, Prometheus
Docker, Inc.
 
Cassandra serving netflix @ scale
Vinay Kumar Chella
 
Season 7 Episode 1 - Tools for Data Scientists
aspyker
 
Running Spark on Cloud
Qubole
 
NetflixOSS Open House Lightning talks
Ruslan Meshenberg
 
Scylla Summit 2018: Scylla Feature Talks - Scylla Streaming and Repair Updates
ScyllaDB
 
EVCache at Netflix
Shashi Shekar Madappa
 
Introduction to Akka-Streams
dmantula
 
iFood on Delivering 100 Million Events a Month to Restaurants with Scylla
ScyllaDB
 
Arc305 how netflix leverages multiple regions to increase availability an i...
Ruslan Meshenberg
 
Beaming flink to the cloud @ netflix ff 2016-monal-daxini
Monal Daxini
 
Netflix Keystone - How Netflix Handles Data Streams up to 11M Events/Sec
Peter Bakas
 
Leveraging Databricks for Spark pipelines
Rose Toomey
 
Building and running cloud native cassandra
Vinay Kumar Chella
 
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
Monal Daxini
 
Looking towards an official cassandra sidecar netflix
Vinay Kumar Chella
 

Similar to Distributed Systems explained (with NodeJS) - Bruno Bossola, JUG Torino (20)

PDF
Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan...
Codemotion
 
ODP
Distributed Systems
Bruno Bossola
 
PPTX
Distributed System explained (with Java Microservices)
Mario Romano
 
PPTX
NoSQL Introduction, Theory, Implementations
Firat Atagun
 
PDF
Lecture-04-Principles of data management.pdf
manimozhi98
 
PDF
System design fundamentals CAP.pdf
UsmanAhmed269749
 
PDF
Distributed Systems: scalability and high availability
Renato Lucindo
 
PDF
CM2-Data model for Big Data chapter2.pdf
ArsimKrasniqi5
 
PDF
From Mainframe to Microservice: An Introduction to Distributed Systems
Tyler Treat
 
PDF
Distributed computing for new bloods
Raymond Tay
 
PPTX
Data Engineering for Data Scientists
jlacefie
 
PDF
Thoughts on consistency models
rogerbodamer
 
PDF
Highly available distributed databases, how they work, javier ramirez at teowaki
javier ramirez
 
PPTX
CAP Theorem - Theory, Implications and Practices
Yoav Francis
 
PPTX
CAP and BASE
Dinesh Varadharajan
 
PPTX
Put Your Thinking CAP On
Tomer Gabel
 
PDF
NoSQL overview implementation free
Benoit Perroud
 
ODP
Everything you always wanted to know about Distributed databases, at devoxx l...
javier ramirez
 
PPT
Key Challenges in Cloud Computing and How Yahoo! is Approaching Them
Yahoo Developer Network
 
PPTX
cse40822-CAP.pptx
NedaaHamed1
 
Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan...
Codemotion
 
Distributed Systems
Bruno Bossola
 
Distributed System explained (with Java Microservices)
Mario Romano
 
NoSQL Introduction, Theory, Implementations
Firat Atagun
 
Lecture-04-Principles of data management.pdf
manimozhi98
 
System design fundamentals CAP.pdf
UsmanAhmed269749
 
Distributed Systems: scalability and high availability
Renato Lucindo
 
CM2-Data model for Big Data chapter2.pdf
ArsimKrasniqi5
 
From Mainframe to Microservice: An Introduction to Distributed Systems
Tyler Treat
 
Distributed computing for new bloods
Raymond Tay
 
Data Engineering for Data Scientists
jlacefie
 
Thoughts on consistency models
rogerbodamer
 
Highly available distributed databases, how they work, javier ramirez at teowaki
javier ramirez
 
CAP Theorem - Theory, Implications and Practices
Yoav Francis
 
CAP and BASE
Dinesh Varadharajan
 
Put Your Thinking CAP On
Tomer Gabel
 
NoSQL overview implementation free
Benoit Perroud
 
Everything you always wanted to know about Distributed databases, at devoxx l...
javier ramirez
 
Key Challenges in Cloud Computing and How Yahoo! is Approaching Them
Yahoo Developer Network
 
cse40822-CAP.pptx
NedaaHamed1
 
Ad

More from Codemotion Tel Aviv (20)

PDF
Keynote: Trends in Modern Application Development - Gilly Dekel, IBM
Codemotion Tel Aviv
 
PDF
Angular is one fire(base)! - Shmuela Jacobs
Codemotion Tel Aviv
 
PDF
Demystifying docker networking black magic - Lorenzo Fontana, Kiratech
Codemotion Tel Aviv
 
PDF
Faster deep learning solutions from training to inference - Amitai Armon & Ni...
Codemotion Tel Aviv
 
PDF
Facts about multithreading that'll keep you up at night - Guy Bar on, Vonage
Codemotion Tel Aviv
 
PDF
Master the Art of the AST (and Take Control of Your JS!) - Yonatan Mevorach, ...
Codemotion Tel Aviv
 
PDF
Unleash the power of angular Reactive Forms - Nir Kaufman, 500Tech
Codemotion Tel Aviv
 
PDF
Can we build an Azure IoT controlled device in less than 40 minutes that cost...
Codemotion Tel Aviv
 
PDF
Actors and Microservices - Can two walk together? - Rotem Hermon, Gigya
Codemotion Tel Aviv
 
PDF
How to Leverage Machine Learning (R, Hadoop, Spark, H2O) for Real Time Proces...
Codemotion Tel Aviv
 
PDF
My Minecraft Smart Home: Prototyping the internet of uncanny things - Sascha ...
Codemotion Tel Aviv
 
PDF
Fullstack DDD with ASP.NET Core and Anguar 2 - Ronald Harmsen, NForza
Codemotion Tel Aviv
 
PDF
The Art of Decomposing Monoliths - Kfir Bloch, Wix
Codemotion Tel Aviv
 
PDF
SOA Lessons Learnt (or Microservices done Better) - Sean Farmar, Particular S...
Codemotion Tel Aviv
 
PDF
Getting Physical with Web Bluetooth - Uri Shaked, BlackBerry
Codemotion Tel Aviv
 
PDF
Web based virtual reality - Tanay Pant, Mozilla
Codemotion Tel Aviv
 
PDF
Material Design Demytified - Ran Nachmany, Google
Codemotion Tel Aviv
 
PDF
All the reasons for choosing react js that you didn't know about - Avi Marcus...
Codemotion Tel Aviv
 
PDF
Mobile Security Attacks: A Glimpse from the Trenches - Yair Amit, Skycure
Codemotion Tel Aviv
 
PPTX
C10k and beyond - Uri Shamay, Akamai
Codemotion Tel Aviv
 
Keynote: Trends in Modern Application Development - Gilly Dekel, IBM
Codemotion Tel Aviv
 
Angular is one fire(base)! - Shmuela Jacobs
Codemotion Tel Aviv
 
Demystifying docker networking black magic - Lorenzo Fontana, Kiratech
Codemotion Tel Aviv
 
Faster deep learning solutions from training to inference - Amitai Armon & Ni...
Codemotion Tel Aviv
 
Facts about multithreading that'll keep you up at night - Guy Bar on, Vonage
Codemotion Tel Aviv
 
Master the Art of the AST (and Take Control of Your JS!) - Yonatan Mevorach, ...
Codemotion Tel Aviv
 
Unleash the power of angular Reactive Forms - Nir Kaufman, 500Tech
Codemotion Tel Aviv
 
Can we build an Azure IoT controlled device in less than 40 minutes that cost...
Codemotion Tel Aviv
 
Actors and Microservices - Can two walk together? - Rotem Hermon, Gigya
Codemotion Tel Aviv
 
How to Leverage Machine Learning (R, Hadoop, Spark, H2O) for Real Time Proces...
Codemotion Tel Aviv
 
My Minecraft Smart Home: Prototyping the internet of uncanny things - Sascha ...
Codemotion Tel Aviv
 
Fullstack DDD with ASP.NET Core and Anguar 2 - Ronald Harmsen, NForza
Codemotion Tel Aviv
 
The Art of Decomposing Monoliths - Kfir Bloch, Wix
Codemotion Tel Aviv
 
SOA Lessons Learnt (or Microservices done Better) - Sean Farmar, Particular S...
Codemotion Tel Aviv
 
Getting Physical with Web Bluetooth - Uri Shaked, BlackBerry
Codemotion Tel Aviv
 
Web based virtual reality - Tanay Pant, Mozilla
Codemotion Tel Aviv
 
Material Design Demytified - Ran Nachmany, Google
Codemotion Tel Aviv
 
All the reasons for choosing react js that you didn't know about - Avi Marcus...
Codemotion Tel Aviv
 
Mobile Security Attacks: A Glimpse from the Trenches - Yair Amit, Skycure
Codemotion Tel Aviv
 
C10k and beyond - Uri Shamay, Akamai
Codemotion Tel Aviv
 
Ad

Recently uploaded (20)

PPTX
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
PDF
Complete Network Protection with Real-Time Security
L4RGINDIA
 
PDF
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
PDF
Wojciech Ciemski for Top Cyber News MAGAZINE. June 2025
Dr. Ludmila Morozova-Buss
 
PDF
SFWelly Summer 25 Release Highlights July 2025
Anna Loughnan Colquhoun
 
PDF
Why Orbit Edge Tech is a Top Next JS Development Company in 2025
mahendraalaska08
 
PDF
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
PDF
Building Resilience with Digital Twins : Lessons from Korea
SANGHEE SHIN
 
PDF
Windsurf Meetup Ottawa 2025-07-12 - Planning Mode at Reliza.pdf
Pavel Shukhman
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PPTX
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
PDF
Smart Air Quality Monitoring with Serrax AQM190 LITE
SERRAX TECHNOLOGIES LLP
 
PPT
Interview paper part 3, It is based on Interview Prep
SoumyadeepGhosh39
 
PDF
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
PPTX
✨Unleashing Collaboration: Salesforce Channels & Community Power in Patna!✨
SanjeetMishra29
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
PPTX
Top iOS App Development Company in the USA for Innovative Apps
SynapseIndia
 
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
Complete Network Protection with Real-Time Security
L4RGINDIA
 
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
Wojciech Ciemski for Top Cyber News MAGAZINE. June 2025
Dr. Ludmila Morozova-Buss
 
SFWelly Summer 25 Release Highlights July 2025
Anna Loughnan Colquhoun
 
Why Orbit Edge Tech is a Top Next JS Development Company in 2025
mahendraalaska08
 
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
Building Resilience with Digital Twins : Lessons from Korea
SANGHEE SHIN
 
Windsurf Meetup Ottawa 2025-07-12 - Planning Mode at Reliza.pdf
Pavel Shukhman
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
Smart Air Quality Monitoring with Serrax AQM190 LITE
SERRAX TECHNOLOGIES LLP
 
Interview paper part 3, It is based on Interview Prep
SoumyadeepGhosh39
 
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
✨Unleashing Collaboration: Salesforce Channels & Community Power in Patna!✨
SanjeetMishra29
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
Top iOS App Development Company in the USA for Innovative Apps
SynapseIndia
 

Distributed Systems explained (with NodeJS) - Bruno Bossola, JUG Torino

  • 1. Distributed Systems + NodeJS Bruno Bossola TEL AVIV 30 MARCH 2017 @bbossola
  • 2. @bbossola Whoami ● Developer since 1988 ● XP Coach 2000+ ● Co-founder of JUG Torino ● Java Champion since 2005 ● I live in London, love the weather...
  • 3. @bbossola Agenda ● Distributed programming ● How does it work, what does it mean ● The CAP theorem ● CAP explained with live demos! – CA system using two phase commit – AP system using sloppy quorums – CP system using majority quorums ● Q&A This is a 30 minutes version of the original 120mins presentation!! 0 min. 10 mins. 20 mins.
  • 5. @bbossola Distributed programming ● Any system should deal with two tasks: – Storage – Computation ● We will look here at the storage part of the equation. We need. three basic properties: – Scalability – Availability – Consistency
  • 6. @bbossola Scalability ● The ability of a system/network/process to: – handle a growing amount of work – be enlarged to accommodate new growth A scalable system continue to meet the needs of its users as the scale increase clipart courtesy of openclipart.orgclipart courtesy of openclipart.org
  • 7. @bbossola How do we scale? partitioning ● Slice the dataset into smaller independent sets ● reduces the impact of dataset growth – improves performance by limiting the amount of data to be examined – improves availability by the ability of partitions to fail indipendently
  • 8. @bbossola How do we scale? partitioning ● But can also be a source of problems – what happens if a partition become unavailable? – what if It becomes slower? – what if it becomes unresponsive? clipart courtesy of openclipart.org
  • 9. @bbossola How do we scale? replication ● Copies of the same data on multiple machines ● Benefits: – allows more servers to take part in the computation – improves performance by making additional computing power and bandwidth – improves availability by creating copy of the data
  • 10. @bbossola How do we scale? replication ● But it's also a source of problems – there are independent copies of the data – need to be kept in sync on multiple machines ● Your system must follow a consistency model v4 v4 v8 v8 v4 v5 v7 v8 clipart courtesy of openclipart.org
  • 11. @bbossola Availability ● The proportion of time a system is in functioning conditions ● The system is fault-tolerant – the ability of your system to behave in a well defined manner once a fault occurs ● All clients can always read and write – In distributed systems this is achieved by redundancy clipart courtesy of openclipart.org
  • 12. @bbossola Consistency ● Any read on a data item X returns a value corresponding to the result of the most recent write on X. ● Each client always has the same view of the data ● Also know as “Strong Consistency” clipart courtesy of cliparts.co
  • 13. @bbossola Consistency flavours ● Strong consistency – every replica sees every update in the same order. – no two replicas may have different values at the same time. ● Weak consistency – every replica will see every update, but possibly in different orders. ● Eventual consistency – every replica will eventually see every update and will eventually agree on all values.
  • 14. @bbossola The CAP theorem CONSISTENCY AVAILABILITY PARTITION TOLERANCE
  • 15. @bbossola The CAP theorem ● You cannot have all :( ● You can select two properties at once Sorry, this has been mathematically proven and no, has not been debunked.
  • 16. @bbossola The CAP theorem CA systems! ● You selected consistency and availability! ● Strict quorum protocols (two/multi phase commit) ● Most RDBMS Hey! A network partition will f**k you up good!
  • 17. @bbossola The CAP theorem AP systems! ● You selected availability and partition tolerance! ● Sloppy quorums and conflict resolution protocols ● Amazon Dynamo, Riak, Cassandra
  • 18. @bbossola The CAP theorem CP systems! ● You selected consistency and partition tolerance! ● Majority quorum protocols (paxos, raft, zab) ● Apache Zookeeper, Google Spanner
  • 19. @bbossola NodeJS time! ● Let's write our brand new key value store ● We will code all three different flavours ● We will have many nodes, fully replicated ● No sharding ● We will kill servers! ● We will trigger network partitions! – (no worries. it's a simulation!) clipart courtesy of cliparts.co
  • 20. @bbossola CA key-value store ● Uses classic two-phase commit ● Works like a local system ● Not partition tolerant ● Classic RDMBS (strict quorum)
  • 21. @bbossola AP key-value store ● Eventually consistent design ● Prioritizes availability over consistency ● Dynamo, Riak, Cassandra (sloppy quorum)
  • 22. @bbossola CP key-value store ● Uses majority quorum (raft) ● Guarantees eventual consistency ● Zookeper, Spanner (majority quorum)
  • 23. @bbossola Q&A Amazon Dynamo: http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html The RAFT consensus algorithm: https://raft.github.io/ http://thesecretlivesofdata.com/raft/ The code used into this presentation: https://github.com/bbossola/sysdist