SlideShare a Scribd company logo
RESILIENT DISTRIBUTED DATASETS: A
FAULT-TOLERANT ABSTRACTION FOR
IN-MEMORY CLUSTER COMPUTING
MATEI ZAHARIA, MOSHARAF CHOWDHURY, TATHAGATA DAS, ANKUR DAVE, JUSTIN MA, MURPHY MCCAULEY,
MICHAEL J. FRANKLIN, SCOTT SHENKER, ION STOICA.
NSDI'12 PROCEEDINGS OF THE 9TH USENIX CONFERENCE ON NETWORKED SYSTEMS DESIGN AND IMPLEMENTATION
PAPERS WE LOVE AMSTERDAM
AUGUST 13, 2015
@gabriele_modena
(C) PRESENTATION BY GABRIELE MODENA, 2015
About me
• CS.ML
• Data science & predictive modelling
• with a sprinkle of systems work
• Hadoop & c. for data wrangling &
crunching numbers
• … and Spark
(C) PRESENTATION BY GABRIELE MODENA, 2015
(C) PRESENTATION BY GABRIELE MODENA, 2015
We present Resilient Distributed Datasets
(RDDs), a distributed memory abstraction
that lets programmers perform in-memory
computations on large clusters in a fault-
tolerant manner. RDDs are motivated by two
types of applications that current computing
frameworks handle inefficiently: iterative
algorithms and interactive data mining tools.
(C) PRESENTATION BY GABRIELE MODENA, 2015
How
• Review (concepts from) key related work
• RDD + Spark
• Some critiques
(C) PRESENTATION BY GABRIELE MODENA, 2015
Related work
• MapReduce
• Dryad
• Hadoop Distributed FileSystem (HDFS)
• Mesos
(C) PRESENTATION BY GABRIELE MODENA, 2015
What’s an iterative algorithm anyway?
data = input data
w = <target vector>
for i in num_iterations:
for item in data:
update(w)
Multiple input scans
At each iteration,
do something
Update a shared data
structure
(C) PRESENTATION BY GABRIELE MODENA, 2015
HDFS
• GFS paper (2003)
• Distributed storage (with
replication)
• Block ops
• NameNode hashes file
locations (blocks)
Data
Node
Data
Node
Data
Node
Name
Node
(C) PRESENTATION BY GABRIELE MODENA, 2015
HDFS
• GFS paper (2003)
• Distributed storage (with
replication)
• Block ops
• NameNode hashes file
locations (blocks)
Data
Node
Data
Node
Data
Node
Name
Node
(C) PRESENTATION BY GABRIELE MODENA, 2015
HDFS
• GFS paper (2003)
• Distributed storage (with
replication)
• Block ops
• NameNode hashes file
locations (blocks)
Data
Node
Data
Node
Data
Node
Name
Node
(C) PRESENTATION BY GABRIELE MODENA, 2015
MapReduce
• Google paper (2004)
• Apache Hadoop (~2007)
• Divide and conquer functional model
• Goes hand-in-hand with HDFS
• Structure data as (key, value)
1. Map(): filter and project
emit (k, v) pairs
2. Reduce(): aggregate and summarise
group by key and count
Map Map Map
Reduce Reduce
HDFS (blocks)
HDFS
(C) PRESENTATION BY GABRIELE MODENA, 2015
MapReduce
• Google paper (2004)
• Apache Hadoop (~2007)
• Divide and conquer functional model
• Goes hand-in-hand with HDFS
• Structure data as (key, value)
1. Map(): filter and project
emit (k, v) pairs
2. Reduce(): aggregate and summarise
group by key and count
Map Map Map
Reduce Reduce
HDFS (blocks)
HDFS
This is a test
Yes it is a test
…
(C) PRESENTATION BY GABRIELE MODENA, 2015
MapReduce
• Google paper (2004)
• Apache Hadoop (~2007)
• Divide and conquer functional model
• Goes hand-in-hand with HDFS
• Structure data as (key, value)
1. Map(): filter and project
emit (k, v) pairs
2. Reduce(): aggregate and summarise
group by key and count
Map Map Map
Reduce Reduce
HDFS (blocks)
HDFS
This is a test
Yes it is a test
…
(This,1), (is, 1), (a, 1),
(test., 1), (Yes, 1), (it, 1),
(is, 1)
(C) PRESENTATION BY GABRIELE MODENA, 2015
MapReduce
• Google paper (2004)
• Apache Hadoop (~2007)
• Divide and conquer functional model
• Goes hand-in-hand with HDFS
• Structure data as (key, value)
1. Map(): filter and project
emit (k, v) pairs
2. Reduce(): aggregate and summarise
group by key and count
Map Map Map
Reduce Reduce
HDFS (blocks)
HDFS
This is a test
Yes it is a test
…
(This,1), (is, 1), (a, 1),
(test., 1), (Yes, 1), (it, 1),
(is, 1)
(This, 1), (is, 2), (a, 2),
(test, 2), (Yes, 1), (it, 1)
(C) PRESENTATION BY GABRIELE MODENA, 2015
(c) Image from Apache Tez http://tez.apache.org
(C) PRESENTATION BY GABRIELE MODENA, 2015
Critiques to MR and HDFS
• Great when records (and jobs) are independent
• In reality expect data to be shuffled across the
network
• Latency measured in minutes
• Performance hit for iterative methods
• Composability monsters
• Meant for batch workflows
(C) PRESENTATION BY GABRIELE MODENA, 2015
Dryad
• Microsoft paper (2007)
• Inspired Apache Tez
• Generalisation of MapReduce via I/O
pipelining
• Applications are (direct acyclic) graphs
of tasks
(C) PRESENTATION BY GABRIELE MODENA, 2015
Dryad
DAG dag = new DAG("WordCount");
dag.addVertex(tokenizerVertex)

.addVertex(summerVertex)

.addEdge(new Edge(tokenizerVertex,

summerVertex,

edgeConf.createDefaultEdgeProperty())

);
(C) PRESENTATION BY GABRIELE MODENA, 2015
MapReduce and Dryad
SELECT a.country, COUNT(b.place_id)
FROM place a JOIN tweets b ON (a. place_id = b.place_id)
GROUP BY a.country;
(c) Image from Apache Tez http://tez.apache.org. Modified.
(C) PRESENTATION BY GABRIELE MODENA, 2015
Critiques to Dryad
• No explicit abstraction for data sharing
• Must express data reps as DAG
• Partial solution: DryadLINQ
• No notion of a distributed filesystem
• How to handle large inputs?
• Local writes / remote reads?
(C) PRESENTATION BY GABRIELE MODENA, 2015
Resilient Distributed Datasets
Read-only, partitioned collection of records

=> a distributed immutable array


accessed via coarse-grained transformations
=> apply a function (scala closure) to all

elements of the array
Obj Obj Obj Obj Obj Obj Obj Obj Obj Obj Obj Obj
(C) PRESENTATION BY GABRIELE MODENA, 2015
Resilient Distributed Datasets
Read-only, partitioned collection of records

=> a distributed immutable array


accessed via coarse-grained transformations
=> apply a function (scala closure) to all

elements of the array
Obj Obj Obj Obj Obj Obj Obj Obj Obj Obj Obj Obj
(C) PRESENTATION BY GABRIELE MODENA, 2015
Spark
• Transformations - lazily create RDDs

wc = dataset.flatMap(tokenize)

.reduceByKey(add)
• Actions - execute computation

wc.collect()
Runtime and API
(C) PRESENTATION BY GABRIELE MODENA, 2015
Applications
Driver
Worker
Worker
Worker
input
data
input
data
input
data
RAM
RAM
results
tasks
RAM
(C) PRESENTATION BY GABRIELE MODENA, 2015
Applications
• Driver code defines RDDs
and invokes actions
Driver
Worker
Worker
Worker
input
data
input
data
input
data
RAM
RAM
results
tasks
RAM
(C) PRESENTATION BY GABRIELE MODENA, 2015
Applications
• Driver code defines RDDs
and invokes actions
• Submit to long lived
workers, that store
partitions in memory
Driver
Worker
Worker
Worker
input
data
input
data
input
data
RAM
RAM
results
tasks
RAM
(C) PRESENTATION BY GABRIELE MODENA, 2015
Applications
• Driver code defines RDDs
and invokes actions
• Submit to long lived
workers, that store
partitions in memory
• Scala closures are
serialised as Java objects
and passed across the
network over HTTPDriver
Worker
Worker
Worker
input
data
input
data
input
data
RAM
RAM
results
tasks
RAM
(C) PRESENTATION BY GABRIELE MODENA, 2015
Applications
• Driver code defines RDDs
and invokes actions
• Submit to long lived
workers, that store
partitions in memory
• Scala closures are
serialised as Java objects
and passed across the
network over HTTP
• Variables bound to the
closure are saved in the
serialised object
Driver
Worker
Worker
Worker
input
data
input
data
input
data
RAM
RAM
results
tasks
RAM
(C) PRESENTATION BY GABRIELE MODENA, 2015
Applications
• Driver code defines RDDs
and invokes actions
• Submit to long lived
workers, that store
partitions in memory
• Scala closures are
serialised as Java objects
and passed across the
network over HTTP
• Variables bound to the
closure are saved in the
serialised object
• Closures are deserialised
on each worker and
applied to the RDD
(partition)
Driver
Worker
Worker
Worker
input
data
input
data
input
data
RAM
RAM
results
tasks
RAM
(C) PRESENTATION BY GABRIELE MODENA, 2015
Applications
• Driver code defines RDDs
and invokes actions
• Submit to long lived
workers, that store
partitions in memory
• Scala closures are
serialised as Java objects
and passed across the
network over HTTP
• Variables bound to the
closure are saved in the
serialised object
• Closures are deserialised
on each worker and
applied to the RDD
(partition)
• Mesos takes care of
resource management
Driver
Worker
Worker
Worker
input
data
input
data
input
data
RAM
RAM
results
tasks
RAM
(C) PRESENTATION BY GABRIELE MODENA, 2015
Data persistance
1. in memory as deserialized java object
2. in memory as serialized data
3. on disk
RDD Checkpointing
Memory management via LRU eviction policy
.persist() RDD for future reuse
(C) PRESENTATION BY GABRIELE MODENA, 2015
Lineage
lines = spark.textFile(“hdfs://...")
errors =
lines.filter(_.startsWith("ERROR"))
errors.persist()
errors.filter(_.contains("HDFS"))
.map(_.split(’t’)(3))
.collect()
(C) PRESENTATION BY GABRIELE MODENA, 2015
Lineage
lines
lines = spark.textFile(“hdfs://...")
errors =
lines.filter(_.startsWith("ERROR"))
errors.persist()
errors.filter(_.contains("HDFS"))
.map(_.split(’t’)(3))
.collect()
(C) PRESENTATION BY GABRIELE MODENA, 2015
Lineage
lines
lines = spark.textFile(“hdfs://...")
errors =
lines.filter(_.startsWith("ERROR"))
errors.persist()
errors.filter(_.contains("HDFS"))
.map(_.split(’t’)(3))
.collect()
filter(_.startsWith("ERROR"))
(C) PRESENTATION BY GABRIELE MODENA, 2015
Lineage
lines
errors
lines = spark.textFile(“hdfs://...")
errors =
lines.filter(_.startsWith("ERROR"))
errors.persist()
errors.filter(_.contains("HDFS"))
.map(_.split(’t’)(3))
.collect()
filter(_.startsWith("ERROR"))
(C) PRESENTATION BY GABRIELE MODENA, 2015
Lineage
lines
errors
lines = spark.textFile(“hdfs://...")
errors =
lines.filter(_.startsWith("ERROR"))
errors.persist()
errors.filter(_.contains("HDFS"))
.map(_.split(’t’)(3))
.collect()
filter(_.startsWith("ERROR"))
filter(_.contains(“HDFS”))
(C) PRESENTATION BY GABRIELE MODENA, 2015
Lineage
lines
errors
hdfs errors
lines = spark.textFile(“hdfs://...")
errors =
lines.filter(_.startsWith("ERROR"))
errors.persist()
errors.filter(_.contains("HDFS"))
.map(_.split(’t’)(3))
.collect()
filter(_.startsWith("ERROR"))
filter(_.contains(“HDFS”))
(C) PRESENTATION BY GABRIELE MODENA, 2015
Lineage
lines
errors
hdfs errors
lines = spark.textFile(“hdfs://...")
errors =
lines.filter(_.startsWith("ERROR"))
errors.persist()
errors.filter(_.contains("HDFS"))
.map(_.split(’t’)(3))
.collect()
filter(_.startsWith("ERROR"))
filter(_.contains(“HDFS”))
map(_.split(’t’)(3))
(C) PRESENTATION BY GABRIELE MODENA, 2015
Lineage
lines
errors
hdfs errors
time fields
lines = spark.textFile(“hdfs://...")
errors =
lines.filter(_.startsWith("ERROR"))
errors.persist()
errors.filter(_.contains("HDFS"))
.map(_.split(’t’)(3))
.collect()
filter(_.startsWith("ERROR"))
filter(_.contains(“HDFS”))
map(_.split(’t’)(3))
(C) PRESENTATION BY GABRIELE MODENA, 2015
Lineage
lines
errors
hdfs errors
time fields
lines = spark.textFile(“hdfs://...")
errors =
lines.filter(_.startsWith("ERROR"))
errors.persist()
errors.filter(_.contains("HDFS"))
.map(_.split(’t’)(3))
.collect()
filter(_.startsWith("ERROR"))
filter(_.contains(“HDFS”))
map(_.split(’t’)(3))
(C) PRESENTATION BY GABRIELE MODENA, 2015
Lineage
lines
errors
hdfs errors
time fields
lines = spark.textFile(“hdfs://...")
errors =
lines.filter(_.startsWith("ERROR"))
errors.persist()
errors.filter(_.contains("HDFS"))
.map(_.split(’t’)(3))
.collect()
filter(_.startsWith("ERROR"))
filter(_.contains(“HDFS”))
map(_.split(’t’)(3))
(C) PRESENTATION BY GABRIELE MODENA, 2015
Lineage
Fault recovery
If a partition is lost,
derived it back from the
lineage
lines
errors
hdfs errors
time fields
lines = spark.textFile(“hdfs://...")
errors =
lines.filter(_.startsWith("ERROR"))
errors.persist()
errors.filter(_.contains("HDFS"))
.map(_.split(’t’)(3))
.collect()
filter(_.startsWith("ERROR"))
filter(_.contains(“HDFS”))
map(_.split(’t’)(3))
(C) PRESENTATION BY GABRIELE MODENA, 2015
Representation
Challenge: track lineage across
transformations
1. Partitions
2. Data locality for partition p
3. List dependencies
4. Iterator function to compute a dataset
based on its parents
5. Metadata for the partitioner scheme
(C) PRESENTATION BY GABRIELE MODENA, 2015
Narrow dependencies
pipelined execution on one cluster node
map, filter
union
(C) PRESENTATION BY GABRIELE MODENA, 2015
Wide dependencies
require data from all parent partitions to be available and to be
shuffled across the nodes using a MapReduce-like operation
groupByKey
join with inputs
not co-partitioned
(C) PRESENTATION BY GABRIELE MODENA, 2015
Scheduling
Task are allocated based on data locality (delayed scheduling)
1. Action is triggered => compute the RDD
2. Based on lineage, build a graph of stages to execute
3. Each stage contains as many pipelined
transformations with narrow dependencies as
possible
4. Launch tasks to compute missing partitions from
each stage until it has computed the target RDD
5. If a task fails => re-run it on another node as long as
its stage’s parents are still available.
(C) PRESENTATION BY GABRIELE MODENA, 2015
Job execution
union
map
groupBy
join
B
C D
E
F
G
Stage 3Stage 2
A
Stage 1
(C) PRESENTATION BY GABRIELE MODENA, 2015
Job execution
union
map
groupBy
join
B
C D
E
F
G
Stage 3Stage 2
A
Stage 1
B = A.groupBy
D = C.map
F = D.union(E)
G = B.join(F)
G.collect()
(C) PRESENTATION BY GABRIELE MODENA, 2015
Job execution
G
B = A.groupBy
D = C.map
F = D.union(E)
G = B.join(F)
G.collect()
(C) PRESENTATION BY GABRIELE MODENA, 2015
Job execution
join
B
F
G
B = A.groupBy
D = C.map
F = D.union(E)
G = B.join(F)
G.collect()
(C) PRESENTATION BY GABRIELE MODENA, 2015
Job execution
join
B
F
G
groupBy
A
B = A.groupBy
D = C.map
F = D.union(E)
G = B.join(F)
G.collect()
(C) PRESENTATION BY GABRIELE MODENA, 2015
Job execution
union
D
E
join
B
F
G
groupBy
A
B = A.groupBy
D = C.map
F = D.union(E)
G = B.join(F)
G.collect()
(C) PRESENTATION BY GABRIELE MODENA, 2015
Job execution
map
C
union
D
E
join
B
F
G
groupBy
A
B = A.groupBy
D = C.map
F = D.union(E)
G = B.join(F)
G.collect()
(C) PRESENTATION BY GABRIELE MODENA, 2015
Job execution
map
C
union
D
E
join
B
F
G
groupBy
A
B = A.groupBy
D = C.map
F = D.union(E)
G = B.join(F)
G.collect()
(C) PRESENTATION BY GABRIELE MODENA, 2015
Job execution
map
C
union
D
E
join
B
F
G
groupBy
A
B = A.groupBy
D = C.map
F = D.union(E)
G = B.join(F)
G.collect()
(C) PRESENTATION BY GABRIELE MODENA, 2015
Job execution
map
C
union
D
E
join
B
F
G
groupBy
A
B = A.groupBy
D = C.map
F = D.union(E)
G = B.join(F)
G.collect()
(C) PRESENTATION BY GABRIELE MODENA, 2015
Job execution
map
C
union
D
E
join
B
F
G
groupBy
A
Stage 1
B = A.groupBy
D = C.map
F = D.union(E)
G = B.join(F)
G.collect()
(C) PRESENTATION BY GABRIELE MODENA, 2015
Job execution
map
C
union
D
E
join
B
F
G
Stage 2
groupBy
A
Stage 1
B = A.groupBy
D = C.map
F = D.union(E)
G = B.join(F)
G.collect()
(C) PRESENTATION BY GABRIELE MODENA, 2015
Job execution
map
C
union
D
E
join
B
F
G
Stage 3Stage 2
groupBy
A
Stage 1
B = A.groupBy
D = C.map
F = D.union(E)
G = B.join(F)
G.collect()
(C) PRESENTATION BY GABRIELE MODENA, 2015
Evaluation
(C) PRESENTATION BY GABRIELE MODENA, 2015
Some critiques (to the paper) Some critiques
(to the paper)
• How general is this approach?
• We are still doing MapReduce
• Concerns wrt iterative algorithms still stand
• CPU bound workloads?
• Linear Algebra?
• How much tuning is required?
• How does the partitioner work?
• What is the cost of reconstructing an RDD from
lineage?
• Performance when data does not fit in memory
• Eg. a join between two very large non co-
partitioned RDDs
(C) PRESENTATION BY GABRIELE MODENA, 2015
References (Theory)
Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing.
Zaharia et. al, Proceedings of NSDI’12. https://www.cs.berkeley.edu/~matei/papers/2012/
nsdi_spark.pdf
Spark: cluster computing with working sets. Zaharia et. al, Proceedings of HotCloud'10.
http://people.csail.mit.edu/matei/papers/2010/hotcloud_spark.pdf
The Google File System. Ghemawat, Gobioff, Leung, 19th ACM Symposium on Operating
Systems Principles, 2003. http://research.google.com/archive/gfs.html
MapReduce: Simplified Data Processing on Large Clusters. Dean, Ghemawat,
OSDI'04: Sixth Symposium on Operating System Design and Implementation.
http://research.google.com/archive/mapreduce.html
Dryad: Distributed Data-Parallel Programs from Sequential Building Blocks
Michael Isard, Mihai Budiu, Yuan Yu, Andrew Birrell, and Dennis Fetterly
European Conference on Computer Systems (EuroSys), Lisbon, Portugal, March 21-23, 2007.
http://research.microsoft.com/en-us/projects/dryad/eurosys07.pdf
Mesos: a platform for fine-grained resource sharing in the data center, Hindman et. al,
Proceedings of NSDI’11. https://www.cs.berkeley.edu/~alig/papers/mesos.pdf
(C) PRESENTATION BY GABRIELE MODENA, 2015
References (Practice)
• An overview of the pyspark API through pictures https://github.com/jkthompson/
pyspark-pictures
• Barry Brumitt’s presentation on MapReduce design patterns (UW CSE490)
http://courses.cs.washington.edu/courses/cse490h/08au/lectures/
MapReduceDesignPatterns-UW2.pdf
• The Dryad Project http://research.microsoft.com/en-us/projects/dryad/
• Apache Spark http://spark.apache.org
• Apache Hadoop https://hadoop.apache.org
• Apache Tez https://tez.apache.org
• Apache Mesos http://mesos.apache.org

More Related Content

What's hot (20)

PDF
Resilient Distributed Datasets
Alessandro Menabò
 
PDF
Interactive Graph Analytics with Spark-(Daniel Darabos, Lynx Analytics)
Spark Summit
 
PPTX
Beyond Hadoop 1.0: A Holistic View of Hadoop YARN, Spark and GraphLab
Vijay Srinivas Agneeswaran, Ph.D
 
PDF
Big data distributed processing: Spark introduction
Hektor Jacynycz García
 
DOCX
Big data processing using - Hadoop Technology
Shital Kat
 
PDF
Hadoop ensma poitiers
Rim Moussa
 
PPTX
Spark & Cassandra at DataStax Meetup on Jan 29, 2015
Sameer Farooqui
 
PDF
Python in an Evolving Enterprise System (PyData SV 2013)
PyData
 
PPTX
Optimal Execution Of MapReduce Jobs In Cloud - Voices 2015
Deanna Kosaraju
 
PPTX
MATLAB, netCDF, and OPeNDAP
The HDF-EOS Tools and Information Center
 
PDF
A sql implementation on the map reduce framework
eldariof
 
PDF
Boston Spark Meetup event Slides Update
vithakur
 
PPT
BDAS RDD study report v1.2
Stefanie Zhao
 
PPTX
HDFS-HC2: Analysis of Data Placement Strategy based on Computing Power of Nod...
Xiao Qin
 
PPTX
Spark 计算模型
wang xing
 
PDF
Large Scale Math with Hadoop MapReduce
Hortonworks
 
PDF
Hot-Spot analysis Using Apache Spark framework
Supriya .
 
PPTX
A 3 dimensional data model in hbase for large time-series dataset-20120915
Dan Han
 
PPTX
Presentation sreenu dwh-services
Sreenu Musham
 
Resilient Distributed Datasets
Alessandro Menabò
 
Interactive Graph Analytics with Spark-(Daniel Darabos, Lynx Analytics)
Spark Summit
 
Beyond Hadoop 1.0: A Holistic View of Hadoop YARN, Spark and GraphLab
Vijay Srinivas Agneeswaran, Ph.D
 
Big data distributed processing: Spark introduction
Hektor Jacynycz García
 
Big data processing using - Hadoop Technology
Shital Kat
 
Hadoop ensma poitiers
Rim Moussa
 
Spark & Cassandra at DataStax Meetup on Jan 29, 2015
Sameer Farooqui
 
Python in an Evolving Enterprise System (PyData SV 2013)
PyData
 
Optimal Execution Of MapReduce Jobs In Cloud - Voices 2015
Deanna Kosaraju
 
MATLAB, netCDF, and OPeNDAP
The HDF-EOS Tools and Information Center
 
A sql implementation on the map reduce framework
eldariof
 
Boston Spark Meetup event Slides Update
vithakur
 
BDAS RDD study report v1.2
Stefanie Zhao
 
HDFS-HC2: Analysis of Data Placement Strategy based on Computing Power of Nod...
Xiao Qin
 
Spark 计算模型
wang xing
 
Large Scale Math with Hadoop MapReduce
Hortonworks
 
Hot-Spot analysis Using Apache Spark framework
Supriya .
 
A 3 dimensional data model in hbase for large time-series dataset-20120915
Dan Han
 
Presentation sreenu dwh-services
Sreenu Musham
 

Viewers also liked (20)

PDF
Spark in 15 min
Christophe Marchal
 
PPTX
Apache Spark Components
Girish Khanzode
 
PDF
Spark fundamentals i (bd095 en) version #1: updated: april 2015
Ashutosh Sonaliya
 
PDF
Unikernels: in search of a killer app and a killer ecosystem
rhatr
 
PDF
Type Checking Scala Spark Datasets: Dataset Transforms
John Nestor
 
PDF
New Analytics Toolbox DevNexus 2015
Robbie Strickland
 
PDF
臺灣高中數學講義 - 第一冊 - 數與式
Xuan-Chao Huang
 
PPTX
Think Like Spark: Some Spark Concepts and a Use Case
Rachel Warren
 
PDF
Apache Spark: killer or savior of Apache Hadoop?
rhatr
 
PPTX
Apache Spark Introduction @ University College London
Vitthal Gogate
 
PPTX
Think Like Spark
Alpine Data
 
PDF
Hadoop Spark Introduction-20150130
Xuan-Chao Huang
 
PDF
Hadoop to spark_v2
elephantscale
 
PDF
Escape from Hadoop: Ultra Fast Data Analysis with Spark & Cassandra
Piotr Kolaczkowski
 
PDF
What’s New in Spark 2.0: Structured Streaming and Datasets - StampedeCon 2016
StampedeCon
 
PPTX
Intro to Spark development
Spark Summit
 
PDF
Beneath RDD in Apache Spark by Jacek Laskowski
Spark Summit
 
PPT
Apache Spark Introduction and Resilient Distributed Dataset basics and deep dive
Sachin Aggarwal
 
PDF
2016 Spark Summit East Keynote: Matei Zaharia
Databricks
 
PDF
Spark Summit San Francisco 2016 - Matei Zaharia Keynote: Apache Spark 2.0
Databricks
 
Spark in 15 min
Christophe Marchal
 
Apache Spark Components
Girish Khanzode
 
Spark fundamentals i (bd095 en) version #1: updated: april 2015
Ashutosh Sonaliya
 
Unikernels: in search of a killer app and a killer ecosystem
rhatr
 
Type Checking Scala Spark Datasets: Dataset Transforms
John Nestor
 
New Analytics Toolbox DevNexus 2015
Robbie Strickland
 
臺灣高中數學講義 - 第一冊 - 數與式
Xuan-Chao Huang
 
Think Like Spark: Some Spark Concepts and a Use Case
Rachel Warren
 
Apache Spark: killer or savior of Apache Hadoop?
rhatr
 
Apache Spark Introduction @ University College London
Vitthal Gogate
 
Think Like Spark
Alpine Data
 
Hadoop Spark Introduction-20150130
Xuan-Chao Huang
 
Hadoop to spark_v2
elephantscale
 
Escape from Hadoop: Ultra Fast Data Analysis with Spark & Cassandra
Piotr Kolaczkowski
 
What’s New in Spark 2.0: Structured Streaming and Datasets - StampedeCon 2016
StampedeCon
 
Intro to Spark development
Spark Summit
 
Beneath RDD in Apache Spark by Jacek Laskowski
Spark Summit
 
Apache Spark Introduction and Resilient Distributed Dataset basics and deep dive
Sachin Aggarwal
 
2016 Spark Summit East Keynote: Matei Zaharia
Databricks
 
Spark Summit San Francisco 2016 - Matei Zaharia Keynote: Apache Spark 2.0
Databricks
 
Ad

Similar to Resilient Distributed Datasets (20)

PPTX
Study Notes: Apache Spark
Gao Yunzhong
 
PDF
Introduction to Apache Spark
Vincent Poncet
 
PPT
Spark training-in-bangalore
Kelly Technologies
 
PDF
Spark cluster computing with working sets
JinxinTang
 
PPTX
Apache Spark
SugumarSarDurai
 
PPT
11. From Hadoop to Spark 1:2
Fabio Fumarola
 
PDF
Spark
newmooxx
 
PPTX
Apache spark - History and market overview
Martin Zapletal
 
PDF
IRJET - Survey Paper on Map Reduce Processing using HADOOP
IRJET Journal
 
PDF
Introduction to apache spark
Muktadiur Rahman
 
PDF
Introduction to apache spark
JUGBD
 
PPTX
2016-07-21-Godil-presentation.pptx
D21CE161GOSWAMIPARTH
 
PDF
Apache Spark and DataStax Enablement
Vincent Poncet
 
PPTX
Real time hadoop + mapreduce intro
Geoff Hendrey
 
PDF
20140614 introduction to spark-ben white
Data Con LA
 
PDF
BDM25 - Spark runtime internal
David Lauzon
 
DOCX
Hadoop Seminar Report
Atul Kushwaha
 
PDF
Apache Hadoop and Spark: Introduction and Use Cases for Data Analysis
Trieu Nguyen
 
PDF
Introduction to Spark Training
Spark Summit
 
Study Notes: Apache Spark
Gao Yunzhong
 
Introduction to Apache Spark
Vincent Poncet
 
Spark training-in-bangalore
Kelly Technologies
 
Spark cluster computing with working sets
JinxinTang
 
Apache Spark
SugumarSarDurai
 
11. From Hadoop to Spark 1:2
Fabio Fumarola
 
Spark
newmooxx
 
Apache spark - History and market overview
Martin Zapletal
 
IRJET - Survey Paper on Map Reduce Processing using HADOOP
IRJET Journal
 
Introduction to apache spark
Muktadiur Rahman
 
Introduction to apache spark
JUGBD
 
2016-07-21-Godil-presentation.pptx
D21CE161GOSWAMIPARTH
 
Apache Spark and DataStax Enablement
Vincent Poncet
 
Real time hadoop + mapreduce intro
Geoff Hendrey
 
20140614 introduction to spark-ben white
Data Con LA
 
BDM25 - Spark runtime internal
David Lauzon
 
Hadoop Seminar Report
Atul Kushwaha
 
Apache Hadoop and Spark: Introduction and Use Cases for Data Analysis
Trieu Nguyen
 
Introduction to Spark Training
Spark Summit
 
Ad

Recently uploaded (20)

PDF
Using AI/ML for Space Biology Research
VICTOR MAESTRE RAMIREZ
 
PPTX
SHREYAS25 INTERN-I,II,III PPT (1).pptx pre
swapnilherage
 
PDF
1750162332_Snapshot-of-Indias-oil-Gas-data-May-2025.pdf
sandeep718278
 
PDF
InformaticsPractices-MS - Google Docs.pdf
seshuashwin0829
 
PPTX
Aict presentation on dpplppp sjdhfh.pptx
vabaso5932
 
PPTX
What Is Data Integration and Transformation?
subhashenia
 
PDF
apidays Singapore 2025 - From API Intelligence to API Governance by Harsha Ch...
apidays
 
PPTX
04_Tamás Marton_Intuitech .pptx_AI_Barometer_2025
FinTech Belgium
 
PPTX
Comparative Study of ML Techniques for RealTime Credit Card Fraud Detection S...
Debolina Ghosh
 
PPTX
办理学历认证InformaticsLetter新加坡英华美学院毕业证书,Informatics成绩单
Taqyea
 
PDF
Data Science Course Certificate by Sigma Software University
Stepan Kalika
 
PDF
Business implication of Artificial Intelligence.pdf
VishalChugh12
 
PPTX
apidays Singapore 2025 - The Quest for the Greenest LLM , Jean Philippe Ehre...
apidays
 
PDF
UNISE-Operation-Procedure-InDHIS2trainng
ahmedabduselam23
 
PDF
apidays Singapore 2025 - Building a Federated Future, Alex Szomora (GSMA)
apidays
 
PDF
Research Methodology Overview Introduction
ayeshagul29594
 
PPTX
01_Nico Vincent_Sailpeak.pptx_AI_Barometer_2025
FinTech Belgium
 
PPTX
big data eco system fundamentals of data science
arivukarasi
 
PDF
Technical-Report-GPS_GIS_RS-for-MSF-finalv2.pdf
KPycho
 
PDF
A GraphRAG approach for Energy Efficiency Q&A
Marco Brambilla
 
Using AI/ML for Space Biology Research
VICTOR MAESTRE RAMIREZ
 
SHREYAS25 INTERN-I,II,III PPT (1).pptx pre
swapnilherage
 
1750162332_Snapshot-of-Indias-oil-Gas-data-May-2025.pdf
sandeep718278
 
InformaticsPractices-MS - Google Docs.pdf
seshuashwin0829
 
Aict presentation on dpplppp sjdhfh.pptx
vabaso5932
 
What Is Data Integration and Transformation?
subhashenia
 
apidays Singapore 2025 - From API Intelligence to API Governance by Harsha Ch...
apidays
 
04_Tamás Marton_Intuitech .pptx_AI_Barometer_2025
FinTech Belgium
 
Comparative Study of ML Techniques for RealTime Credit Card Fraud Detection S...
Debolina Ghosh
 
办理学历认证InformaticsLetter新加坡英华美学院毕业证书,Informatics成绩单
Taqyea
 
Data Science Course Certificate by Sigma Software University
Stepan Kalika
 
Business implication of Artificial Intelligence.pdf
VishalChugh12
 
apidays Singapore 2025 - The Quest for the Greenest LLM , Jean Philippe Ehre...
apidays
 
UNISE-Operation-Procedure-InDHIS2trainng
ahmedabduselam23
 
apidays Singapore 2025 - Building a Federated Future, Alex Szomora (GSMA)
apidays
 
Research Methodology Overview Introduction
ayeshagul29594
 
01_Nico Vincent_Sailpeak.pptx_AI_Barometer_2025
FinTech Belgium
 
big data eco system fundamentals of data science
arivukarasi
 
Technical-Report-GPS_GIS_RS-for-MSF-finalv2.pdf
KPycho
 
A GraphRAG approach for Energy Efficiency Q&A
Marco Brambilla
 

Resilient Distributed Datasets

  • 1. RESILIENT DISTRIBUTED DATASETS: A FAULT-TOLERANT ABSTRACTION FOR IN-MEMORY CLUSTER COMPUTING MATEI ZAHARIA, MOSHARAF CHOWDHURY, TATHAGATA DAS, ANKUR DAVE, JUSTIN MA, MURPHY MCCAULEY, MICHAEL J. FRANKLIN, SCOTT SHENKER, ION STOICA. NSDI'12 PROCEEDINGS OF THE 9TH USENIX CONFERENCE ON NETWORKED SYSTEMS DESIGN AND IMPLEMENTATION PAPERS WE LOVE AMSTERDAM AUGUST 13, 2015 @gabriele_modena
  • 2. (C) PRESENTATION BY GABRIELE MODENA, 2015 About me • CS.ML • Data science & predictive modelling • with a sprinkle of systems work • Hadoop & c. for data wrangling & crunching numbers • … and Spark
  • 3. (C) PRESENTATION BY GABRIELE MODENA, 2015
  • 4. (C) PRESENTATION BY GABRIELE MODENA, 2015 We present Resilient Distributed Datasets (RDDs), a distributed memory abstraction that lets programmers perform in-memory computations on large clusters in a fault- tolerant manner. RDDs are motivated by two types of applications that current computing frameworks handle inefficiently: iterative algorithms and interactive data mining tools.
  • 5. (C) PRESENTATION BY GABRIELE MODENA, 2015 How • Review (concepts from) key related work • RDD + Spark • Some critiques
  • 6. (C) PRESENTATION BY GABRIELE MODENA, 2015 Related work • MapReduce • Dryad • Hadoop Distributed FileSystem (HDFS) • Mesos
  • 7. (C) PRESENTATION BY GABRIELE MODENA, 2015 What’s an iterative algorithm anyway? data = input data w = <target vector> for i in num_iterations: for item in data: update(w) Multiple input scans At each iteration, do something Update a shared data structure
  • 8. (C) PRESENTATION BY GABRIELE MODENA, 2015 HDFS • GFS paper (2003) • Distributed storage (with replication) • Block ops • NameNode hashes file locations (blocks) Data Node Data Node Data Node Name Node
  • 9. (C) PRESENTATION BY GABRIELE MODENA, 2015 HDFS • GFS paper (2003) • Distributed storage (with replication) • Block ops • NameNode hashes file locations (blocks) Data Node Data Node Data Node Name Node
  • 10. (C) PRESENTATION BY GABRIELE MODENA, 2015 HDFS • GFS paper (2003) • Distributed storage (with replication) • Block ops • NameNode hashes file locations (blocks) Data Node Data Node Data Node Name Node
  • 11. (C) PRESENTATION BY GABRIELE MODENA, 2015 MapReduce • Google paper (2004) • Apache Hadoop (~2007) • Divide and conquer functional model • Goes hand-in-hand with HDFS • Structure data as (key, value) 1. Map(): filter and project emit (k, v) pairs 2. Reduce(): aggregate and summarise group by key and count Map Map Map Reduce Reduce HDFS (blocks) HDFS
  • 12. (C) PRESENTATION BY GABRIELE MODENA, 2015 MapReduce • Google paper (2004) • Apache Hadoop (~2007) • Divide and conquer functional model • Goes hand-in-hand with HDFS • Structure data as (key, value) 1. Map(): filter and project emit (k, v) pairs 2. Reduce(): aggregate and summarise group by key and count Map Map Map Reduce Reduce HDFS (blocks) HDFS This is a test Yes it is a test …
  • 13. (C) PRESENTATION BY GABRIELE MODENA, 2015 MapReduce • Google paper (2004) • Apache Hadoop (~2007) • Divide and conquer functional model • Goes hand-in-hand with HDFS • Structure data as (key, value) 1. Map(): filter and project emit (k, v) pairs 2. Reduce(): aggregate and summarise group by key and count Map Map Map Reduce Reduce HDFS (blocks) HDFS This is a test Yes it is a test … (This,1), (is, 1), (a, 1), (test., 1), (Yes, 1), (it, 1), (is, 1)
  • 14. (C) PRESENTATION BY GABRIELE MODENA, 2015 MapReduce • Google paper (2004) • Apache Hadoop (~2007) • Divide and conquer functional model • Goes hand-in-hand with HDFS • Structure data as (key, value) 1. Map(): filter and project emit (k, v) pairs 2. Reduce(): aggregate and summarise group by key and count Map Map Map Reduce Reduce HDFS (blocks) HDFS This is a test Yes it is a test … (This,1), (is, 1), (a, 1), (test., 1), (Yes, 1), (it, 1), (is, 1) (This, 1), (is, 2), (a, 2), (test, 2), (Yes, 1), (it, 1)
  • 15. (C) PRESENTATION BY GABRIELE MODENA, 2015 (c) Image from Apache Tez http://tez.apache.org
  • 16. (C) PRESENTATION BY GABRIELE MODENA, 2015 Critiques to MR and HDFS • Great when records (and jobs) are independent • In reality expect data to be shuffled across the network • Latency measured in minutes • Performance hit for iterative methods • Composability monsters • Meant for batch workflows
  • 17. (C) PRESENTATION BY GABRIELE MODENA, 2015 Dryad • Microsoft paper (2007) • Inspired Apache Tez • Generalisation of MapReduce via I/O pipelining • Applications are (direct acyclic) graphs of tasks
  • 18. (C) PRESENTATION BY GABRIELE MODENA, 2015 Dryad DAG dag = new DAG("WordCount"); dag.addVertex(tokenizerVertex)
 .addVertex(summerVertex)
 .addEdge(new Edge(tokenizerVertex,
 summerVertex,
 edgeConf.createDefaultEdgeProperty())
 );
  • 19. (C) PRESENTATION BY GABRIELE MODENA, 2015 MapReduce and Dryad SELECT a.country, COUNT(b.place_id) FROM place a JOIN tweets b ON (a. place_id = b.place_id) GROUP BY a.country; (c) Image from Apache Tez http://tez.apache.org. Modified.
  • 20. (C) PRESENTATION BY GABRIELE MODENA, 2015 Critiques to Dryad • No explicit abstraction for data sharing • Must express data reps as DAG • Partial solution: DryadLINQ • No notion of a distributed filesystem • How to handle large inputs? • Local writes / remote reads?
  • 21. (C) PRESENTATION BY GABRIELE MODENA, 2015 Resilient Distributed Datasets Read-only, partitioned collection of records
 => a distributed immutable array 
 accessed via coarse-grained transformations => apply a function (scala closure) to all
 elements of the array Obj Obj Obj Obj Obj Obj Obj Obj Obj Obj Obj Obj
  • 22. (C) PRESENTATION BY GABRIELE MODENA, 2015 Resilient Distributed Datasets Read-only, partitioned collection of records
 => a distributed immutable array 
 accessed via coarse-grained transformations => apply a function (scala closure) to all
 elements of the array Obj Obj Obj Obj Obj Obj Obj Obj Obj Obj Obj Obj
  • 23. (C) PRESENTATION BY GABRIELE MODENA, 2015 Spark • Transformations - lazily create RDDs
 wc = dataset.flatMap(tokenize)
 .reduceByKey(add) • Actions - execute computation
 wc.collect() Runtime and API
  • 24. (C) PRESENTATION BY GABRIELE MODENA, 2015 Applications Driver Worker Worker Worker input data input data input data RAM RAM results tasks RAM
  • 25. (C) PRESENTATION BY GABRIELE MODENA, 2015 Applications • Driver code defines RDDs and invokes actions Driver Worker Worker Worker input data input data input data RAM RAM results tasks RAM
  • 26. (C) PRESENTATION BY GABRIELE MODENA, 2015 Applications • Driver code defines RDDs and invokes actions • Submit to long lived workers, that store partitions in memory Driver Worker Worker Worker input data input data input data RAM RAM results tasks RAM
  • 27. (C) PRESENTATION BY GABRIELE MODENA, 2015 Applications • Driver code defines RDDs and invokes actions • Submit to long lived workers, that store partitions in memory • Scala closures are serialised as Java objects and passed across the network over HTTPDriver Worker Worker Worker input data input data input data RAM RAM results tasks RAM
  • 28. (C) PRESENTATION BY GABRIELE MODENA, 2015 Applications • Driver code defines RDDs and invokes actions • Submit to long lived workers, that store partitions in memory • Scala closures are serialised as Java objects and passed across the network over HTTP • Variables bound to the closure are saved in the serialised object Driver Worker Worker Worker input data input data input data RAM RAM results tasks RAM
  • 29. (C) PRESENTATION BY GABRIELE MODENA, 2015 Applications • Driver code defines RDDs and invokes actions • Submit to long lived workers, that store partitions in memory • Scala closures are serialised as Java objects and passed across the network over HTTP • Variables bound to the closure are saved in the serialised object • Closures are deserialised on each worker and applied to the RDD (partition) Driver Worker Worker Worker input data input data input data RAM RAM results tasks RAM
  • 30. (C) PRESENTATION BY GABRIELE MODENA, 2015 Applications • Driver code defines RDDs and invokes actions • Submit to long lived workers, that store partitions in memory • Scala closures are serialised as Java objects and passed across the network over HTTP • Variables bound to the closure are saved in the serialised object • Closures are deserialised on each worker and applied to the RDD (partition) • Mesos takes care of resource management Driver Worker Worker Worker input data input data input data RAM RAM results tasks RAM
  • 31. (C) PRESENTATION BY GABRIELE MODENA, 2015 Data persistance 1. in memory as deserialized java object 2. in memory as serialized data 3. on disk RDD Checkpointing Memory management via LRU eviction policy .persist() RDD for future reuse
  • 32. (C) PRESENTATION BY GABRIELE MODENA, 2015 Lineage lines = spark.textFile(“hdfs://...") errors = lines.filter(_.startsWith("ERROR")) errors.persist() errors.filter(_.contains("HDFS")) .map(_.split(’t’)(3)) .collect()
  • 33. (C) PRESENTATION BY GABRIELE MODENA, 2015 Lineage lines lines = spark.textFile(“hdfs://...") errors = lines.filter(_.startsWith("ERROR")) errors.persist() errors.filter(_.contains("HDFS")) .map(_.split(’t’)(3)) .collect()
  • 34. (C) PRESENTATION BY GABRIELE MODENA, 2015 Lineage lines lines = spark.textFile(“hdfs://...") errors = lines.filter(_.startsWith("ERROR")) errors.persist() errors.filter(_.contains("HDFS")) .map(_.split(’t’)(3)) .collect() filter(_.startsWith("ERROR"))
  • 35. (C) PRESENTATION BY GABRIELE MODENA, 2015 Lineage lines errors lines = spark.textFile(“hdfs://...") errors = lines.filter(_.startsWith("ERROR")) errors.persist() errors.filter(_.contains("HDFS")) .map(_.split(’t’)(3)) .collect() filter(_.startsWith("ERROR"))
  • 36. (C) PRESENTATION BY GABRIELE MODENA, 2015 Lineage lines errors lines = spark.textFile(“hdfs://...") errors = lines.filter(_.startsWith("ERROR")) errors.persist() errors.filter(_.contains("HDFS")) .map(_.split(’t’)(3)) .collect() filter(_.startsWith("ERROR")) filter(_.contains(“HDFS”))
  • 37. (C) PRESENTATION BY GABRIELE MODENA, 2015 Lineage lines errors hdfs errors lines = spark.textFile(“hdfs://...") errors = lines.filter(_.startsWith("ERROR")) errors.persist() errors.filter(_.contains("HDFS")) .map(_.split(’t’)(3)) .collect() filter(_.startsWith("ERROR")) filter(_.contains(“HDFS”))
  • 38. (C) PRESENTATION BY GABRIELE MODENA, 2015 Lineage lines errors hdfs errors lines = spark.textFile(“hdfs://...") errors = lines.filter(_.startsWith("ERROR")) errors.persist() errors.filter(_.contains("HDFS")) .map(_.split(’t’)(3)) .collect() filter(_.startsWith("ERROR")) filter(_.contains(“HDFS”)) map(_.split(’t’)(3))
  • 39. (C) PRESENTATION BY GABRIELE MODENA, 2015 Lineage lines errors hdfs errors time fields lines = spark.textFile(“hdfs://...") errors = lines.filter(_.startsWith("ERROR")) errors.persist() errors.filter(_.contains("HDFS")) .map(_.split(’t’)(3)) .collect() filter(_.startsWith("ERROR")) filter(_.contains(“HDFS”)) map(_.split(’t’)(3))
  • 40. (C) PRESENTATION BY GABRIELE MODENA, 2015 Lineage lines errors hdfs errors time fields lines = spark.textFile(“hdfs://...") errors = lines.filter(_.startsWith("ERROR")) errors.persist() errors.filter(_.contains("HDFS")) .map(_.split(’t’)(3)) .collect() filter(_.startsWith("ERROR")) filter(_.contains(“HDFS”)) map(_.split(’t’)(3))
  • 41. (C) PRESENTATION BY GABRIELE MODENA, 2015 Lineage lines errors hdfs errors time fields lines = spark.textFile(“hdfs://...") errors = lines.filter(_.startsWith("ERROR")) errors.persist() errors.filter(_.contains("HDFS")) .map(_.split(’t’)(3)) .collect() filter(_.startsWith("ERROR")) filter(_.contains(“HDFS”)) map(_.split(’t’)(3))
  • 42. (C) PRESENTATION BY GABRIELE MODENA, 2015 Lineage Fault recovery If a partition is lost, derived it back from the lineage lines errors hdfs errors time fields lines = spark.textFile(“hdfs://...") errors = lines.filter(_.startsWith("ERROR")) errors.persist() errors.filter(_.contains("HDFS")) .map(_.split(’t’)(3)) .collect() filter(_.startsWith("ERROR")) filter(_.contains(“HDFS”)) map(_.split(’t’)(3))
  • 43. (C) PRESENTATION BY GABRIELE MODENA, 2015 Representation Challenge: track lineage across transformations 1. Partitions 2. Data locality for partition p 3. List dependencies 4. Iterator function to compute a dataset based on its parents 5. Metadata for the partitioner scheme
  • 44. (C) PRESENTATION BY GABRIELE MODENA, 2015 Narrow dependencies pipelined execution on one cluster node map, filter union
  • 45. (C) PRESENTATION BY GABRIELE MODENA, 2015 Wide dependencies require data from all parent partitions to be available and to be shuffled across the nodes using a MapReduce-like operation groupByKey join with inputs not co-partitioned
  • 46. (C) PRESENTATION BY GABRIELE MODENA, 2015 Scheduling Task are allocated based on data locality (delayed scheduling) 1. Action is triggered => compute the RDD 2. Based on lineage, build a graph of stages to execute 3. Each stage contains as many pipelined transformations with narrow dependencies as possible 4. Launch tasks to compute missing partitions from each stage until it has computed the target RDD 5. If a task fails => re-run it on another node as long as its stage’s parents are still available.
  • 47. (C) PRESENTATION BY GABRIELE MODENA, 2015 Job execution union map groupBy join B C D E F G Stage 3Stage 2 A Stage 1
  • 48. (C) PRESENTATION BY GABRIELE MODENA, 2015 Job execution union map groupBy join B C D E F G Stage 3Stage 2 A Stage 1 B = A.groupBy D = C.map F = D.union(E) G = B.join(F) G.collect()
  • 49. (C) PRESENTATION BY GABRIELE MODENA, 2015 Job execution G B = A.groupBy D = C.map F = D.union(E) G = B.join(F) G.collect()
  • 50. (C) PRESENTATION BY GABRIELE MODENA, 2015 Job execution join B F G B = A.groupBy D = C.map F = D.union(E) G = B.join(F) G.collect()
  • 51. (C) PRESENTATION BY GABRIELE MODENA, 2015 Job execution join B F G groupBy A B = A.groupBy D = C.map F = D.union(E) G = B.join(F) G.collect()
  • 52. (C) PRESENTATION BY GABRIELE MODENA, 2015 Job execution union D E join B F G groupBy A B = A.groupBy D = C.map F = D.union(E) G = B.join(F) G.collect()
  • 53. (C) PRESENTATION BY GABRIELE MODENA, 2015 Job execution map C union D E join B F G groupBy A B = A.groupBy D = C.map F = D.union(E) G = B.join(F) G.collect()
  • 54. (C) PRESENTATION BY GABRIELE MODENA, 2015 Job execution map C union D E join B F G groupBy A B = A.groupBy D = C.map F = D.union(E) G = B.join(F) G.collect()
  • 55. (C) PRESENTATION BY GABRIELE MODENA, 2015 Job execution map C union D E join B F G groupBy A B = A.groupBy D = C.map F = D.union(E) G = B.join(F) G.collect()
  • 56. (C) PRESENTATION BY GABRIELE MODENA, 2015 Job execution map C union D E join B F G groupBy A B = A.groupBy D = C.map F = D.union(E) G = B.join(F) G.collect()
  • 57. (C) PRESENTATION BY GABRIELE MODENA, 2015 Job execution map C union D E join B F G groupBy A Stage 1 B = A.groupBy D = C.map F = D.union(E) G = B.join(F) G.collect()
  • 58. (C) PRESENTATION BY GABRIELE MODENA, 2015 Job execution map C union D E join B F G Stage 2 groupBy A Stage 1 B = A.groupBy D = C.map F = D.union(E) G = B.join(F) G.collect()
  • 59. (C) PRESENTATION BY GABRIELE MODENA, 2015 Job execution map C union D E join B F G Stage 3Stage 2 groupBy A Stage 1 B = A.groupBy D = C.map F = D.union(E) G = B.join(F) G.collect()
  • 60. (C) PRESENTATION BY GABRIELE MODENA, 2015 Evaluation
  • 61. (C) PRESENTATION BY GABRIELE MODENA, 2015 Some critiques (to the paper) Some critiques (to the paper) • How general is this approach? • We are still doing MapReduce • Concerns wrt iterative algorithms still stand • CPU bound workloads? • Linear Algebra? • How much tuning is required? • How does the partitioner work? • What is the cost of reconstructing an RDD from lineage? • Performance when data does not fit in memory • Eg. a join between two very large non co- partitioned RDDs
  • 62. (C) PRESENTATION BY GABRIELE MODENA, 2015 References (Theory) Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. Zaharia et. al, Proceedings of NSDI’12. https://www.cs.berkeley.edu/~matei/papers/2012/ nsdi_spark.pdf Spark: cluster computing with working sets. Zaharia et. al, Proceedings of HotCloud'10. http://people.csail.mit.edu/matei/papers/2010/hotcloud_spark.pdf The Google File System. Ghemawat, Gobioff, Leung, 19th ACM Symposium on Operating Systems Principles, 2003. http://research.google.com/archive/gfs.html MapReduce: Simplified Data Processing on Large Clusters. Dean, Ghemawat, OSDI'04: Sixth Symposium on Operating System Design and Implementation. http://research.google.com/archive/mapreduce.html Dryad: Distributed Data-Parallel Programs from Sequential Building Blocks Michael Isard, Mihai Budiu, Yuan Yu, Andrew Birrell, and Dennis Fetterly European Conference on Computer Systems (EuroSys), Lisbon, Portugal, March 21-23, 2007. http://research.microsoft.com/en-us/projects/dryad/eurosys07.pdf Mesos: a platform for fine-grained resource sharing in the data center, Hindman et. al, Proceedings of NSDI’11. https://www.cs.berkeley.edu/~alig/papers/mesos.pdf
  • 63. (C) PRESENTATION BY GABRIELE MODENA, 2015 References (Practice) • An overview of the pyspark API through pictures https://github.com/jkthompson/ pyspark-pictures • Barry Brumitt’s presentation on MapReduce design patterns (UW CSE490) http://courses.cs.washington.edu/courses/cse490h/08au/lectures/ MapReduceDesignPatterns-UW2.pdf • The Dryad Project http://research.microsoft.com/en-us/projects/dryad/ • Apache Spark http://spark.apache.org • Apache Hadoop https://hadoop.apache.org • Apache Tez https://tez.apache.org • Apache Mesos http://mesos.apache.org