SlideShare a Scribd company logo
Graph Stream Processing
spinning fast, large-scale, complex analytics
Paris Carbone
PhD Candidate @ KTH
Committer @ Apache Flink
We want to analyse….
We want to analyse….
data
We want to analyse….
datacomplex
We want to analyse….
datacomplexlarge-scale
We want to analyse….
data fastcomplexlarge-scale
But why do we need
large-scale, complex and fast data analysis?
>
But why do we need
large-scale, complex and fast data analysis?
to answer big complex questions faster>
to	answer	big	complex	questions	faster>
>Hej Siri_
to	answer	big	complex	questions	faster>
Get me the best route to work right now
>Hej Siri_
to	answer	big	complex	questions	faster>
Get me the best route to work right now
>Hej Siri_
…with the fewest human drivers
to	answer	big	complex	questions	faster>
Get me the best route to work right now
>Hej Siri_
Lookup a pizza recipe all of my friends like but
did not eat yesterday…
…with the fewest human drivers
to	answer	big	complex	questions	faster>
Get me the best route to work right now
>Hej Siri_
Lookup a pizza recipe all of my friends like but
did not eat yesterday… or the day before yesterday
…with the fewest human drivers
to	answer	big	complex	questions	faster>
Get me the best route to work right now
>Hej Siri_
Lookup a pizza recipe all of my friends like but
did not eat yesterday… or the day before yesterday
oh! And no kebab pizza!
…with the fewest human drivers
to	answer	big	complex	questions	faster>
Get me the best route to work right now
>Hej Siri_
Lookup a pizza recipe all of my friends like but
did not eat yesterday…
Siri, is it possible to re-unite all data
scientists in the world?
or the day before yesterday
oh! And no kebab pizza!
…with the fewest human drivers
to	answer	big	complex	questions	faster>
no matter if they use Spark or Flink or just ipython
Get me the best route to work right now
>Hej Siri_
Lookup a pizza recipe all of my friends like but
did not eat yesterday…
Siri, is it possible to re-unite all data
scientists in the world?
or the day before yesterday
oh! And no kebab pizza!
…with the fewest human drivers
to	answer	big	complex	questions	faster>
no matter if they use Spark or Flink or just ipython
Get me the best route to work right now
>Hej Siri_
Lookup a pizza recipe all of my friends like but
did not eat yesterday…
Siri, is it possible to re-unite all data
scientists in the world?
or the day before yesterday
oh! And no kebab pizza!
…with the fewest human drivers
to	answer	big	complex	questions	faster>
to	answer	big	complex	questions	faster>
no	matter	if	they	use	Spark	or	Flink	or	just	ipython
best	route	to	work	right	now
Lookup	a	pizza	recipe	all	of	my	friends	like	but	did	not	eat	
yesterday…
re-unite	all	data	scientists	in	the	world?
or	the	day	before	yesterday
oh!	And	no	kebab	pizza!
…with	the	fewest	human	drivers
3000 AD
to	answer	big	complex	questions	faster>
no	matter	if	they	use	Spark	or	Flink	or	just	ipython
best	route	to	work	right	now
Lookup	a	pizza	recipe	all	of	my	friends	like	but	did	not	eat	
yesterday…
re-unite	all	data	scientists	in	the	world?
or	the	day	before	yesterday
oh!	And	no	kebab	pizza!
…with	the	fewest	human	drivers
3000 AD
to	answer	big	complex	questions	faster>
no	matter	if	they	use	Spark	or	Flink	or	just	ipython
best	route	to	work	right	now
Lookup	a	pizza	recipe	all	of	my	friends	like	but	did	not	eat	
yesterday…
re-unite	all	data	scientists	in	the	world?
or	the	day	before	yesterday
oh!	And	no	kebab	pizza!
…with	the	fewest	human	drivers
FIRST WORLD PROBLEM
3000 AD
to	answer	big	complex	questions	faster>
use	Spark	or	Flink	or	just	ipython
best	route	to	work	right	now
re-unite	all	data	scientists	in	the	world?
oh!	And	no	kebab	pizza!
…with	the	fewest	human	drivers
30000 AD
to	answer	big	complex	questions	faster>
FIRST EARTH WORLD PROBLEM
use	Spark	or	Flink	or	just	ipython
best	route	to	work	right	now
re-unite	all	data	scientists	in	the	world?
oh!	And	no	kebab	pizza!
…with	the	fewest	human	drivers
30000 AD
Still,	fast	analytics	might	save	us	some	day…
• We can access patient movements and fb, twitter
and pretty much all social media interactions
• Can we stop a pandemic?
• Or can we predict fast where the virus can spread?
Now how do we analyse…
data fastcomplexlarge-scale ?
Now how do we analyse…
data
graphdistributed streaming
Now how do we analyse…
data
graphdistributed streaming
everything is a graph
Now how do we analyse…
data
graphdistributed streaming
everything is many everything is a graph
Now how do we analyse…
data
graphdistributed streaming
everything is many everything is a graph everything is a stream
it all started…
as a first world problem question
but then things escalated quickly…
…and machinery got cheaper and we
suddenly realised that we have big data
Distributed Graph processing was born
Thus,
Distributed Graph processing was born
Thus,
Map Reduce
1. Store Partitioned Data
2. Sent Local computation (map)
3. now shuffle it on disks
4. merge the results (reduce)
5. Store the result back
DFS :
distributed
file system
Distributed Graph processing was born
Thus,
1. Store Updates to DFS
2. Load graph snapshot (mem)
3. Compute round~superstep
4. Store updates
5. …repeat
Distributed Graph
ProcessingMap Reduce
1. Store Partitioned Data
2. Sent Local computation (map)
3. now shuffle it on disks
4. merge the results (reduce)
5. Store the result back
DFS :
distributed
file system
Distributed Graph processing was born
Thus,
1. Store Updates to DFS
2. Load graph snapshot (mem)
3. Compute round~superstep
4. Store updates
5. …repeat
Distributed Graph
ProcessingMap Reduce
1. Store Partitioned Data
2. Sent Local computation (map)
3. now shuffle it on disks
4. merge the results (reduce)
5. Store the result back
DFS :
distributed
file system
Distributed Graph processing was born
Thus,
1. Store Updates to DFS
2. Load graph snapshot (mem)
3. Compute round~superstep
4. Store updates
5. …repeat
Distributed Graph
ProcessingMap Reduce
1. Store Partitioned Data
2. Sent Local computation (map)
3. now shuffle it on disks
4. merge the results (reduce)
5. Store the result back
DFS :
distributed
file system
Distributed Graph processing was born
Thus,
1. Store Updates to DFS
2. Load graph snapshot (mem)
3. Compute round~superstep
4. Store updates
5. …repeat
Distributed Graph
ProcessingMap Reduce
1. Store Partitioned Data
2. Sent Local computation (map)
3. now shuffle it on disks
4. merge the results (reduce)
5. Store the result back
DFS :
distributed
file system
Distributed Graph processing was born
Thus,
1. Store Updates to DFS
2. Load graph snapshot (mem)
3. Compute round~superstep
4. Store updates
5. …repeat
Distributed Graph
ProcessingMap Reduce
1. Store Partitioned Data
2. Sent Local computation (map)
3. now shuffle it on disks
4. merge the results (reduce)
5. Store the result back
DFS :
distributed
file system
Distributed Graph processing was born
Thus,
1. Store Updates to DFS
2. Load graph snapshot (mem)
3. Compute round~superstep
4. Store updates
5. …repeat
Distributed Graph
ProcessingMap Reduce
1. Store Partitioned Data
2. Sent Local computation (map)
3. now shuffle it on disks
4. merge the results (reduce)
5. Store the result back
DFS :
distributed
file system
• We want to compute the Connected Components
of a distributed graph.
• Basic computation element (map): vertex
• Updates : messages to other vertices
Distributed Graph processing example
• We want to compute the Connected Components
of a distributed graph.
• Basic computation element (map): vertex
• Updates : messages to other vertices
Distributed Graph processing example
1 2
3
Distributed Graph processing example
1
43
2
5
6
7
8
ROUND 0
Distributed Graph processing example
1
43
2
5
ROUND 0
6
7
8
3
1
4
4
5
2
4
2
3
5
7
8
6
8
6
7
Distributed Graph processing example
1
21
2
2
ROUND 1
6
6
6
Distributed Graph processing example
1
2
2
2
2
1
2
6
6
6
6
1
21
2
2
ROUND 1
6
6
6
6
6
Distributed Graph processing example
1
11
2
2
ROUND 2
6
6
6
Distributed Graph processing example
1
11
2
2
ROUND 2
6
6
6
1
1
1
Distributed Graph processing example
1
11
1
1
ROUND 3
6
6
6
Distributed Graph processing example
1
11
1
1
ROUND 3
6
6
6
1
1
1
1
Distributed Graph processing example
1
11
1
1
ROUND 4
6
6
6
No messages, DONE!
• Examples of Load-Compute-Store systems:
Pregel, Graphx (spark), Graphlab, PowerGraph
• Same execution strategy - Same problems
• It’s slow
• Too much re-computation ($€) for nothing.
• Real World Updates anyone?
Distributed Graph processing systems
Graph Stream Processing : spinning fast, large scale, complex analytics
…and streaming came
to mess everything
make
fast and simple
…and streaming came
to mess everything
make
fast and simple
real
world
…and streaming came
to mess everything
make
fast and simple
real
world event records
• local state stays here
• local computation too
The Dataflow™
Graph Stream Processing : spinning fast, large scale, complex analytics
Streaming is so advanced that…
• subsecond latency and high throughput
finally coexist
• it does fault tolerance without batch writes*
• late data** is handled gracefully
* https://arxiv.org/abs/1506.08603• ** http://dl.acm.org/citation.cfm?id=2824076
Streaming is so advanced that…
…but what about complex problems?
• subsecond latency and high throughput
finally coexist
• it does fault tolerance without batch writes*
• late data** is handled gracefully
* https://arxiv.org/abs/1506.08603• ** http://dl.acm.org/citation.cfm?id=2824076
Graph Stream Processing : spinning fast, large scale, complex analytics
can we make it happen?
can we make it happen?
• Problem: Can’t keep an infinite graph in-
memory and do complex stuff
can we make it happen?
• Problem: Can’t keep an infinite graph in-
memory and do complex stuff
??
universe
can we make it happen?
• Problem: Can’t keep an infinite graph in-
memory and do complex stuff
??
universe
>it was never about the graph silly, it was about
answering complex questions, remember?
can we make it happen?
• Problem: Can’t keep an infinite graph in-
memory and do complex stuff
universe
;)
universe
summary
>it was never about the graph silly, it was about
answering complex questions, remember?
answers
Examples of Summaries
• Spanners : distance estimation
• Sparsifiers : cut estimation
• Sketches : homomorphic properties
graph summary
algorithm algorithm~R1 R2
Distributed Graph
streaming example
54
76
86
42
31
52Connected Components
on a stream of edges (additions)
31
Distributed Graph
streaming example
54
76
86
42
43
31
52
Connected Components
on a stream of edges (additions)
1
52
Distributed Graph
streaming example
54
76
86
42
43
87
52
Connected Components
on a stream of edges (additions)
31
1 2
52
4
Distributed Graph
streaming example
54
76
86
42
43
87
41
Connected Components
on a stream of edges (additions)
31
1 2
52
4
Distributed Graph
streaming example
76
86
42
43
87
41
Connected Components
on a stream of edges (additions)
31
1
76
2
6
52
4
8
Distributed Graph
streaming example
86
42
43
87
41
Connected Components
on a stream of edges (additions)
31
1
76
2
6
8
52
4
76
Distributed Graph
streaming example
42
43
87
41
Connected Components
on a stream of edges (additions)
31
1 2
6
8
52
4
76
Distributed Graph
streaming example
42
43
87
41
Connected Components
on a stream of edges (additions)
31
1 2
6
8
52
4
76
Distributed Graph
streaming example
43
87
41Connected Components
on a stream of edges (additions)
31
1
6
But Is this Efficient?
Sure, we can distribute the edges and summaries
But Is this Efficient?
Sure, we can distribute the edges and summaries
any systems in mind?
Gelly Stream
Graph stream processing with Apache Flink
Gelly Stream Oveview
DataStreamDataSet
Distributed Dataflow
Deployment
Gelly Gelly-
➤ Static Graphs
➤ Multi-Pass Algorithms
➤ Full Computations
➤ Dynamic Graphs
➤ Single-Pass Algorithms
➤ Approximate Computations
DataStream
Gelly Stream Status
➤ Properties and Metrics
➤ Transformations
➤ Aggregations
➤ Discretization
➤ Neighborhood
Aggregations
➤ Graph Streaming
Algorithms
➤ Connected
Components
➤ Bipartiteness Check
➤ Window Triangle Count
➤ Triangle Count
Estimation
➤ Continuous Degree
Aggregate
Graph Stream Processing : spinning fast, large scale, complex analytics
wait, so now we can detect
connected components right	away?
wait, so now we can detect
connected components right	away?
wait, so now we can detect
connected components right	away?
Solved! But how about our other issues now?
no	matter	if	they	use	Spark	or	Flink	or		
just	ipython
>Hej	Siri_	
Siri,	is	it	possible	to	re-unite	all	data	
scientists	in	the	world?
>
no	matter	if	they	use	Spark	or	Flink	or		
just	ipython
>Hej	Siri_	
Siri,	is	it	possible	to	re-unite	all	data	
scientists	in	the	world?
>
Gelly-Stream to the rescue
graphStream.filterVertices(DataScientists())
.slice(Time.of(10, MINUTE), EdgeDirection.IN)
.applyOnNeighbors(FindPairs())
wendy checked_in glaze
steve checked_in glaze
tom checked_in joe’s_grill
sandra checked_in glaze
rafa checked_in joe’s_grill
wendy
steve
sandra
glaze
tom
rafa
joe’s
grill
{wendy, steve}
{steve, sandra}
{wendy, sandra}
{tom, rafa}
no	matter	if	they	use	Spark	or	Flink	or		
just	ipython
>Hej	Siri_	
Siri,	is	it	possible	to	re-unite	all	data	
scientists	in	the	world?
>
no	matter	if	they	use	Spark	or	Flink	or		
just	ipython
>Hej	Siri_	
Siri,	is	it	possible	to	re-unite	all	data	
scientists	in	the	world?
>	yes
The next step
• Iterative model* on streams for deeper analytics
• More Summaries
• Better Our-Of-Core State Integration
• AdHoc Graph Queries
Large-scale, Complex, Fast, Deep Analytics
* http://dl.acm.org/citation.cfm?id=2983551
Try out Gelly-Stream*
because all questions matter
@SenorCarbone
*https://github.com/vasia/gelly-streaming

More Related Content

What's hot (20)

PPTX
Apache Flink Training: System Overview
Flink Forward
 
PPTX
Till Rohrmann – Fault Tolerance and Job Recovery in Apache Flink
Flink Forward
 
PPTX
Virtual Flink Forward 2020: Cogynt: Flink without code - Samantha Chan, Aslam...
Flink Forward
 
PDF
Don't Cross The Streams - Data Streaming And Apache Flink
John Gorman (BSc, CISSP)
 
PPTX
Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015
Robert Metzger
 
PDF
Christian Kreuzfeld – Static vs Dynamic Stream Processing
Flink Forward
 
PDF
Tran Nam-Luc – Stale Synchronous Parallel Iterations on Flink
Flink Forward
 
PPTX
Extending the Yahoo Streaming Benchmark
Jamie Grier
 
PDF
A look at Flink 1.2
Stefan Richter
 
PPTX
SICS: Apache Flink Streaming
Turi, Inc.
 
PDF
K. Tzoumas & S. Ewen – Flink Forward Keynote
Flink Forward
 
PDF
Pulsar connector on flink 1.14
宇帆 盛
 
PPTX
Debunking Common Myths in Stream Processing
Kostas Tzoumas
 
PDF
State Management in Apache Flink : Consistent Stateful Distributed Stream Pro...
Paris Carbone
 
PPTX
An Introduction to Distributed Data Streaming
Paris Carbone
 
PDF
Stateful stream processing with Apache Flink
Knoldus Inc.
 
PDF
Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...
Flink Forward
 
PPTX
A Data Streaming Architecture with Apache Flink (berlin Buzzwords 2016)
Robert Metzger
 
PPTX
Continuous Processing with Apache Flink - Strata London 2016
Stephan Ewen
 
PDF
A Call for Sanity in NoSQL
C4Media
 
Apache Flink Training: System Overview
Flink Forward
 
Till Rohrmann – Fault Tolerance and Job Recovery in Apache Flink
Flink Forward
 
Virtual Flink Forward 2020: Cogynt: Flink without code - Samantha Chan, Aslam...
Flink Forward
 
Don't Cross The Streams - Data Streaming And Apache Flink
John Gorman (BSc, CISSP)
 
Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015
Robert Metzger
 
Christian Kreuzfeld – Static vs Dynamic Stream Processing
Flink Forward
 
Tran Nam-Luc – Stale Synchronous Parallel Iterations on Flink
Flink Forward
 
Extending the Yahoo Streaming Benchmark
Jamie Grier
 
A look at Flink 1.2
Stefan Richter
 
SICS: Apache Flink Streaming
Turi, Inc.
 
K. Tzoumas & S. Ewen – Flink Forward Keynote
Flink Forward
 
Pulsar connector on flink 1.14
宇帆 盛
 
Debunking Common Myths in Stream Processing
Kostas Tzoumas
 
State Management in Apache Flink : Consistent Stateful Distributed Stream Pro...
Paris Carbone
 
An Introduction to Distributed Data Streaming
Paris Carbone
 
Stateful stream processing with Apache Flink
Knoldus Inc.
 
Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...
Flink Forward
 
A Data Streaming Architecture with Apache Flink (berlin Buzzwords 2016)
Robert Metzger
 
Continuous Processing with Apache Flink - Strata London 2016
Stephan Ewen
 
A Call for Sanity in NoSQL
C4Media
 

Viewers also liked (20)

PDF
Aggregate Sharing for User-Define Data Stream Windows
Paris Carbone
 
PDF
Spark meetup london share and analyse genomic data at scale with spark, adam...
Andy Petrella
 
PPTX
Cloud PARTE: Elastic Complex Event Processing based on Mobile Actors
Stefan Marr
 
PPTX
Flink. Pure Streaming
Indizen Technologies
 
ODP
JUDCon India 2012 Drools Fusion
Mark Proctor
 
PDF
Graphs as Streams: Rethinking Graph Processing in the Streaming Era
Vasia Kalavri
 
PDF
Twitter's Real Time Stack - Processing Billions of Events Using Distributed L...
Karthik Ramasamy
 
PDF
Gelly-Stream: Single-Pass Graph Streaming Analytics with Apache Flink
Vasia Kalavri
 
PDF
Apache Flink Training Workshop @ HadoopCon2016 - #1 System Overview
Apache Flink Taiwan User Group
 
PDF
Gelly in Apache Flink Bay Area Meetup
Vasia Kalavri
 
PDF
Graph Processing with Apache TinkerPop
Jason Plurad
 
PDF
Batch and Stream Graph Processing with Apache Flink
Vasia Kalavri
 
PDF
Internet of Things and Complex event processing (CEP)/Data fusion
BAINIDA
 
PPTX
ETL into Neo4j
Max De Marzi
 
PPT
Complex Event Processing (CEP) for Next-Generation Security Event Management,...
Tim Bass
 
PDF
Flink Apachecon Presentation
Gyula Fóra
 
PDF
20170126 big data processing
Vienna Data Science Group
 
PDF
Quantum Processes in Graph Computing
Marko Rodriguez
 
PDF
Introduction to Streaming Analytics
Guido Schmutz
 
PDF
How to monitor NGINX
Server Density
 
Aggregate Sharing for User-Define Data Stream Windows
Paris Carbone
 
Spark meetup london share and analyse genomic data at scale with spark, adam...
Andy Petrella
 
Cloud PARTE: Elastic Complex Event Processing based on Mobile Actors
Stefan Marr
 
Flink. Pure Streaming
Indizen Technologies
 
JUDCon India 2012 Drools Fusion
Mark Proctor
 
Graphs as Streams: Rethinking Graph Processing in the Streaming Era
Vasia Kalavri
 
Twitter's Real Time Stack - Processing Billions of Events Using Distributed L...
Karthik Ramasamy
 
Gelly-Stream: Single-Pass Graph Streaming Analytics with Apache Flink
Vasia Kalavri
 
Apache Flink Training Workshop @ HadoopCon2016 - #1 System Overview
Apache Flink Taiwan User Group
 
Gelly in Apache Flink Bay Area Meetup
Vasia Kalavri
 
Graph Processing with Apache TinkerPop
Jason Plurad
 
Batch and Stream Graph Processing with Apache Flink
Vasia Kalavri
 
Internet of Things and Complex event processing (CEP)/Data fusion
BAINIDA
 
ETL into Neo4j
Max De Marzi
 
Complex Event Processing (CEP) for Next-Generation Security Event Management,...
Tim Bass
 
Flink Apachecon Presentation
Gyula Fóra
 
20170126 big data processing
Vienna Data Science Group
 
Quantum Processes in Graph Computing
Marko Rodriguez
 
Introduction to Streaming Analytics
Guido Schmutz
 
How to monitor NGINX
Server Density
 
Ad

Similar to Graph Stream Processing : spinning fast, large scale, complex analytics (20)

PDF
Ling liu part 02:big graph processing
jins0618
 
PDF
From Pipelines to Refineries: Scaling Big Data Applications
Databricks
 
PPTX
Big Stream Processing Systems, Big Graphs
Petr Novotný
 
PDF
Big Data Processing & Analytics: Improving data insight.pdf
McSkyzeZeg
 
PPTX
From Pipelines to Refineries: scaling big data applications with Tim Hunter
Databricks
 
PPTX
Trivento summercamp masterclass 9/9/2016
Stavros Kontopoulos
 
PDF
Introduction to Distributed Computing Engines for Data Processing - Simone Ro...
Data Science Milan
 
PPTX
Automatic Scaling Iterative Computations
Guozhang Wang
 
PPTX
Big Data for QAs
Ahmed Misbah
 
PDF
Dev Ops Training
Spark Summit
 
PPTX
Real time streaming analytics
Anirudh
 
PPTX
Big Data Ecosystem at LinkedIn. Keynote talk at Big Data Innovators Gathering...
Mitul Tiwari
 
PDF
Introduction to Spark Training
Spark Summit
 
PPTX
Intro to Spark development
Spark Summit
 
PPTX
Crash course on data streaming (with examples using Apache Flink)
Vincenzo Gulisano
 
PDF
The Analytics Frontier of the Hadoop Eco-System
inside-BigData.com
 
PPTX
Software architecture for data applications
Ding Li
 
PDF
Fishing Graphs in a Hadoop Data Lake
DataWorks Summit/Hadoop Summit
 
PDF
Fishing Graphs in a Hadoop Data Lake by Jörg Schad and Max Neunhoeffer at Big...
Big Data Spain
 
PDF
Designing Data-Intensive Applications_ The Big Ideas Behind Reliable, Scalabl...
SindhuVasireddy1
 
Ling liu part 02:big graph processing
jins0618
 
From Pipelines to Refineries: Scaling Big Data Applications
Databricks
 
Big Stream Processing Systems, Big Graphs
Petr Novotný
 
Big Data Processing & Analytics: Improving data insight.pdf
McSkyzeZeg
 
From Pipelines to Refineries: scaling big data applications with Tim Hunter
Databricks
 
Trivento summercamp masterclass 9/9/2016
Stavros Kontopoulos
 
Introduction to Distributed Computing Engines for Data Processing - Simone Ro...
Data Science Milan
 
Automatic Scaling Iterative Computations
Guozhang Wang
 
Big Data for QAs
Ahmed Misbah
 
Dev Ops Training
Spark Summit
 
Real time streaming analytics
Anirudh
 
Big Data Ecosystem at LinkedIn. Keynote talk at Big Data Innovators Gathering...
Mitul Tiwari
 
Introduction to Spark Training
Spark Summit
 
Intro to Spark development
Spark Summit
 
Crash course on data streaming (with examples using Apache Flink)
Vincenzo Gulisano
 
The Analytics Frontier of the Hadoop Eco-System
inside-BigData.com
 
Software architecture for data applications
Ding Li
 
Fishing Graphs in a Hadoop Data Lake
DataWorks Summit/Hadoop Summit
 
Fishing Graphs in a Hadoop Data Lake by Jörg Schad and Max Neunhoeffer at Big...
Big Data Spain
 
Designing Data-Intensive Applications_ The Big Ideas Behind Reliable, Scalabl...
SindhuVasireddy1
 
Ad

More from Paris Carbone (6)

PDF
Continuous Intelligence - Intersecting Event-Based Business Logic and ML
Paris Carbone
 
PDF
Scalable and Reliable Data Stream Processing - Doctorate Seminar
Paris Carbone
 
PDF
Stream Loops on Flink - Reinventing the wheel for the streaming era
Paris Carbone
 
PDF
Asynchronous Epoch Commits for Fast and Reliable Data Stream Execution in Apa...
Paris Carbone
 
PDF
A Future Look of Data Stream Processing as an Architecture for AI
Paris Carbone
 
PDF
Continuous Deep Analytics
Paris Carbone
 
Continuous Intelligence - Intersecting Event-Based Business Logic and ML
Paris Carbone
 
Scalable and Reliable Data Stream Processing - Doctorate Seminar
Paris Carbone
 
Stream Loops on Flink - Reinventing the wheel for the streaming era
Paris Carbone
 
Asynchronous Epoch Commits for Fast and Reliable Data Stream Execution in Apa...
Paris Carbone
 
A Future Look of Data Stream Processing as an Architecture for AI
Paris Carbone
 
Continuous Deep Analytics
Paris Carbone
 

Recently uploaded (20)

PDF
apidays Helsinki & North 2025 - REST in Peace? Hunting the Dominant Design fo...
apidays
 
PPTX
fashion industry boom.pptx an economics project
TGMPandeyji
 
PPTX
Climate Action.pptx action plan for climate
justfortalabat
 
PDF
apidays Helsinki & North 2025 - APIs in the healthcare sector: hospitals inte...
apidays
 
PPTX
GenAI-Introduction-to-Copilot-for-Bing-March-2025-FOR-HUB.pptx
cleydsonborges1
 
PPTX
Advanced_NLP_with_Transformers_PPT_final 50.pptx
Shiwani Gupta
 
PPTX
apidays Munich 2025 - Building Telco-Aware Apps with Open Gateway APIs, Subhr...
apidays
 
PPTX
Resmed Rady Landis May 4th - analytics.pptx
Adrian Limanto
 
PPTX
Introduction to Artificial Intelligence.pptx
StarToon1
 
PPTX
Usage of Power BI for Pharmaceutical Data analysis.pptx
Anisha Herala
 
PDF
Building Production-Ready AI Agents with LangGraph.pdf
Tamanna
 
PPTX
This PowerPoint presentation titled "Data Visualization: Turning Data into In...
HemaDivyaKantamaneni
 
PPTX
apidays Munich 2025 - Building an AWS Serverless Application with Terraform, ...
apidays
 
PDF
How to Avoid 7 Costly Mainframe Migration Mistakes
JP Infra Pvt Ltd
 
PDF
apidays Helsinki & North 2025 - How (not) to run a Graphql Stewardship Group,...
apidays
 
PPTX
Rocket-Launched-PowerPoint-Template.pptx
Arden31
 
PPTX
加拿大尼亚加拉学院毕业证书{Niagara在读证明信Niagara成绩单修改}复刻
Taqyea
 
PDF
apidays Helsinki & North 2025 - Monetizing AI APIs: The New API Economy, Alla...
apidays
 
PDF
Avatar for apidays apidays PRO June 07, 2025 0 5 apidays Helsinki & North 2...
apidays
 
PPTX
Slide studies GC- CRC - PC - HNC baru.pptx
LLen8
 
apidays Helsinki & North 2025 - REST in Peace? Hunting the Dominant Design fo...
apidays
 
fashion industry boom.pptx an economics project
TGMPandeyji
 
Climate Action.pptx action plan for climate
justfortalabat
 
apidays Helsinki & North 2025 - APIs in the healthcare sector: hospitals inte...
apidays
 
GenAI-Introduction-to-Copilot-for-Bing-March-2025-FOR-HUB.pptx
cleydsonborges1
 
Advanced_NLP_with_Transformers_PPT_final 50.pptx
Shiwani Gupta
 
apidays Munich 2025 - Building Telco-Aware Apps with Open Gateway APIs, Subhr...
apidays
 
Resmed Rady Landis May 4th - analytics.pptx
Adrian Limanto
 
Introduction to Artificial Intelligence.pptx
StarToon1
 
Usage of Power BI for Pharmaceutical Data analysis.pptx
Anisha Herala
 
Building Production-Ready AI Agents with LangGraph.pdf
Tamanna
 
This PowerPoint presentation titled "Data Visualization: Turning Data into In...
HemaDivyaKantamaneni
 
apidays Munich 2025 - Building an AWS Serverless Application with Terraform, ...
apidays
 
How to Avoid 7 Costly Mainframe Migration Mistakes
JP Infra Pvt Ltd
 
apidays Helsinki & North 2025 - How (not) to run a Graphql Stewardship Group,...
apidays
 
Rocket-Launched-PowerPoint-Template.pptx
Arden31
 
加拿大尼亚加拉学院毕业证书{Niagara在读证明信Niagara成绩单修改}复刻
Taqyea
 
apidays Helsinki & North 2025 - Monetizing AI APIs: The New API Economy, Alla...
apidays
 
Avatar for apidays apidays PRO June 07, 2025 0 5 apidays Helsinki & North 2...
apidays
 
Slide studies GC- CRC - PC - HNC baru.pptx
LLen8
 

Graph Stream Processing : spinning fast, large scale, complex analytics