SlideShare a Scribd company logo
- STEPHAN EWEN, CO-FOUNDER & CTO
STREAM PROCESSING
FROM APPLICATIONS TO PLATFORMS
2
A platform makes building new applications
simple by taking care of the common
and repeatable parts.
3
Internal streaming data platforms
built with Apache Flink
4
Observation 1
Stream Processing is about
building applications
Batch / Data Lake Architecture
5
a.k.a. collect now, figure out later
Streaming / Data-driven Applications
6
build applications directly on data streams
7
Observation 2
Stream Processing changes
the database-centric
architecture
Recall last Flink Forward…
8
Classic tiered architecture Streaming architecture
database
layer
compute
layer
application working state
+ historic state
compute
+
stream storage
and
snapshot storage
(backup)
application state
Changing the Two Tier Architecture
9
reads/writes across
tier boundary
asynchronous writes
of large blobs
all modifications
are local
Classic tiered architecture Streaming architecture
10
Application Platforms
Application Platforms
11
Resource Manager
Logging
Metrics
CI / CD
Kubernetes
12
Kubernetes
deploying new
applications
scaling
applications
Kubernetes & Stateful Applications
13
Kubernetes
Database
What about stateful containers?
14
Kubernetes
• Example: Scaling down a replicated database
• 3 replicas, 4 node scale down
need to move or
reorganize data
before container
shutdown
Stateful Questions
 consistent stateful upgrades
• application evolution and bug fixes
 migration of application state
• cluster migration, A/B testing
 re-processing and reinstatement
• fix corrupt results, bootstrap new applications
 state evolution (schema evolution)
15
A B
16
Container-based
Resource Orchestration
Stateful Stream
Processing & Snapshots
Kubernetes Apache Flink
Container-based
platform for stateful
data-driven applications
dA Platform
Code, Resource, Config, and
Snapshot Management
Application
Manager
App
Manager
Kubernetes Storage
Resource
Allocation
Job Control
Snapshot
Management
CI/CDWeb
interface
Versioned Applications, not Jobs/Jars
18
Stream Processing
Application
Version 3
Version 2
Version 1
Code and Application
Snapshot
upgrade
upgrade
New Application
Version 3a
Version 2a
fork /
duplicate
Architecture
Apache Flink
Stateful stream processing
Kubernetes
Container platform
Logging
Metrics
dA
Application
Manager
Application
lifecycle
management
20
What could the future of a
Streaming Data Platform
look like?
The Usual Suspects
 Role-based access control
 Metadata management
 Cross Datacenter Failover /
Disaster Recovery
21
Support for Batch Processing
22
Everything is a stream. Finite applications as a special case.
Periodic Bursty Stream Processing
23
time
Bursty Event Stream (events only at end-of-day )
Checkpoint / Savepoint
Store
Support a Broad Developer Audience
24
Streaming Data Platform
…
Use Case Vertical Libraries
25
Streaming Data Platform
SQL CEP …
Machine
Learning
Apache Flink
Stateful stream processing
Kubernetes
Container platform
Logging
Metrics
dA
Application
Manager
Application lifecycle
management
dA Platform is a turnkey solution for stateful
stream processing with Apache Flink.

More Related Content

PPTX
Flink Forward San Francisco 2018: - Jinkui Shi and Radu Tudoran "Flink real-t...
Flink Forward
 
PDF
dA Platform Overview
Robert Metzger
 
PPTX
data Artisans Product Announcement
Flink Forward
 
PDF
Flink Forward San Francisco 2018: Stefan Richter - "How to build a modern str...
Flink Forward
 
PPTX
Flink Forward Berlin 2017: Till Rohrmann - From Apache Flink 1.3 to 1.4
Flink Forward
 
PDF
Scaling stream data pipelines with Pravega and Apache Flink
Till Rohrmann
 
PPTX
Apache flink 1.7 and Beyond
Till Rohrmann
 
PPTX
Flink Forward Berlin 2017: Hao Wu - Large Scale User Behavior Analytics by Flink
Flink Forward
 
Flink Forward San Francisco 2018: - Jinkui Shi and Radu Tudoran "Flink real-t...
Flink Forward
 
dA Platform Overview
Robert Metzger
 
data Artisans Product Announcement
Flink Forward
 
Flink Forward San Francisco 2018: Stefan Richter - "How to build a modern str...
Flink Forward
 
Flink Forward Berlin 2017: Till Rohrmann - From Apache Flink 1.3 to 1.4
Flink Forward
 
Scaling stream data pipelines with Pravega and Apache Flink
Till Rohrmann
 
Apache flink 1.7 and Beyond
Till Rohrmann
 
Flink Forward Berlin 2017: Hao Wu - Large Scale User Behavior Analytics by Flink
Flink Forward
 

What's hot (20)

PPTX
From Apache Flink® 1.3 to 1.4
Till Rohrmann
 
PDF
Modern Stream Processing With Apache Flink @ GOTO Berlin 2017
Till Rohrmann
 
PDF
Flink Forward Berlin 2017: Jörg Schad, Till Rohrmann - Apache Flink meets Apa...
Flink Forward
 
PDF
Virtual Flink Forward 2020: Production-Ready Flink and Hive Integration - wha...
Flink Forward
 
PDF
Flink Forward Berlin 2017: Stephan Ewen - The State of Flink and how to adopt...
Flink Forward
 
PDF
Flink Forward Berlin 2017: Zohar Mizrahi - Python Streaming API
Flink Forward
 
PDF
Unify Enterprise Data Processing System Platform Level Integration of Flink a...
Flink Forward
 
PPTX
Stephan Ewen - Experiences running Flink at Very Large Scale
Ververica
 
PPTX
Apache Flink Berlin Meetup May 2016
Stephan Ewen
 
PDF
Flink Forward Berlin 2017: Mihail Vieru - A Materialization Engine for Data I...
Flink Forward
 
PDF
A look at Flink 1.2
Stefan Richter
 
PPTX
The Stream Processor as the Database - Apache Flink @ Berlin buzzwords
Stephan Ewen
 
PDF
Flink Forward SF 2017: Srikanth Satya & Tom Kaitchuck - Pravega: Storage Rei...
Flink Forward
 
PDF
Flink Forward San Francisco 2018: Andrew Torson - "Extending Flink metrics: R...
Flink Forward
 
PDF
Elastic Streams at Scale @ Flink Forward 2018 Berlin
Till Rohrmann
 
PDF
Tuning Flink For Robustness And Performance
Stefan Richter
 
PPTX
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
Flink Forward
 
PPTX
January 2016 Flink Community Update & Roadmap 2016
Robert Metzger
 
PDF
Flink Forward San Francisco 2018: Steven Wu - "Scaling Flink in Cloud"
Flink Forward
 
PDF
Till Rohrmann - Dynamic Scaling - How Apache Flink adapts to changing workloads
Flink Forward
 
From Apache Flink® 1.3 to 1.4
Till Rohrmann
 
Modern Stream Processing With Apache Flink @ GOTO Berlin 2017
Till Rohrmann
 
Flink Forward Berlin 2017: Jörg Schad, Till Rohrmann - Apache Flink meets Apa...
Flink Forward
 
Virtual Flink Forward 2020: Production-Ready Flink and Hive Integration - wha...
Flink Forward
 
Flink Forward Berlin 2017: Stephan Ewen - The State of Flink and how to adopt...
Flink Forward
 
Flink Forward Berlin 2017: Zohar Mizrahi - Python Streaming API
Flink Forward
 
Unify Enterprise Data Processing System Platform Level Integration of Flink a...
Flink Forward
 
Stephan Ewen - Experiences running Flink at Very Large Scale
Ververica
 
Apache Flink Berlin Meetup May 2016
Stephan Ewen
 
Flink Forward Berlin 2017: Mihail Vieru - A Materialization Engine for Data I...
Flink Forward
 
A look at Flink 1.2
Stefan Richter
 
The Stream Processor as the Database - Apache Flink @ Berlin buzzwords
Stephan Ewen
 
Flink Forward SF 2017: Srikanth Satya & Tom Kaitchuck - Pravega: Storage Rei...
Flink Forward
 
Flink Forward San Francisco 2018: Andrew Torson - "Extending Flink metrics: R...
Flink Forward
 
Elastic Streams at Scale @ Flink Forward 2018 Berlin
Till Rohrmann
 
Tuning Flink For Robustness And Performance
Stefan Richter
 
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
Flink Forward
 
January 2016 Flink Community Update & Roadmap 2016
Robert Metzger
 
Flink Forward San Francisco 2018: Steven Wu - "Scaling Flink in Cloud"
Flink Forward
 
Till Rohrmann - Dynamic Scaling - How Apache Flink adapts to changing workloads
Flink Forward
 
Ad

Similar to Flink Forward San Francisco 2018 keynote: Stephan Ewen - "What turns stream processing from a tool into a platform?" (20)

PPTX
Apache Flink and what it is used for
Aljoscha Krettek
 
PDF
Introducing Events and Stream Processing into Nationwide Building Society (Ro...
confluent
 
PDF
SnapLogic- iPaaS (Elastic Integration Cloud and Data Integration)
Surendar S
 
PPTX
Keeping Analytics Data Fresh in a Streaming Architecture | John Neal, Qlik
HostedbyConfluent
 
PPTX
Databus - LinkedIn's Change Data Capture Pipeline
Sunil Nagaraj
 
PPTX
Introducing Events and Stream Processing into Nationwide Building Society
confluent
 
PDF
Leveraging Mainframe Data for Modern Analytics
confluent
 
PPTX
Databricks Platform.pptx
Alex Ivy
 
PDF
Revolutionary Storage for Modern Databases, Applications and Infrastrcture
sabnees
 
PDF
The Power of Distributed Snapshots in Apache Flink
C4Media
 
PDF
Reintroducing the Stream Processor: A universal tool for continuous data anal...
Paris Carbone
 
PDF
Spark and Couchbase: Augmenting the Operational Database with Spark
Spark Summit
 
PPTX
PNUTS: Yahoo!’s Hosted Data Serving Platform
Tarik Reza Toha
 
PDF
Exploring Scenarios of Flink CDC in Streaming Data Integration
Leonard Xu
 
PDF
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
Databricks
 
PPTX
A Data Streaming Architecture with Apache Flink (berlin Buzzwords 2016)
Robert Metzger
 
PPTX
QCon London - Stream Processing with Apache Flink
Robert Metzger
 
PDF
Continus sql with sql stream builder
Timothy Spann
 
PPTX
GOTO Night Amsterdam - Stream processing with Apache Flink
Robert Metzger
 
PPTX
(Past), Present, and Future of Apache Flink
Aljoscha Krettek
 
Apache Flink and what it is used for
Aljoscha Krettek
 
Introducing Events and Stream Processing into Nationwide Building Society (Ro...
confluent
 
SnapLogic- iPaaS (Elastic Integration Cloud and Data Integration)
Surendar S
 
Keeping Analytics Data Fresh in a Streaming Architecture | John Neal, Qlik
HostedbyConfluent
 
Databus - LinkedIn's Change Data Capture Pipeline
Sunil Nagaraj
 
Introducing Events and Stream Processing into Nationwide Building Society
confluent
 
Leveraging Mainframe Data for Modern Analytics
confluent
 
Databricks Platform.pptx
Alex Ivy
 
Revolutionary Storage for Modern Databases, Applications and Infrastrcture
sabnees
 
The Power of Distributed Snapshots in Apache Flink
C4Media
 
Reintroducing the Stream Processor: A universal tool for continuous data anal...
Paris Carbone
 
Spark and Couchbase: Augmenting the Operational Database with Spark
Spark Summit
 
PNUTS: Yahoo!’s Hosted Data Serving Platform
Tarik Reza Toha
 
Exploring Scenarios of Flink CDC in Streaming Data Integration
Leonard Xu
 
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
Databricks
 
A Data Streaming Architecture with Apache Flink (berlin Buzzwords 2016)
Robert Metzger
 
QCon London - Stream Processing with Apache Flink
Robert Metzger
 
Continus sql with sql stream builder
Timothy Spann
 
GOTO Night Amsterdam - Stream processing with Apache Flink
Robert Metzger
 
(Past), Present, and Future of Apache Flink
Aljoscha Krettek
 
Ad

More from Flink Forward (20)

PDF
Building a fully managed stream processing platform on Flink at scale for Lin...
Flink Forward
 
PPTX
Evening out the uneven: dealing with skew in Flink
Flink Forward
 
PPTX
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
Flink Forward
 
PDF
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Flink Forward
 
PDF
Introducing the Apache Flink Kubernetes Operator
Flink Forward
 
PPTX
Autoscaling Flink with Reactive Mode
Flink Forward
 
PDF
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Flink Forward
 
PPTX
One sink to rule them all: Introducing the new Async Sink
Flink Forward
 
PPTX
Tuning Apache Kafka Connectors for Flink.pptx
Flink Forward
 
PDF
Flink powered stream processing platform at Pinterest
Flink Forward
 
PPTX
Apache Flink in the Cloud-Native Era
Flink Forward
 
PPTX
Where is my bottleneck? Performance troubleshooting in Flink
Flink Forward
 
PPTX
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Flink Forward
 
PPTX
The Current State of Table API in 2022
Flink Forward
 
PDF
Flink SQL on Pulsar made easy
Flink Forward
 
PPTX
Dynamic Rule-based Real-time Market Data Alerts
Flink Forward
 
PPTX
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Flink Forward
 
PPTX
Processing Semantically-Ordered Streams in Financial Services
Flink Forward
 
PDF
Tame the small files problem and optimize data layout for streaming ingestion...
Flink Forward
 
PDF
Batch Processing at Scale with Flink & Iceberg
Flink Forward
 
Building a fully managed stream processing platform on Flink at scale for Lin...
Flink Forward
 
Evening out the uneven: dealing with skew in Flink
Flink Forward
 
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
Flink Forward
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Flink Forward
 
Introducing the Apache Flink Kubernetes Operator
Flink Forward
 
Autoscaling Flink with Reactive Mode
Flink Forward
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Flink Forward
 
One sink to rule them all: Introducing the new Async Sink
Flink Forward
 
Tuning Apache Kafka Connectors for Flink.pptx
Flink Forward
 
Flink powered stream processing platform at Pinterest
Flink Forward
 
Apache Flink in the Cloud-Native Era
Flink Forward
 
Where is my bottleneck? Performance troubleshooting in Flink
Flink Forward
 
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Flink Forward
 
The Current State of Table API in 2022
Flink Forward
 
Flink SQL on Pulsar made easy
Flink Forward
 
Dynamic Rule-based Real-time Market Data Alerts
Flink Forward
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Flink Forward
 
Processing Semantically-Ordered Streams in Financial Services
Flink Forward
 
Tame the small files problem and optimize data layout for streaming ingestion...
Flink Forward
 
Batch Processing at Scale with Flink & Iceberg
Flink Forward
 

Recently uploaded (20)

PPTX
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
PPTX
The-Ethical-Hackers-Imperative-Safeguarding-the-Digital-Frontier.pptx
sujalchauhan1305
 
PDF
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
PDF
GDG Cloud Munich - Intro - Luiz Carneiro - #BuildWithAI - July - Abdel.pdf
Luiz Carneiro
 
PDF
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
PDF
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PDF
How ETL Control Logic Keeps Your Pipelines Safe and Reliable.pdf
Stryv Solutions Pvt. Ltd.
 
PPTX
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
PDF
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
PDF
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 
PDF
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
PDF
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
PDF
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
PPTX
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
PDF
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
PPTX
Applied-Statistics-Mastering-Data-Driven-Decisions.pptx
parmaryashparmaryash
 
PDF
AI-Cloud-Business-Management-Platforms-The-Key-to-Efficiency-Growth.pdf
Artjoker Software Development Company
 
PDF
Software Development Methodologies in 2025
KodekX
 
PDF
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
The-Ethical-Hackers-Imperative-Safeguarding-the-Digital-Frontier.pptx
sujalchauhan1305
 
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
GDG Cloud Munich - Intro - Luiz Carneiro - #BuildWithAI - July - Abdel.pdf
Luiz Carneiro
 
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
How ETL Control Logic Keeps Your Pipelines Safe and Reliable.pdf
Stryv Solutions Pvt. Ltd.
 
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
Applied-Statistics-Mastering-Data-Driven-Decisions.pptx
parmaryashparmaryash
 
AI-Cloud-Business-Management-Platforms-The-Key-to-Efficiency-Growth.pdf
Artjoker Software Development Company
 
Software Development Methodologies in 2025
KodekX
 
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 

Flink Forward San Francisco 2018 keynote: Stephan Ewen - "What turns stream processing from a tool into a platform?"