SlideShare a Scribd company logo
1
Jamie Grier

@jamiegrier



data-artisans.com
Building Advanced Streaming Applications:
The Latest and Greatest of Flink and Kafka
2
Original creators of Apache
Flink®
Providers of the
dA Platform, a supported
Flink distribution
The Latest Features - Quick Overview
▪ Rescalable State
▪ Async I/O Support
▪ Streaming SQL (Dynamic Tables)
▪ Flexible Deployment Options
▪ Enhanced Security
The Latest Features
▪ ProcessFunction API
▪ Queryable State API
▪ Excellent support for advanced applications that are:
▪ Flexible
▪ Stateful
▪ Event Driven
▪ Time Driven
Rescalable State
▪ Separates state parallelism from task
parallelism
▪ Enables autoscaling integrations while
maintaining stateful computations
▪ Handled efficiently via key groups
Rescalable State
Map Filter Window
State State State
Source Sink
State is partitioned by key
Rescalable State
Source Sink
Map Filter Window
State State State
Map Filter Window
State State State
State is partitioned by key
Rescalable State
Source Sink
Map Filter Window
State State State
Map Filter Window
State State State
Map Filter Window
State State State
Map Filter Window
State State State
State is partitioned by key
Rescalable State
Source Sink
Map Filter Window
State State State
Map Filter Window
State State State
Map Filter Window
State State State
Map Filter Window
State State State
Map Filter Window
State State State
Map Filter Window
State State State
Map Filter Window
State State State
Map Filter Window
State State State
State is partitioned by key
Asynchronous I/O Support
▪ Make aynchronous calls to external
services from streaming job
▪ Efficiently keeps configurable number of
asynchronous calls in flight
▪ Correctly handles failure scenarios -
restarts failed async calls, etc
Asynchronous I/O Support
Async
I/O
User
Code
Asynchronous I/O Support
Little’s Law:
throughput = occupancy / latency
Asynchronous I/O Support
a
b
c
d
b
c
d
x
a
a
b
b
a
Sync. I/O Async. I/O
database database
sendRequest(x) x receiveResponse(x) wait
Asynchronous I/O Support
Asynchronous I/O Support
// create the original stream
val stream: DataStream[String] = ...
// apply the async I/O transformation
val resultStream: DataStream[(String, String)] =
AsyncDataStream.unorderedWait(
input = stream,
asyncFunction = new AsyncDatabaseRequest(),
timeout = 1000,
timeUnit = TimeUnit.MILLISECONDS,
concurrentRequests = 100)
Asynchronous I/O Support
class AsyncDatabaseRequest extends AsyncFunction[String, (String, String)] {
override def asyncInvoke(str: String, asyncCollector: AsyncCollector[(String, String)]): Unit = {
// issue the asynchronous request, receive a future for the result
val resultFuture: Future[String] = client.query(str)
// set the callback to be executed once the request by the client is complete
// the callback simply forwards the result to the collector
resultFuture.onSuccess {
case result: String => asyncCollector.collect(Iterable((str, result)));
}
}
}
Streaming SQL: Querying Dynamic Tables
Streaming SQL: Stream to Table
Append Mode
Streaming SQL: Stream to Table
Update Mode
Streaming SQL: Queries
Streaming SQL: Windowed Queries
Streaming SQL: Table to Stream
Redo Stream
Streaming SQL: Table to Stream
Undo / Redo Stream
Flexible Deployment Options
▪ YARN
▪ Mesos
▪ Docker Swarm
▪ Kubernetes
Flexible Deployment Options
▪ DC/OS
▪ Amazon EMR
▪ Google Dataproc
Enhanced Security
▪ SSL
▪ Kerberos
▪ Kafka
▪ Zookeeper
▪ Hadoop
Advanced Event-Driven Applications
▪ ProcessFunction API
▪ Queryable State API
▪ Excellent support for advanced applications that are:
▪ Flexible
▪ Stateful
▪ Event Driven
▪ Time Driven
Example: FlinkTrade
▪ Overall Requirements:
▪ Consume “starting position” and “quote” streams
▪ Process complex, time-oriented, trading rules
▪ Trade out of positions to our advantage if possible
▪ Provide a dashboard of currently held positions to traders and asset managers
▪ Complex Rules:
▪ We only make trades where the Bid Price is above our current Ask Price
▪ When a trade is made we increase our Ask Price — looking to optimize our profits
▪ Positions have a set time-to-live until we try to trade out of them more aggressively
by decreasing the Ask Price over time
Example: FlinkTrade
Quotes
Starting
Positions
Trading
Engine
Trades
Positions
SYMBOL SHARES BUY PRICE ASK PRICE LAST
TRADE
PRICE
Profit
AAPL 10,000 140.40 140.50 140.40 $10,921.00
GOOG 20,000 846.81 846.91 846.81 $12,021.00
TWTR 8,000 15.12 15.22 15.12 $4,032.00
● Event Driven Processing
● Complex Trading Rules
● Time Based Logic
Trader Dashboard
Example: FlinkTrade
Quotes
Starting
Positions
Trades
Trading
Engine
Positions
Trading
Engine
Positions
Trading
Engine
Positions
Trading
Engine
Positions
Trading
Engine
Positions
Trading
Engine
Positions
Trading
Engine
Positions
Trading
Engine
Positions
Trading
Engine
Positions
Trading
Engine
Positions
Trading
Engine
Positions
Trading
Engine
Positions
SYMBOL SHARES BUY PRICE ASK PRICE LAST
TRADE
PRICE
Profit
AAPL 10,000 140.40 140.50 140.40 $10,921.00
GOOG 20,000 846.81 846.91 846.81 $12,021.00
TWTR 8,000 15.12 15.22 15.12 $4,032.00
Example: FlinkTrade
Let’s look at
the code
We are hiring!
data-artisans.com/careers
@jamiegrier
@ApacheFlink
@dataArtisans

More Related Content

What's hot (20)

PDF
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
HostedbyConfluent
 
PDF
On Track with Apache Kafka®: Building a Streaming ETL Solution with Rail Data
confluent
 
PDF
Kafka as your Data Lake - is it Feasible? (Guido Schmutz, Trivadis) Kafka Sum...
HostedbyConfluent
 
PDF
Hadoop summit - Scaling Uber’s Real-Time Infra for Trillion Events per Day
Ankur Bansal
 
PDF
Building Stream Processing Applications with Apache Kafka Using KSQL (Robin M...
confluent
 
PDF
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
HostedbyConfluent
 
PDF
Via Varejo taking data from legacy to a new world at Brazil Black Friday (Mar...
confluent
 
PDF
Deploying Confluent Platform for Production
confluent
 
PDF
Securing the Message Bus with Kafka Streams | Paul Otto and Ryan Salcido, Raf...
HostedbyConfluent
 
PDF
How to use Standard SQL over Kafka: From the basics to advanced use cases | F...
HostedbyConfluent
 
PPTX
Monitoring and Resiliency Testing our Apache Kafka Clusters at Goldman Sachs ...
HostedbyConfluent
 
PPTX
How to Lock Down Apache Kafka and Keep Your Streams Safe
confluent
 
PDF
Apache kafka-a distributed streaming platform
confluent
 
PDF
ksqlDB: A Stream-Relational Database System
confluent
 
PDF
Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...
HostedbyConfluent
 
PDF
Intro to AsyncAPI
confluent
 
PDF
Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...
Kai Wähner
 
PDF
Performance Tuning RocksDB for Kafka Streams’ State Stores
confluent
 
PDF
Apache Kafka® at Dropbox
confluent
 
PPTX
Apache Kafka Streams Use Case
Apache Kafka TLV
 
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
HostedbyConfluent
 
On Track with Apache Kafka®: Building a Streaming ETL Solution with Rail Data
confluent
 
Kafka as your Data Lake - is it Feasible? (Guido Schmutz, Trivadis) Kafka Sum...
HostedbyConfluent
 
Hadoop summit - Scaling Uber’s Real-Time Infra for Trillion Events per Day
Ankur Bansal
 
Building Stream Processing Applications with Apache Kafka Using KSQL (Robin M...
confluent
 
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
HostedbyConfluent
 
Via Varejo taking data from legacy to a new world at Brazil Black Friday (Mar...
confluent
 
Deploying Confluent Platform for Production
confluent
 
Securing the Message Bus with Kafka Streams | Paul Otto and Ryan Salcido, Raf...
HostedbyConfluent
 
How to use Standard SQL over Kafka: From the basics to advanced use cases | F...
HostedbyConfluent
 
Monitoring and Resiliency Testing our Apache Kafka Clusters at Goldman Sachs ...
HostedbyConfluent
 
How to Lock Down Apache Kafka and Keep Your Streams Safe
confluent
 
Apache kafka-a distributed streaming platform
confluent
 
ksqlDB: A Stream-Relational Database System
confluent
 
Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...
HostedbyConfluent
 
Intro to AsyncAPI
confluent
 
Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...
Kai Wähner
 
Performance Tuning RocksDB for Kafka Streams’ State Stores
confluent
 
Apache Kafka® at Dropbox
confluent
 
Apache Kafka Streams Use Case
Apache Kafka TLV
 

Viewers also liked (12)

PPTX
Kafka Summit NYC 2017 - Billions of Messages a Day - Yelp's Real-time Data Pi...
confluent
 
PPTX
Streaming Data and Stream Processing with Apache Kafka
confluent
 
PDF
Real-world Streaming Architectures
confluent
 
PDF
Machine Learning and Deep Learning Applied to Real Time with Apache Kafka Str...
confluent
 
PDF
Putting the Micro into Microservices with Stateful Stream Processing
confluent
 
PDF
Reliability Guarantees for Apache Kafka
confluent
 
PDF
Kafka Summit NYC 2017 - Running Hundreds of Kafka Clusters with 5 People
confluent
 
PDF
Disaster Recovery Plans for Apache Kafka
confluent
 
PDF
Building Microservices with Apache Kafka
confluent
 
PDF
Common Patterns of Multi Data-Center Architectures with Apache Kafka
confluent
 
PDF
Exactly-once Semantics in Apache Kafka
confluent
 
PDF
Metrics Are Not Enough: Monitoring Apache Kafka and Streaming Applications
confluent
 
Kafka Summit NYC 2017 - Billions of Messages a Day - Yelp's Real-time Data Pi...
confluent
 
Streaming Data and Stream Processing with Apache Kafka
confluent
 
Real-world Streaming Architectures
confluent
 
Machine Learning and Deep Learning Applied to Real Time with Apache Kafka Str...
confluent
 
Putting the Micro into Microservices with Stateful Stream Processing
confluent
 
Reliability Guarantees for Apache Kafka
confluent
 
Kafka Summit NYC 2017 - Running Hundreds of Kafka Clusters with 5 People
confluent
 
Disaster Recovery Plans for Apache Kafka
confluent
 
Building Microservices with Apache Kafka
confluent
 
Common Patterns of Multi Data-Center Architectures with Apache Kafka
confluent
 
Exactly-once Semantics in Apache Kafka
confluent
 
Metrics Are Not Enough: Monitoring Apache Kafka and Streaming Applications
confluent
 
Ad

Similar to Kafka Summit NYC 2017 - Building Advanced Streaming Applications using the Latest from Apache Flink & Kafka (20)

PDF
Flink Forward SF 2017: Jamie Grier - Apache Flink - The latest and greatest
Flink Forward
 
PDF
Flink Forward Berlin 2017: Stephan Ewen - The State of Flink and how to adopt...
Flink Forward
 
PPTX
Flink Forward Berlin 2018: Aljoscha Krettek & Till Rohrmann - Keynote: "A Yea...
Flink Forward
 
PPTX
The Past, Present, and Future of Apache Flink
Aljoscha Krettek
 
PDF
Modern Stream Processing With Apache Flink @ GOTO Berlin 2017
Till Rohrmann
 
PDF
It's Time To Stop Using Lambda Architecture
Yaroslav Tkachenko
 
PPTX
Why apache Flink is the 4G of Big Data Analytics Frameworks
Slim Baltagi
 
PDF
Introduction to Apache Flink
datamantra
 
PPTX
Apache Flink Training: System Overview
Flink Forward
 
PPTX
The Past, Present, and Future of Apache Flink®
Aljoscha Krettek
 
PDF
Introduction to Flink Streaming
datamantra
 
PPTX
Flink Streaming
Gyula Fóra
 
PDF
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
Apache Apex
 
PDF
It's Time To Stop Using Lambda Architecture | Yaroslav Tkachenko, Shopify
HostedbyConfluent
 
PPTX
Overview of Apache Flink: Next-Gen Big Data Analytics Framework
Slim Baltagi
 
PPTX
Data Stream Processing with Apache Flink
Fabian Hueske
 
PPTX
Flink history, roadmap and vision
Stephan Ewen
 
PPTX
Intro to Apache Apex @ Women in Big Data
Apache Apex
 
PPTX
Flink Forward SF 2017: Stephan Ewen - Convergence of real-time analytics and ...
Flink Forward
 
PPTX
ApacheCon: Apache Flink - Fast and Reliable Large-Scale Data Processing
Fabian Hueske
 
Flink Forward SF 2017: Jamie Grier - Apache Flink - The latest and greatest
Flink Forward
 
Flink Forward Berlin 2017: Stephan Ewen - The State of Flink and how to adopt...
Flink Forward
 
Flink Forward Berlin 2018: Aljoscha Krettek & Till Rohrmann - Keynote: "A Yea...
Flink Forward
 
The Past, Present, and Future of Apache Flink
Aljoscha Krettek
 
Modern Stream Processing With Apache Flink @ GOTO Berlin 2017
Till Rohrmann
 
It's Time To Stop Using Lambda Architecture
Yaroslav Tkachenko
 
Why apache Flink is the 4G of Big Data Analytics Frameworks
Slim Baltagi
 
Introduction to Apache Flink
datamantra
 
Apache Flink Training: System Overview
Flink Forward
 
The Past, Present, and Future of Apache Flink®
Aljoscha Krettek
 
Introduction to Flink Streaming
datamantra
 
Flink Streaming
Gyula Fóra
 
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
Apache Apex
 
It's Time To Stop Using Lambda Architecture | Yaroslav Tkachenko, Shopify
HostedbyConfluent
 
Overview of Apache Flink: Next-Gen Big Data Analytics Framework
Slim Baltagi
 
Data Stream Processing with Apache Flink
Fabian Hueske
 
Flink history, roadmap and vision
Stephan Ewen
 
Intro to Apache Apex @ Women in Big Data
Apache Apex
 
Flink Forward SF 2017: Stephan Ewen - Convergence of real-time analytics and ...
Flink Forward
 
ApacheCon: Apache Flink - Fast and Reliable Large-Scale Data Processing
Fabian Hueske
 
Ad

More from confluent (20)

PDF
Stream Processing Handson Workshop - Flink SQL Hands-on Workshop (Korean)
confluent
 
PPTX
Webinar Think Right - Shift Left - 19-03-2025.pptx
confluent
 
PDF
Migration, backup and restore made easy using Kannika
confluent
 
PDF
Five Things You Need to Know About Data Streaming in 2025
confluent
 
PDF
Data in Motion Tour Seoul 2024 - Keynote
confluent
 
PDF
Data in Motion Tour Seoul 2024 - Roadmap Demo
confluent
 
PDF
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
confluent
 
PDF
Confluent per il settore FSI: Accelerare l'Innovazione con il Data Streaming...
confluent
 
PDF
Data in Motion Tour 2024 Riyadh, Saudi Arabia
confluent
 
PDF
Build a Real-Time Decision Support Application for Financial Market Traders w...
confluent
 
PDF
Strumenti e Strategie di Stream Governance con Confluent Platform
confluent
 
PDF
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
confluent
 
PDF
Building Real-Time Gen AI Applications with SingleStore and Confluent
confluent
 
PDF
Unlocking value with event-driven architecture by Confluent
confluent
 
PDF
Il Data Streaming per un’AI real-time di nuova generazione
confluent
 
PDF
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
confluent
 
PDF
Break data silos with real-time connectivity using Confluent Cloud Connectors
confluent
 
PDF
Building API data products on top of your real-time data infrastructure
confluent
 
PDF
Speed Wins: From Kafka to APIs in Minutes
confluent
 
PDF
Evolving Data Governance for the Real-time Streaming and AI Era
confluent
 
Stream Processing Handson Workshop - Flink SQL Hands-on Workshop (Korean)
confluent
 
Webinar Think Right - Shift Left - 19-03-2025.pptx
confluent
 
Migration, backup and restore made easy using Kannika
confluent
 
Five Things You Need to Know About Data Streaming in 2025
confluent
 
Data in Motion Tour Seoul 2024 - Keynote
confluent
 
Data in Motion Tour Seoul 2024 - Roadmap Demo
confluent
 
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
confluent
 
Confluent per il settore FSI: Accelerare l'Innovazione con il Data Streaming...
confluent
 
Data in Motion Tour 2024 Riyadh, Saudi Arabia
confluent
 
Build a Real-Time Decision Support Application for Financial Market Traders w...
confluent
 
Strumenti e Strategie di Stream Governance con Confluent Platform
confluent
 
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
confluent
 
Building Real-Time Gen AI Applications with SingleStore and Confluent
confluent
 
Unlocking value with event-driven architecture by Confluent
confluent
 
Il Data Streaming per un’AI real-time di nuova generazione
confluent
 
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
confluent
 
Break data silos with real-time connectivity using Confluent Cloud Connectors
confluent
 
Building API data products on top of your real-time data infrastructure
confluent
 
Speed Wins: From Kafka to APIs in Minutes
confluent
 
Evolving Data Governance for the Real-time Streaming and AI Era
confluent
 

Recently uploaded (20)

PPTX
Comprehensive Guide: Shoviv Exchange to Office 365 Migration Tool 2025
Shoviv Software
 
PDF
Capcut Pro Crack For PC Latest Version {Fully Unlocked} 2025
hashhshs786
 
PDF
Powering GIS with FME and VertiGIS - Peak of Data & AI 2025
Safe Software
 
PPTX
A Complete Guide to Salesforce SMS Integrations Build Scalable Messaging With...
360 SMS APP
 
PPTX
Fundamentals_of_Microservices_Architecture.pptx
MuhammadUzair504018
 
PPTX
Why Businesses Are Switching to Open Source Alternatives to Crystal Reports.pptx
Varsha Nayak
 
PPTX
3uTools Full Crack Free Version Download [Latest] 2025
muhammadgurbazkhan
 
PDF
Alexander Marshalov - How to use AI Assistants with your Monitoring system Q2...
VictoriaMetrics
 
PDF
Mobile CMMS Solutions Empowering the Frontline Workforce
CryotosCMMSSoftware
 
PDF
MiniTool Partition Wizard 12.8 Crack License Key LATEST
hashhshs786
 
PPTX
MiniTool Power Data Recovery Full Crack Latest 2025
muhammadgurbazkhan
 
PDF
Automate Cybersecurity Tasks with Python
VICTOR MAESTRE RAMIREZ
 
PDF
Alarm in Android-Scheduling Timed Tasks Using AlarmManager in Android.pdf
Nabin Dhakal
 
PPTX
Java Native Memory Leaks: The Hidden Villain Behind JVM Performance Issues
Tier1 app
 
PDF
Understanding the Need for Systemic Change in Open Source Through Intersectio...
Imma Valls Bernaus
 
PPTX
Tally software_Introduction_Presentation
AditiBansal54083
 
PDF
Beyond Binaries: Understanding Diversity and Allyship in a Global Workplace -...
Imma Valls Bernaus
 
PPTX
Writing Better Code - Helping Developers make Decisions.pptx
Lorraine Steyn
 
PDF
iTop VPN With Crack Lifetime Activation Key-CODE
utfefguu
 
PDF
유니티에서 Burst Compiler+ThreadedJobs+SIMD 적용사례
Seongdae Kim
 
Comprehensive Guide: Shoviv Exchange to Office 365 Migration Tool 2025
Shoviv Software
 
Capcut Pro Crack For PC Latest Version {Fully Unlocked} 2025
hashhshs786
 
Powering GIS with FME and VertiGIS - Peak of Data & AI 2025
Safe Software
 
A Complete Guide to Salesforce SMS Integrations Build Scalable Messaging With...
360 SMS APP
 
Fundamentals_of_Microservices_Architecture.pptx
MuhammadUzair504018
 
Why Businesses Are Switching to Open Source Alternatives to Crystal Reports.pptx
Varsha Nayak
 
3uTools Full Crack Free Version Download [Latest] 2025
muhammadgurbazkhan
 
Alexander Marshalov - How to use AI Assistants with your Monitoring system Q2...
VictoriaMetrics
 
Mobile CMMS Solutions Empowering the Frontline Workforce
CryotosCMMSSoftware
 
MiniTool Partition Wizard 12.8 Crack License Key LATEST
hashhshs786
 
MiniTool Power Data Recovery Full Crack Latest 2025
muhammadgurbazkhan
 
Automate Cybersecurity Tasks with Python
VICTOR MAESTRE RAMIREZ
 
Alarm in Android-Scheduling Timed Tasks Using AlarmManager in Android.pdf
Nabin Dhakal
 
Java Native Memory Leaks: The Hidden Villain Behind JVM Performance Issues
Tier1 app
 
Understanding the Need for Systemic Change in Open Source Through Intersectio...
Imma Valls Bernaus
 
Tally software_Introduction_Presentation
AditiBansal54083
 
Beyond Binaries: Understanding Diversity and Allyship in a Global Workplace -...
Imma Valls Bernaus
 
Writing Better Code - Helping Developers make Decisions.pptx
Lorraine Steyn
 
iTop VPN With Crack Lifetime Activation Key-CODE
utfefguu
 
유니티에서 Burst Compiler+ThreadedJobs+SIMD 적용사례
Seongdae Kim
 

Kafka Summit NYC 2017 - Building Advanced Streaming Applications using the Latest from Apache Flink & Kafka