SlideShare a Scribd company logo
https://www.risingwave.com/
Rethinking State Management
in Cloud-Native Streaming Systems
Yingjun Wu
RisingWave Labs
https://www.risingwave.com/
About Us
• Yingjun Wu
• Founder @RisingWave Labs
• Software Engineer @AWS Redshift
• Researcher @IBM Research - Almaden
• Ph.D., National University of Singapore
• RisingWave Labs
• Series-A startup founded in January 2021
• Building RisingWave, a cloud-native streaming database
2
https://www.risingwave.com/
Stream Processing: Values and Costs
3
Unbounded
< 1 sec < 1 min < 10 min < 1 hour
Modified from: https://www.oreilly.com/content/ubers-case-for-incremental-processing-on-hadoop/
Result
freshness
Batch processing
https://www.risingwave.com/
Stream Processing: Values and Costs
4
Unbounded
< 1 sec < 1 min < 10 min < 1 hour
Modified from: https://www.oreilly.com/content/ubers-case-for-incremental-processing-on-hadoop/
Result
freshness
Batch processing
Stream processing
https://www.risingwave.com/
Stream Processing: Values and Costs
5
Unbounded
< 1 sec < 1 min < 10 min < 1 hour
Modified from: https://www.oreilly.com/content/ubers-case-for-incremental-processing-on-hadoop/
Result
freshness
Batch processing
Stream processing
Business value
https://www.risingwave.com/
Stream Processing: Values and Costs
6
Unbounded
< 1 sec < 1 min < 10 min < 1 hour
Modified from: https://www.oreilly.com/content/ubers-case-for-incremental-processing-on-hadoop/
Result
freshness
Batch processing
Stream processing
Cost $$$$
Business value
https://www.risingwave.com/
Stream Processing: Values and Costs
7
Unbounded
< 1 sec < 1 min < 10 min < 1 hour
Modified from: https://www.oreilly.com/content/ubers-case-for-incremental-processing-on-hadoop/
Result
freshness
Batch processing
Stream processing
Business value
Cost $
https://www.risingwave.com/
Cost in Stream Processing
https://www.risingwave.com/
Cost in Stream Processing
• Normal execution
https://www.risingwave.com/
Cost in Stream Processing
• Normal execution
• Failure recovery
https://www.risingwave.com/
Cost in Stream Processing
• Normal execution
• Failure recovery
• Elastic scaling
https://www.risingwave.com/
Cost in Stream Processing
• Normal execution
• Failure recovery
• Elastic scaling
State management!
https://www.risingwave.com/
Stateful Stream Processing
• Stateful operators
• Aggregation/GroupBy
• Join
• Window
• …
13
https://www.risingwave.com/
Stateful Stream Processing
• Consider joining two streams
• Impression stream
• Click stream
14
Output (adId, impressionTime, clickTime)
Impression (adId, impressionTime)
Click (adId, clickTime)
https://www.risingwave.com/
Stateful Stream Processing
• Consider joining two streams
• Impression stream
• Click stream
15
Output (adId, impressionTime, clickTime)
Impression (adId, impressionTime)
Click (adId, clickTime)
https://www.risingwave.com/
Stateful Stream Processing
• Consider joining two streams
• Impression stream
• Click stream
16
Output (adId, impressionTime, clickTime)
Impression (adId, impressionTime)
Click (adId, clickTime)
State
State
Hash table for impression stream
Hash table for click stream
https://www.risingwave.com/
Stateful Stream Processing
• Consider joining two streams
• Impression stream
• Click stream
17
Output (adId, impressionTime, clickTime)
Impression (adId, impressionTime)
Click (adId, clickTime)
State
State
Hash table for impression stream
Hash table for click stream
How to manage internal states?
https://www.risingwave.com/
Stateful Stream Processing
• Consider joining two streams
• Impression stream
• Click stream
18
Output (adId, impressionTime, clickTime)
Impression (adId, impressionTime)
Click (adId, clickTime)
State
State
Hash table for impression stream
Hash table for click stream
How to manage internal states?
High performance & small footprint
Fast failure recovery
Smooth elastic scaling
https://www.risingwave.com/
Stream Processing: History
19
Single node era Big data era Cloud era
https://www.risingwave.com/
State Management: Single Node Era
20
https://www.risingwave.com/
State Management: Single Node Era
• Consider joining two streams
• Impression stream
• Click stream
21
Output (adId, impressionTime, clickTime)
Impression (adId, impressionTime)
Click (adId, clickTime)
State
State
Hash table for impression stream
Hash table for click stream
https://www.risingwave.com/
State Management: Single Node Era
• Consider joining two streams
• Impression stream
• Click stream
22
Output (adId, impressionTime, clickTime)
Impression (adId, impressionTime)
Click (adId, clickTime)
State
State
Hash table for impression stream
Hash table for click stream
https://www.risingwave.com/
23
Output (adId, impressionTime, clickTime)
Impression (adId, impressionTime)
Click (adId, clickTime)
Hash table for impression stream
Hash table for click stream
State Management: Single Node Era
State
State
• Consider joining two streams
• Impression stream
• Click stream
https://www.risingwave.com/
24
Output (adId, impressionTime, clickTime)
Impression (adId, impressionTime)
Click (adId, clickTime)
Hash table for impression stream
Hash table for click stream
State Management: Single Node Era
State
• Consider joining two streams
• Impression stream
• Click stream
State
https://www.risingwave.com/
25
Output (adId, impressionTime, clickTime)
Impression (adId, impressionTime)
Click (adId, clickTime)
Hash table for impression stream
Hash table for click stream
State Management: Single Node Era
State
• Consider joining two streams
• Impression stream
• Click stream
State
https://www.risingwave.com/
State Management: Big Data Era
26
https://www.risingwave.com/
• Node (machine) is the minimum resource unit
27
State Management: Big Data Era
https://www.risingwave.com/
• Node (machine) is the minimum resource unit
• If running out compute/storage resources, just add more nodes!
28
State Management: Big Data Era
https://www.risingwave.com/
State Management: Big Data Era
• Coupled compute-storage architecture
• Embarrassingly parallel execution
• Utilize resources in a brute-force manner
29
State
https://www.risingwave.com/
State Management: Big Data Era
• Coupled compute-storage architecture
• Embarrassingly parallel execution
• Utilize resources in a brute-force manner
30
State
State
State
State
https://www.risingwave.com/
State Management: Big Data Era
• Consider joining two streams
• Impression stream
• Click stream
31
Output (adId, impressionTime, clickTime)
Impression (adId, impressionTime)
Click (adId, clickTime)
State
State
Hash table for impression stream
Hash table for click stream
https://www.risingwave.com/
• Node (machine) is the minimum resource unit
• If running out compute/storage resources, just add more nodes!
32
Output (adId, impressionTime, clickTime)
Impression (adId, impressionTime)
Click (adId, clickTime)
Hash table for impression stream
Hash table for click stream
State Management: Big Data Era
State
State
https://www.risingwave.com/
• Node (machine) is the minimum resource unit
• If running out compute/storage resources, just add more nodes!
33
Output (adId, impressionTime, clickTime)
Impression (adId, impressionTime)
Click (adId, clickTime)
Hash table for impression stream
Hash table for click stream
State Management: Big Data Era
State
State
State
State
State
State
https://www.risingwave.com/
• Node (machine) is the minimum resource unit
• If running out compute/storage resources, just add more nodes!
34
Output (adId, impressionTime, clickTime)
Impression (adId, impressionTime)
Click (adId, clickTime)
Hash table for impression stream
Hash table for click stream
State Management: Big Data Era
State
State
State
State
State
State
Consume too many resources!
https://www.risingwave.com/
State Management: Cloud Era
35
Storage (S3)
Compute (EC2)
https://www.risingwave.com/
State Management: Cloud Era
• Compute and storage resources are managed separately
36
Storage (S3)
Compute (EC2)
https://www.risingwave.com/
State Management: Cloud Era
• Compute and storage resources are managed separately
• If running out of compute, just buy more compute instances!
37
Storage (S3)
Compute (EC2)
https://www.risingwave.com/
State Management: Cloud Era
• Compute and storage resources are managed separately
• If running out of compute, just buy more compute instances!
• Storage resources can scale automatically!
38
Storage (S3)
Compute (EC2)
https://www.risingwave.com/
State Management: Cloud Era
• Decoupled compute and storage
• Build execution engine on top of cloud storage
39
https://www.risingwave.com/
State Management: Cloud Era
• Consider joining two streams
• Impression stream
• Click stream
40
Output (adId, impressionTime, clickTime)
Impression (adId, impressionTime)
Click (adId, clickTime)
State
State
Hash table for impression stream
Hash table for click stream
https://www.risingwave.com/
State Management: Cloud Era
• Naïve solution: maintain state in remote cloud storage
41
Output (adId, impressionTime, clickTime)
Impression (adId, impressionTime)
Click (adId, clickTime)
Hash table for impression stream
Hash table for click stream
State
State
https://www.risingwave.com/
State Management: Cloud Era
• Naïve solution: maintain state in remote cloud storage
42
Output (adId, impressionTime, clickTime)
Impression (adId, impressionTime)
Click (adId, clickTime)
Hash table for impression stream
Hash table for click stream
State
State
Stored in S3
Compute in EC2
https://www.risingwave.com/
State
State Management: Cloud Era
• Naïve solution: maintain state in remote cloud storage
• If running out of compute… then just add more EC2!
43
Output (adId, impressionTime, clickTime)
Impression (adId, impressionTime)
Click (adId, clickTime)
Hash table for impression stream
Hash table for click stream
State
Stored in S3
Compute in EC2
Compute in EC2
Compute in EC2
https://www.risingwave.com/
State Management: Cloud Era
• Naïve solution: maintain state in remote cloud storage
44
Output (adId, impressionTime, clickTime)
Impression (adId, impressionTime)
Click (adId, clickTime)
Hash table for impression stream
Hash table for click stream
State
State
Stored in S3
Compute in EC2
https://www.risingwave.com/
State Management: Cloud Era
• Naïve solution: maintain state in remote cloud storage
• If running out of storage… S3 will automatically scale itself!
45
Output (adId, impressionTime, clickTime)
Impression (adId, impressionTime)
Click (adId, clickTime)
Hash table for impression stream
Hash table for click stream
State
State
Stored in S3
Compute in EC2
https://www.risingwave.com/
State
State Management: Cloud Era
• Naïve solution: maintain state in remote cloud storage
46
Output (adId, impressionTime, clickTime)
Impression (adId, impressionTime)
Click (adId, clickTime)
Hash table for impression stream
Hash table for click stream
Stored in S3
Compute in EC2
State manipulation becomes remote access!
State
https://www.risingwave.com/
State Management: Cloud Era
• Unfortunately, S3 is too slow to support low-latency processing!
47
https://www.risingwave.com/
State Management: Cloud Era
• Unfortunately, S3 is too slow to support low-latency processing!
• Moreover, S3 is charged on a per request basis!
48
https://www.risingwave.com/
Tiered Storage
• Luckily, we can maintain data in different services
• EC2: “volatile” storage
• Super fast!
• Data will get lost if it’s not well replicated
• EBS: “semi-persistent” storage
• Fast
• 99.999% durability (5 nines)
• S3: persistent storage
• slow
• 99.999999999% durability (11 nines)
49
https://www.risingwave.com/
Tiered Storage
• Luckily, we can maintain data in different services
• EC2: “volatile” storage
• Super fast!
• Data will get lost if it’s not well replicated
• EBS: “semi-persistent” storage
• Fast
• 99.999% durability (5 nines)
• S3: persistent storage
• slow
• 99.999999999% durability (11 nines)
50
Tiered storage
https://www.risingwave.com/
Tiered Storage for State Management
• Use LSM-tree-like structure to maintain internal states in different
storage medium
51
EC2
EBS
S3
Hot data
Warm data
Cold data
https://www.risingwave.com/
Tiered Storage for State Management
• Use LSM-tree-like structure to maintain internal states in different
storage medium
52
EC2
EBS
S3
Hot data
Warm data
Cold data
Streaming data ingested in
https://www.risingwave.com/
Tiered Storage for State Management
• Use LSM-tree-like structure to maintain internal states in different
storage medium
53
EC2
EBS
S3
Hot data
Warm data
Cold data
Compaction
Compaction
https://www.risingwave.com/
Rethinking State Management Design
54
Cloud
Big data
v.s.
https://www.risingwave.com/
Rethinking State Management Design
55
v.s.
Cloud
Big data
Coupled compute-storage architecture Decoupled compute-storage architecture
https://www.risingwave.com/
Coupled compute-storage architecture
Rethinking State Management Design
56
State State State
Compute
nodes
Persistent
storage
Decoupled compute-storage architecture
States
https://www.risingwave.com/
Rethinking State Management Design
57
State State State
Compute
nodes
Persistent
storage
Checkpoint
States
Coupled compute-storage architecture Decoupled compute-storage architecture
https://www.risingwave.com/
Rethinking State Management Design
58
State State State
States
State State State
Compute
nodes
Persistent
storage
States
Checkpoint
Coupled compute-storage architecture Decoupled compute-storage architecture
https://www.risingwave.com/
Rethinking State Management Design
59
State State State
States
State State State
Compute
nodes
Persistent
storage
States
Checkpoint
Coupled compute-storage architecture Decoupled compute-storage architecture
Cache Cache Cache
https://www.risingwave.com/
Rethinking State Management Design
60
State State State
States
State State State
Compute
nodes
Persistent
storage
States
Checkpoint
Coupled compute-storage architecture Decoupled compute-storage architecture
Cache Cache Cache
“state as checkpoint”
https://www.risingwave.com/
Rethinking State Management Design
61
State State State
States
State State State
Compute
nodes
Persistent
storage
States
Checkpoint
Coupled compute-storage architecture Decoupled compute-storage architecture
Cache Cache Cache
“state as checkpoint”
Small state?
https://www.risingwave.com/
Rethinking State Management Design
62
State State State
States
State State State
Compute
nodes
Persistent
storage
States
Checkpoint
Coupled compute-storage architecture Decoupled compute-storage architecture
Cache Cache Cache
“state as checkpoint”
Big state?
https://www.risingwave.com/
Failure Recovery
63
https://www.risingwave.com/
Failure Recovery
64
State State State
Compute
nodes
Persistent
storage
States
Checkpoint
Coupled compute-storage architecture
https://www.risingwave.com/
Failure Recovery
65
State State State
Compute
nodes
Persistent
storage
States
Checkpoint
Coupled compute-storage architecture
https://www.risingwave.com/
Failure Recovery
66
State State State
Compute
nodes
Persistent
storage
States
Checkpoint
Coupled compute-storage architecture
https://www.risingwave.com/
Failure Recovery
67
State State State
Compute
nodes
Persistent
storage
States
Checkpoint
Coupled compute-storage architecture
State
Recover from
checkpoint
https://www.risingwave.com/
Failure Recovery
68
State State State
States
State State State
Compute
nodes
Persistent
storage
States
Checkpoint
Coupled compute-storage architecture Decoupled compute-storage architecture
Cache Cache Cache
“state as checkpoint”
State
Recover from
checkpoint
https://www.risingwave.com/
Failure Recovery
69
State State State
States
State State State
Compute
nodes
Persistent
storage
States
Checkpoint
Coupled compute-storage architecture Decoupled compute-storage architecture
Cache Cache Cache
“state as checkpoint”
State
Recover from
checkpoint
https://www.risingwave.com/
Failure Recovery
70
State State State
States
State State State
Compute
nodes
Persistent
storage
States
Checkpoint
Coupled compute-storage architecture Decoupled compute-storage architecture
Cache Cache Cache
“state as checkpoint”
State
Recover from
checkpoint
https://www.risingwave.com/
Failure Recovery
71
State State State
States
State State State
Compute
nodes
Persistent
storage
States
Checkpoint
Coupled compute-storage architecture Decoupled compute-storage architecture
Cache Cache Cache
“state as checkpoint”
Read from
remote state
State
Recover from
checkpoint
https://www.risingwave.com/
Elastic Scaling
72
https://www.risingwave.com/
Elastic Scaling
73
State State State
Compute
nodes
Persistent
storage
States
Checkpoint
Coupled compute-storage architecture
https://www.risingwave.com/
Elastic Scaling
74
State State State
Compute
nodes
Persistent
storage
States
Checkpoint
Coupled compute-storage architecture
Scale out
https://www.risingwave.com/
Elastic Scaling
75
State State State
Compute
nodes
Persistent
storage
States
Checkpoint
Coupled compute-storage architecture
Scale out
https://www.risingwave.com/
Elastic Scaling
76
State State State
Compute
nodes
Persistent
storage
States
Checkpoint
Coupled compute-storage architecture
Scale out State State State
States
Decoupled compute-storage architecture
Cache Cache Cache
“state as checkpoint”
https://www.risingwave.com/
Elastic Scaling
77
State State State
Compute
nodes
Persistent
storage
States
Checkpoint
Coupled compute-storage architecture
Scale out State State State
States
Decoupled compute-storage architecture
Cache Cache Cache
“state as checkpoint”
Scale out
https://www.risingwave.com/
Challenging Problems
78
https://www.risingwave.com/
Challenging Problems: #1
• LSM tree compaction
79
EC2
EBS
S3
Hot data
Warm data
Cold data
Compaction
Compaction
https://www.risingwave.com/
Challenging Problems: #1
• LSM tree compaction
80
EC2
EBS
S3
Hot data
Warm data
Cold data
Compaction
Compaction
Compaction can result in performance drops!
https://www.risingwave.com/
Challenging Problems: #1
• LSM tree compaction
81
EC2
EBS
S3
Hot data
Warm data
Cold data
Compaction
Compaction
Compaction can result in performance drops!
Remote compaction?
Lambda function?
https://www.risingwave.com/
Challenging Problems: #1
• LSM tree compaction
82
EC2
EBS
S3
Hot data
Warm data
Cold data
Compaction
Compaction
Compaction can result in performance drops!
Remote compaction?
Lambda function?
Still incur high CPU utilization rate!
https://www.risingwave.com/
Challenging Problems
83
https://www.risingwave.com/
Challenging Problems: #2
84
• Cache miss
EC2
EBS
S3
Hot data
Warm data
Cold data
Compaction
Compaction
High latency can occur due to cache miss!
https://www.risingwave.com/
Challenging Problems: #2
85
• Cache miss
• Out-of-order processing
• Overlap fetching from S3 with computation
https://www.risingwave.com/
Challenging Problems
86
https://www.risingwave.com/
Challenging Problems: #3
87
• Implementing “state as checkpoint”
State State State
States
Cache Cache Cache
“state as checkpoint”
Decoupled compute-storage architecture
https://www.risingwave.com/
Challenging Problems: #3
88
• Implementing “state as checkpoint”
• Multi-version concurrency control
State State State
States
Cache Cache Cache
“state as checkpoint”
Decoupled compute-storage architecture
https://www.risingwave.com/
Challenging Problems: #3
89
• Implementing “state as checkpoint”
• Multi-version concurrency control
• Use “epoch” to identify versions
State State State
States
Cache Cache Cache
“state as checkpoint”
Decoupled compute-storage architecture
https://www.risingwave.com/
Performance Evaluation
90
• I will not show any performance numbers in this talk!
https://www.risingwave.com/
Performance Evaluation
91
• I will not show any performance numbers in this talk!
• Not a fan of performance “bench-marketing”
https://www.risingwave.com/
Performance Evaluation
92
• I will not show any performance numbers in this talk!
• Not a fan of performance “bench-marketing”
• The objective is to maximize cost efficiency, not performance
https://www.risingwave.com/
Performance Evaluation
93
• I will not show any performance numbers in this talk!
• Not a fan of performance “bench-marketing”
• The objective is to maximize cost efficiency, not performance
• Yes, we have the performance numbers, and they look nice!
https://www.risingwave.com/
Performance Evaluation
94
• I will not show any performance numbers in this talk!
• Not a fan of performance “bench-marketing”
• The objective is to maximize cost efficiency, not performance
• Yes, we have the performance numbers, and they look nice!
• DM me if you want to read the performance report!
• yingjunwu@risingwave-labs.com
https://www.risingwave.com/
Thank you! Questions?
risingwave.com/linkedin
License 2.0
risingwave.com/twitter
risingwave.com/github
risingwave.com/slack

More Related Content

PDF
Producer Performance Tuning for Apache Kafka
Jiangjie Qin
 
PPTX
Using Queryable State for Fun and Profit
Flink Forward
 
PDF
Building Robust ETL Pipelines with Apache Spark
Databricks
 
PPTX
Real time big data stream processing
Luay AL-Assadi
 
ODP
Stream processing using Kafka
Knoldus Inc.
 
PPTX
APACHE KAFKA / Kafka Connect / Kafka Streams
Ketan Gote
 
PDF
A Thorough Comparison of Delta Lake, Iceberg and Hudi
Databricks
 
PDF
Room 2 - 6 - Đinh Tuấn Phong - Migrate opensource database to Kubernetes easi...
Vietnam Open Infrastructure User Group
 
Producer Performance Tuning for Apache Kafka
Jiangjie Qin
 
Using Queryable State for Fun and Profit
Flink Forward
 
Building Robust ETL Pipelines with Apache Spark
Databricks
 
Real time big data stream processing
Luay AL-Assadi
 
Stream processing using Kafka
Knoldus Inc.
 
APACHE KAFKA / Kafka Connect / Kafka Streams
Ketan Gote
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
Databricks
 
Room 2 - 6 - Đinh Tuấn Phong - Migrate opensource database to Kubernetes easi...
Vietnam Open Infrastructure User Group
 

What's hot (20)

PPTX
Evening out the uneven: dealing with skew in Flink
Flink Forward
 
PDF
Building an open data platform with apache iceberg
Alluxio, Inc.
 
PDF
Apache Kafka Architecture & Fundamentals Explained
confluent
 
PDF
[ACNA2022] Hadoop Vectored IO_ your data just got faster!.pdf
MukundThakur22
 
PDF
[Meetup] a successful migration from elastic search to clickhouse
Vianney FOUCAULT
 
PPTX
Tuning Apache Kafka Connectors for Flink.pptx
Flink Forward
 
PDF
Apache Iceberg - A Table Format for Hige Analytic Datasets
Alluxio, Inc.
 
PDF
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
DataStax Academy
 
PDF
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Flink Forward
 
PDF
Room 1 - 4 - Phạm Tường Chiến & Trần Văn Thắng - Deliver managed Kubernetes C...
Vietnam Open Infrastructure User Group
 
PDF
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
Altinity Ltd
 
PDF
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Databricks
 
PDF
Changelog Stream Processing with Apache Flink
Flink Forward
 
PPTX
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
Flink Forward
 
PDF
Optimizing MariaDB for maximum performance
MariaDB plc
 
PPTX
Dynamic Rule-based Real-time Market Data Alerts
Flink Forward
 
PDF
Room 3 - 1 - Nguyễn Xuân Trường Lâm - Zero touch on-premise storage infrastru...
Vietnam Open Infrastructure User Group
 
PPTX
Where is my bottleneck? Performance troubleshooting in Flink
Flink Forward
 
PDF
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
StreamNative
 
PPTX
Demystifying flink memory allocation and tuning - Roshan Naik, Uber
Flink Forward
 
Evening out the uneven: dealing with skew in Flink
Flink Forward
 
Building an open data platform with apache iceberg
Alluxio, Inc.
 
Apache Kafka Architecture & Fundamentals Explained
confluent
 
[ACNA2022] Hadoop Vectored IO_ your data just got faster!.pdf
MukundThakur22
 
[Meetup] a successful migration from elastic search to clickhouse
Vianney FOUCAULT
 
Tuning Apache Kafka Connectors for Flink.pptx
Flink Forward
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Alluxio, Inc.
 
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
DataStax Academy
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Flink Forward
 
Room 1 - 4 - Phạm Tường Chiến & Trần Văn Thắng - Deliver managed Kubernetes C...
Vietnam Open Infrastructure User Group
 
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
Altinity Ltd
 
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Databricks
 
Changelog Stream Processing with Apache Flink
Flink Forward
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
Flink Forward
 
Optimizing MariaDB for maximum performance
MariaDB plc
 
Dynamic Rule-based Real-time Market Data Alerts
Flink Forward
 
Room 3 - 1 - Nguyễn Xuân Trường Lâm - Zero touch on-premise storage infrastru...
Vietnam Open Infrastructure User Group
 
Where is my bottleneck? Performance troubleshooting in Flink
Flink Forward
 
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
StreamNative
 
Demystifying flink memory allocation and tuning - Roshan Naik, Uber
Flink Forward
 
Ad

Similar to Rethinking State Management in Cloud-Native Streaming Systems (14)

PDF
Data Platform at Twitter: Enabling Real-time & Batch Analytics at Scale
Sriram Krishnan
 
PPTX
Flink Forward San Francisco 2018: Fabian Hueske & Timo Walther - "Why and how...
Flink Forward
 
PPTX
Why and how to leverage the simplicity and power of SQL on Flink
DataWorks Summit
 
PPTX
Data Design and Modeling for Microservices I AWS Dev Day 2018
AWS Germany
 
PPTX
In-Memory Stream Processing with Hazelcast Jet @JEEConf
Nazarii Cherkas
 
PPTX
Flexible and Real-Time Stream Processing with Apache Flink
DataWorks Summit
 
PDF
Concord: Simple & Flexible Stream Processing on Apache Mesos: Data By The Bay...
Concord
 
KEY
How Flipkart scales PHP
Siddhartha Reddy Kothakapu
 
PPTX
Data data everywhere
Metron
 
PPT
The Evolution of Big Data Pipelines at Intuit
DataWorks Summit/Hadoop Summit
 
PDF
Resilient Predictive Data Pipelines (QCon London 2016)
Sid Anand
 
PDF
What's New in Apache Spark 2.3 & Why Should You Care
Databricks
 
PDF
Cloud Native Data Pipelines (GoTo Chicago 2017)
Sid Anand
 
PPT
Cross-Tier Application and Data Partitioning of Web Applications for Hybrid C...
nimak
 
Data Platform at Twitter: Enabling Real-time & Batch Analytics at Scale
Sriram Krishnan
 
Flink Forward San Francisco 2018: Fabian Hueske & Timo Walther - "Why and how...
Flink Forward
 
Why and how to leverage the simplicity and power of SQL on Flink
DataWorks Summit
 
Data Design and Modeling for Microservices I AWS Dev Day 2018
AWS Germany
 
In-Memory Stream Processing with Hazelcast Jet @JEEConf
Nazarii Cherkas
 
Flexible and Real-Time Stream Processing with Apache Flink
DataWorks Summit
 
Concord: Simple & Flexible Stream Processing on Apache Mesos: Data By The Bay...
Concord
 
How Flipkart scales PHP
Siddhartha Reddy Kothakapu
 
Data data everywhere
Metron
 
The Evolution of Big Data Pipelines at Intuit
DataWorks Summit/Hadoop Summit
 
Resilient Predictive Data Pipelines (QCon London 2016)
Sid Anand
 
What's New in Apache Spark 2.3 & Why Should You Care
Databricks
 
Cloud Native Data Pipelines (GoTo Chicago 2017)
Sid Anand
 
Cross-Tier Application and Data Partitioning of Web Applications for Hybrid C...
nimak
 
Ad

Recently uploaded (20)

PDF
Oracle AI Vector Search- Getting Started and what's new in 2025- AIOUG Yatra ...
Sandesh Rao
 
PDF
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
PDF
AI-Cloud-Business-Management-Platforms-The-Key-to-Efficiency-Growth.pdf
Artjoker Software Development Company
 
PDF
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
PPTX
The-Ethical-Hackers-Imperative-Safeguarding-the-Digital-Frontier.pptx
sujalchauhan1305
 
PDF
Brief History of Internet - Early Days of Internet
sutharharshit158
 
PPTX
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
PDF
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
PDF
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
PPTX
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
PDF
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PDF
REPORT: Heating appliances market in Poland 2024
SPIUG
 
PDF
Doc9.....................................
SofiaCollazos
 
PDF
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
PPTX
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
PDF
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
PDF
Advances in Ultra High Voltage (UHV) Transmission and Distribution Systems.pdf
Nabajyoti Banik
 
PDF
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
PDF
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 
Oracle AI Vector Search- Getting Started and what's new in 2025- AIOUG Yatra ...
Sandesh Rao
 
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
AI-Cloud-Business-Management-Platforms-The-Key-to-Efficiency-Growth.pdf
Artjoker Software Development Company
 
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
The-Ethical-Hackers-Imperative-Safeguarding-the-Digital-Frontier.pptx
sujalchauhan1305
 
Brief History of Internet - Early Days of Internet
sutharharshit158
 
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
REPORT: Heating appliances market in Poland 2024
SPIUG
 
Doc9.....................................
SofiaCollazos
 
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
Advances in Ultra High Voltage (UHV) Transmission and Distribution Systems.pdf
Nabajyoti Banik
 
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 

Rethinking State Management in Cloud-Native Streaming Systems