Why Serverless Flink Matters - Blazing Fast Stream Processing Made Scalable

Why Serverless Flink Matters -
Blazing Fast Stream Processing
Made Scalable
1
Mayank Juneja, Conﬂuent
Jean-Sébastien Brunner, Conﬂuent

Agenda
2
1. Stream Processing
a. Overview and Challenges
2. Why Flink?
a. Why is it so popular?
b. Challenges of self managed Flink
3. Serverless Flink + Demo

Real-time
Data
A Sale
A Shipment
A Trade
Rich Front-End
Customer Experiences
A Customer
Experience
Real-Time Backend
Operations
Stream processing is computing over unbounded
streams of data
Stream
Processor

Stream processing use cases
4
Data Exploration Data Pipelines Real-time Apps
Engineers and Analysts both
need to be able to simply read
and understand the event
streams stored in Kafka
● Metadata discovery
● Throughput analysis
● Data sampling
● Interactive query
Data pipelines are used to
enrich, curate, and transform
events streams, creating new
derived event streams
● Filtering
● Joins
● Projections
● Aggregations
● Flattening
● Enrichment
Whole ecosystems of apps
feed on event streams
automating action in real-time
● Threat detection
● Quality of Service
● Fraud detection
● Intelligent routing
● Alerting

Challenges with Stream Processing
Ordering and
Timing
State
Management
Fault
Tolerance
Scalability
How do you
handle
out-of-order and
late events?
How do you scale
for unexpected
large throughput?
Do you need
exactly-once
semantics?
How do you
manage state in
a distributed
environment?

What’s great about Apache Flink?
Scalability Language Flexibility Uniﬁed Processing
Flink is capable of supporting
stream processing workloads
at hyper scale, as evidenced by
its broad adoption by leading
digital native companies
Flink supports Java, Python, &
SQL without making major
tradeoffs in functionality,
enabling developers to work in
their language of choice
Flink supports stream
processing and batch
processing through one
technology, rather than
needing separate tools
Flink is a top 5 Apache project and is leveraged as the stream processing engine for >25% of Kafka users

Stream Processing with Flink
Ordering
and Timing
State
Management
Fault
Tolerance
Scalability &
Performance
● Event time
processing
● Watermarks
● Elastic scale out
● Network Trafﬁc
Optimization
● Backpressure
Handling
Challenges
Flink
Features
● Local and
in-memory
states for all
computations
● Exactly once
semantics
● Distributed
snapshots /
checkpoints

So is Flink the perfect stream
processor?

Self Managed Flink comes with its own challenges
Conﬁguration
and Setup
Monitoring Cost
Management
Security

- Resource allocation: Provisioning
resources (CPU, memory, storage)
for each Flink job can be a
complicated task
- Dependency management
- Connectors, databases
- Conﬁguration
- Standalone vs k8s vs YARN,
Application mode vs Session
mode
Challenge #1 - Conﬁguration and Setup

Challenge #2 - Monitoring and Maintenance
- Metrics:
- Filtering down to the most
relevant metrics for your
application can be
overwhelming
- Version Upgrades
- Upgrading Flink versions esp
when ensuring backward
compatibility is a pain
- Disaster recovery
- Needs regular backups,
checkpointing, savepoints
Flink downloads, Mar 2023

Challenge #3: Total Cost of Ownership
- Hardware costs: Signiﬁcant
investments required for
managing hardware costs - can
be underutilized
- Expertise: Hiring of skilled
professionals who can set up,
manage and maintain Flink
- Opportunity Cost: Less time
spent on developing core
product or service

Challenge #4 - Security
- RBAC: Flink lacks built-in
capabilities for granular role based
access control
- Encryption: Data encryption at
rest for Flink state backends does
not come out of the box
- Multi-tenancy: Insufﬁcient
capabilities to support multi
tenancy within the same cluster

1
6
Powerful SQL Streaming Operators
Time windows Pattern Matching Streaming Joins
● Time-based windows
● Event-density windows
● Event-based windows: every single
event can trigger a new window
● Complex Event
Processing
● Stream-to-stream joins
● Temporal joins
● Lookup joins
● Versioned joins
etc.

Solution - Serverless Flink
- Evergreen runtime: once you submit a job it can run 24.7 and you don't
need to take care of any upgrade (security patch, Flink, etc.), it just runs.
- Elastic autoscaling of the compute pools:
- Elastic scale up of the pool, with a user-deﬁned maximum
- Elastic scale down of the pool, with scale-to-zero when nothing runs
- Usage based billing
- Separation of compute (Flink) and storage (Kafka)
- Scale independently to get best best cost and best performance
- Optimization of communications for even increased
cost/performance

Autoscale and monitoring at the job level
● Per task dynamic scaling
○ Rescale based on backpressure and utilization of the vertices, not only
based on CPU or infrastructure-level metrics
○ Take into account the throughput from the source
● Job level metrics and monitoring
●

Flink and Kafka closer together
High
bandwidth
Low
bandwidth
Flink and Kafka are closer
together, allowing to reduce:
● Latency
● Network cost
With Fetch-from-follower the
optimization can be done at the
Availability Zone level.
Conﬂuent Cloud
High
bandwidth

Apache Flink in Conﬂuent Cloud
2
1
Serverless Flink SQL
Rich Experience
Complete and Secure
● ANSI-SQL with powerful streaming operators
● Rich CLI Experience
● SQL Editor with "workspaces"
● Integration with Schema Registry and
Governance
● Support for user-authentication and Service
account
+
+

Support various use-cases and Personas
Developers Data Analysts Data Engineers
Languages Java & SQL ANSI SQL SQL & Python
Tools
Use Cases Streaming Apps Data Exploration Data Pipelines
IDE & SQL CLI Notebooks
UI / BI / JDBC

Thanks!
Q&A
Feedback/comments/questions: ﬂink-preview@conﬂuent.io

Why Serverless Flink Matters - Blazing Fast Stream Processing Made Scalable

More Related Content

Similar to Why Serverless Flink Matters - Blazing Fast Stream Processing Made Scalable (20)

More from HostedbyConfluent (20)

Recently uploaded (20)

Why Serverless Flink Matters - Blazing Fast Stream Processing Made Scalable