Confluent Kafka and KSQL:
Streaming Data Pipelines Made Easy
Kairo Tavares
kairo.tavares@hpe.com
Goals
How Confluent Kafka Platform can solve problems in company
What are streaming data pipelines and what are its challenges
How KSQL can make easy your streaming data pipeline
2
Agenda
– Kafka 101
– Confluent Kafka
– Streaming Data Pipeline
– KSQL
– Demo
3
Kafka 101
4
Common problems that we face
Extract - Transform – Load (ETL)
5
Common problems that we face
Microservices
6
Lets organized it
7
Kafka 101
Apache Kafka
Kafka® is used for building real-time data pipelines and
streaming apps. It is horizontally scalable, fault-
tolerant, wicked fast, and runs in production in
thousands of companies.
8
https://kafka.apache.org/
Kafka 101
9
Kafka 101
Producing Data
10
Kafka 101
Consuming Data
11
Simple Consumer Consumer Groups
Confluent Platform
12
Confluent Platform
13
Founded by the team that built Apache Kafka®, Confluent
builds an event streaming platform that enables
companies to easily access data as real-time streams
Development and
Connectivity
Stream Processing
Management
and Operations
https://www.confluent.io/
Confluent Platform
Deployment and Connectivity
– Kafka Connect & Connectors
– Schema Registry
– Kafka Clients
– REST Proxy
– MQTT Proxy
14
Confluent Platform
Avro Format
– Open Source Data Format
– Choice for a number of reasons:
– Direct mapping to and from JSON
– Compact format
– Very fast to serialize and deserialize
– Support to several programming languages
– It has a rich, extensible schema language
defined in pure JSON
– It has the best notion of compatibility for evolving
your data over time
– Built-in documentation
15
Confluent Platform
Management and Operations
– Control Center
– Manage key operations and monitor the health and performance of Kafka clusters and data streams with curated
dashboards directly from a GUI.
– Replicator
– Replicate Kafka topics across data centers and public clouds to ensure disaster recovery and build distributed data
pipelines.
– Auto Data Balancer
– Optimize your resource utilization by invoking a rack-aware algorithm that automatically rebalances partitions across
a Kafka cluster.
– Security Controls
– Enable pass-through client credentials from REST Proxy / Schema Registry to Kafka broker. Map AD/LDAP groups to
Kafka ACLs.
– Operator
– Automate deployment of the complete Confluent Platform, including Kafka, as a cloud-native application on
Kubernetes.
16
Confluent Platform
Stream Processing
– Kafka Streams (Library)
– Build mission-critical real-time applications with a simple Java library and a Kafka cluster - no additional framework or
cluster needed.
– KSQL (Service)
– Build real-time stream processing applications against Apache Kafka using simple SQL-like semantics.
17
Confluent Platform
License
18
Streaming Data Pipeline
19
Streaming Data Pipeline
But what is a stream?
20
– A stream is an unbounded, continuous flow of records
– Data is real-time
– Immutable events
– Records are key-value pairs (Kafka)
Streaming Data Pipeline
Stream Processing
21
Per-record
millisecond delay
Data filtering
Data transformation
and conversions
Data enrichment
with joins
Data manipulation
with scalar functions
Stateful processing
Data Aggregation Windowing processing
Streaming Data Pipeline
Windowing
22
KSQL
23
KSQL
Architecture and components
24
https://docs.confluent.io/current/ksql/docs/concepts/ksql-architecture.html
KSQL
Why KSQL?
25
KSQL
Kafka Streams Library vs KSQL
26
KSQL
Streams vs Tables
27
KSQL
Filtering
28
KSQL
Aggregation
29
How many pageviews regions by user?
KSQL
Aggregation Functions
30UDF and UDAF
KSQL
Joins
31
Table
Stream
KSQL
Click Stream - ETL
32
Stream
(clickstream)
{userid, page, action, usage_time}
Table
(users)
{user_id, level, gender, age}
KSQL
Credit Card Fraud – Anomaly detection
33
KSQL
Error Monitoring
34
KSQL
Arithmetic Operations
35
KSQL
Machine Learning Prediction
36
CREATE STREAM AnomalyDetectionWithFilter
SELECT rowtime, eventid, anomaly(sensorinput) AS Anomaly
FROM carsensor
WHERE anomaly(sensorinput) > 5;
Demo
https://docs.confluent.io/current/quickstart/ce-docker-quickstart.html
37
Thanks
kairo.tavares@hpe.com
38

More Related Content

PPTX
PDF
Redpanda and ClickHouse
PPTX
Apache Kafka Best Practices
PDF
Kafka At Scale in the Cloud
PPTX
Apache Kafka 0.8 basic training - Verisign
PDF
Apache Kafka Introduction
PDF
Redis + Kafka = Performance at Scale | Julien Ruaux, Redis Labs
PPTX
Introduction to Kafka and Zookeeper
Redpanda and ClickHouse
Apache Kafka Best Practices
Kafka At Scale in the Cloud
Apache Kafka 0.8 basic training - Verisign
Apache Kafka Introduction
Redis + Kafka = Performance at Scale | Julien Ruaux, Redis Labs
Introduction to Kafka and Zookeeper

What's hot (20)

PPTX
Introduction to Kafka
PDF
KSQL-ops! Running ksqlDB in the Wild (Simon Aubury, ThoughtWorks) Kafka Summi...
PPTX
NiFi Best Practices for the Enterprise
PPTX
Kafka 101
PDF
When NOT to use Apache Kafka?
PDF
Disaster Recovery with MirrorMaker 2.0 (Ryanne Dolan, Cloudera) Kafka Summit ...
PPTX
Nginx Reverse Proxy with Kafka.pptx
PPTX
Kafka 101
PDF
Apache kafka performance(latency)_benchmark_v0.3
PPTX
Kafka replication apachecon_2013
PPTX
Kafka presentation
PDF
Apache Kafka Fundamentals for Architects, Admins and Developers
ODP
Stream processing using Kafka
PPTX
Exactly-once Stream Processing with Kafka Streams
PPTX
Introduction to Apache Kafka
PDF
5.6 以前の InnoDB Flushing
PDF
KafkaとAWS Kinesisの比較
PPTX
Citi Tech Talk Disaster Recovery Solutions Deep Dive
PDF
Secret Management with Hashicorp’s Vault
Introduction to Kafka
KSQL-ops! Running ksqlDB in the Wild (Simon Aubury, ThoughtWorks) Kafka Summi...
NiFi Best Practices for the Enterprise
Kafka 101
When NOT to use Apache Kafka?
Disaster Recovery with MirrorMaker 2.0 (Ryanne Dolan, Cloudera) Kafka Summit ...
Nginx Reverse Proxy with Kafka.pptx
Kafka 101
Apache kafka performance(latency)_benchmark_v0.3
Kafka replication apachecon_2013
Kafka presentation
Apache Kafka Fundamentals for Architects, Admins and Developers
Stream processing using Kafka
Exactly-once Stream Processing with Kafka Streams
Introduction to Apache Kafka
5.6 以前の InnoDB Flushing
KafkaとAWS Kinesisの比較
Citi Tech Talk Disaster Recovery Solutions Deep Dive
Secret Management with Hashicorp’s Vault
Ad

Similar to Confluent Kafka and KSQL: Streaming Data Pipelines Made Easy (20)

PDF
Introduction to Apache Kafka and Confluent... and why they matter!
PDF
Introduction to Apache Kafka and Confluent... and why they matter
PDF
Introducing Confluent Cloud: Apache Kafka as a Service
PDF
Kafka Connect and Streams (Concepts, Architecture, Features)
PDF
Confluent Enterprise Datasheet
PDF
Confluent and Elastic
PDF
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
PDF
Introduction to apache kafka, confluent and why they matter
PDF
BBL KAPPA Lesfurets.com
PPTX
Kafka Streams for Java enthusiasts
PDF
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
PPTX
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
PDF
Integrating Apache Kafka Into Your Environment
PDF
Rethinking Stream Processing with Apache Kafka, Kafka Streams and KSQL
PDF
Apache Kafka + Apache Mesos + Kafka Streams - Highly Scalable Streaming Micro...
PDF
Kubernetes connectivity to Cloud Native Kafka | Evan Shortiss and Hugo Guerre...
PDF
Kafka Vienna Meetup 020719
PDF
Building Streaming Data Applications Using Apache Kafka
PDF
Steps to Building a Streaming ETL Pipeline with Apache Kafka® and KSQL
PPTX
Supply Chain Optimization with Apache Kafka
Introduction to Apache Kafka and Confluent... and why they matter!
Introduction to Apache Kafka and Confluent... and why they matter
Introducing Confluent Cloud: Apache Kafka as a Service
Kafka Connect and Streams (Concepts, Architecture, Features)
Confluent Enterprise Datasheet
Confluent and Elastic
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
Introduction to apache kafka, confluent and why they matter
BBL KAPPA Lesfurets.com
Kafka Streams for Java enthusiasts
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Integrating Apache Kafka Into Your Environment
Rethinking Stream Processing with Apache Kafka, Kafka Streams and KSQL
Apache Kafka + Apache Mesos + Kafka Streams - Highly Scalable Streaming Micro...
Kubernetes connectivity to Cloud Native Kafka | Evan Shortiss and Hugo Guerre...
Kafka Vienna Meetup 020719
Building Streaming Data Applications Using Apache Kafka
Steps to Building a Streaming ETL Pipeline with Apache Kafka® and KSQL
Supply Chain Optimization with Apache Kafka
Ad

Recently uploaded (20)

PDF
AI/ML Infra Meetup | Beyond S3's Basics: Architecting for AI-Native Data Access
PPTX
Monitoring Stack: Grafana, Loki & Promtail
PDF
Microsoft Office 365 Crack Download Free
PPTX
Trending Python Topics for Data Visualization in 2025
PPTX
CNN LeNet5 Architecture: Neural Networks
PDF
Website Design Services for Small Businesses.pdf
PDF
Top 10 Software Development Trends to Watch in 2025 🚀.pdf
PDF
Visual explanation of Dijkstra's Algorithm using Python
PDF
Autodesk AutoCAD Crack Free Download 2025
PDF
DNT Brochure 2025 – ISV Solutions @ D365
PDF
AI/ML Infra Meetup | LLM Agents and Implementation Challenges
PDF
Wondershare Recoverit Full Crack New Version (Latest 2025)
PPTX
GSA Content Generator Crack (2025 Latest)
PPTX
Advanced SystemCare Ultimate Crack + Portable (2025)
PPTX
assetexplorer- product-overview - presentation
PDF
DuckDuckGo Private Browser Premium APK for Android Crack Latest 2025
PPTX
WiFi Honeypot Detecscfddssdffsedfseztor.pptx
PDF
How AI/LLM recommend to you ? GDG meetup 16 Aug by Fariman Guliev
PDF
Product Update: Alluxio AI 3.7 Now with Sub-Millisecond Latency
PDF
Time Tracking Features That Teams and Organizations Actually Need
AI/ML Infra Meetup | Beyond S3's Basics: Architecting for AI-Native Data Access
Monitoring Stack: Grafana, Loki & Promtail
Microsoft Office 365 Crack Download Free
Trending Python Topics for Data Visualization in 2025
CNN LeNet5 Architecture: Neural Networks
Website Design Services for Small Businesses.pdf
Top 10 Software Development Trends to Watch in 2025 🚀.pdf
Visual explanation of Dijkstra's Algorithm using Python
Autodesk AutoCAD Crack Free Download 2025
DNT Brochure 2025 – ISV Solutions @ D365
AI/ML Infra Meetup | LLM Agents and Implementation Challenges
Wondershare Recoverit Full Crack New Version (Latest 2025)
GSA Content Generator Crack (2025 Latest)
Advanced SystemCare Ultimate Crack + Portable (2025)
assetexplorer- product-overview - presentation
DuckDuckGo Private Browser Premium APK for Android Crack Latest 2025
WiFi Honeypot Detecscfddssdffsedfseztor.pptx
How AI/LLM recommend to you ? GDG meetup 16 Aug by Fariman Guliev
Product Update: Alluxio AI 3.7 Now with Sub-Millisecond Latency
Time Tracking Features That Teams and Organizations Actually Need

Confluent Kafka and KSQL: Streaming Data Pipelines Made Easy

Editor's Notes

  • #9: Apache Kafka was originally developed by LinkedIn, and was subsequently open sourced in early 2011. Graduation from the Apache Incubator occurred on 23 October 2012. In 2014, Jun Rao, Jay Kreps, and Neha Narkhede, who had worked on Kafka at LinkedIn, created a new company named Confluent with a focus on Kafka.[5] According to a Quora post from 2014, Kreps chose to name the software after the author Franz Kafka because it is "a system optimized for writing", and he liked Kafka's work