SlideShare a Scribd company logo
1
Integrating Apache Kafka
and Elastic using the
Connect Framework
2
I ❤ Elastic 😁
Kafka
Cluster
3
Apache Kafka®
Kafka
A Distributed Commit Log. Publish and subscribe to 

streams of records. Highly scalable, high throughput. 

Supports transactions. Persisted data.
Reads are a single seek &
Writes are
append only
4
Apache Kafka®
Kafka Streams API
Write standard Java applications & microservices

to process your data in real-time
Kafka Connect API
Reliable and scalable integration of Kafka
with other systems – no coding required.
Orders
Table
Customers
Kafka Streams API
5
Many Systems are a bit of a mess…
6
The Streaming Platform
7
The Streaming Platform
8
Why Kafka & Elastic?
Event-Centric Thinking
Streaming
Platform
“A product was viewed”
Elasticsear
ch
web
app
Event-Centric Thinking
Streaming
Platform
“A product was viewed”
web
app
mobile
app
APIs
Elasticsear
ch
mobile
app
web
app
APIs
Streaming
Platform
Hadoop
Security
Monitoring
Elastic
search
“A product was
Event-Centric Thinking
System Availability and Event Buffering
Producer Elasticsearch
System Availability and Event Buffering
Producer Elasticsearch
Native Stream Processing
Raw
SLA
breaches
Alert
Stream
Processing
App
Serve
Visualise & Analyse data from Kafka
16
Integrating Elastic and Kafka
17
Integrating Elastic with Kafka - Beats, Logstash
output.kafka:
hosts: ["localhost:9092"]
topic: 'logs'
required_acks: 1
output {
kafka {
topic_id => "logstash_logs_json"
bootstrap_servers => "localhost:9092"
codec => json
}
}
Beats
Logstash
18
19
Kafka Connect
Kafka Brokers
Kafka Connect
Tasks Workers
Sources Sinks
Amazon S3
syslog
flat file
20
Kafka -> Elasticsearch
21
Kafka Connect's Elasticsearch Sink
{
  "name": "es-sink",
  "config": {
        "connector.class":
"io.confluent.connect.elasticsearch.ElasticsearchSinkConnector",
        "connection.url": "http://localhost:9200",
        "type.name": "type.name=kafka-connect",
        "topics": "foobar"
        }
}
22
Kafka Connect to stream Kafka Topics to Elasticsearch
23
Kafka Connect
Elasticsearch Sink Properties
https://docs.confluent.io/current/connect/connect-elasticsearch/docs/configuration_options.html
24
Sink properties : Converters
• Json, Avro, String, Protobuf, etc
• Specify the converter in the Kafka Connect configuration, e.g.
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
• Kafka Connect uses pluggable converters for both message key and
value deserialisation
25
Schemas & Document Mappings
26
Schemas in Kafka Connect - JSON
{"schema":
{"type":"struct",
"fields":[{"type":"int32","optional":true,"field":"c1"},
{"type":"string","optional":true,"field":"c2"},
{"type":"int64","optional":false,
"name":"org.apache.kafka.connect.data.Timestamp","field":"create_ts"},
{"type":"int64","optional":false,
"name":"org.apache.kafka.connect.data.Timestamp","field":"update_ts"}],
"optional":false,
"name":"foobar"
},
"payload":{ "c1":100,
"c2":"bar",
"create_ts":1516747629000,
"update_ts":1516747629000}
}
27
Kafka Connect + Schema Registry = WIN
Avro
Messag
e
Schema
Registry
Avro
Schema
Kafka
Connect
28
Single Message Transform (SMT) -- Extract, TRANSFORM, Load…
• Modify events before storing in Kafka:
• Mask/drop sensitive information
• Set partitioning key
• Store lineage
• Cast data types
• Modify events going out of Kafka:
• Direct events to different Elasticsearch
indexes
• Mask/drop sensitive information
• Cast data types to match destination
29
Confluent Platform: Enterprise Streaming based on Apache Kafka®
Database Changes Log Events loT Data Web Events …
CRM
Data Warehouse
Database
Hadoop
Data

Integration
…
Monitoring
Analytics
Custom Apps
Transformations
Real-time Applications
…
Apache Open Source Confluent Open Source Confluent Enterprise
Confluent Platform
Confluent Platform
Apache Kafka®
Core | Connect API | Streams API
Data Compatibility
Schema Registry
Monitoring & Administration
Confluent Control Center | Security
Operations
Replicator | Auto Data Balancing
Development and Connectivity
Clients | Connectors | REST Proxy | CLI
Apache Open Source Confluent Open Source Confluent Enterprise
SQL Stream Processing
KSQL
30
https://www.confluent.io/download/
Streaming ETL, powered by Apache Kafka and Confluent Platform
https://www.confluent.io/blog/simplest-useful-kafka-connect-data-pipeline-world-thereabouts-part-1/
https://docs.confluent.io/current/connect/connect-elasticsearch/docs/

More Related Content

What's hot (20)

PPTX
kafka
Amikam Snir
 
PDF
Kafka as your Data Lake - is it Feasible?
Guido Schmutz
 
PDF
Migrating ETL Workflow to Apache Spark at Scale in Pinterest
Databricks
 
PDF
Diving into the Deep End - Kafka Connect
confluent
 
PDF
Kafka Streams: What it is, and how to use it?
confluent
 
PDF
Apache Nifi Crash Course
DataWorks Summit
 
PPTX
Apache Spark Architecture
Alexey Grishchenko
 
PDF
Monitoring Apache Kafka with Confluent Control Center
confluent
 
PPTX
Data Con LA 2022 - Making real-time analytics a reality for digital transform...
Data Con LA
 
PDF
Producer Performance Tuning for Apache Kafka
Jiangjie Qin
 
PPTX
Druid: Sub-Second OLAP queries over Petabytes of Streaming Data
DataWorks Summit
 
PDF
Introduction to Redis
Dvir Volk
 
PDF
Common issues with Apache Kafka® Producer
confluent
 
PDF
Reliability Guarantees for Apache Kafka
confluent
 
PDF
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
SANG WON PARK
 
PPTX
[211] HBase 기반 검색 데이터 저장소 (공개용)
NAVER D2
 
PPTX
Tuning Apache Kafka Connectors for Flink.pptx
Flink Forward
 
PDF
Service mesh(istio) monitoring
Jeong-Ho Na
 
PPTX
Stream Processing Frameworks
SirKetchup
 
PDF
실시간 스트리밍 분석 Kinesis Data Analytics Deep Dive
Amazon Web Services Korea
 
Kafka as your Data Lake - is it Feasible?
Guido Schmutz
 
Migrating ETL Workflow to Apache Spark at Scale in Pinterest
Databricks
 
Diving into the Deep End - Kafka Connect
confluent
 
Kafka Streams: What it is, and how to use it?
confluent
 
Apache Nifi Crash Course
DataWorks Summit
 
Apache Spark Architecture
Alexey Grishchenko
 
Monitoring Apache Kafka with Confluent Control Center
confluent
 
Data Con LA 2022 - Making real-time analytics a reality for digital transform...
Data Con LA
 
Producer Performance Tuning for Apache Kafka
Jiangjie Qin
 
Druid: Sub-Second OLAP queries over Petabytes of Streaming Data
DataWorks Summit
 
Introduction to Redis
Dvir Volk
 
Common issues with Apache Kafka® Producer
confluent
 
Reliability Guarantees for Apache Kafka
confluent
 
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
SANG WON PARK
 
[211] HBase 기반 검색 데이터 저장소 (공개용)
NAVER D2
 
Tuning Apache Kafka Connectors for Flink.pptx
Flink Forward
 
Service mesh(istio) monitoring
Jeong-Ho Na
 
Stream Processing Frameworks
SirKetchup
 
실시간 스트리밍 분석 Kinesis Data Analytics Deep Dive
Amazon Web Services Korea
 

Similar to Integrating Apache Kafka and Elastic Using the Connect Framework (20)

PDF
Kafka Connect and Streams (Concepts, Architecture, Features)
Kai Wähner
 
PDF
Confluent and Elastic
Paolo Castagna
 
PDF
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
confluent
 
PDF
Apache Kafka - A Distributed Streaming Platform
Paolo Castagna
 
PDF
Apache kafka-a distributed streaming platform
confluent
 
PPTX
Kafka Streams for Java enthusiasts
Slim Baltagi
 
PPTX
Kafka connect 101
Whiteklay
 
PDF
BBL KAPPA Lesfurets.com
Cedric Vidal
 
PDF
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Kai Wähner
 
PDF
Chti jug - 2018-06-26
Florent Ramiere
 
PPTX
Confluent Kafka and KSQL: Streaming Data Pipelines Made Easy
Kairo Tavares
 
PDF
Jug - ecosystem
Florent Ramiere
 
PDF
Introducing Kafka's Streams API
confluent
 
PDF
Changing landscapes in data integration - Kafka Connect for near real-time da...
HostedbyConfluent
 
PDF
Apache Kafka - Scalable Message-Processing and more !
Guido Schmutz
 
PDF
Introduction to apache kafka, confluent and why they matter
Paolo Castagna
 
PDF
How to Leverage the Apache Kafka Ecosystem to Productionize Machine Learning ...
Codemotion
 
PDF
Kafka 탄생과 생태계
Gee Yeol Nahm
 
PPTX
Kafka Streams: The Stream Processing Engine of Apache Kafka
Eno Thereska
 
Kafka Connect and Streams (Concepts, Architecture, Features)
Kai Wähner
 
Confluent and Elastic
Paolo Castagna
 
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
confluent
 
Apache Kafka - A Distributed Streaming Platform
Paolo Castagna
 
Apache kafka-a distributed streaming platform
confluent
 
Kafka Streams for Java enthusiasts
Slim Baltagi
 
Kafka connect 101
Whiteklay
 
BBL KAPPA Lesfurets.com
Cedric Vidal
 
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Kai Wähner
 
Chti jug - 2018-06-26
Florent Ramiere
 
Confluent Kafka and KSQL: Streaming Data Pipelines Made Easy
Kairo Tavares
 
Jug - ecosystem
Florent Ramiere
 
Introducing Kafka's Streams API
confluent
 
Changing landscapes in data integration - Kafka Connect for near real-time da...
HostedbyConfluent
 
Apache Kafka - Scalable Message-Processing and more !
Guido Schmutz
 
Introduction to apache kafka, confluent and why they matter
Paolo Castagna
 
How to Leverage the Apache Kafka Ecosystem to Productionize Machine Learning ...
Codemotion
 
Kafka 탄생과 생태계
Gee Yeol Nahm
 
Kafka Streams: The Stream Processing Engine of Apache Kafka
Eno Thereska
 
Ad

More from confluent (20)

PDF
Stream Processing Handson Workshop - Flink SQL Hands-on Workshop (Korean)
confluent
 
PPTX
Webinar Think Right - Shift Left - 19-03-2025.pptx
confluent
 
PDF
Migration, backup and restore made easy using Kannika
confluent
 
PDF
Five Things You Need to Know About Data Streaming in 2025
confluent
 
PDF
Data in Motion Tour Seoul 2024 - Keynote
confluent
 
PDF
Data in Motion Tour Seoul 2024 - Roadmap Demo
confluent
 
PDF
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
confluent
 
PDF
Confluent per il settore FSI: Accelerare l'Innovazione con il Data Streaming...
confluent
 
PDF
Data in Motion Tour 2024 Riyadh, Saudi Arabia
confluent
 
PDF
Build a Real-Time Decision Support Application for Financial Market Traders w...
confluent
 
PDF
Strumenti e Strategie di Stream Governance con Confluent Platform
confluent
 
PDF
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
confluent
 
PDF
Building Real-Time Gen AI Applications with SingleStore and Confluent
confluent
 
PDF
Unlocking value with event-driven architecture by Confluent
confluent
 
PDF
Il Data Streaming per un’AI real-time di nuova generazione
confluent
 
PDF
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
confluent
 
PDF
Break data silos with real-time connectivity using Confluent Cloud Connectors
confluent
 
PDF
Building API data products on top of your real-time data infrastructure
confluent
 
PDF
Speed Wins: From Kafka to APIs in Minutes
confluent
 
PDF
Evolving Data Governance for the Real-time Streaming and AI Era
confluent
 
Stream Processing Handson Workshop - Flink SQL Hands-on Workshop (Korean)
confluent
 
Webinar Think Right - Shift Left - 19-03-2025.pptx
confluent
 
Migration, backup and restore made easy using Kannika
confluent
 
Five Things You Need to Know About Data Streaming in 2025
confluent
 
Data in Motion Tour Seoul 2024 - Keynote
confluent
 
Data in Motion Tour Seoul 2024 - Roadmap Demo
confluent
 
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
confluent
 
Confluent per il settore FSI: Accelerare l'Innovazione con il Data Streaming...
confluent
 
Data in Motion Tour 2024 Riyadh, Saudi Arabia
confluent
 
Build a Real-Time Decision Support Application for Financial Market Traders w...
confluent
 
Strumenti e Strategie di Stream Governance con Confluent Platform
confluent
 
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
confluent
 
Building Real-Time Gen AI Applications with SingleStore and Confluent
confluent
 
Unlocking value with event-driven architecture by Confluent
confluent
 
Il Data Streaming per un’AI real-time di nuova generazione
confluent
 
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
confluent
 
Break data silos with real-time connectivity using Confluent Cloud Connectors
confluent
 
Building API data products on top of your real-time data infrastructure
confluent
 
Speed Wins: From Kafka to APIs in Minutes
confluent
 
Evolving Data Governance for the Real-time Streaming and AI Era
confluent
 
Ad

Recently uploaded (20)

PDF
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
PPTX
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
PDF
Peak of Data & AI Encore AI-Enhanced Workflows for the Real World
Safe Software
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PDF
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
PDF
Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
PPTX
Mastering ODC + Okta Configuration - Chennai OSUG
HathiMaryA
 
PPT
Ericsson LTE presentation SEMINAR 2010.ppt
npat3
 
PPTX
Seamless Tech Experiences Showcasing Cross-Platform App Design.pptx
presentifyai
 
PDF
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
PPTX
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
PDF
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
PDF
What’s my job again? Slides from Mark Simos talk at 2025 Tampa BSides
Mark Simos
 
PDF
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
PPTX
Digital Circuits, important subject in CS
contactparinay1
 
PDF
NLJUG Speaker academy 2025 - first session
Bert Jan Schrijver
 
PDF
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
PPTX
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
DOCX
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
Peak of Data & AI Encore AI-Enhanced Workflows for the Real World
Safe Software
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
Mastering ODC + Okta Configuration - Chennai OSUG
HathiMaryA
 
Ericsson LTE presentation SEMINAR 2010.ppt
npat3
 
Seamless Tech Experiences Showcasing Cross-Platform App Design.pptx
presentifyai
 
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
What’s my job again? Slides from Mark Simos talk at 2025 Tampa BSides
Mark Simos
 
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
Digital Circuits, important subject in CS
contactparinay1
 
NLJUG Speaker academy 2025 - first session
Bert Jan Schrijver
 
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 

Integrating Apache Kafka and Elastic Using the Connect Framework