SlideShare a Scribd company logo
Apache Kafka is a fast, scalable, durable and
distributed messaging system.

 We need basic Java programming skills plus access
to:
 Apache Kafka 0.9.0
 Apache Maven 3.0 or later
 Git
 Step 1: Download Kafka
Download the Apache Kafka 0.9.0release and un-tar
it.
Prerequisites & Installation

Kafka is designed for distributed high
throughput systems. Kafka tends to work very
well as a replacement for a more traditional
message broker. In comparison to other
messaging systems, Kafka has better throughput,
built-in partitioning, replication and inherent
fault-tolerance, which makes it a good fit for
large-scale message processing applications.


 A Messaging System is responsible for transferring
data from one application to another, so the
applications can focus on data, but not worry about
how to share it. Distributed messaging is based on
the concept of reliable message queuing. Messages
are queued asynchronously between client
applications and messaging system. Two types of
messaging patterns are available − one is point to
point and the other is publish-subscribe (pub-sub)
messaging system. Most of the messaging patterns
follow pub-sub.
What is a Messaging
System?

 In a point-to-point system, messages are persisted in
a queue. One or more consumers can consume the
messages in the queue, but a particular message can
be consumed by a maximum of one consumer only.
Once a consumer reads a message in the queue, it
disappears from that queue. The typical example of
this system is an Order Processing System, where
each order will be processed by one Order Processor,
but Multiple Order Processors can work as well at
the same time. The following diagram depicts the
structure.
Point to Point Messaging
System
Sender
Message
Queue
Receiver
 In the publish-subscribe system, messages are persisted
in a topic. Unlike point-to-point system, consumers can
subscribe to one or more topic and consume all the
messages in that topic. In the Publish-Subscribe system,
message producers are called publishers and message
consumers are called subscribers. A real-life example is
Dish TV, which publishes different channels like sports,
movies, music, etc., and anyone can subscribe to their
own set of channels and get them whenever their
subscribed channels are available.
Publish-Subscribe
Messaging System
Sender
Message
Queue
Receiver
Receiver
Receiver

 Following are a few benefits of Kafka −
 Reliability − Kafka is distributed, partitioned, replicated and
fault tolerance.
 Scalability − Kafka messaging system scales easily without
down time..
 Durability − Kafka uses Distributed commit log which means
messages persists on disk as fast as possible, hence it is
durable..
 Performance − Kafka has high throughput for both publishing
and subscribing messages. It maintains stable performance
even many TB of messages are stored.
 Kafka is very fast and guarantees zero downtime and zero data
loss.
Benefits

 Kafka is a unified platform for handling all the real-time
data feeds. Kafka supports low latency message delivery
and gives guarantee for fault tolerance in the presence of
machine failures. It has the ability to handle a large
number of diverse consumers. Kafka is very fast,
performs 2 million writes/sec. Kafka persists all data to
the disk, which essentially means that all the writes go to
the page cache of the OS (RAM). This makes it very
efficient to transfer data from page cache to a network
socket.

Need for Kafka

 such as topics, brokers, producers and consumers.
 Topics: A stream of messages belonging to a particular category is
called a topic. Data is stored in topics
 Partition:Topics may have many partitions, so it can handle an
arbitrary amount of data.
 Partition offset: Each partitioned message has a unique sequence id
called as offset
 Replicas of partition: Replicas are nothing but backups of a partition.
Replicas are never read or write data. They are used to prevent data
loss.
 Brokers
 Brokers are simple system responsible for maintaining the pub-lished
data. Each broker may have zero or more partitions per topic. Assume,
if there are N partitions in a topic and N number of brokers, each
broker will have one partition.
Kafka main
terminologies

 The sample Producer is a classical Java application with a
main() method, this application must:
 Initialize and configure a producer
 Use the producer to send messages
 1- Producer Initialization
Create a producer is quite simple, you just need to create an
instance of the
org.apache.kafka.clients.producer.KafkaProducer class with a
set of properties, this looks like:
 producer = new KafkaProducer(properties); In this example,
the configuration is externalized in a property file, with the
following entries:
Producer


Once you have a producer instance you can post
messages to a topic using the ProducerRecord class. The
ProducerRecord class is a key/value pair where:
 the key is the topic
 the value is the message
 As you can guess sending a message to the topic is
straight forward:
 ... producer.send(new ProducerRecord("fast-messages",
"This is a dummy message")); ...
2- Message posting

Once you are done with the producer use
the producer.close() method that blocks the process
until all the messages are sent to the server. This call is
used in a finally block to guarantee that it is called. A
Kafka producer can also be used in a try with resources
construct.
 ... }
 finally { producer.close();
 }
Producer End

 The Consumer class, like the producer is a simple
Java class with a main method.
 This sample consumer uses the Hdr Histogram
library to record and analyze the messages received
from the fast-messages topic, and Jackson to parse
JSON messages.
Consumer

 ZooKeeper is used for managing and coordinating
Kafka broker. ZooKeeper service is mainly used to
notify producer and consumer about the presence of
any new broker in the Kafka system or failure of the
broker in the Kafka system. As per the notification
received by the Zookeeper regarding presence or
failure of the broker then pro-ducer and consumer
takes decision and starts coordinating their task with
some other broker.
 Kafka stores basic metadata in Zookeeper such as
information about topics, brokers, consumer offsets
(queue readers) and so on.
ZooKeeper

More Related Content

What's hot (20)

PPTX
Apache kafka
Jemin Patel
 
PDF
Introduction to Apache Kafka
Shiao-An Yuan
 
ODP
Introduction to Kafka connect
Knoldus Inc.
 
PPTX
Apache kafka
Viswanath J
 
PDF
An Introduction to Apache Kafka
Amir Sedighi
 
PDF
Integrating Apache Kafka Into Your Environment
confluent
 
PPTX
Kafka 101
Aparna Pillai
 
PPTX
Apache Kafka at LinkedIn
Discover Pinterest
 
PPTX
A visual introduction to Apache Kafka
Paul Brebner
 
PDF
Benefits of Stream Processing and Apache Kafka Use Cases
confluent
 
PPTX
APACHE KAFKA / Kafka Connect / Kafka Streams
Ketan Gote
 
PDF
Building zero data loss pipelines with apache kafka
Avinash Ramineni
 
ODP
Debugging Native heap OOM - JavaOne 2013
MattKilner
 
PDF
Apache Kafka Introduction
Amita Mirajkar
 
PPTX
Discover Quarkus and GraalVM
Romain Schlick
 
PPTX
Apache kafka
Kumar Shivam
 
PPTX
Introduction to Apache Kafka
Jeff Holoman
 
PDF
[D31] PostgreSQLでスケールアウト構成を構築しよう by Yugo Nagata
Insight Technology, Inc.
 
PPTX
Apache kafka
Srikrishna k
 
Apache kafka
Jemin Patel
 
Introduction to Apache Kafka
Shiao-An Yuan
 
Introduction to Kafka connect
Knoldus Inc.
 
Apache kafka
Viswanath J
 
An Introduction to Apache Kafka
Amir Sedighi
 
Integrating Apache Kafka Into Your Environment
confluent
 
Kafka 101
Aparna Pillai
 
Apache Kafka at LinkedIn
Discover Pinterest
 
A visual introduction to Apache Kafka
Paul Brebner
 
Benefits of Stream Processing and Apache Kafka Use Cases
confluent
 
APACHE KAFKA / Kafka Connect / Kafka Streams
Ketan Gote
 
Building zero data loss pipelines with apache kafka
Avinash Ramineni
 
Debugging Native heap OOM - JavaOne 2013
MattKilner
 
Apache Kafka Introduction
Amita Mirajkar
 
Discover Quarkus and GraalVM
Romain Schlick
 
Apache kafka
Kumar Shivam
 
Introduction to Apache Kafka
Jeff Holoman
 
[D31] PostgreSQLでスケールアウト構成を構築しよう by Yugo Nagata
Insight Technology, Inc.
 
Apache kafka
Srikrishna k
 

Viewers also liked (18)

PDF
如何滅除痛苦
宏 恆
 
PDF
Web search-metrics-tutorial-www2010-section-6of7-freshness
Ali Dasdan
 
PPT
Flasirana voda
miroslavv149
 
PDF
ETSY GUY FINAL
Katherine Grudens
 
PPT
Buyer presentation
Beth Larson
 
PPTX
Jsf 2
Ramakrishna kapa
 
PPTX
Lutein plus
Zoran Stojcevski
 
PDF
20160821 a parábola do semeador
Flavio Brim
 
PPTX
Spirulina chlorella plus
Zoran Stojcevski
 
PDF
Pseudo-addiction
Paul Coelho, MD
 
PPT
Zvit-vihovna
tetiana1958
 
PPTX
Differently-abled Heroes of India; awarded by Limca book of records
ATUL RAJA
 
PPTX
Essential skills of a teacher
Tin Arevalo
 
PPTX
Harvard Business School: Why Companies Fail and How Their Founders Can Bounce...
ATUL RAJA
 
PPTX
Journalism: Guidelines and Steps in Page Designing
Jamaica Olazo
 
PPTX
MATERI AQIDAH Mengurai Sombong
Anas Wibowo
 
PPTX
successful entrepreneurs of flipkart {sachin and binny bansal}
Yogesh Gokule
 
PDF
leanIX - Networking Event Hamburg 22.2.2013
LeanIX GmbH
 
如何滅除痛苦
宏 恆
 
Web search-metrics-tutorial-www2010-section-6of7-freshness
Ali Dasdan
 
Flasirana voda
miroslavv149
 
ETSY GUY FINAL
Katherine Grudens
 
Buyer presentation
Beth Larson
 
Lutein plus
Zoran Stojcevski
 
20160821 a parábola do semeador
Flavio Brim
 
Spirulina chlorella plus
Zoran Stojcevski
 
Pseudo-addiction
Paul Coelho, MD
 
Zvit-vihovna
tetiana1958
 
Differently-abled Heroes of India; awarded by Limca book of records
ATUL RAJA
 
Essential skills of a teacher
Tin Arevalo
 
Harvard Business School: Why Companies Fail and How Their Founders Can Bounce...
ATUL RAJA
 
Journalism: Guidelines and Steps in Page Designing
Jamaica Olazo
 
MATERI AQIDAH Mengurai Sombong
Anas Wibowo
 
successful entrepreneurs of flipkart {sachin and binny bansal}
Yogesh Gokule
 
leanIX - Networking Event Hamburg 22.2.2013
LeanIX GmbH
 
Ad

Similar to Apache kafka (20)

PPTX
Apache kafka
Srikrishna k
 
PDF
apachekafka-160907180205.pdf
TarekHamdi8
 
PPTX
Kafka tutorial
Srikrishna k
 
PPTX
kafka_session_updated.pptx
Koiuyt1
 
PDF
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
mumrah
 
PPTX
Apache kafka
natashasweety7
 
PPTX
Distributed messaging with Apache Kafka
Saumitra Srivastav
 
PDF
Introduction_to_Kafka - A brief Overview.pdf
ssuserc49ec4
 
PPTX
Apache Kafka - Overview
CodeOps Technologies LLP
 
PDF
Apache kafka
NexThoughts Technologies
 
PDF
Introduction to Kafka and Event-Driven
arconsis
 
PPTX
Introduction to Kafka and Event-Driven
Dimosthenis Botsaris
 
PDF
Apache Kafka - Scalable Message-Processing and more !
Guido Schmutz
 
PDF
Developing Realtime Data Pipelines With Apache Kafka
Joe Stein
 
PDF
Apache Kafka - Scalable Message Processing and more!
Guido Schmutz
 
DOCX
KAFKA Quickstart
Vikram Singh Chandel
 
PDF
Apache Kafka
Worapol Alex Pongpech, PhD
 
PDF
SA UNIT II KAFKA.pdf
ManjuAppukuttan2
 
PDF
Kafka Training Online | Apache Kafka Course
Accentfuture
 
PPTX
Kafka Basic For Beginners
Riby Varghese
 
Apache kafka
Srikrishna k
 
apachekafka-160907180205.pdf
TarekHamdi8
 
Kafka tutorial
Srikrishna k
 
kafka_session_updated.pptx
Koiuyt1
 
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
mumrah
 
Apache kafka
natashasweety7
 
Distributed messaging with Apache Kafka
Saumitra Srivastav
 
Introduction_to_Kafka - A brief Overview.pdf
ssuserc49ec4
 
Apache Kafka - Overview
CodeOps Technologies LLP
 
Introduction to Kafka and Event-Driven
arconsis
 
Introduction to Kafka and Event-Driven
Dimosthenis Botsaris
 
Apache Kafka - Scalable Message-Processing and more !
Guido Schmutz
 
Developing Realtime Data Pipelines With Apache Kafka
Joe Stein
 
Apache Kafka - Scalable Message Processing and more!
Guido Schmutz
 
KAFKA Quickstart
Vikram Singh Chandel
 
SA UNIT II KAFKA.pdf
ManjuAppukuttan2
 
Kafka Training Online | Apache Kafka Course
Accentfuture
 
Kafka Basic For Beginners
Riby Varghese
 
Ad

More from Ramakrishna kapa (20)

PPTX
Load balancer in mule
Ramakrishna kapa
 
PPTX
Anypoint connectors
Ramakrishna kapa
 
PPTX
Batch processing
Ramakrishna kapa
 
PPTX
Msmq connectivity
Ramakrishna kapa
 
PPTX
Scopes in mule
Ramakrishna kapa
 
PPTX
Data weave more operations
Ramakrishna kapa
 
PPTX
Basic math operations using dataweave
Ramakrishna kapa
 
PPTX
Dataweave types operators
Ramakrishna kapa
 
PPTX
Operators in mule dataweave
Ramakrishna kapa
 
PPTX
Data weave in mule
Ramakrishna kapa
 
PPTX
Servicenow connector
Ramakrishna kapa
 
PPTX
Introduction to testing mule
Ramakrishna kapa
 
PPTX
Choice flow control
Ramakrishna kapa
 
PPTX
Message enricher example
Ramakrishna kapa
 
PPTX
Mule exception strategies
Ramakrishna kapa
 
PPTX
Anypoint connector basics
Ramakrishna kapa
 
PPTX
Mule global elements
Ramakrishna kapa
 
PPTX
Mule message structure and varibles scopes
Ramakrishna kapa
 
PPTX
How to create an api in mule
Ramakrishna kapa
 
PPTX
Log4j is a reliable, fast and flexible
Ramakrishna kapa
 
Load balancer in mule
Ramakrishna kapa
 
Anypoint connectors
Ramakrishna kapa
 
Batch processing
Ramakrishna kapa
 
Msmq connectivity
Ramakrishna kapa
 
Scopes in mule
Ramakrishna kapa
 
Data weave more operations
Ramakrishna kapa
 
Basic math operations using dataweave
Ramakrishna kapa
 
Dataweave types operators
Ramakrishna kapa
 
Operators in mule dataweave
Ramakrishna kapa
 
Data weave in mule
Ramakrishna kapa
 
Servicenow connector
Ramakrishna kapa
 
Introduction to testing mule
Ramakrishna kapa
 
Choice flow control
Ramakrishna kapa
 
Message enricher example
Ramakrishna kapa
 
Mule exception strategies
Ramakrishna kapa
 
Anypoint connector basics
Ramakrishna kapa
 
Mule global elements
Ramakrishna kapa
 
Mule message structure and varibles scopes
Ramakrishna kapa
 
How to create an api in mule
Ramakrishna kapa
 
Log4j is a reliable, fast and flexible
Ramakrishna kapa
 

Recently uploaded (20)

PDF
Complete Network Protection with Real-Time Security
L4RGINDIA
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
PDF
TrustArc Webinar - Data Privacy Trends 2025: Mid-Year Insights & Program Stra...
TrustArc
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PDF
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
PDF
Windsurf Meetup Ottawa 2025-07-12 - Planning Mode at Reliza.pdf
Pavel Shukhman
 
PDF
Why Orbit Edge Tech is a Top Next JS Development Company in 2025
mahendraalaska08
 
PDF
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
PDF
Blockchain Transactions Explained For Everyone
CIFDAQ
 
PDF
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
PDF
Impact of IEEE Computer Society in Advancing Emerging Technologies including ...
Hironori Washizaki
 
PDF
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
PDF
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PDF
Predicting the unpredictable: re-engineering recommendation algorithms for fr...
Speck&Tech
 
PDF
Wojciech Ciemski for Top Cyber News MAGAZINE. June 2025
Dr. Ludmila Morozova-Buss
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
Complete Network Protection with Real-Time Security
L4RGINDIA
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
TrustArc Webinar - Data Privacy Trends 2025: Mid-Year Insights & Program Stra...
TrustArc
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
Windsurf Meetup Ottawa 2025-07-12 - Planning Mode at Reliza.pdf
Pavel Shukhman
 
Why Orbit Edge Tech is a Top Next JS Development Company in 2025
mahendraalaska08
 
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
Blockchain Transactions Explained For Everyone
CIFDAQ
 
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
Impact of IEEE Computer Society in Advancing Emerging Technologies including ...
Hironori Washizaki
 
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
Predicting the unpredictable: re-engineering recommendation algorithms for fr...
Speck&Tech
 
Wojciech Ciemski for Top Cyber News MAGAZINE. June 2025
Dr. Ludmila Morozova-Buss
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 

Apache kafka

  • 1. Apache Kafka is a fast, scalable, durable and distributed messaging system.
  • 2.   We need basic Java programming skills plus access to:  Apache Kafka 0.9.0  Apache Maven 3.0 or later  Git  Step 1: Download Kafka Download the Apache Kafka 0.9.0release and un-tar it. Prerequisites & Installation
  • 3.  Kafka is designed for distributed high throughput systems. Kafka tends to work very well as a replacement for a more traditional message broker. In comparison to other messaging systems, Kafka has better throughput, built-in partitioning, replication and inherent fault-tolerance, which makes it a good fit for large-scale message processing applications. 
  • 4.   A Messaging System is responsible for transferring data from one application to another, so the applications can focus on data, but not worry about how to share it. Distributed messaging is based on the concept of reliable message queuing. Messages are queued asynchronously between client applications and messaging system. Two types of messaging patterns are available − one is point to point and the other is publish-subscribe (pub-sub) messaging system. Most of the messaging patterns follow pub-sub. What is a Messaging System?
  • 5.   In a point-to-point system, messages are persisted in a queue. One or more consumers can consume the messages in the queue, but a particular message can be consumed by a maximum of one consumer only. Once a consumer reads a message in the queue, it disappears from that queue. The typical example of this system is an Order Processing System, where each order will be processed by one Order Processor, but Multiple Order Processors can work as well at the same time. The following diagram depicts the structure. Point to Point Messaging System Sender Message Queue Receiver
  • 6.  In the publish-subscribe system, messages are persisted in a topic. Unlike point-to-point system, consumers can subscribe to one or more topic and consume all the messages in that topic. In the Publish-Subscribe system, message producers are called publishers and message consumers are called subscribers. A real-life example is Dish TV, which publishes different channels like sports, movies, music, etc., and anyone can subscribe to their own set of channels and get them whenever their subscribed channels are available. Publish-Subscribe Messaging System Sender Message Queue Receiver Receiver Receiver
  • 7.   Following are a few benefits of Kafka −  Reliability − Kafka is distributed, partitioned, replicated and fault tolerance.  Scalability − Kafka messaging system scales easily without down time..  Durability − Kafka uses Distributed commit log which means messages persists on disk as fast as possible, hence it is durable..  Performance − Kafka has high throughput for both publishing and subscribing messages. It maintains stable performance even many TB of messages are stored.  Kafka is very fast and guarantees zero downtime and zero data loss. Benefits
  • 8.   Kafka is a unified platform for handling all the real-time data feeds. Kafka supports low latency message delivery and gives guarantee for fault tolerance in the presence of machine failures. It has the ability to handle a large number of diverse consumers. Kafka is very fast, performs 2 million writes/sec. Kafka persists all data to the disk, which essentially means that all the writes go to the page cache of the OS (RAM). This makes it very efficient to transfer data from page cache to a network socket.  Need for Kafka
  • 9.   such as topics, brokers, producers and consumers.  Topics: A stream of messages belonging to a particular category is called a topic. Data is stored in topics  Partition:Topics may have many partitions, so it can handle an arbitrary amount of data.  Partition offset: Each partitioned message has a unique sequence id called as offset  Replicas of partition: Replicas are nothing but backups of a partition. Replicas are never read or write data. They are used to prevent data loss.  Brokers  Brokers are simple system responsible for maintaining the pub-lished data. Each broker may have zero or more partitions per topic. Assume, if there are N partitions in a topic and N number of brokers, each broker will have one partition. Kafka main terminologies
  • 10.   The sample Producer is a classical Java application with a main() method, this application must:  Initialize and configure a producer  Use the producer to send messages  1- Producer Initialization Create a producer is quite simple, you just need to create an instance of the org.apache.kafka.clients.producer.KafkaProducer class with a set of properties, this looks like:  producer = new KafkaProducer(properties); In this example, the configuration is externalized in a property file, with the following entries: Producer
  • 11.   Once you have a producer instance you can post messages to a topic using the ProducerRecord class. The ProducerRecord class is a key/value pair where:  the key is the topic  the value is the message  As you can guess sending a message to the topic is straight forward:  ... producer.send(new ProducerRecord("fast-messages", "This is a dummy message")); ... 2- Message posting
  • 12.  Once you are done with the producer use the producer.close() method that blocks the process until all the messages are sent to the server. This call is used in a finally block to guarantee that it is called. A Kafka producer can also be used in a try with resources construct.  ... }  finally { producer.close();  } Producer End
  • 13.   The Consumer class, like the producer is a simple Java class with a main method.  This sample consumer uses the Hdr Histogram library to record and analyze the messages received from the fast-messages topic, and Jackson to parse JSON messages. Consumer
  • 14.   ZooKeeper is used for managing and coordinating Kafka broker. ZooKeeper service is mainly used to notify producer and consumer about the presence of any new broker in the Kafka system or failure of the broker in the Kafka system. As per the notification received by the Zookeeper regarding presence or failure of the broker then pro-ducer and consumer takes decision and starts coordinating their task with some other broker.  Kafka stores basic metadata in Zookeeper such as information about topics, brokers, consumer offsets (queue readers) and so on. ZooKeeper