Introduction to Apache Kafka- Part 1

Download as ODP, PDF

11 likes4,082 views

This document introduces Apache Kafka, a fast, scalable, and fault-tolerant publish-subscribe messaging system used for building data pipelines. It covers key topics such as Kafka's high-level overview, use cases, partition distribution, and the replication protocol, along with basic operational commands. The document also references various resources for further exploration of Kafka's functionalities.

Software

Topics Covered
➢ What is Kafka
➢ Why Kafka
➢ High level overview
➢ Use cases
➢ Key terminology
➢ Partitions distribution over brokers
➢ Replication protocol
➢ Demo

Building Data Pipelines
This is Bad data pipelining

Building Data Pipelines
Kafka decouples
Data Pipelines

Use cases
➢ Messaging
➢ Website Activity Tracking
➢ Metrics
➢ Log Aggregation
➢ Real-Time Stream Processing
➢ Event Sourcing
➢ Commit Log
➢ Internet Of Things (IoT)

Anatomy of a Topic
For each topic, the Kafka cluster maintains a partitioned log that looks like this:
http://kafka.apache.org/images/log_anatomy.png
Number of partition for a Topic is configurable. In this example number of partition are three.

Reading & Writing From Topic
https://content.linkedin.com/content/dam/engineering/en-us/blog/migrated/partitioned_log_0.png
Topic with two partition:

Partitions Distribution
Who is responsible for these tasks ?

Responsibility Of Controller
● managing the states of partitions and replicas
● performing administrative tasks like reassigning partitions

Roles For Partition
➢ Each partition has one server which acts as the "leader" and zero or more servers which act as
"followers".
➢ The leader handles all read and write requests for the partition while the followers passively replicate
the leader.
➢ If the leader fails, one of the followers will automatically become the new leader.
➢ Each server acts as a leader for some of its partitions and a follower for others so load is well
balanced within the cluster.

Basic Operations
● List all topics created:
bin/kafka-topics.sh --list --zookeeper localhost:2181
● Describe a topic:
– bin/kafka-topics.sh --zookeeper localhost:2181 --topic topic-name –describe

Basic Operations
Adding a topic:
$ bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 3 --partitions 1 --topic topic_name
Modifying a topic
$ bin/kafka-topics.sh --zookeeper zk_host:localhost:2181 --alter --topic my_topic_name --partitions 4
Deleting a topic
bin/kafka-topics.sh --zookeeper zk_host:localhost:2181 --delete --topic my_topic_name

Basic Operations
Balancing Leadership:
$ bin/kafka-preferred-replica-election.sh --zookeeper zk_host:localhost:2181
– Or
Also configure Kafka to do this automatically by setting the following configuration :
auto.leader.rebalance.enable = true

References
● http://kafka.apache.org/documentation.html
● https://engineering.linkedin.com/kafka/benchmarking-apache-k
● http://www.confluent.io/blog/tutorial-getting-started-with-the-new
● http://kafka-summit.org
● http://www.confluent.io/blog/hands-free-kafka-replication-a-less

Thanks
Presenters:
@_himaniarora
@_satendrakumar
Organizer:
@knolspeak
http://www.knoldus.com

More Related Content

What's hot (20)

PDF

Kafka and Spark Streamingdatamantra

PDF

kafkaAriel Moskovich

PPTX

Apache KafkaJoe Stein

PDF

Hello, kafka! (an introduction to apache kafka)Timothy Spann

PPTX

Kafka Streams for Java enthusiastsSlim Baltagi

PDF

Introduction to Apache Kafka and why it matters - MadridPaolo Castagna

PPTX

Kafka connect-london-meetup-2016Gwen (Chen) Shapira

PDF

Fundamentals of Apache KafkaChhavi Parasher

PPTX

Kafka connect 101Whiteklay

PPTX

Current and Future of Apache KafkaJoe Stein

PPTX

Data Pipelines with Kafka ConnectKaufman Ng

PDF

Introduction to Kafka StreamsGuozhang Wang

PDF

From Message to Cluster: A Realworld Introduction to Kafka Capacity Planningconfluent

PDF

Building High-Throughput, Low-Latency Pipelines in Kafkaconfluent

PPTX

Real time Messages at Scale with Apache Kafka and CouchbaseWill Gardella

PPTX

Kafka ConnectOleg Kuznetsov

PPTX

Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013Christopher Curtin

PDF

Apache Kafka IntroductionAmita Mirajkar

PPTX

Apache Kafka 0.8 basic training - VerisignMichael Noll

PDF

Apache kafkaNexThoughts Technologies

Kafka and Spark Streamingdatamantra

kafkaAriel Moskovich

Apache KafkaJoe Stein

Hello, kafka! (an introduction to apache kafka)Timothy Spann

Kafka Streams for Java enthusiastsSlim Baltagi

Introduction to Apache Kafka and why it matters - MadridPaolo Castagna

Kafka connect-london-meetup-2016Gwen (Chen) Shapira

Fundamentals of Apache KafkaChhavi Parasher

Kafka connect 101Whiteklay

Current and Future of Apache KafkaJoe Stein

Data Pipelines with Kafka ConnectKaufman Ng

Introduction to Kafka StreamsGuozhang Wang

From Message to Cluster: A Realworld Introduction to Kafka Capacity Planningconfluent

Building High-Throughput, Low-Latency Pipelines in Kafkaconfluent

Real time Messages at Scale with Apache Kafka and CouchbaseWill Gardella

Kafka ConnectOleg Kuznetsov

Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013Christopher Curtin

Apache Kafka IntroductionAmita Mirajkar

Apache Kafka 0.8 basic training - VerisignMichael Noll

Apache kafkaNexThoughts Technologies

Similar to Introduction to Apache Kafka- Part 1 (20)

PDF

14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...Athens Big Data

PDF

An Introduction to Apache KafkaAmir Sedighi

PDF

Introduction to apache kafkaSamuel Kerrien

PPTX

Building Event-Driven Systems with Apache KafkaBrian Ritchie

PDF

Tips and Tricks for Operating Apache KafkaAll Things Open

PDF

Introducing Kafka-on-Pulsar: bring native Kafka protocol support to Apache Pu...StreamNative

PDF

Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUpJosé Román Martín Gil

PDF

Structured Streaming with Kafkadatamantra

PDF

Kafka on Pulsar:bringing native Kafka protocol support to Pulsar_Sijie&PierreStreamNative

PDF

Apache Kafka - Scalable Message-Processing and more !Guido Schmutz

PDF

DBCC 2021 - FLiP Stack for Cloud Data LakesTimothy Spann

PDF

Multi-Tenancy Kafka cluster for LINE services with 250 billion daily messagesLINE Corporation

PDF

Big Data Streams Architectures. Why? What? How?Anton Nazaruk

PDF

Apache Kafka DC Meetup: Replicating DB Binary Logs to KafkaMark Bittmann

PPTX

Introduction to Kafka Streams PresentationKnoldus Inc.

PDF

Building a Messaging Solutions for OVHcloud with Apache Pulsar_Pierre ZembStreamNative

PDF

Timothy Spann: Apache Pulsar for MLEdunomica

PDF

bigdata 2022_ FLiP Into Pulsar AppsTimothy Spann

PPTX

Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community

PDF

Capital One Delivers Risk Insights in Real Time with Stream Processingconfluent

14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...Athens Big Data

An Introduction to Apache KafkaAmir Sedighi

Introduction to apache kafkaSamuel Kerrien

Building Event-Driven Systems with Apache KafkaBrian Ritchie

Tips and Tricks for Operating Apache KafkaAll Things Open

Introducing Kafka-on-Pulsar: bring native Kafka protocol support to Apache Pu...StreamNative

Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUpJosé Román Martín Gil

Structured Streaming with Kafkadatamantra

Kafka on Pulsar:bringing native Kafka protocol support to Pulsar_Sijie&PierreStreamNative

Apache Kafka - Scalable Message-Processing and more !Guido Schmutz

DBCC 2021 - FLiP Stack for Cloud Data LakesTimothy Spann

Multi-Tenancy Kafka cluster for LINE services with 250 billion daily messagesLINE Corporation

Big Data Streams Architectures. Why? What? How?Anton Nazaruk

Apache Kafka DC Meetup: Replicating DB Binary Logs to KafkaMark Bittmann

Introduction to Kafka Streams PresentationKnoldus Inc.

Building a Messaging Solutions for OVHcloud with Apache Pulsar_Pierre ZembStreamNative

Timothy Spann: Apache Pulsar for MLEdunomica

bigdata 2022_ FLiP Into Pulsar AppsTimothy Spann

Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community

Capital One Delivers Risk Insights in Real Time with Stream Processingconfluent

More from Knoldus Inc. (20)

PPTX

Angular Hydration Presentation (FrontEnd)Knoldus Inc.

PPTX

Optimizing Test Execution: Heuristic Algorithm for Self-HealingKnoldus Inc.

PPTX

Self-Healing Test Automation Framework - HealeniumKnoldus Inc.

PPTX

Kanban Metrics Presentation (Project Management)Knoldus Inc.

PPTX

Java 17 features and implementation.pptxKnoldus Inc.

PPTX

Chaos Mesh Introducing Chaos in KubernetesKnoldus Inc.

PPTX

GraalVM - A Step Ahead of JVM PresentationKnoldus Inc.

PPTX

Nomad by HashiCorp Presentation (DevOps)Knoldus Inc.

PPTX

Nomad by HashiCorp Presentation (DevOps)Knoldus Inc.

PPTX

DAPR - Distributed Application Runtime PresentationKnoldus Inc.

PPTX

Introduction to Azure Virtual WAN PresentationKnoldus Inc.

PPTX

Introduction to Argo Rollouts PresentationKnoldus Inc.

PPTX

Intro to Azure Container App PresentationKnoldus Inc.

PPTX

Insights Unveiled Test Reporting and Observability ExcellenceKnoldus Inc.

PPTX

Introduction to Splunk Presentation (DevOps)Knoldus Inc.

PPTX

Code Camp - Data Profiling and Quality Analysis FrameworkKnoldus Inc.

PPTX

AWS: Messaging Services in AWS PresentationKnoldus Inc.

PPTX

Amazon Cognito: A Primer on Authentication and AuthorizationKnoldus Inc.

PPTX

ZIO Http A Functional Approach to Scalable and Type-Safe Web DevelopmentKnoldus Inc.

PPTX

Managing State & HTTP Requests In Ionic.Knoldus Inc.

Angular Hydration Presentation (FrontEnd)Knoldus Inc.

Optimizing Test Execution: Heuristic Algorithm for Self-HealingKnoldus Inc.

Self-Healing Test Automation Framework - HealeniumKnoldus Inc.

Kanban Metrics Presentation (Project Management)Knoldus Inc.

Java 17 features and implementation.pptxKnoldus Inc.

Chaos Mesh Introducing Chaos in KubernetesKnoldus Inc.

GraalVM - A Step Ahead of JVM PresentationKnoldus Inc.

Nomad by HashiCorp Presentation (DevOps)Knoldus Inc.

DAPR - Distributed Application Runtime PresentationKnoldus Inc.

Introduction to Azure Virtual WAN PresentationKnoldus Inc.

Introduction to Argo Rollouts PresentationKnoldus Inc.

Intro to Azure Container App PresentationKnoldus Inc.

Insights Unveiled Test Reporting and Observability ExcellenceKnoldus Inc.

Introduction to Splunk Presentation (DevOps)Knoldus Inc.

Code Camp - Data Profiling and Quality Analysis FrameworkKnoldus Inc.

AWS: Messaging Services in AWS PresentationKnoldus Inc.

Amazon Cognito: A Primer on Authentication and AuthorizationKnoldus Inc.

ZIO Http A Functional Approach to Scalable and Type-Safe Web DevelopmentKnoldus Inc.

Managing State & HTTP Requests In Ionic.Knoldus Inc.

Recently uploaded (20)

PPTX

Writing Better Code - Helping Developers make Decisions.pptxLorraine Steyn

PDF

Powering GIS with FME and VertiGIS - Peak of Data & AI 2025Safe Software

PDF

GridView,Recycler view, API, SQLITE& NetworkRequest.pdfNabin Dhakal

PPT

MergeSortfbsjbjsfk sdfik kRafishaikIT02044

PPTX

Migrating Millions of Users with Debezium, Apache Kafka, and an Acyclic Synch...MD Sayem Ahmed

PPTX

NeuroStrata: Harnessing Neuro-Symbolic Paradigms for Improved Testability and...Ivan Ruchkin

PDF

Beyond Binaries: Understanding Diversity and Allyship in a Global Workplace -...Imma Valls Bernaus

PDF

Transform Retail with Smart Technology: Power Your Growth with GinesysGinesys

PPTX

Equipment Management Software BIS Safety UK.pptxBIS Safety Software

PPTX

How Apagen Empowered an EPC Company with Engineering ERP SoftwareSatishKumar2651

PDF

Capcut Pro Crack For PC Latest Version {Fully Unlocked} 2025hashhshs786

PDF

LPS25 - Operationalizing MLOps in GEP - Terradue.pdfterradue

PDF

Alarm in Android-Scheduling Timed Tasks Using AlarmManager in Android.pdfNabin Dhakal

PDF

From Chaos to Clarity: Mastering Analytics Governance in the Modern EnterpriseWiiisdom

PPTX

EO4EU Ocean Monitoring: Maritime Weather Routing Optimsation Use CaseEO4EU

PPTX

3uTools Full Crack Free Version Download [Latest] 2025muhammadgurbazkhan

PDF

Continouous failure - Why do we make our lives hard?Papp Krisztián

PPTX

Automatic_Iperf_Log_Result_Excel_visual_v2.pptxChen-Chih Lee

PDF

Streamline Contractor Lifecycle- TECH EHS SolutionTECH EHS Solution

PDF

Thread In Android-Mastering Concurrency for Responsive Apps.pdfNabin Dhakal