SlideShare a Scribd company logo
5
Most read
7
Most read
8
Most read
TL;DR Kafka Metrics
Gantigmaa Selenge
2
TL;DR Kafka Metrics
Gantigmaa Selenge
Kafka Cluster
3
Controller
Broke
r
Client application
Consumer
Producer
Kafka cluster
Overview
Broke
r
4
Broker metrics
Kafka Cluster
Client application
Consumer
Producer
Controller
5
Broker metrics
Alert metrics
UnderMinIsrPartitionCount
kafka.server:type=ReplicaManager,name=UnderMinIsrPartitionCount
6
Broker metrics
Alert metrics
UnderReplicatedPartitionCount
kafka.server:type=ReplicaManager,name=UnderReplicatedPartitionCount
7
BytesInPerSec | BytesOutPerSec
kafka.server:type=BrokerTopicMetrics,name={BytesInPerSec|BytesOutPerSec}
Cluster Performance
Metrics to monitor
ReplicationBytesInPerSec | ReplicationBytesOutPerSec
kafka.server:type=BrokerTopicMetrics,name={ReplicationBytesInPerSec|ReplicationBytesOutPerSec}
8
RequestHandlerAvgIdlePercent
kafka.server:type=KafkaRequestHandlersPool,name=RequestHandlerAvgIdlePercent
Cluster Performance
Metrics to monitor
9
Unbalanced cluster
Metrics to monitor
PartitionCount | LeaderCount
kafka.server:type=ReplicaManager,name=PartitionCount|LeaderCount
10
RequestsPerSec
kafka.network:type=RequestMetrics,name=RequestsPerSec,request={Produce|FetchConsumer|
FetchFollower}
Slow network
Metrics to monitor
11
NetworkProcessorAvgIdlePercent
kafka.network:type=SocketServer,name=NetworkProcessorAvgIdlePercent
Slow network
Metrics to monitor
12
RequestQueueTimeMs | RequestQueueSize
kafka.network:type=RequestChannel,name=RequestQueueTimeMs|RequestQueueSize
Slow network
Metrics to monitor
Kafka Cluster
13
Controller
Broke
r
Client application
Consumer
Producer
Controller Metrics
14
ActiveControllerCount
kafka.controller:type=KafkaController,name=ActiveControllerCount
Controller metrics
Alert metrics
15
OfflinePartitionCount
kafka.controller:type=KafkaController,name=OfflinePartitionCount
Controller metrics
Alert metrics
16
LeaderElectionRateAndTimeMs
kafka.controller:type=KafkaController,name=LeaderElectionRateAndTimeMs
Controller metrics
Alert metrics
Kafka Cluster
17
Controller
Broke
r
Client application
Consumer
Producer
Client Metrics
External monitoring for clusters
18
connection-count
kafka.[producer|consumer]:type=[producer|consumer]-metrics,client-id=([-.w]+)
Client metrics
Alert metrics
19
incoming|outgoing-byte-rate
kafka.[producer|consumer]:type=[producer|consumer]-metrics,client-id=([-.w]+)
Client metrics
Alert metrics
Client application
Kafka Cluster
Producer
JVM
JVM
20
Broker
Controller
Monitoring tools
How does it all fit together?
Prometheus
jmx_exporter
server
jmx_exporter
server
Alert Manager
PagerDuty
Grafana
Consumer
21
https://github.com/tinaselenge
https://www.linkedin.com/in/gselenge
https://developers.redhat.com/topics/kafka-kubernetes
https://kafka.apache.org/documentation/#monitoring
https://strimzi.io/docs/operators/0.36.1/full/overview#metrics-overview_str
https://cwiki.apache.org/confluence/display/KAFKA/
KIP-714%3A+Client+metrics+and+observability
Thank you
22
Tuesday
2:00 PM - 2:45 PM Breakout Room 4
Getting the Balance Right with Kafka Connect
Kate Stanley
5:30 PM - 6:15 PM Breakout Room 7
Safeguarding Your Kafka Data with Encryption-at-rest
Tom Bentley
Red Hat Sessions
Wednesday
2:00 PM - 2:45 PM Breakout Room 3
Meet the Apache Kafka Committers
Tom Bentley
2:15 PM - 3:00 PM Meetup Hub
Running Kafka on Kube the Native Way with Operators (plus Kafka Connect book signing)
Kate Stanley

More Related Content

Similar to TL;DR Kafka Metrics | Kafka Summit London (20)

PPTX
Large scale, distributed access management deployment with aruba clear pass
Aruba, a Hewlett Packard Enterprise company
 
PDF
Web Scale Reasoning and the LarKC Project
Saltlux Inc.
 
PDF
Highly Available Kafka Consumers and Kafka Streams on Kubernetes with Adrian ...
HostedbyConfluent
 
PDF
Autoscaling in kubernetes v1
JurajHantk
 
PDF
Challenges in a Microservices Age: Monitoring, Logging and Tracing on Red Hat...
Martin Etmajer
 
PPTX
Open Shift.Run2019 マイクロサービスの開発に疲れる前にdaprを使おう
kei omizo
 
PDF
Kubernetes: Beyond Baby Steps
DigitalOcean
 
PDF
Bringing Kafka Without Zookeeper Into Production with Colin McCabe | Kafka Su...
HostedbyConfluent
 
PDF
Secure Kafka at scale in true multi-tenant environment ( Vishnu Balusu & Asho...
confluent
 
PDF
How to reduce expenses on monitoring
RomanKhavronenko
 
PDF
Streaming ETL to Elastic with Apache Kafka and KSQL
confluent
 
PDF
Flink Forward San Francisco 2018: Andrew Torson - "Extending Flink metrics: R...
Flink Forward
 
PDF
Kubernetes connectivity to Cloud Native Kafka | Evan Shortiss and Hugo Guerre...
HostedbyConfluent
 
PDF
AWS Summit Singapore 2019 | Autoscaling Your Kubernetes Workloads
AWS Summits
 
PDF
stackconf 2023 | How to reduce expenses on monitoring with VictoriaMetrics by...
NETWAYS
 
PPTX
Tacker vancouver project update
Robin Gong
 
PDF
Data Transformations on Ops Metrics using Kafka Streams (Srividhya Ramachandr...
confluent
 
PPTX
Google Cloud Platform monitoring with Zabbix
Max Kuzkin
 
PDF
NDC London 2017 - The Data Dichotomy- Rethinking Data and Services with Streams
Ben Stopford
 
PDF
Interactive Query in Kafka Streams: The Next Generation with Vasiliki Papavas...
HostedbyConfluent
 
Large scale, distributed access management deployment with aruba clear pass
Aruba, a Hewlett Packard Enterprise company
 
Web Scale Reasoning and the LarKC Project
Saltlux Inc.
 
Highly Available Kafka Consumers and Kafka Streams on Kubernetes with Adrian ...
HostedbyConfluent
 
Autoscaling in kubernetes v1
JurajHantk
 
Challenges in a Microservices Age: Monitoring, Logging and Tracing on Red Hat...
Martin Etmajer
 
Open Shift.Run2019 マイクロサービスの開発に疲れる前にdaprを使おう
kei omizo
 
Kubernetes: Beyond Baby Steps
DigitalOcean
 
Bringing Kafka Without Zookeeper Into Production with Colin McCabe | Kafka Su...
HostedbyConfluent
 
Secure Kafka at scale in true multi-tenant environment ( Vishnu Balusu & Asho...
confluent
 
How to reduce expenses on monitoring
RomanKhavronenko
 
Streaming ETL to Elastic with Apache Kafka and KSQL
confluent
 
Flink Forward San Francisco 2018: Andrew Torson - "Extending Flink metrics: R...
Flink Forward
 
Kubernetes connectivity to Cloud Native Kafka | Evan Shortiss and Hugo Guerre...
HostedbyConfluent
 
AWS Summit Singapore 2019 | Autoscaling Your Kubernetes Workloads
AWS Summits
 
stackconf 2023 | How to reduce expenses on monitoring with VictoriaMetrics by...
NETWAYS
 
Tacker vancouver project update
Robin Gong
 
Data Transformations on Ops Metrics using Kafka Streams (Srividhya Ramachandr...
confluent
 
Google Cloud Platform monitoring with Zabbix
Max Kuzkin
 
NDC London 2017 - The Data Dichotomy- Rethinking Data and Services with Streams
Ben Stopford
 
Interactive Query in Kafka Streams: The Next Generation with Vasiliki Papavas...
HostedbyConfluent
 

More from HostedbyConfluent (20)

PDF
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
HostedbyConfluent
 
PDF
Renaming a Kafka Topic | Kafka Summit London
HostedbyConfluent
 
PDF
Evolution of NRT Data Ingestion Pipeline at Trendyol
HostedbyConfluent
 
PDF
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
HostedbyConfluent
 
PDF
Exactly-once Stream Processing with Arroyo and Kafka
HostedbyConfluent
 
PDF
Fish Plays Pokemon | Kafka Summit London
HostedbyConfluent
 
PDF
Tiered Storage 101 | Kafla Summit London
HostedbyConfluent
 
PDF
Building a Self-Service Stream Processing Portal: How And Why
HostedbyConfluent
 
PDF
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
HostedbyConfluent
 
PDF
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
HostedbyConfluent
 
PDF
Navigating Private Network Connectivity Options for Kafka Clusters
HostedbyConfluent
 
PDF
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
HostedbyConfluent
 
PDF
Explaining How Real-Time GenAI Works in a Noisy Pub
HostedbyConfluent
 
PDF
A Window Into Your Kafka Streams Tasks | KSL
HostedbyConfluent
 
PDF
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
HostedbyConfluent
 
PDF
Data Contracts Management: Schema Registry and Beyond
HostedbyConfluent
 
PDF
Code-First Approach: Crafting Efficient Flink Apps
HostedbyConfluent
 
PDF
Debezium vs. the World: An Overview of the CDC Ecosystem
HostedbyConfluent
 
PDF
Beyond Tiered Storage: Serverless Kafka with No Local Disks
HostedbyConfluent
 
PDF
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
HostedbyConfluent
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
HostedbyConfluent
 
Renaming a Kafka Topic | Kafka Summit London
HostedbyConfluent
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
HostedbyConfluent
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
HostedbyConfluent
 
Exactly-once Stream Processing with Arroyo and Kafka
HostedbyConfluent
 
Fish Plays Pokemon | Kafka Summit London
HostedbyConfluent
 
Tiered Storage 101 | Kafla Summit London
HostedbyConfluent
 
Building a Self-Service Stream Processing Portal: How And Why
HostedbyConfluent
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
HostedbyConfluent
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
HostedbyConfluent
 
Navigating Private Network Connectivity Options for Kafka Clusters
HostedbyConfluent
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
HostedbyConfluent
 
Explaining How Real-Time GenAI Works in a Noisy Pub
HostedbyConfluent
 
A Window Into Your Kafka Streams Tasks | KSL
HostedbyConfluent
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
HostedbyConfluent
 
Data Contracts Management: Schema Registry and Beyond
HostedbyConfluent
 
Code-First Approach: Crafting Efficient Flink Apps
HostedbyConfluent
 
Debezium vs. the World: An Overview of the CDC Ecosystem
HostedbyConfluent
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
HostedbyConfluent
 
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
HostedbyConfluent
 
Ad

Recently uploaded (20)

PPTX
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
SFWelly Summer 25 Release Highlights July 2025
Anna Loughnan Colquhoun
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PDF
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
PDF
July Patch Tuesday
Ivanti
 
PDF
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
PDF
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PDF
Python basic programing language for automation
DanialHabibi2
 
PDF
Windsurf Meetup Ottawa 2025-07-12 - Planning Mode at Reliza.pdf
Pavel Shukhman
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
Blockchain Transactions Explained For Everyone
CIFDAQ
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PPTX
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
PDF
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
SFWelly Summer 25 Release Highlights July 2025
Anna Loughnan Colquhoun
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
July Patch Tuesday
Ivanti
 
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
Python basic programing language for automation
DanialHabibi2
 
Windsurf Meetup Ottawa 2025-07-12 - Planning Mode at Reliza.pdf
Pavel Shukhman
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
Blockchain Transactions Explained For Everyone
CIFDAQ
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
Ad

TL;DR Kafka Metrics | Kafka Summit London