SlideShare a Scribd company logo
1
Stream Processing with
Apache KafkaTM and .NET
Matt Howlett
Confluent Inc.
2
Agenda
Some Typical Use Cases
Technical Overview
[break]
Live Demo in C#
[let’s build a massively scalable web crawler… in 30 minutes]
3
Typical Use Cases
4
• Application Logs
{
“log_level”: 7,
“time”: “2017-03-03 11:45:05.737”,
“consumer-id”: “rdkafka#consumer-1”,
“method”: “RECV”,
“addr”: “10.0.0.14:9092/0”,
”message”: “Received HeartbeatResponse (v0, 2 bytes, CorrId 8, rrt 0.00ms)
}
Analytics
• Click / Meta Event Data
{
“ip”: “192.168.0.43”,
“time”: “2017-03-03 11:45:05.737”,
“user_id”: 7423653,
”product_id”: 62345334,
“page”: “product.detail”,
“data”: “32da—bfe89-116ac”
}
5
192.168.1.13 - - [23/Aug/2010:03:50:59 +0000] "POST /wordpress3/wp-admin/admin-ajax.php HTTP/1.1" 200 2
"http://www.example.com/wordpress3/wp-admin/post-new.php" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_4; en-US)
AppleWebKit/534.3 (KHTML, like Gecko) Chrome/6.0.472.25 Safari/534.3"
• Web Server Logs
Stack Trace:
at Confluent.Kafka.IntegrationTests.Tests.ConsumeMessage(Consumer consumer, Message`2 dr, String testString) in
/git/confluent-kafka-dotnet/test/Confluent.Kafka.IntegrationTests/Tests/SimpleProduceConsume.cs:line 72
at Confluent.Kafka.IntegrationTests.Tests.SimpleProduceConsume(String bootstrapServers, String topic, String
partitionedTopic) in /git/confluent-kafka-dotnet/test/Confluent.Kafka.IntegrationTests/Tests/SimpleProduceConsume.cs:line 65
• Stack Traces
6
Log Analytics v1.0
Log
files
ETL
tool
7
Potential Problems
- Spikes in usage
- Real world applications often have non-uniform usage patterns
- Want to avoid huge over-provisioning
- Upgrades / outages
- What if you want to do something else with the data?
- What if you want to adopt something other than elastic search?
Missed Opportunities
8
Log Analytics v2
Kafka
connect
Kafka
Kafka
connect
Log
files
9
+ Alerting + Fraud/Spam Detection
Kafka
Connect
Kafka Kafka
Connect
Log
files
User
Info
IP
Addr.
Info fraud detection
stream processor
alerting
10
kafka
DWH
search stream processingapps
K/V monitoring real-time analytics Hadoop
rdbms
Before you know it:
11
• Central to architecture at many
companies
• Across industries
12
Technical Overview
13
14
● Persisted
● Append only
● Immutable
● Delete earliest data based on time / size / never
15
• Allows topics to scale past
constraints of single server
• Message → partition_id
deterministic. Partitioning
relevant to application.
• Ordering guarantees per
partition but not across
partitions
16
Apache Kafka Replication
• cheap durability!
• choose # acks for
message produced
confirmation
17
Apache Kafka Consumer Groups
Partitions are spread across brokers
18
19
Discount code: kafcom17
Use the Apache Kafka community discount code to get $50 off
www.kafka-summit.org
Kafka Summit New York: May 8
Kafka Summit San Francisco: August 28
Presented by
20
Live Demo
21
Basic Operation
Links
https://www.confluent.io/download/
https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md
https://github.com/mhowlett/south-bay-dotnet
Starting
./bin/zookeeper-server-start ./etc/kafka/zookeeper.properties
./bin/kafka-server-start ./etc/kafka/server.properties
Create Topics
./bin/kafka-topics –zookeeper localhost:2181 --create --topic url-queue --partitions 12 --replication-factor 1
./bin/kafka-topics –zookeeper localhost:2181 --create --topic pages --partitions 12 --replication-factor 1
List High Watermark Offsets
./bin/kafka-run-class kafka.tools.GetOffsetShell --broker-list localhost:9092 --topic pages --time -1
22
Server parameters you’re likely to want to tweak
dataDir=<data dir> # location of database snapshots
autopurge.purgeInterval=12 # time interval in hours for which purge task triggered (default: no purge)
Kafka
Zookeeper
Low Memory
log.dir=<data dir> # location of kafka log data
auto.create.topics.enable=false # whether or not topics are auto-create when referenced if don’t exist
delete.topic.enable=true # topics cannot be deleted unless this is set
log.retention.hours=1000000 # ~infinite retention
log.cleaner.dedupe.buffer.size=20000000 # pre-allocated compaction buffer size (bytes)
KAFKA_HEAP_OPTS="-Xmx128M -Xms128M” ./bin/kafka-server-start server.properties
KAFKA_HEAP_OPTS="-Xmx64M –Xms64M” ./bin/zookeeper-server-start zookeeper.properties
23
Thank You
@matt_howlett
@confluentinc

More Related Content

What's hot (20)

PPTX
Apache Kafka
emreakis
 
PDF
Kafka Connect and Streams (Concepts, Architecture, Features)
Kai Wähner
 
PDF
From Zero to Hero with Kafka Connect
confluent
 
PPTX
Introduction to Apache Kafka
Jeff Holoman
 
PDF
Apache Kafka Architecture & Fundamentals Explained
confluent
 
PDF
KSQL Intro
confluent
 
PDF
ksqlDB: A Stream-Relational Database System
confluent
 
PDF
What is Apache Kafka and What is an Event Streaming Platform?
confluent
 
PPTX
Envoy and Kafka
Adam Kotwasinski
 
PDF
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Kai Wähner
 
PPTX
Nginx Reverse Proxy with Kafka.pptx
wonyong hwang
 
PDF
ksqlDB: Building Consciousness on Real Time Events
confluent
 
PDF
Flink powered stream processing platform at Pinterest
Flink Forward
 
PDF
An Introduction to Apache Kafka
Amir Sedighi
 
PDF
Apache Kafka in the Airline, Aviation and Travel Industry
Kai Wähner
 
PDF
IBM API Connect - overview
Ramy Bassem
 
PPTX
Spring Boot+Kafka: the New Enterprise Platform
VMware Tanzu
 
PDF
Apache Kafka - Martin Podval
Martin Podval
 
PPTX
Apache Kafka at LinkedIn
Discover Pinterest
 
PPTX
Apache Kafka Best Practices
DataWorks Summit/Hadoop Summit
 
Apache Kafka
emreakis
 
Kafka Connect and Streams (Concepts, Architecture, Features)
Kai Wähner
 
From Zero to Hero with Kafka Connect
confluent
 
Introduction to Apache Kafka
Jeff Holoman
 
Apache Kafka Architecture & Fundamentals Explained
confluent
 
KSQL Intro
confluent
 
ksqlDB: A Stream-Relational Database System
confluent
 
What is Apache Kafka and What is an Event Streaming Platform?
confluent
 
Envoy and Kafka
Adam Kotwasinski
 
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Kai Wähner
 
Nginx Reverse Proxy with Kafka.pptx
wonyong hwang
 
ksqlDB: Building Consciousness on Real Time Events
confluent
 
Flink powered stream processing platform at Pinterest
Flink Forward
 
An Introduction to Apache Kafka
Amir Sedighi
 
Apache Kafka in the Airline, Aviation and Travel Industry
Kai Wähner
 
IBM API Connect - overview
Ramy Bassem
 
Spring Boot+Kafka: the New Enterprise Platform
VMware Tanzu
 
Apache Kafka - Martin Podval
Martin Podval
 
Apache Kafka at LinkedIn
Discover Pinterest
 
Apache Kafka Best Practices
DataWorks Summit/Hadoop Summit
 

Similar to Stream Processing with Apache Kafka and .NET (20)

PDF
Developing Realtime Data Pipelines With Apache Kafka
Joe Stein
 
PDF
Connect K of SMACK:pykafka, kafka-python or?
Micron Technology
 
PPTX
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
DataWorks Summit/Hadoop Summit
 
PPTX
Real-Time Log Analysis with Apache Mesos, Kafka and Cassandra
Joe Stein
 
PDF
Introduction to Apache NiFi 1.11.4
Timothy Spann
 
PDF
Big Data Open Source Security LLC: Realtime log analysis with Mesos, Docker, ...
DataStax Academy
 
PDF
Introduction to apache kafka
Samuel Kerrien
 
PPTX
Apache Kafka
Joe Stein
 
PPTX
Building Event-Driven Systems with Apache Kafka
Brian Ritchie
 
PPTX
Deploying windows containers with kubernetes
Ben Hall
 
PDF
JConWorld_ Continuous SQL with Kafka and Flink
Timothy Spann
 
PDF
Introduction to apache kafka, confluent and why they matter
Paolo Castagna
 
PPTX
Apache Kafka with Spark Streaming: Real-time Analytics Redefined
Edureka!
 
PPTX
DEVNET-1140 InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...
Cisco DevNet
 
PPTX
Mcas log collector deck
Matt Soseman
 
PDF
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Guido Schmutz
 
PPTX
Colorado OpenStack 5th Birthday Monasca Operations
dlfryar
 
PDF
Apache Kafka - Scalable Message-Processing and more !
Guido Schmutz
 
PPTX
Teaching Apache Spark: Demonstrations on the Databricks Cloud Platform
Yao Yao
 
PPTX
Typesafe spark- Zalando meetup
Stavros Kontopoulos
 
Developing Realtime Data Pipelines With Apache Kafka
Joe Stein
 
Connect K of SMACK:pykafka, kafka-python or?
Micron Technology
 
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
DataWorks Summit/Hadoop Summit
 
Real-Time Log Analysis with Apache Mesos, Kafka and Cassandra
Joe Stein
 
Introduction to Apache NiFi 1.11.4
Timothy Spann
 
Big Data Open Source Security LLC: Realtime log analysis with Mesos, Docker, ...
DataStax Academy
 
Introduction to apache kafka
Samuel Kerrien
 
Apache Kafka
Joe Stein
 
Building Event-Driven Systems with Apache Kafka
Brian Ritchie
 
Deploying windows containers with kubernetes
Ben Hall
 
JConWorld_ Continuous SQL with Kafka and Flink
Timothy Spann
 
Introduction to apache kafka, confluent and why they matter
Paolo Castagna
 
Apache Kafka with Spark Streaming: Real-time Analytics Redefined
Edureka!
 
DEVNET-1140 InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...
Cisco DevNet
 
Mcas log collector deck
Matt Soseman
 
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Guido Schmutz
 
Colorado OpenStack 5th Birthday Monasca Operations
dlfryar
 
Apache Kafka - Scalable Message-Processing and more !
Guido Schmutz
 
Teaching Apache Spark: Demonstrations on the Databricks Cloud Platform
Yao Yao
 
Typesafe spark- Zalando meetup
Stavros Kontopoulos
 
Ad

More from confluent (20)

PDF
Stream Processing Handson Workshop - Flink SQL Hands-on Workshop (Korean)
confluent
 
PPTX
Webinar Think Right - Shift Left - 19-03-2025.pptx
confluent
 
PDF
Migration, backup and restore made easy using Kannika
confluent
 
PDF
Five Things You Need to Know About Data Streaming in 2025
confluent
 
PDF
Data in Motion Tour Seoul 2024 - Keynote
confluent
 
PDF
Data in Motion Tour Seoul 2024 - Roadmap Demo
confluent
 
PDF
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
confluent
 
PDF
Confluent per il settore FSI: Accelerare l'Innovazione con il Data Streaming...
confluent
 
PDF
Data in Motion Tour 2024 Riyadh, Saudi Arabia
confluent
 
PDF
Build a Real-Time Decision Support Application for Financial Market Traders w...
confluent
 
PDF
Strumenti e Strategie di Stream Governance con Confluent Platform
confluent
 
PDF
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
confluent
 
PDF
Building Real-Time Gen AI Applications with SingleStore and Confluent
confluent
 
PDF
Unlocking value with event-driven architecture by Confluent
confluent
 
PDF
Il Data Streaming per un’AI real-time di nuova generazione
confluent
 
PDF
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
confluent
 
PDF
Break data silos with real-time connectivity using Confluent Cloud Connectors
confluent
 
PDF
Building API data products on top of your real-time data infrastructure
confluent
 
PDF
Speed Wins: From Kafka to APIs in Minutes
confluent
 
PDF
Evolving Data Governance for the Real-time Streaming and AI Era
confluent
 
Stream Processing Handson Workshop - Flink SQL Hands-on Workshop (Korean)
confluent
 
Webinar Think Right - Shift Left - 19-03-2025.pptx
confluent
 
Migration, backup and restore made easy using Kannika
confluent
 
Five Things You Need to Know About Data Streaming in 2025
confluent
 
Data in Motion Tour Seoul 2024 - Keynote
confluent
 
Data in Motion Tour Seoul 2024 - Roadmap Demo
confluent
 
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
confluent
 
Confluent per il settore FSI: Accelerare l'Innovazione con il Data Streaming...
confluent
 
Data in Motion Tour 2024 Riyadh, Saudi Arabia
confluent
 
Build a Real-Time Decision Support Application for Financial Market Traders w...
confluent
 
Strumenti e Strategie di Stream Governance con Confluent Platform
confluent
 
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
confluent
 
Building Real-Time Gen AI Applications with SingleStore and Confluent
confluent
 
Unlocking value with event-driven architecture by Confluent
confluent
 
Il Data Streaming per un’AI real-time di nuova generazione
confluent
 
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
confluent
 
Break data silos with real-time connectivity using Confluent Cloud Connectors
confluent
 
Building API data products on top of your real-time data infrastructure
confluent
 
Speed Wins: From Kafka to APIs in Minutes
confluent
 
Evolving Data Governance for the Real-time Streaming and AI Era
confluent
 
Ad

Recently uploaded (20)

PPT
Ericsson LTE presentation SEMINAR 2010.ppt
npat3
 
PPTX
Agentforce World Tour Toronto '25 - MCP with MuleSoft
Alexandra N. Martinez
 
PDF
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PPTX
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
PPTX
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
PDF
SIZING YOUR AIR CONDITIONER---A PRACTICAL GUIDE.pdf
Muhammad Rizwan Akram
 
PDF
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PDF
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PPTX
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
PDF
Staying Human in a Machine- Accelerated World
Catalin Jora
 
DOCX
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PDF
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
PPTX
Agentforce World Tour Toronto '25 - Supercharge MuleSoft Development with Mod...
Alexandra N. Martinez
 
PDF
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
Ericsson LTE presentation SEMINAR 2010.ppt
npat3
 
Agentforce World Tour Toronto '25 - MCP with MuleSoft
Alexandra N. Martinez
 
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
SIZING YOUR AIR CONDITIONER---A PRACTICAL GUIDE.pdf
Muhammad Rizwan Akram
 
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
Staying Human in a Machine- Accelerated World
Catalin Jora
 
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
Agentforce World Tour Toronto '25 - Supercharge MuleSoft Development with Mod...
Alexandra N. Martinez
 
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 

Stream Processing with Apache Kafka and .NET