SlideShare a Scribd company logo
From the Trenches:
Improving Kafka Connect Source
Connector Ingestion from 7 Hours to 30
Minutes
Improving Kafka Connect Ingestion
K af k a Summi t Lond on 20 24 – Raf ae l N at al i
Kafka Summit London 2024
Rafael Natali
/rafaelnatali
@rafaelmnatali
marionete.co.uk
PROBLEMS
INNEFICIENT
SLOW
HOPELESS
FAULTY
SLUGGISH
UNWORKABLE
USELESS
INVESTIGATION
MONITORING DOCUMENTATION
Enable JMX Metrics
Integrate Prometheus + Grafana
Overall view of the Kafka Connect
https://www.confluent.io/en-gb/blog/how-to-increase-throughput-on-kafka-connect-source-connectors/
RECORDSSENDTOTAL
17:00h 00:00h
20,000,000
<16kb*
BATCH.SIZEAVG
*KafkaBrokerdefaultvalue
35
RECORDSPERREQUESTAVG
35
<16kb
7h
ASSUMPTION
Increasingthebatch.sizewillmake
theingestionfaster.
TESTING
BATCH.SIZEINCREASE
batch.size = number of records * record size average in bytes
"producer.override.batch.size": 739500
batch.size = 1500 * 493 bytes
batch.size = 739500 bytes
RESULTS
1500
RECORDSPERREQUESTAVG
600kb
BATCH.SIZEAVG
RECORDSSENDTOTAL
09:15 09:45h
20,000,000
SUMMARY
7h 30min
<16kb 600Kb
35 1500

More Related Content

More from HostedbyConfluent (20)

PDF
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
HostedbyConfluent
 
PDF
Navigating Private Network Connectivity Options for Kafka Clusters
HostedbyConfluent
 
PDF
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
HostedbyConfluent
 
PDF
Explaining How Real-Time GenAI Works in a Noisy Pub
HostedbyConfluent
 
PDF
TL;DR Kafka Metrics | Kafka Summit London
HostedbyConfluent
 
PDF
A Window Into Your Kafka Streams Tasks | KSL
HostedbyConfluent
 
PDF
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
HostedbyConfluent
 
PDF
Data Contracts Management: Schema Registry and Beyond
HostedbyConfluent
 
PDF
Code-First Approach: Crafting Efficient Flink Apps
HostedbyConfluent
 
PDF
Debezium vs. the World: An Overview of the CDC Ecosystem
HostedbyConfluent
 
PDF
Beyond Tiered Storage: Serverless Kafka with No Local Disks
HostedbyConfluent
 
PDF
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
HostedbyConfluent
 
PDF
How to Build an Event-based Control Center for the Electrical Grid
HostedbyConfluent
 
PDF
Keep Your Kafka Cloud Costs in Check with Showbacks
HostedbyConfluent
 
PDF
When Securing Access to Data is About Life and Death
HostedbyConfluent
 
PDF
Aggregating Ad Events with Kafka Streams and Interactive Queries at Invidi
HostedbyConfluent
 
PDF
Mastering Kafka Consumer Distribution: A Guide to Efficient Scaling and Resou...
HostedbyConfluent
 
PDF
Flink 2.0: Navigating the Future of Unified Stream and Batch Processing
HostedbyConfluent
 
PDF
Leveraging Tiered Storage in Strimzi-Operated Kafka for Cost-Effective Stream...
HostedbyConfluent
 
PDF
Building Kafka Connectors with Kotlin: A Step-by-Step Guide to Creation and D...
HostedbyConfluent
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
HostedbyConfluent
 
Navigating Private Network Connectivity Options for Kafka Clusters
HostedbyConfluent
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
HostedbyConfluent
 
Explaining How Real-Time GenAI Works in a Noisy Pub
HostedbyConfluent
 
TL;DR Kafka Metrics | Kafka Summit London
HostedbyConfluent
 
A Window Into Your Kafka Streams Tasks | KSL
HostedbyConfluent
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
HostedbyConfluent
 
Data Contracts Management: Schema Registry and Beyond
HostedbyConfluent
 
Code-First Approach: Crafting Efficient Flink Apps
HostedbyConfluent
 
Debezium vs. the World: An Overview of the CDC Ecosystem
HostedbyConfluent
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
HostedbyConfluent
 
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
HostedbyConfluent
 
How to Build an Event-based Control Center for the Electrical Grid
HostedbyConfluent
 
Keep Your Kafka Cloud Costs in Check with Showbacks
HostedbyConfluent
 
When Securing Access to Data is About Life and Death
HostedbyConfluent
 
Aggregating Ad Events with Kafka Streams and Interactive Queries at Invidi
HostedbyConfluent
 
Mastering Kafka Consumer Distribution: A Guide to Efficient Scaling and Resou...
HostedbyConfluent
 
Flink 2.0: Navigating the Future of Unified Stream and Batch Processing
HostedbyConfluent
 
Leveraging Tiered Storage in Strimzi-Operated Kafka for Cost-Effective Stream...
HostedbyConfluent
 
Building Kafka Connectors with Kotlin: A Step-by-Step Guide to Creation and D...
HostedbyConfluent
 

Recently uploaded (20)

PDF
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
PDF
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PDF
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
PDF
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
PDF
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PPTX
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
PDF
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PDF
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
Ad

From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 Hours to 30 Minutes