SlideShare a Scribd company logo
Kafka and Hadoop as components of
architecture - Martin Strycek
Kafka
Kafka is a distributed streaming platform.
Hadoop
The Apache™ Hadoop® project develops
open-source software for reliable, scalable,
distributed computing
How Kafka and Hadoop got into Exponea?
How Kafka and Hadoop got into Exponea?
How Kafka and Hadoop got into Exponea
● We had our in memory database super fast,
but in memory
● Our customers were scared that they will have to pay a lot
● They want to have freedom to run analyses on all data
● We had some troubles processing data
Kafka + MapR
Kafka + MapR
● We were appending data to files that contain jsons
○ HDFS does not support append
● We started using Kafka 0.8.2.1
● We had no idea how to monitor the whole stack
Where we are now
How Kafka and Hadoop got into Exponea
● We are using Kafka to stream data to
● We have first Sparks jobs that are part of application
○ Recommendation
○ Predictions
○ Campaigns overview
○ Loading data to
● We are using Oryx 2
○ But we need multitenancy
● We have MapR
○ But it ships with different Spark version that Oryx 2
● We are using Oryx 2
○ But it works with different version with Kafka
Recommendation
Exponea  - Kafka and Hadoop as components of architecture
● How about we use
● How about we create better local storage for
● We need another cluster for testing
○ Bare metal? AWS? Google Cloud?
● library to be usable in
● We want to do a workshop for all of you that want to
try it out, but don’t have a place where.
What next?
● Freedom & responsibility
● Big impact
● Team
● Proficiency
Exponea Culture
}
Global ambitions,
best company to work for
Thanks!

More Related Content

What's hot (20)

PPTX
Intro to Apache Kudu (short) - Big Data Application Meetup
Mike Percy
 
PDF
Low latency high throughput streaming using Apache Apex and Apache Kudu
DataWorks Summit
 
PDF
Spark Summit EU talk by Mike Percy
Spark Summit
 
PPTX
Introduction to Kudu: Hadoop Storage for Fast Analytics on Fast Data - Rüdige...
Dataconomy Media
 
PPTX
Enabling the Active Data Warehouse with Apache Kudu
Grant Henke
 
PPTX
Introduction to Apache Kudu
Jeff Holoman
 
PDF
SQL Engines for Hadoop - The case for Impala
markgrover
 
PDF
Application Architectures with Hadoop
hadooparchbook
 
PDF
Introducing Kudu, Big Data Warehousing Meetup
Caserta
 
PDF
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
Data Con LA
 
PPTX
Introducing Kudu
Jeremy Beard
 
PPTX
February 2016 HUG: Apache Kudu (incubating): New Apache Hadoop Storage for Fa...
Yahoo Developer Network
 
PDF
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
Data Con LA
 
PDF
Kudu: Resolving Transactional and Analytic Trade-offs in Hadoop
jdcryans
 
PDF
cloudera Apache Kudu Updatable Analytical Storage for Modern Data Platform
Rakuten Group, Inc.
 
PPTX
High concurrency,
Low latency analytics
using Spark/Kudu
Chris George
 
PDF
A Closer Look at Apache Kudu
Andriy Zabavskyy
 
PPTX
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Cloudera, Inc.
 
PPTX
Hive vs. Impala
Omid Vahdaty
 
PPTX
Data Science at Scale Using Apache Spark and Apache Hadoop
Cloudera, Inc.
 
Intro to Apache Kudu (short) - Big Data Application Meetup
Mike Percy
 
Low latency high throughput streaming using Apache Apex and Apache Kudu
DataWorks Summit
 
Spark Summit EU talk by Mike Percy
Spark Summit
 
Introduction to Kudu: Hadoop Storage for Fast Analytics on Fast Data - Rüdige...
Dataconomy Media
 
Enabling the Active Data Warehouse with Apache Kudu
Grant Henke
 
Introduction to Apache Kudu
Jeff Holoman
 
SQL Engines for Hadoop - The case for Impala
markgrover
 
Application Architectures with Hadoop
hadooparchbook
 
Introducing Kudu, Big Data Warehousing Meetup
Caserta
 
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
Data Con LA
 
Introducing Kudu
Jeremy Beard
 
February 2016 HUG: Apache Kudu (incubating): New Apache Hadoop Storage for Fa...
Yahoo Developer Network
 
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
Data Con LA
 
Kudu: Resolving Transactional and Analytic Trade-offs in Hadoop
jdcryans
 
cloudera Apache Kudu Updatable Analytical Storage for Modern Data Platform
Rakuten Group, Inc.
 
High concurrency,
Low latency analytics
using Spark/Kudu
Chris George
 
A Closer Look at Apache Kudu
Andriy Zabavskyy
 
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Cloudera, Inc.
 
Hive vs. Impala
Omid Vahdaty
 
Data Science at Scale Using Apache Spark and Apache Hadoop
Cloudera, Inc.
 

Similar to Exponea - Kafka and Hadoop as components of architecture (20)

PPTX
BigData & Hadoop Ecosystem.pptx
BibhasDeb1
 
PPTX
Understanding kafka
AmitDhodi
 
PPTX
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Data Con LA
 
PDF
Kafka Up And Running For Network Devops Set Your Network Data In Motion Eric ...
tjademargis
 
PDF
Event Driven Architectures with Apache Kafka on Heroku
Heroku
 
PDF
kafka-tutorial-cloudruable-v2.pdf
PriyamTomar1
 
PPTX
Streaming Data and Stream Processing with Apache Kafka
confluent
 
PPTX
Big Data Analytics_basic introduction of Kafka.pptx
khareamit369
 
PDF
Building scalable data with kafka and spark
babatunde ekemode
 
PPTX
Streaming Data Ingest and Processing with Apache Kafka
Attunity
 
PPTX
Kafka Tutorial: Streaming Data Architecture
Jean-Paul Azar
 
PDF
Apache Kafka Use Cases_ When To Use It_ When Not To Use_.pdf
Noman Shaikh
 
PDF
Building Streaming Data Applications Using Apache Kafka
Slim Baltagi
 
PDF
Kafka Vienna Meetup 020719
Patrik Kleindl
 
PPTX
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Jean-Paul Azar
 
PDF
Using the SDACK Architecture to Build a Big Data Product
Evans Ye
 
PPTX
How Apache Kafka is transforming Hadoop, Spark and Storm
Edureka!
 
PDF
Kafka: Journey from Just Another Software to Being a Critical Part of PayPal ...
confluent
 
PPTX
Hadoop Platforms - Introduction, Importance, Providers
Mrigendra Sharma
 
PPTX
Apache kafka
sureshraj43
 
BigData & Hadoop Ecosystem.pptx
BibhasDeb1
 
Understanding kafka
AmitDhodi
 
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Data Con LA
 
Kafka Up And Running For Network Devops Set Your Network Data In Motion Eric ...
tjademargis
 
Event Driven Architectures with Apache Kafka on Heroku
Heroku
 
kafka-tutorial-cloudruable-v2.pdf
PriyamTomar1
 
Streaming Data and Stream Processing with Apache Kafka
confluent
 
Big Data Analytics_basic introduction of Kafka.pptx
khareamit369
 
Building scalable data with kafka and spark
babatunde ekemode
 
Streaming Data Ingest and Processing with Apache Kafka
Attunity
 
Kafka Tutorial: Streaming Data Architecture
Jean-Paul Azar
 
Apache Kafka Use Cases_ When To Use It_ When Not To Use_.pdf
Noman Shaikh
 
Building Streaming Data Applications Using Apache Kafka
Slim Baltagi
 
Kafka Vienna Meetup 020719
Patrik Kleindl
 
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Jean-Paul Azar
 
Using the SDACK Architecture to Build a Big Data Product
Evans Ye
 
How Apache Kafka is transforming Hadoop, Spark and Storm
Edureka!
 
Kafka: Journey from Just Another Software to Being a Critical Part of PayPal ...
confluent
 
Hadoop Platforms - Introduction, Importance, Providers
Mrigendra Sharma
 
Apache kafka
sureshraj43
 
Ad

Recently uploaded (20)

PDF
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
PDF
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
Edge AI and Vision Alliance
 
PDF
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
PPTX
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
PDF
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
Lilly Gracia
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PDF
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
PDF
Staying Human in a Machine- Accelerated World
Catalin Jora
 
PDF
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
PDF
SIZING YOUR AIR CONDITIONER---A PRACTICAL GUIDE.pdf
Muhammad Rizwan Akram
 
PDF
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
PPTX
Digital Circuits, important subject in CS
contactparinay1
 
PPTX
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PPTX
Agentforce World Tour Toronto '25 - Supercharge MuleSoft Development with Mod...
Alexandra N. Martinez
 
PDF
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
PPTX
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
PDF
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PDF
NLJUG Speaker academy 2025 - first session
Bert Jan Schrijver
 
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
Edge AI and Vision Alliance
 
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
Lilly Gracia
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
Staying Human in a Machine- Accelerated World
Catalin Jora
 
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
SIZING YOUR AIR CONDITIONER---A PRACTICAL GUIDE.pdf
Muhammad Rizwan Akram
 
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
Digital Circuits, important subject in CS
contactparinay1
 
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
Agentforce World Tour Toronto '25 - Supercharge MuleSoft Development with Mod...
Alexandra N. Martinez
 
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
NLJUG Speaker academy 2025 - first session
Bert Jan Schrijver
 
Ad

Exponea - Kafka and Hadoop as components of architecture

  • 1. Kafka and Hadoop as components of architecture - Martin Strycek
  • 2. Kafka Kafka is a distributed streaming platform.
  • 3. Hadoop The Apache™ Hadoop® project develops open-source software for reliable, scalable, distributed computing
  • 4. How Kafka and Hadoop got into Exponea?
  • 5. How Kafka and Hadoop got into Exponea?
  • 6. How Kafka and Hadoop got into Exponea ● We had our in memory database super fast, but in memory ● Our customers were scared that they will have to pay a lot ● They want to have freedom to run analyses on all data ● We had some troubles processing data
  • 8. Kafka + MapR ● We were appending data to files that contain jsons ○ HDFS does not support append ● We started using Kafka 0.8.2.1 ● We had no idea how to monitor the whole stack
  • 10. How Kafka and Hadoop got into Exponea ● We are using Kafka to stream data to ● We have first Sparks jobs that are part of application ○ Recommendation ○ Predictions ○ Campaigns overview ○ Loading data to
  • 11. ● We are using Oryx 2 ○ But we need multitenancy ● We have MapR ○ But it ships with different Spark version that Oryx 2 ● We are using Oryx 2 ○ But it works with different version with Kafka Recommendation
  • 13. ● How about we use ● How about we create better local storage for ● We need another cluster for testing ○ Bare metal? AWS? Google Cloud? ● library to be usable in ● We want to do a workshop for all of you that want to try it out, but don’t have a place where. What next?
  • 14. ● Freedom & responsibility ● Big impact ● Team ● Proficiency Exponea Culture } Global ambitions, best company to work for