SlideShare a Scribd company logo
Hadoop Cluster
Hadoop Cluster
Speed Layer
Storm/Spark (Real
time processing)
Batch Layer Analytics
ELT
Kafka Cluster (Mirror)
Mirroring
Kafka Cluster
Hadoop Client for
Camus
Hive/Pig
DWH
HBase
Master Plan (Lambda Architecture)
Variety Sources
REST/IS
Kafka Cluster A
REST/IS
REST/IS
Kafka Cluster B
Mirroring
Kafka Cluster C
Kafka Cluster D
Topic : “Data”
Topic : “Data”
Kafka (Mirroring) Configuration
Integration of all the sources
coming to cluster A and B with
topic Name “Data”
Integration of all the sources
coming to cluster A and B with
topic Name “Data”
Host Configuration
-Quad-Core AMD Opteron(TM)
Processor
-8GB RAM
-320Gb
Batch Layer (Writing the data
into hadoop cluster)
Speed Layer (Feeding data from
Kafka to Storm )

More Related Content

What's hot (20)

PDF
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
DataStax Academy
 
PDF
Akka in Production - ScalaDays 2015
Evan Chan
 
PDF
Streaming Big Data & Analytics For Scale
Helena Edelson
 
PDF
Real-time personal trainer on the SMACK stack
Anirvan Chakraborty
 
PDF
Sa introduction to big data pipelining with cassandra & spark west mins...
Simon Ambridge
 
PDF
Real-Time Anomaly Detection with Spark MLlib, Akka and Cassandra
Natalino Busa
 
PDF
Reactive dashboard’s using apache spark
Rahul Kumar
 
PDF
Using the SDACK Architecture to Build a Big Data Product
Evans Ye
 
PDF
Analyzing Time Series Data with Apache Spark and Cassandra
Patrick McFadin
 
PDF
Reactive app using actor model & apache spark
Rahul Kumar
 
PDF
The How and Why of Fast Data Analytics with Apache Spark
Legacy Typesafe (now Lightbend)
 
PPTX
Real Time Data Processing Using Spark Streaming
Hari Shreedharan
 
PDF
Getting Started Running Apache Spark on Apache Mesos
Paco Nathan
 
PDF
Apache cassandra & apache spark for time series data
Patrick McFadin
 
PPTX
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
Brian O'Neill
 
PDF
C* Summit 2013: Real-time Analytics using Cassandra, Spark and Shark by Evan ...
DataStax Academy
 
PDF
Lambda Architecture Using SQL
SATOSHI TAGOMORI
 
PPTX
Alpine academy apache spark series #1 introduction to cluster computing wit...
Holden Karau
 
PDF
Near Real Time Indexing Kafka Messages into Apache Blur: Presented by Dibyend...
Lucidworks
 
PPTX
Building a Lambda Architecture with Elasticsearch at Yieldbot
yieldbot
 
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
DataStax Academy
 
Akka in Production - ScalaDays 2015
Evan Chan
 
Streaming Big Data & Analytics For Scale
Helena Edelson
 
Real-time personal trainer on the SMACK stack
Anirvan Chakraborty
 
Sa introduction to big data pipelining with cassandra & spark west mins...
Simon Ambridge
 
Real-Time Anomaly Detection with Spark MLlib, Akka and Cassandra
Natalino Busa
 
Reactive dashboard’s using apache spark
Rahul Kumar
 
Using the SDACK Architecture to Build a Big Data Product
Evans Ye
 
Analyzing Time Series Data with Apache Spark and Cassandra
Patrick McFadin
 
Reactive app using actor model & apache spark
Rahul Kumar
 
The How and Why of Fast Data Analytics with Apache Spark
Legacy Typesafe (now Lightbend)
 
Real Time Data Processing Using Spark Streaming
Hari Shreedharan
 
Getting Started Running Apache Spark on Apache Mesos
Paco Nathan
 
Apache cassandra & apache spark for time series data
Patrick McFadin
 
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
Brian O'Neill
 
C* Summit 2013: Real-time Analytics using Cassandra, Spark and Shark by Evan ...
DataStax Academy
 
Lambda Architecture Using SQL
SATOSHI TAGOMORI
 
Alpine academy apache spark series #1 introduction to cluster computing wit...
Holden Karau
 
Near Real Time Indexing Kafka Messages into Apache Blur: Presented by Dibyend...
Lucidworks
 
Building a Lambda Architecture with Elasticsearch at Yieldbot
yieldbot
 

Viewers also liked (19)

PDF
Big Data and Fast Data - Lambda Architecture in Action
Guido Schmutz
 
PDF
Demystifying salesforce for developers
Heitor Souza
 
PDF
Extreme Salesforce Data Volumes Webinar
Salesforce Developers
 
PPTX
How Apache Kafka is transforming Hadoop, Spark and Storm
Edureka!
 
PDF
Large volume data analysis on the Typesafe Reactive Platform - Big Data Scala...
Martin Zapletal
 
PDF
7가지 동시성 모델 람다아키텍처
Sunggon Song
 
PDF
Handling of Large Data by Salesforce
Thinqloud
 
PPTX
Big Data Day LA 2015 - Event Driven Architecture for Web Analytics by Peyman ...
Data Con LA
 
PPTX
Case Study: Elasticsearch Ingest Using StreamSets @ Cisco Intercloud
Streamsets Inc.
 
PDF
Machine learning at Scale with Apache Spark
Martin Zapletal
 
PPT
Salesforce REST API
Bohdan Dovhań
 
PDF
Understanding the Salesforce Architecture: How We Do the Magic We Do
Salesforce Developers
 
PDF
Salesforce API Series: Fast Parallel Data Loading with the Bulk API Webinar
Salesforce Developers
 
PPTX
Introduction to Apache NiFi - Seattle Scalability Meetup
Saptak Sen
 
PPTX
Microservice-based Architecture on the Salesforce App Cloud
pbattisson
 
PPTX
Large Data Management Strategies
Salesforce Developers
 
PDF
람다아키텍처
HyeonSeok Choi
 
PPTX
Kafka at Scale: Multi-Tier Architectures
Todd Palino
 
PDF
Reference architecture for Internet of Things
Sujee Maniyam
 
Big Data and Fast Data - Lambda Architecture in Action
Guido Schmutz
 
Demystifying salesforce for developers
Heitor Souza
 
Extreme Salesforce Data Volumes Webinar
Salesforce Developers
 
How Apache Kafka is transforming Hadoop, Spark and Storm
Edureka!
 
Large volume data analysis on the Typesafe Reactive Platform - Big Data Scala...
Martin Zapletal
 
7가지 동시성 모델 람다아키텍처
Sunggon Song
 
Handling of Large Data by Salesforce
Thinqloud
 
Big Data Day LA 2015 - Event Driven Architecture for Web Analytics by Peyman ...
Data Con LA
 
Case Study: Elasticsearch Ingest Using StreamSets @ Cisco Intercloud
Streamsets Inc.
 
Machine learning at Scale with Apache Spark
Martin Zapletal
 
Salesforce REST API
Bohdan Dovhań
 
Understanding the Salesforce Architecture: How We Do the Magic We Do
Salesforce Developers
 
Salesforce API Series: Fast Parallel Data Loading with the Bulk API Webinar
Salesforce Developers
 
Introduction to Apache NiFi - Seattle Scalability Meetup
Saptak Sen
 
Microservice-based Architecture on the Salesforce App Cloud
pbattisson
 
Large Data Management Strategies
Salesforce Developers
 
람다아키텍처
HyeonSeok Choi
 
Kafka at Scale: Multi-Tier Architectures
Todd Palino
 
Reference architecture for Internet of Things
Sujee Maniyam
 
Ad

Kafka Lambda architecture with mirroring

  • 1. Hadoop Cluster Hadoop Cluster Speed Layer Storm/Spark (Real time processing) Batch Layer Analytics ELT Kafka Cluster (Mirror) Mirroring Kafka Cluster Hadoop Client for Camus Hive/Pig DWH HBase Master Plan (Lambda Architecture) Variety Sources REST/IS
  • 2. Kafka Cluster A REST/IS REST/IS Kafka Cluster B Mirroring Kafka Cluster C Kafka Cluster D Topic : “Data” Topic : “Data” Kafka (Mirroring) Configuration Integration of all the sources coming to cluster A and B with topic Name “Data” Integration of all the sources coming to cluster A and B with topic Name “Data” Host Configuration -Quad-Core AMD Opteron(TM) Processor -8GB RAM -320Gb Batch Layer (Writing the data into hadoop cluster) Speed Layer (Feeding data from Kafka to Storm )