SlideShare a Scribd company logo
Hadoop Technologies
Architecture Overview

@senthil245

Mail - senthil245@gmail.com
DISTRIBUTED CLUSTER ARCHITECTURE: MASTER/SLAVE
HADOOP CORE
MAPREDUCE PATTERNS
WHEN MAPREDUCE
Since the MapReduce is running within a
cluster of computing nodes, the architecture is
very scalable.
• In other words, if the data size is increased by
the factor of x, the performance should be still
constant if we are adding a predictable/fixed
factor of y.

The graph on the right is illustrating the
relationship between the size of the data (xaxis) and processing time (y-axis).
•The blue color curve is the process using
traditional programming. On the other hand, the
black color curve is the process using Hadoop.
When the data size is small, traditional
programming is better performance because the
bootstrap of Hadoop is expensive (Copy the data
within the cluster, inter-nodes communication,
etc.).

Once the data size is big enough, the penalty
of the Hadoop bootstrap becomes invisible.
•Hence Hadoop is best suited for Big Data
crunching ideally in terms of petaBytes and is
not suited for implementing common data
integration patterns
Hadoop Ecosystem Architecture Overview
APACHE SQOOP
APACHE FLUME
APACHE CHUKWA
HDFS
APACHE OOZIE – WORKFLOW SCHEDULER (CHECK AZKABAN & LINKEDIN OPENSOURCE)
PIG AND HQL (DO

NOT USE

HQL)
APACHE S4 (STREAM PROCESSING)(ALSO CHECK KAFKA

AND

STORM)
APACHE ZOOKEEPER SERVICE (ALSO CHECK APACHE HUE)
APACHE HIVE
APACHE HCATALOG, HIVE

AND

HBASE

More Related Content

What's hot (20)

PDF
Introduction to the Hadoop Ecosystem (FrOSCon Edition)
Uwe Printz
 
PPTX
Big data concepts
Serkan Özal
 
PPTX
Apache hadoop introduction and architecture
Harikrishnan K
 
PDF
Hadoop: Distributed data processing
royans
 
PDF
An Introduction to the World of Hadoop
University College Cork
 
PPSX
Hadoop
Nishant Gandhi
 
PPTX
Big data and Hadoop
Rahul Agarwal
 
PDF
Introduction To Hadoop Ecosystem
InSemble
 
PDF
Seminar_Report_hadoop
Varun Narang
 
PPTX
Apache Hadoop at 10
Cloudera, Inc.
 
PDF
Facebooks Petabyte Scale Data Warehouse using Hive and Hadoop
royans
 
PDF
Hadoop: The Default Machine Learning Platform ?
Milind Bhandarkar
 
PPT
Seminar Presentation Hadoop
Varun Narang
 
PPTX
Hadoop overview
Siva Pandeti
 
PPTX
Hadoop
Shamama Kamal
 
PPTX
PPT on Hadoop
Shubham Parmar
 
PPTX
Demystify Big Data Breakfast Briefing: Herb Cunitz, Hortonworks
Hortonworks
 
PPTX
Big data Hadoop presentation
Shivanee garg
 
PDF
Big Data and Hadoop Ecosystem
Rajkumar Singh
 
ODP
Hadoop demo ppt
Phil Young
 
Introduction to the Hadoop Ecosystem (FrOSCon Edition)
Uwe Printz
 
Big data concepts
Serkan Özal
 
Apache hadoop introduction and architecture
Harikrishnan K
 
Hadoop: Distributed data processing
royans
 
An Introduction to the World of Hadoop
University College Cork
 
Big data and Hadoop
Rahul Agarwal
 
Introduction To Hadoop Ecosystem
InSemble
 
Seminar_Report_hadoop
Varun Narang
 
Apache Hadoop at 10
Cloudera, Inc.
 
Facebooks Petabyte Scale Data Warehouse using Hive and Hadoop
royans
 
Hadoop: The Default Machine Learning Platform ?
Milind Bhandarkar
 
Seminar Presentation Hadoop
Varun Narang
 
Hadoop overview
Siva Pandeti
 
Hadoop
Shamama Kamal
 
PPT on Hadoop
Shubham Parmar
 
Demystify Big Data Breakfast Briefing: Herb Cunitz, Hortonworks
Hortonworks
 
Big data Hadoop presentation
Shivanee garg
 
Big Data and Hadoop Ecosystem
Rajkumar Singh
 
Hadoop demo ppt
Phil Young
 

Viewers also liked (6)

PPTX
The Hadoop Ecosystem
J Singh
 
PDF
Media Buying Platform Ecosystem
olivier delamesliere
 
PPSX
Hadoop Ecosystem
Patrick Nicolas
 
PDF
Creating an Ecosystem Platform with Vertical PaaS
WSO2
 
PPT
Understanding the Online Advertising Technology Landscape
Karina Sanz
 
PDF
Business Ecosystem Design
Jan Schmiedgen
 
The Hadoop Ecosystem
J Singh
 
Media Buying Platform Ecosystem
olivier delamesliere
 
Hadoop Ecosystem
Patrick Nicolas
 
Creating an Ecosystem Platform with Vertical PaaS
WSO2
 
Understanding the Online Advertising Technology Landscape
Karina Sanz
 
Business Ecosystem Design
Jan Schmiedgen
 
Ad

Similar to Hadoop Ecosystem Architecture Overview (20)

PPT
Hadoop
chandinisanz
 
PDF
Introduction to Big Data Hadoop Training Online by www.itjobzone.biz
ITJobZone.biz
 
PPTX
Apache hadoop basics
saili mane
 
PPTX
2. hadoop fundamentals
Lokesh Ramaswamy
 
PPTX
Hadoop technology
tipanagiriharika
 
PPTX
Big data and hadoop
Roushan Sinha
 
ODP
Hadoop seminar
KrishnenduKrishh
 
PDF
hdfs readrmation ghghg bigdats analytics info.pdf
ssuser2d043c
 
PPTX
INTRODUCTION TO BIG DATA HADOOP
Krishna Sujeer
 
PPTX
hadoop.pptx
arunaPalani3
 
PPTX
Hadoop introduction
Dong Ngoc
 
PDF
Big Data: Introduction to Hadoop
tokopedia
 
PDF
Hadoop 101 (v1) (20150730)
Fahmi Fachreza
 
PDF
Hadoop 101 - Big Data Technology
Firman Gautama
 
PPTX
Presentation sreenu dwh-services
Sreenu Musham
 
PPT
Hadoop - Introduction to HDFS
Vibrant Technologies & Computers
 
PPTX
Big Data and Hadoop with MapReduce Paradigms
Arundhati Kanungo
 
PPT
Hadoop and Mapreduce Introduction
rajsandhu1989
 
PPTX
Hadoop_Introduction_pptx.pptx
Shrinivasa6
 
Hadoop
chandinisanz
 
Introduction to Big Data Hadoop Training Online by www.itjobzone.biz
ITJobZone.biz
 
Apache hadoop basics
saili mane
 
2. hadoop fundamentals
Lokesh Ramaswamy
 
Hadoop technology
tipanagiriharika
 
Big data and hadoop
Roushan Sinha
 
Hadoop seminar
KrishnenduKrishh
 
hdfs readrmation ghghg bigdats analytics info.pdf
ssuser2d043c
 
INTRODUCTION TO BIG DATA HADOOP
Krishna Sujeer
 
hadoop.pptx
arunaPalani3
 
Hadoop introduction
Dong Ngoc
 
Big Data: Introduction to Hadoop
tokopedia
 
Hadoop 101 (v1) (20150730)
Fahmi Fachreza
 
Hadoop 101 - Big Data Technology
Firman Gautama
 
Presentation sreenu dwh-services
Sreenu Musham
 
Hadoop - Introduction to HDFS
Vibrant Technologies & Computers
 
Big Data and Hadoop with MapReduce Paradigms
Arundhati Kanungo
 
Hadoop and Mapreduce Introduction
rajsandhu1989
 
Hadoop_Introduction_pptx.pptx
Shrinivasa6
 
Ad

Recently uploaded (20)

PDF
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
PPT
Ericsson LTE presentation SEMINAR 2010.ppt
npat3
 
PDF
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
PDF
Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
PDF
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
PDF
Staying Human in a Machine- Accelerated World
Catalin Jora
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PDF
SIZING YOUR AIR CONDITIONER---A PRACTICAL GUIDE.pdf
Muhammad Rizwan Akram
 
PDF
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
PDF
What’s my job again? Slides from Mark Simos talk at 2025 Tampa BSides
Mark Simos
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PDF
UPDF - AI PDF Editor & Converter Key Features
DealFuel
 
PPTX
Agentforce World Tour Toronto '25 - MCP with MuleSoft
Alexandra N. Martinez
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PPTX
MuleSoft MCP Support (Model Context Protocol) and Use Case Demo
shyamraj55
 
PDF
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
PPTX
Mastering ODC + Okta Configuration - Chennai OSUG
HathiMaryA
 
PDF
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
PDF
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
Ericsson LTE presentation SEMINAR 2010.ppt
npat3
 
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
Staying Human in a Machine- Accelerated World
Catalin Jora
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
SIZING YOUR AIR CONDITIONER---A PRACTICAL GUIDE.pdf
Muhammad Rizwan Akram
 
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
What’s my job again? Slides from Mark Simos talk at 2025 Tampa BSides
Mark Simos
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
UPDF - AI PDF Editor & Converter Key Features
DealFuel
 
Agentforce World Tour Toronto '25 - MCP with MuleSoft
Alexandra N. Martinez
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
MuleSoft MCP Support (Model Context Protocol) and Use Case Demo
shyamraj55
 
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
Mastering ODC + Okta Configuration - Chennai OSUG
HathiMaryA
 
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 

Hadoop Ecosystem Architecture Overview