SlideShare a Scribd company logo
Intro to Hadoop
Hadoop
• Open Source Software
• Exceeds Enterprise VLDB
• Storage Processing Costs per gigabyte
• Structured and Unstructured data
• Evolved from Google
• Developed by Yahoo
• Technical Reasons for
• Business Use Case
Hadoop Ecosystem
• Hive: Structured Data
• Pig: Structured and Semi-structured
• Sqoop: Data Transfer Tool
• Hbase: Real Time Table Access
• Hcatatlog: Metadata view of HDFS data
• Flume: Log Stream
• Oozie: Workflow
Hadoop 1.0 Core Architecture
Map Reduce (distributed processing)
Hadoop Distributed File System (storage)
Hadoop 2 Yarn
Apache Tez
HDFS HDFS
MapReduce Yarn
Hadoop 1.0 Hadoop 2.0
MapReduce

More Related Content

What's hot (20)

ODP
Introdution to Apache Hadoop
Mike Frampton
 
PPTX
The Fundamentals Guide to HDP and HDInsight
Gert Drapers
 
PPTX
Cloudera Hadoop Distribution
Thisara Pramuditha
 
PPSX
Hadoop Ecosystem
Patrick Nicolas
 
PPTX
Hadoop
avnishagr
 
PPTX
Big Data and Hadoop Training in Chandigarh
Big Boxx Animation Academy
 
PDF
Intro to Apache Spark
Marius Soutier
 
PPTX
Hadoop
reddivarihareesh
 
PPTX
Hadoop Architecture
Ganesh B
 
PPTX
Summer Shorts: Big Data Integration
ibi
 
PPTX
Big data hadoop training in pune course content advanto software
Advanto Software
 
PPTX
Intro to Big Data
Jonathan Bloom
 
PPTX
Big data and tools
Shivam Shukla
 
PDF
Integrating Hadoop & Solr
Lucidworks (Archived)
 
PPTX
Available platforms for Big Data 2.0
Petr Novotný
 
PPTX
Hadoop data access layer v4.0
SpringPeople
 
PPTX
Qubole - Big data in cloud
Dmitry Tolpeko
 
PPTX
Big data advance topics - part 2.pptx
Moldovan Radu Adrian
 
PPTX
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Lucidworks (Archived)
 
PPTX
A Non-Standard use Case of Hadoop: High Scale Image Processing and Analytics
DataWorks Summit
 
Introdution to Apache Hadoop
Mike Frampton
 
The Fundamentals Guide to HDP and HDInsight
Gert Drapers
 
Cloudera Hadoop Distribution
Thisara Pramuditha
 
Hadoop Ecosystem
Patrick Nicolas
 
Hadoop
avnishagr
 
Big Data and Hadoop Training in Chandigarh
Big Boxx Animation Academy
 
Intro to Apache Spark
Marius Soutier
 
Hadoop Architecture
Ganesh B
 
Summer Shorts: Big Data Integration
ibi
 
Big data hadoop training in pune course content advanto software
Advanto Software
 
Intro to Big Data
Jonathan Bloom
 
Big data and tools
Shivam Shukla
 
Integrating Hadoop & Solr
Lucidworks (Archived)
 
Available platforms for Big Data 2.0
Petr Novotný
 
Hadoop data access layer v4.0
SpringPeople
 
Qubole - Big data in cloud
Dmitry Tolpeko
 
Big data advance topics - part 2.pptx
Moldovan Radu Adrian
 
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Lucidworks (Archived)
 
A Non-Standard use Case of Hadoop: High Scale Image Processing and Analytics
DataWorks Summit
 

Viewers also liked (7)

PPTX
Intro to Hadoop
Jonathan Bloom
 
PDF
Intro to Apache Hadoop
Sufi Nawaz
 
PDF
Intro to Hadoop
Quang Nguyen
 
PDF
An Introduction to Hadoop and Cloudera: Nashville Cloudera User Group, 10/23/14
iwrigley
 
ODP
Hadoop & Cloudera Workshop
Serkan Sakınmaz
 
PDF
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetup
iwrigley
 
KEY
Intro To Hadoop
Bill Graham
 
Intro to Hadoop
Jonathan Bloom
 
Intro to Apache Hadoop
Sufi Nawaz
 
Intro to Hadoop
Quang Nguyen
 
An Introduction to Hadoop and Cloudera: Nashville Cloudera User Group, 10/23/14
iwrigley
 
Hadoop & Cloudera Workshop
Serkan Sakınmaz
 
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetup
iwrigley
 
Intro To Hadoop
Bill Graham
 
Ad

Recently uploaded (20)

PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PPTX
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PPTX
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PDF
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
PDF
July Patch Tuesday
Ivanti
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PDF
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
PDF
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PDF
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PDF
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
PDF
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
July Patch Tuesday
Ivanti
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
Ad

Intro To Hadoop

  • 2. Hadoop • Open Source Software • Exceeds Enterprise VLDB • Storage Processing Costs per gigabyte • Structured and Unstructured data • Evolved from Google • Developed by Yahoo • Technical Reasons for • Business Use Case
  • 3. Hadoop Ecosystem • Hive: Structured Data • Pig: Structured and Semi-structured • Sqoop: Data Transfer Tool • Hbase: Real Time Table Access • Hcatatlog: Metadata view of HDFS data • Flume: Log Stream • Oozie: Workflow
  • 4. Hadoop 1.0 Core Architecture Map Reduce (distributed processing) Hadoop Distributed File System (storage)
  • 6. Apache Tez HDFS HDFS MapReduce Yarn Hadoop 1.0 Hadoop 2.0 MapReduce