SlideShare a Scribd company logo
7
Most read
Paper name : Big Data Analytics
Staff : Mrs M. Florence Dayana M. C. A., M.Phil., (Ph.D.)
Class : II- M.Sc.(Computer Science)
Semester : IV
Unit : IV
Topic : Hadoop Foundation for Analytics
Hadoop
Foundation
for Analytics
HISTORY OF HADOOP:
• Hadoop was created by Doug Cutting and Mike Cafarella. It is
created in 2005.
• Firstly, it was developed to support the distribution for the
Nutch search engine project.
• It was named by Doug after seeing his son’s toy elephant.
• By that time, he was worked in Yahoo.
• Hadoop is an open-source distributed processing framework.
It manages data processing and also storage for big data
applications running in clustered systems.
Hadoop foundation for analytics
ARCHITECTURE OF HADOOP:
Hadoop has two major layers. They are
• Processing/Computation layer (Also called as MapReduce)
• Storage layer (Hadoop Distributed File System).
The Hadoop framework application works in an environment
that provides distributed storage and computation across
group of computers. Hadoop is designed in a way to handle
single server to thousands of machines, each offering local
computation and storage.
Hadoop foundation for analytics
MAPREDUCE:
HDFS:
COMPONENTS OF HADOOP:
• Hive:
Hive is an open-source data warehouse framework that
structures and queries data using a SQL like language called
HiveQL.
• Ambari:
Ambari was designed to remove difficulties of Hadoop
management by providing a simple web interface that can
manage and monitor Apache Hadoop clusters.
• HBase:
HBase is an open-source and distributed database model that
provides random, real-time read/write access to your big data. It is a
non-relational database model HBase is a NoSQL Database for Hadoop.
• Pig:
Pig is an open-source technology that enables low cost storage and
processing of large sets of data, without requiring any specific formats.
• Zookeeper:
ZooKeeper is an open-source platform that provides a centralized
infrastructure. It is used for maintaining configuration information,
naming, providing distributed synchronization, and also providing group
services.

More Related Content

PPTX
Map Reduce
Prashant Gupta
 
PPTX
Introduction to HDFS
Bhavesh Padharia
 
PDF
Big data-analytics-cpe8035
Neelam Rawat
 
PPT
Virtualization in cloud computing ppt
Mehul Patel
 
PPTX
Distributed database management system
Pooja Dixit
 
PPTX
Data mining tasks
Khwaja Aamer
 
PPTX
Temporal databases
Dabbal Singh Mahara
 
PPT
Hadoop Map Reduce
VNIT-ACM Student Chapter
 
Map Reduce
Prashant Gupta
 
Introduction to HDFS
Bhavesh Padharia
 
Big data-analytics-cpe8035
Neelam Rawat
 
Virtualization in cloud computing ppt
Mehul Patel
 
Distributed database management system
Pooja Dixit
 
Data mining tasks
Khwaja Aamer
 
Temporal databases
Dabbal Singh Mahara
 
Hadoop Map Reduce
VNIT-ACM Student Chapter
 

What's hot (20)

PPTX
Security in distributed systems
Haitham Ahmed
 
PPTX
Big Data Analytics with Hadoop
Philippe Julio
 
PDF
Cloud Ecosystem
Arief Gunawan
 
PPT
Web Servers (ppt)
webhostingguy
 
PPTX
Big data and Hadoop
Rahul Agarwal
 
PPTX
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
Simplilearn
 
PPTX
Introduction to Big Data
Umair Shafique
 
PPTX
Web browser architecture
Nguyen Quang
 
PPTX
Hadoop File system (HDFS)
Prashant Gupta
 
PPTX
Features of Hadoop
Dr. C.V. Suresh Babu
 
PPTX
Introduction to Hadoop and Hadoop component
rebeccatho
 
PPT
File structures
Shyam Kumar
 
PPTX
Ordbms
ramandeep brar
 
PPTX
Cloud computing ppt
Pravesh ARYA
 
PPT
Legal issues in cloud computing
movinghats
 
PPT
Apache web-server-architecture
IvanGeorgeArouje
 
PPTX
Hadoop And Their Ecosystem ppt
sunera pathan
 
PPT
introduction to web technology
vikram singh
 
PPTX
NOSQL Databases types and Uses
Suvradeep Rudra
 
Security in distributed systems
Haitham Ahmed
 
Big Data Analytics with Hadoop
Philippe Julio
 
Cloud Ecosystem
Arief Gunawan
 
Web Servers (ppt)
webhostingguy
 
Big data and Hadoop
Rahul Agarwal
 
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
Simplilearn
 
Introduction to Big Data
Umair Shafique
 
Web browser architecture
Nguyen Quang
 
Hadoop File system (HDFS)
Prashant Gupta
 
Features of Hadoop
Dr. C.V. Suresh Babu
 
Introduction to Hadoop and Hadoop component
rebeccatho
 
File structures
Shyam Kumar
 
Ordbms
ramandeep brar
 
Cloud computing ppt
Pravesh ARYA
 
Legal issues in cloud computing
movinghats
 
Apache web-server-architecture
IvanGeorgeArouje
 
Hadoop And Their Ecosystem ppt
sunera pathan
 
introduction to web technology
vikram singh
 
NOSQL Databases types and Uses
Suvradeep Rudra
 
Ad

Similar to Hadoop foundation for analytics (20)

PPTX
Big Data UNIT 2 AKTU syllabus all topics covered
chinky1118
 
PPTX
Hadoop training
TIB Academy
 
PPSX
Hadoop
Nishant Gandhi
 
PDF
Unit IV.pdf
KennyPratheepKumar
 
PPTX
Data analytics
owaiz shaikh
 
PPTX
Cap 10 ingles
ElianaSalinas4
 
PPTX
Cap 10 ingles
ElianaSalinas4
 
PPTX
Hadoop
Shamama Kamal
 
PPTX
Hadoop and their in big data analysis EcoSystem.pptx
Rahul Borate
 
PPT
Hadoop
chandinisanz
 
PPTX
M. Florence Dayana - Hadoop Foundation for Analytics.pptx
Dr.Florence Dayana
 
PDF
BIGDATA ppts
Krisshhna Daasaarii
 
PPTX
Hadoop And Their Ecosystem
sunera pathan
 
PPTX
Introduction to Apache Hadoop Ecosystem
Mahabubur Rahaman
 
PPTX
hadoop-ecosystem-ppt.pptx
raghavanand36
 
PPTX
Big data and hadoop anupama
Anupama Prabhudesai
 
PPTX
Big Data Hadoop Technology
Rahul Sharma
 
PPTX
Getting started big data
Kibrom Gebrehiwot
 
PPT
Anju
Anju Shekhawat
 
PDF
What is Apache Hadoop and its ecosystem?
tommychauhan
 
Big Data UNIT 2 AKTU syllabus all topics covered
chinky1118
 
Hadoop training
TIB Academy
 
Hadoop
Nishant Gandhi
 
Unit IV.pdf
KennyPratheepKumar
 
Data analytics
owaiz shaikh
 
Cap 10 ingles
ElianaSalinas4
 
Cap 10 ingles
ElianaSalinas4
 
Hadoop
Shamama Kamal
 
Hadoop and their in big data analysis EcoSystem.pptx
Rahul Borate
 
Hadoop
chandinisanz
 
M. Florence Dayana - Hadoop Foundation for Analytics.pptx
Dr.Florence Dayana
 
BIGDATA ppts
Krisshhna Daasaarii
 
Hadoop And Their Ecosystem
sunera pathan
 
Introduction to Apache Hadoop Ecosystem
Mahabubur Rahaman
 
hadoop-ecosystem-ppt.pptx
raghavanand36
 
Big data and hadoop anupama
Anupama Prabhudesai
 
Big Data Hadoop Technology
Rahul Sharma
 
Getting started big data
Kibrom Gebrehiwot
 
What is Apache Hadoop and its ecosystem?
tommychauhan
 
Ad

Recently uploaded (20)

PPTX
CARE OF UNCONSCIOUS PATIENTS .pptx
AneetaSharma15
 
PDF
Biological Classification Class 11th NCERT CBSE NEET.pdf
NehaRohtagi1
 
PDF
Antianginal agents, Definition, Classification, MOA.pdf
Prerana Jadhav
 
PPTX
TEF & EA Bsc Nursing 5th sem.....BBBpptx
AneetaSharma15
 
PPTX
Artificial Intelligence in Gastroentrology: Advancements and Future Presprec...
AyanHossain
 
PPTX
Artificial-Intelligence-in-Drug-Discovery by R D Jawarkar.pptx
Rahul Jawarkar
 
PPTX
Introduction to pediatric nursing in 5th Sem..pptx
AneetaSharma15
 
PPTX
Applications of matrices In Real Life_20250724_091307_0000.pptx
gehlotkrish03
 
DOCX
SAROCES Action-Plan FOR ARAL PROGRAM IN DEPED
Levenmartlacuna1
 
PPTX
Tips Management in Odoo 18 POS - Odoo Slides
Celine George
 
PPTX
An introduction to Prepositions for beginners.pptx
drsiddhantnagine
 
PDF
Virat Kohli- the Pride of Indian cricket
kushpar147
 
PPTX
BASICS IN COMPUTER APPLICATIONS - UNIT I
suganthim28
 
PPTX
Dakar Framework Education For All- 2000(Act)
santoshmohalik1
 
PPTX
Five Point Someone – Chetan Bhagat | Book Summary & Analysis by Bhupesh Kushwaha
Bhupesh Kushwaha
 
PPTX
A Smarter Way to Think About Choosing a College
Cyndy McDonald
 
PPTX
How to Close Subscription in Odoo 18 - Odoo Slides
Celine George
 
DOCX
Action Plan_ARAL PROGRAM_ STAND ALONE SHS.docx
Levenmartlacuna1
 
PPTX
How to Track Skills & Contracts Using Odoo 18 Employee
Celine George
 
PDF
The Minister of Tourism, Culture and Creative Arts, Abla Dzifa Gomashie has e...
nservice241
 
CARE OF UNCONSCIOUS PATIENTS .pptx
AneetaSharma15
 
Biological Classification Class 11th NCERT CBSE NEET.pdf
NehaRohtagi1
 
Antianginal agents, Definition, Classification, MOA.pdf
Prerana Jadhav
 
TEF & EA Bsc Nursing 5th sem.....BBBpptx
AneetaSharma15
 
Artificial Intelligence in Gastroentrology: Advancements and Future Presprec...
AyanHossain
 
Artificial-Intelligence-in-Drug-Discovery by R D Jawarkar.pptx
Rahul Jawarkar
 
Introduction to pediatric nursing in 5th Sem..pptx
AneetaSharma15
 
Applications of matrices In Real Life_20250724_091307_0000.pptx
gehlotkrish03
 
SAROCES Action-Plan FOR ARAL PROGRAM IN DEPED
Levenmartlacuna1
 
Tips Management in Odoo 18 POS - Odoo Slides
Celine George
 
An introduction to Prepositions for beginners.pptx
drsiddhantnagine
 
Virat Kohli- the Pride of Indian cricket
kushpar147
 
BASICS IN COMPUTER APPLICATIONS - UNIT I
suganthim28
 
Dakar Framework Education For All- 2000(Act)
santoshmohalik1
 
Five Point Someone – Chetan Bhagat | Book Summary & Analysis by Bhupesh Kushwaha
Bhupesh Kushwaha
 
A Smarter Way to Think About Choosing a College
Cyndy McDonald
 
How to Close Subscription in Odoo 18 - Odoo Slides
Celine George
 
Action Plan_ARAL PROGRAM_ STAND ALONE SHS.docx
Levenmartlacuna1
 
How to Track Skills & Contracts Using Odoo 18 Employee
Celine George
 
The Minister of Tourism, Culture and Creative Arts, Abla Dzifa Gomashie has e...
nservice241
 

Hadoop foundation for analytics

  • 1. Paper name : Big Data Analytics Staff : Mrs M. Florence Dayana M. C. A., M.Phil., (Ph.D.) Class : II- M.Sc.(Computer Science) Semester : IV Unit : IV Topic : Hadoop Foundation for Analytics
  • 3. HISTORY OF HADOOP: • Hadoop was created by Doug Cutting and Mike Cafarella. It is created in 2005. • Firstly, it was developed to support the distribution for the Nutch search engine project. • It was named by Doug after seeing his son’s toy elephant. • By that time, he was worked in Yahoo. • Hadoop is an open-source distributed processing framework. It manages data processing and also storage for big data applications running in clustered systems.
  • 5. ARCHITECTURE OF HADOOP: Hadoop has two major layers. They are • Processing/Computation layer (Also called as MapReduce) • Storage layer (Hadoop Distributed File System). The Hadoop framework application works in an environment that provides distributed storage and computation across group of computers. Hadoop is designed in a way to handle single server to thousands of machines, each offering local computation and storage.
  • 9. COMPONENTS OF HADOOP: • Hive: Hive is an open-source data warehouse framework that structures and queries data using a SQL like language called HiveQL. • Ambari: Ambari was designed to remove difficulties of Hadoop management by providing a simple web interface that can manage and monitor Apache Hadoop clusters.
  • 10. • HBase: HBase is an open-source and distributed database model that provides random, real-time read/write access to your big data. It is a non-relational database model HBase is a NoSQL Database for Hadoop. • Pig: Pig is an open-source technology that enables low cost storage and processing of large sets of data, without requiring any specific formats. • Zookeeper: ZooKeeper is an open-source platform that provides a centralized infrastructure. It is used for maintaining configuration information, naming, providing distributed synchronization, and also providing group services.