SlideShare a Scribd company logo
Technical Seminar
on
HADOOP TECHNOLOGY
Under the Guidance of
P.V.R.K.MURTHY, M.Tech
Assistant Professor
What is hadoop Technology??
Why hadoop?
Developers of hadoop Technology
Famous hadoop users
Hadoop Features
Hadoop Architectures
Core-Components of Hadoop
Hadoop High Level Architechture
Hadoop cluster
CONTENTS
What is HDFS
HDFS – Name Node features:
HDFS-name node architecture
HDFS-data node
Hadoop MAPREDUCE
Benefits of Hadoop…
Conclusion
Reference
CONTENTS…
HADOOP TECHNOLOGY
What is Hadoop Technology??
•The most well known technology used for Big Data is
Hadoop.
•It is actually a large scale batch data processing system
Why Hadoop ??
•Distributed cluster system
•Platform for massively scalable applications
•Enables parallel data processing
Developers of Hadoop Technology:
Michael j. cafarella
Doug cutting
Famous Hadoop users
Hadoop Features
•Hadoop provides access to the file systems
• The Hadoop Common package contains the
necessary JAR files and scripts
•The package also provides source code,
documentation and a contribution section that includes
projects from the Hadoop Community.
HADOOPARCHITECTURE
Core-Components of Hadoop:
Hadoop distributive file system.
Map reduce.
What is HDFS ?
•Distributed file system
•Traditional hierarchical file organization
•Single namespace for the entire cluster
•Write-once-read-many access model
•Aware of the network topology
Hadoop High Level Architechture
Hadoop cluster
•A Small Hadoop Cluster Include a single master &
multiple worker nodes
Master node:
Data Node
Job Tracker
Task Tracker
Name Node
Slave node:
Data Node
Task Tracke
HDFS – Name Node Features
Metadata in main memory:
•List of files
•List of blocks for each file
•List of Data Nodes for each block
•File attributes
•Creation time
•Records every change in the
metadata
HDFS-name node architecture
Secondary name node
3.Store to HDD
Primary name-node
RAM
HDD
RAM
HDD
1. Pull transaction log
4.Push
2. Merge changes
HDFS-Data node
•Block Server Stores data in the local file system
•Periodic validation of checksums
•Periodically sends a report of all existing blocks
to the Name Node
Hadoop MAPREDUCE
Job Tracker:
Splitting into map and reduce tasks
Scheduling tasks on a cluster node
Task Tracker:
Runs Map Reduce tasks periodically
Map reduce implementation:
Benefits of Hadoop…
•Cost Saving and efficient and reliable data processing
•Provides an economically scalable solution
•Storing and processing of large amount of data
•Data grid operating system
•It is deployed on industry standard servers rather than expensive
specialized data storage systems.
• Parallel processing of huge amounts of data across inexpensive,
industry-standard servers.
Why commodity hw ?
because cheaper
designed to tolerate faults
Why HDFS ?
network bandwidth vs seek latency
Why Map reduce programming model?
parallel programming
large data sets
moving computation to data
single compute + data cluster
CONCLUSION
REFERENCES
•Apache Hadoop!
(http://hadoop.apache.org)
•Hadoop on Wikipedia
(http://en.wikipedia.org/wiki/Hadoop)
•Cloudera - Apache Hadoop for the Enterprise
(http://www.cloudera.com
HADOOP  TECHNOLOGY ppt
HADOOP  TECHNOLOGY ppt

More Related Content

What's hot (20)

PPT
Schemaless Databases
Dan Gunter
 
PPTX
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
Simplilearn
 
PPTX
Hadoop and Big Data
Harshdeep Kaur
 
PDF
What is HDFS | Hadoop Distributed File System | Edureka
Edureka!
 
PDF
Introduction to Hadoop
joelcrabb
 
PPTX
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
Simplilearn
 
PPTX
Session 14 - Hive
AnandMHadoop
 
PPTX
Big Data Analytics with Hadoop
Philippe Julio
 
PDF
Introduction to HBase
Avkash Chauhan
 
PPTX
Big data and Hadoop
Rahul Agarwal
 
PPTX
Apache Hadoop
Ajit Koti
 
PPTX
HADOOP TECHNOLOGY ppt
sravya raju
 
PPTX
Hive Tutorial | Hive Architecture | Hive Tutorial For Beginners | Hive In Had...
Simplilearn
 
PPT
Unit 3 -Data storage and cloud computing
MonishaNehkal
 
PPTX
Understanding cloud with Google Cloud Platform
Dr. Ketan Parmar
 
PDF
Hadoop
Rajesh Piryani
 
PPTX
Map Reduce
Prashant Gupta
 
PDF
Hadoop ecosystem
Stanley Wang
 
PPT
Hive(ppt)
Abhinav Tyagi
 
PPT
Seminar Presentation Hadoop
Varun Narang
 
Schemaless Databases
Dan Gunter
 
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
Simplilearn
 
Hadoop and Big Data
Harshdeep Kaur
 
What is HDFS | Hadoop Distributed File System | Edureka
Edureka!
 
Introduction to Hadoop
joelcrabb
 
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
Simplilearn
 
Session 14 - Hive
AnandMHadoop
 
Big Data Analytics with Hadoop
Philippe Julio
 
Introduction to HBase
Avkash Chauhan
 
Big data and Hadoop
Rahul Agarwal
 
Apache Hadoop
Ajit Koti
 
HADOOP TECHNOLOGY ppt
sravya raju
 
Hive Tutorial | Hive Architecture | Hive Tutorial For Beginners | Hive In Had...
Simplilearn
 
Unit 3 -Data storage and cloud computing
MonishaNehkal
 
Understanding cloud with Google Cloud Platform
Dr. Ketan Parmar
 
Map Reduce
Prashant Gupta
 
Hadoop ecosystem
Stanley Wang
 
Hive(ppt)
Abhinav Tyagi
 
Seminar Presentation Hadoop
Varun Narang
 

Similar to HADOOP TECHNOLOGY ppt (20)

PPTX
Hadoop info
Nikita Sure
 
PPTX
02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY
Venneladonthireddy1
 
PPTX
Hadoop
Dinakar nk
 
PDF
Hadoop Maharajathi,II-M.sc.,Computer Science,Bonsecours college for women
maharajothip1
 
PDF
P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.
MaharajothiP
 
PDF
hdfs readrmation ghghg bigdats analytics info.pdf
ssuser2d043c
 
PPTX
Hadoop
avnishagr
 
PPTX
Distributed Systems Hadoop.pptx
Uttara University
 
PPTX
Big Data and Hadoop with MapReduce Paradigms
Arundhati Kanungo
 
PPTX
2. hadoop fundamentals
Lokesh Ramaswamy
 
PPTX
OPERATING SYSTEM .pptx
AltafKhadim
 
PPT
Hadoop
chandinisanz
 
PPTX
Apache hadoop basics
saili mane
 
PPTX
HADOOP DISTRIBUTED FILE SYSTEM AND MAPREDUCE
Harsha Siva Sai
 
PDF
Hadoop architecture-tutorial
vinayiqbusiness
 
PPTX
Hadoop ppt1
chariorienit
 
PPTX
Hadoop – Architecture.pptx
SakthiVinoth78
 
PPTX
hadoop.pptx
arunaPalani3
 
PPTX
Big Data and Hadoop
Flavio Vit
 
PDF
Introduction to Hadoop Administration
Ramesh Pabba - seeking new projects
 
Hadoop info
Nikita Sure
 
02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY
Venneladonthireddy1
 
Hadoop
Dinakar nk
 
Hadoop Maharajathi,II-M.sc.,Computer Science,Bonsecours college for women
maharajothip1
 
P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.
MaharajothiP
 
hdfs readrmation ghghg bigdats analytics info.pdf
ssuser2d043c
 
Hadoop
avnishagr
 
Distributed Systems Hadoop.pptx
Uttara University
 
Big Data and Hadoop with MapReduce Paradigms
Arundhati Kanungo
 
2. hadoop fundamentals
Lokesh Ramaswamy
 
OPERATING SYSTEM .pptx
AltafKhadim
 
Hadoop
chandinisanz
 
Apache hadoop basics
saili mane
 
HADOOP DISTRIBUTED FILE SYSTEM AND MAPREDUCE
Harsha Siva Sai
 
Hadoop architecture-tutorial
vinayiqbusiness
 
Hadoop ppt1
chariorienit
 
Hadoop – Architecture.pptx
SakthiVinoth78
 
hadoop.pptx
arunaPalani3
 
Big Data and Hadoop
Flavio Vit
 
Introduction to Hadoop Administration
Ramesh Pabba - seeking new projects
 
Ad

More from sravya raju (6)

PPT
Secure shell ppt
sravya raju
 
PPTX
BIOMETRIC IDENTIFICATION IN ATM’S PPT
sravya raju
 
PPTX
Hawk Eye Technology ppt
sravya raju
 
PPTX
fog computing ppt
sravya raju
 
DOCX
Fog computing document
sravya raju
 
PPTX
PERSON DE-IDENTIFICATION IN VIDEOS ppt
sravya raju
 
Secure shell ppt
sravya raju
 
BIOMETRIC IDENTIFICATION IN ATM’S PPT
sravya raju
 
Hawk Eye Technology ppt
sravya raju
 
fog computing ppt
sravya raju
 
Fog computing document
sravya raju
 
PERSON DE-IDENTIFICATION IN VIDEOS ppt
sravya raju
 
Ad

Recently uploaded (20)

PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PDF
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
PDF
Python basic programing language for automation
DanialHabibi2
 
PPTX
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PDF
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
PDF
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
PDF
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
PDF
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
PDF
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
Python basic programing language for automation
DanialHabibi2
 
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 

HADOOP TECHNOLOGY ppt

  • 1. Technical Seminar on HADOOP TECHNOLOGY Under the Guidance of P.V.R.K.MURTHY, M.Tech Assistant Professor
  • 2. What is hadoop Technology?? Why hadoop? Developers of hadoop Technology Famous hadoop users Hadoop Features Hadoop Architectures Core-Components of Hadoop Hadoop High Level Architechture Hadoop cluster CONTENTS
  • 3. What is HDFS HDFS – Name Node features: HDFS-name node architecture HDFS-data node Hadoop MAPREDUCE Benefits of Hadoop… Conclusion Reference CONTENTS…
  • 4. HADOOP TECHNOLOGY What is Hadoop Technology?? •The most well known technology used for Big Data is Hadoop. •It is actually a large scale batch data processing system
  • 5. Why Hadoop ?? •Distributed cluster system •Platform for massively scalable applications •Enables parallel data processing
  • 6. Developers of Hadoop Technology: Michael j. cafarella Doug cutting
  • 8. Hadoop Features •Hadoop provides access to the file systems • The Hadoop Common package contains the necessary JAR files and scripts •The package also provides source code, documentation and a contribution section that includes projects from the Hadoop Community.
  • 10. Core-Components of Hadoop: Hadoop distributive file system. Map reduce.
  • 11. What is HDFS ? •Distributed file system •Traditional hierarchical file organization •Single namespace for the entire cluster •Write-once-read-many access model •Aware of the network topology
  • 12. Hadoop High Level Architechture
  • 13. Hadoop cluster •A Small Hadoop Cluster Include a single master & multiple worker nodes Master node: Data Node Job Tracker Task Tracker Name Node Slave node: Data Node Task Tracke
  • 14. HDFS – Name Node Features Metadata in main memory: •List of files •List of blocks for each file •List of Data Nodes for each block •File attributes •Creation time •Records every change in the metadata
  • 15. HDFS-name node architecture Secondary name node 3.Store to HDD Primary name-node RAM HDD RAM HDD 1. Pull transaction log 4.Push 2. Merge changes
  • 16. HDFS-Data node •Block Server Stores data in the local file system •Periodic validation of checksums •Periodically sends a report of all existing blocks to the Name Node
  • 17. Hadoop MAPREDUCE Job Tracker: Splitting into map and reduce tasks Scheduling tasks on a cluster node Task Tracker: Runs Map Reduce tasks periodically Map reduce implementation:
  • 18. Benefits of Hadoop… •Cost Saving and efficient and reliable data processing •Provides an economically scalable solution •Storing and processing of large amount of data •Data grid operating system •It is deployed on industry standard servers rather than expensive specialized data storage systems. • Parallel processing of huge amounts of data across inexpensive, industry-standard servers.
  • 19. Why commodity hw ? because cheaper designed to tolerate faults Why HDFS ? network bandwidth vs seek latency Why Map reduce programming model? parallel programming large data sets moving computation to data single compute + data cluster CONCLUSION
  • 20. REFERENCES •Apache Hadoop! (http://hadoop.apache.org) •Hadoop on Wikipedia (http://en.wikipedia.org/wiki/Hadoop) •Cloudera - Apache Hadoop for the Enterprise (http://www.cloudera.com