SlideShare a Scribd company logo
Analysis of Data Placement Strategy 
based on Computing Power of Nodes on 
Heterogeneous Hadoop Clusters 
Sanket Reddy Chintapalli 
Advisor - Dr. Xiao Qin
Presentation Overview 
● Synopsis 
● Mapreduce Programming Model Overview 
● HDFS Overview 
● Motivation 
● Design 
● Software Description 
● Hardware Description 
● Results 
● Conclusion
Synopsis 
● Data placement strategy 
● Heterogeneous Clusters 
● Computing Power 
● Calculating Computing Ratio 
● WordCount and Grep
MapReduce Model 
● Hadoop 1.0 and Hadoop 2.0 
● Master - Slave Model 
● JobTracker and TaskTracker Hadoop 1.0 
● YARN Hadoop 2.0 
● Resource Manager YARN 
● Application Manager YARN 
● Node Manager YARN 
● MapReduce Flow
Mapreduce Model
Mapreduce Model - 1.0
Mapreduce Model - YARN - 2.0
Mapreduce Model - Flow
HDFS 
● Namenode 
● Datanode 
● Replication 
● Federated Namenodes
HDFS Architecture
HDFS Federated Namenodes
HDFS Federated Namenodes 
● Scalability 
● Performance 
● Isolation - overload
Motivation
Software Description 
● Hadoop 2.3.0 
● Maven 
● Eclipse 
● Protocol Buffers
Hardware Description
Design 
Run WordCount and Grep Applications on 
individual nodes
Design 
Calculate Computing Power of Individual Nodes for 
a specific application
Design 
● Evaluate Hadoop Distribution by running grep and 
wordcount together on all nodes 
● Run the CRBalancer to balance the nodes 
● Finally re-run the applications to note the ramifications 
of the data placement strategy.
Design - Algorithm 
CRBalancer Strategy
Implementation 
● CRBalancer 
● CRBalancingPolicy 
● CRNamenodeConnector
Results - WordCount
Results - Grep
Questions ??

More Related Content

PPT
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
Xiao Qin
 
PDF
A time energy performance analysis of map reduce on heterogeneous systems wit...
newmooxx
 
PPTX
Dache: A Data Aware Caching for Big-Data using Map Reduce framework
Safir Shah
 
PPTX
Dache: A Data Aware Caching for Big-Data Applications Using the MapReduce Fra...
Govt.Engineering college, Idukki
 
PDF
Map Reduce
Vigen Sahakyan
 
PPTX
Map Reduce
Rahul Agarwal
 
PPTX
Analysing of big data using map reduce
Paladion Networks
 
PPTX
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
DataWorks Summit
 
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
Xiao Qin
 
A time energy performance analysis of map reduce on heterogeneous systems wit...
newmooxx
 
Dache: A Data Aware Caching for Big-Data using Map Reduce framework
Safir Shah
 
Dache: A Data Aware Caching for Big-Data Applications Using the MapReduce Fra...
Govt.Engineering college, Idukki
 
Map Reduce
Vigen Sahakyan
 
Map Reduce
Rahul Agarwal
 
Analysing of big data using map reduce
Paladion Networks
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
DataWorks Summit
 

What's hot (20)

PPTX
writing Hadoop Map Reduce programs
jani shaik
 
PDF
Mapreduce by examples
Andrea Iacono
 
PPT
Map Reduce introduction
Muralidharan Deenathayalan
 
PDF
Hadoop scheduler with deadline constraint
ijccsa
 
PPTX
Introduction to MapReduce
Chicago Hadoop Users Group
 
PDF
Hadoop Network Performance profile
pramodbiligiri
 
PPTX
Introduction to MapReduce
Hassan A-j
 
PPTX
Introduction to Map Reduce
Apache Apex
 
PPT
Map Reduce
Michel Bruley
 
PPTX
Optimal Execution Of MapReduce Jobs In Cloud - Voices 2015
Deanna Kosaraju
 
PPTX
Mastering Hadoop Map Reduce - Custom Types and Other Optimizations
scottcrespo
 
PPSX
MapReduce Scheduling Algorithms
Leila panahi
 
PDF
Shuffle phase as the bottleneck in Hadoop Terasort
pramodbiligiri
 
PDF
Python in an Evolving Enterprise System (PyData SV 2013)
PyData
 
PPTX
Stratosphere with big_data_analytics
Avinash Pandu
 
PDF
Hadoop - Introduction to map reduce programming - Reunião 12/04/2014
soujavajug
 
PPT
Hadoop Map Reduce
VNIT-ACM Student Chapter
 
PDF
MapReduce Algorithm Design
Gabriela Agustini
 
PDF
Resilient Distributed Datasets
Gabriele Modena
 
writing Hadoop Map Reduce programs
jani shaik
 
Mapreduce by examples
Andrea Iacono
 
Map Reduce introduction
Muralidharan Deenathayalan
 
Hadoop scheduler with deadline constraint
ijccsa
 
Introduction to MapReduce
Chicago Hadoop Users Group
 
Hadoop Network Performance profile
pramodbiligiri
 
Introduction to MapReduce
Hassan A-j
 
Introduction to Map Reduce
Apache Apex
 
Map Reduce
Michel Bruley
 
Optimal Execution Of MapReduce Jobs In Cloud - Voices 2015
Deanna Kosaraju
 
Mastering Hadoop Map Reduce - Custom Types and Other Optimizations
scottcrespo
 
MapReduce Scheduling Algorithms
Leila panahi
 
Shuffle phase as the bottleneck in Hadoop Terasort
pramodbiligiri
 
Python in an Evolving Enterprise System (PyData SV 2013)
PyData
 
Stratosphere with big_data_analytics
Avinash Pandu
 
Hadoop - Introduction to map reduce programming - Reunião 12/04/2014
soujavajug
 
Hadoop Map Reduce
VNIT-ACM Student Chapter
 
MapReduce Algorithm Design
Gabriela Agustini
 
Resilient Distributed Datasets
Gabriele Modena
 
Ad

Viewers also liked (20)

PDF
High Value Media Placement Strategies
The Advertising Research Foundation
 
PPTX
Placement strategies
Anup Singh
 
PPTX
Computer architecture
jookerbuzz
 
PPT
COMP2710 Software Construction: header files
Xiao Qin
 
PPTX
How to do research?
Xiao Qin
 
PPTX
Reliability Analysis for an Energy-Aware RAID System
Xiao Qin
 
PPTX
Project 2 - how to compile os161?
Xiao Qin
 
PPTX
An Active and Hybrid Storage System for Data-intensive Applications
Xiao Qin
 
PPTX
Project 2 how to modify OS/161
Xiao Qin
 
PPTX
Energy Efficient Data Storage Systems
Xiao Qin
 
PPTX
OS/161 Overview
Xiao Qin
 
PPTX
Nas'12 overview
Xiao Qin
 
PPTX
IPCCC 2012 Conference Program Overview
Xiao Qin
 
PPTX
Common grammar mistakes
Xiao Qin
 
PPTX
Thermal modeling and management of cluster storage systems xunfei jiang 2014
Xiao Qin
 
PPTX
Why Major in Computer Science and Software Engineering at Auburn University?
Xiao Qin
 
PDF
Project 2 how to install and compile os161
Xiao Qin
 
PPTX
Surviving a group project
Xiao Qin
 
PDF
Project 2 How to modify os161: A Manual
Xiao Qin
 
PPT
COMP2710: Software Construction - Linked list exercises
Xiao Qin
 
High Value Media Placement Strategies
The Advertising Research Foundation
 
Placement strategies
Anup Singh
 
Computer architecture
jookerbuzz
 
COMP2710 Software Construction: header files
Xiao Qin
 
How to do research?
Xiao Qin
 
Reliability Analysis for an Energy-Aware RAID System
Xiao Qin
 
Project 2 - how to compile os161?
Xiao Qin
 
An Active and Hybrid Storage System for Data-intensive Applications
Xiao Qin
 
Project 2 how to modify OS/161
Xiao Qin
 
Energy Efficient Data Storage Systems
Xiao Qin
 
OS/161 Overview
Xiao Qin
 
Nas'12 overview
Xiao Qin
 
IPCCC 2012 Conference Program Overview
Xiao Qin
 
Common grammar mistakes
Xiao Qin
 
Thermal modeling and management of cluster storage systems xunfei jiang 2014
Xiao Qin
 
Why Major in Computer Science and Software Engineering at Auburn University?
Xiao Qin
 
Project 2 how to install and compile os161
Xiao Qin
 
Surviving a group project
Xiao Qin
 
Project 2 How to modify os161: A Manual
Xiao Qin
 
COMP2710: Software Construction - Linked list exercises
Xiao Qin
 
Ad

More from Xiao Qin (13)

PPTX
How to apply for internship positions?
Xiao Qin
 
PPTX
How to write research papers? Version 5.0
Xiao Qin
 
PDF
Making a competitive nsf career proposal: Part 2 Worksheet
Xiao Qin
 
PDF
Making a competitive nsf career proposal: Part 1 Tips
Xiao Qin
 
PPTX
Auburn csse faculty orientation
Xiao Qin
 
PPTX
Auburn CSSE graduate student orientation
Xiao Qin
 
PPTX
CSSE Graduate Programs Committee: Progress Report
Xiao Qin
 
PPTX
Understanding what our customer wants-slideshare
Xiao Qin
 
PDF
P#1 stream of praise
Xiao Qin
 
PPTX
Data center specific thermal and energy saving techniques
Xiao Qin
 
PPTX
How to add system calls to OS/161
Xiao Qin
 
PPTX
Performance Evaluation of Traditional Caching Policies on a Large System with...
Xiao Qin
 
PPT
Reliability Modeling and Analysis of Energy-Efficient Storage Systems
Xiao Qin
 
How to apply for internship positions?
Xiao Qin
 
How to write research papers? Version 5.0
Xiao Qin
 
Making a competitive nsf career proposal: Part 2 Worksheet
Xiao Qin
 
Making a competitive nsf career proposal: Part 1 Tips
Xiao Qin
 
Auburn csse faculty orientation
Xiao Qin
 
Auburn CSSE graduate student orientation
Xiao Qin
 
CSSE Graduate Programs Committee: Progress Report
Xiao Qin
 
Understanding what our customer wants-slideshare
Xiao Qin
 
P#1 stream of praise
Xiao Qin
 
Data center specific thermal and energy saving techniques
Xiao Qin
 
How to add system calls to OS/161
Xiao Qin
 
Performance Evaluation of Traditional Caching Policies on a Large System with...
Xiao Qin
 
Reliability Modeling and Analysis of Energy-Efficient Storage Systems
Xiao Qin
 

Recently uploaded (20)

PPTX
Simple and concise overview about Quantum computing..pptx
mughal641
 
PDF
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
PDF
Brief History of Internet - Early Days of Internet
sutharharshit158
 
PDF
Unlocking the Future- AI Agents Meet Oracle Database 23ai - AIOUG Yatra 2025.pdf
Sandesh Rao
 
PPTX
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
PDF
Peak of Data & AI Encore - Real-Time Insights & Scalable Editing with ArcGIS
Safe Software
 
PDF
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
PPTX
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
PDF
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
PDF
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
PDF
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
PDF
GDG Cloud Munich - Intro - Luiz Carneiro - #BuildWithAI - July - Abdel.pdf
Luiz Carneiro
 
PDF
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
PDF
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
PDF
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
PDF
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
PDF
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
PDF
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
PDF
AI Unleashed - Shaping the Future -Starting Today - AIOUG Yatra 2025 - For Co...
Sandesh Rao
 
PDF
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
Simple and concise overview about Quantum computing..pptx
mughal641
 
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
Brief History of Internet - Early Days of Internet
sutharharshit158
 
Unlocking the Future- AI Agents Meet Oracle Database 23ai - AIOUG Yatra 2025.pdf
Sandesh Rao
 
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
Peak of Data & AI Encore - Real-Time Insights & Scalable Editing with ArcGIS
Safe Software
 
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
GDG Cloud Munich - Intro - Luiz Carneiro - #BuildWithAI - July - Abdel.pdf
Luiz Carneiro
 
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
AI Unleashed - Shaping the Future -Starting Today - AIOUG Yatra 2025 - For Co...
Sandesh Rao
 
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 

HDFS-HC2: Analysis of Data Placement Strategy based on Computing Power of Nodes on Heterogeneous Hadoop Clusters