SlideShare a Scribd company logo
HADOOP INTERVIEW QUESTIONS
Reach Us: Radiantits.com
Contact Us: +12105037100
1) What is Hadoop Map Reduce ?
For processing large data sets in parallel across a hadoop cluster,
Hadoop MapReduce framework is used. Data analysis uses a
two-step map and reduce process.
Reach Us: Radiantits.com
Contact Us: +12105037100
2) How Hadoop MapReduce works?
In MapReduce, during the map phase it counts the words in each
document, while in the reduce phase it aggregates the data as per
the document spanning the entire collection. During the map
phase the input data is divided into splits for analysis by map
tasks running in parallel across Hadoop framework.
Reach Us: Radiantits.com
Contact Us: +12105037100
3) Explain what is shuffling in MapReduce ?
The process by which the system performs the sort and transfers the
map outputs to the reducer as inputs is known as the shuffle.
Reach Us: Radiantits.com
Contact Us: +12105037100
4) Explain what is distributed Cache in MapReduce
Framework ?
Distributed Cache is an important feature provided by map reduce
framework. When you want to share some files across all nodes in
Hadoop Cluster, DistributedCache is used. The files could be an
executable jar files or simple properties file.
Reach Us: Radiantits.com
Contact Us: +12105037100
5) Explain what is NameNode in Hadoop?
NameNode in Hadoop is the node, where Hadoop stores all the file
location information in HDFS (Hadoop Distributed File System). In
other words, NameNode is the centrepiece of an HDFS file system. It
keeps the record of all the files in the file system, and tracks the file data
across the cluster or multiple machines.
Reach Us: Radiantits.com
Contact Us: +12105037100
7) Explain what is heartbeat in HDFS?
Heartbeat is referred to a signal used between a data node and Name
node, and between task tracker and job tracker, if the Name node or
job tracker does not respond to the signal, then it is considered there
is some issues with data node or task tracker
Reach Us: Radiantits.com
Contact Us: +12105037100
8) Explain what combiners is and when you should use a
combiner in a MapReduce Job?
To increase the efficiency of MapReduce Program, Combiners are
used. The amount of data can be reduced with the help of combiner’s
that need to be transferred across to the reducers. If the operation
performed is commutative and associative you can use your reducer
code as a combiner. The execution of combiner is not guaranteed in
Hadoop
Reach Us: Radiantits.com
Contact Us: +12105037100
9) What happens when a data node fails ?
When a data node fails
Job tracker and name node detect the failure
On the failed node all tasks are re-scheduled
Name node replicates the users data to another node
Reach Us: Radiantits.com
Contact Us: +12105037100
10) Explain what is the function of Map Reducer partitioner?
The function of Map Reducer partitioner is to make sure that all the value of a
single key goes to the same reducer, eventually which helps evenly distribution
of the map output over the reducers.
THANK YOU

More Related Content

What's hot (20)

PPTX
Map Reduce basics
Abhishek Mukherjee
 
PPTX
Introduction to MapReduce
Hassan A-j
 
PDF
A sql implementation on the map reduce framework
eldariof
 
PDF
Map Reduce
Vigen Sahakyan
 
PPTX
Introduction to Map Reduce
Apache Apex
 
PPTX
Mastering Hadoop Map Reduce - Custom Types and Other Optimizations
scottcrespo
 
DOCX
Neo4j vs giraph
Nishant Gandhi
 
PDF
Reduce Side Joins
Edureka!
 
PPT
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
Xiao Qin
 
PDF
Self adjusting slot configurations for homogeneous and heterogeneous hadoop c...
LeMeniz Infotech
 
PDF
A comparative survey based on processing network traffic data using hadoop pi...
ijcses
 
PPTX
Developing a Map Reduce Application
Dr. C.V. Suresh Babu
 
PPT
Map Reduce
Sri Prasanna
 
PDF
Paper id 25201498
IJRAT
 
PPTX
Join optimization in hive
Liyin Tang
 
PDF
[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard)
npinto
 
PPTX
Hadoop training-in-hyderabad
sreehari orienit
 
PPTX
Map reduce presentation
ateeq ateeq
 
PDF
A Big-Data Process Consigned Geographically by Employing Mapreduce Frame Work
IRJET Journal
 
Map Reduce basics
Abhishek Mukherjee
 
Introduction to MapReduce
Hassan A-j
 
A sql implementation on the map reduce framework
eldariof
 
Map Reduce
Vigen Sahakyan
 
Introduction to Map Reduce
Apache Apex
 
Mastering Hadoop Map Reduce - Custom Types and Other Optimizations
scottcrespo
 
Neo4j vs giraph
Nishant Gandhi
 
Reduce Side Joins
Edureka!
 
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
Xiao Qin
 
Self adjusting slot configurations for homogeneous and heterogeneous hadoop c...
LeMeniz Infotech
 
A comparative survey based on processing network traffic data using hadoop pi...
ijcses
 
Developing a Map Reduce Application
Dr. C.V. Suresh Babu
 
Map Reduce
Sri Prasanna
 
Paper id 25201498
IJRAT
 
Join optimization in hive
Liyin Tang
 
[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard)
npinto
 
Hadoop training-in-hyderabad
sreehari orienit
 
Map reduce presentation
ateeq ateeq
 
A Big-Data Process Consigned Geographically by Employing Mapreduce Frame Work
IRJET Journal
 

Similar to Hadoop interview questions (20)

DOCX
500 data engineering interview question.docx
aekannake
 
PPTX
Hadoop Interview Questions and Answers
MindsMapped Consulting
 
PDF
Hadoop interview questions
Kalyan Hadoop
 
PDF
50 must read hadoop interview questions & answers - whizlabs
Whizlabs
 
PPTX
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
Simplilearn
 
PDF
Hadoop 31-frequently-asked-interview-questions
Asad Masood Qazi
 
PPTX
Top Hadoop Big Data Interview Questions and Answers for Fresher
JanBask Training
 
PDF
Hadoop interview questions - Softwarequery.com
softwarequery
 
PPTX
Lec 2 & 3 _Unit 1_Hadoop _MapReduce1.pptx
ashima967262
 
PDF
Most Popular Hadoop Interview Questions and Answers
Sprintzeal
 
PDF
Hadoop
Anantha Babu A
 
PDF
250hadoopinterviewquestions
Ramana Swamy
 
PDF
Hadoop Interview Questions and Answers | Big Data Interview Questions | Hadoo...
Edureka!
 
PPT
Apache hadoop, hdfs and map reduce Overview
Nisanth Simon
 
PPTX
MapReduce1.pptx
ashimashahi1
 
PPTX
Introduction to Apache Hadoop Eco-System
Md. Hasan Basri (Angel)
 
PPTX
Distributed Systems Hadoop.pptx
Uttara University
 
ODP
Apache hadoop
sheetal sharma
 
PDF
Day 1 big data & hadoop By SoApt
Kumar Vivek
 
500 data engineering interview question.docx
aekannake
 
Hadoop Interview Questions and Answers
MindsMapped Consulting
 
Hadoop interview questions
Kalyan Hadoop
 
50 must read hadoop interview questions & answers - whizlabs
Whizlabs
 
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
Simplilearn
 
Hadoop 31-frequently-asked-interview-questions
Asad Masood Qazi
 
Top Hadoop Big Data Interview Questions and Answers for Fresher
JanBask Training
 
Hadoop interview questions - Softwarequery.com
softwarequery
 
Lec 2 & 3 _Unit 1_Hadoop _MapReduce1.pptx
ashima967262
 
Most Popular Hadoop Interview Questions and Answers
Sprintzeal
 
250hadoopinterviewquestions
Ramana Swamy
 
Hadoop Interview Questions and Answers | Big Data Interview Questions | Hadoo...
Edureka!
 
Apache hadoop, hdfs and map reduce Overview
Nisanth Simon
 
MapReduce1.pptx
ashimashahi1
 
Introduction to Apache Hadoop Eco-System
Md. Hasan Basri (Angel)
 
Distributed Systems Hadoop.pptx
Uttara University
 
Apache hadoop
sheetal sharma
 
Day 1 big data & hadoop By SoApt
Kumar Vivek
 
Ad

Recently uploaded (20)

PDF
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - GLOBAL SUCCESS - CẢ NĂM - NĂM 2024 (VOCABULARY, ...
Nguyen Thanh Tu Collection
 
PPTX
Unit 2 COMMERCIAL BANKING, Corporate banking.pptx
AnubalaSuresh1
 
PPTX
2025 Winter SWAYAM NPTEL & A Student.pptx
Utsav Yagnik
 
PDF
The Different Types of Non-Experimental Research
Thelma Villaflores
 
PPTX
How to Convert an Opportunity into a Quotation in Odoo 18 CRM
Celine George
 
PPT
Talk on Critical Theory, Part One, Philosophy of Social Sciences
Soraj Hongladarom
 
PPTX
How to Set Up Tags in Odoo 18 - Odoo Slides
Celine George
 
PPTX
MENINGITIS: NURSING MANAGEMENT, BACTERIAL MENINGITIS, VIRAL MENINGITIS.pptx
PRADEEP ABOTHU
 
PPTX
How to Handle Salesperson Commision in Odoo 18 Sales
Celine George
 
PDF
DIGESTION OF CARBOHYDRATES,PROTEINS,LIPIDS
raviralanaresh2
 
PDF
Chapter-V-DED-Entrepreneurship: Institutions Facilitating Entrepreneurship
Dayanand Huded
 
PDF
community health nursing question paper 2.pdf
Prince kumar
 
PDF
Stokey: A Jewish Village by Rachel Kolsky
History of Stoke Newington
 
PDF
Exploring the Different Types of Experimental Research
Thelma Villaflores
 
PDF
Reconstruct, Restore, Reimagine: New Perspectives on Stoke Newington’s Histor...
History of Stoke Newington
 
PPTX
How to Manage Large Scrollbar in Odoo 18 POS
Celine George
 
PPTX
grade 5 lesson matatag ENGLISH 5_Q1_PPT_WEEK4.pptx
SireQuinn
 
PPTX
Growth and development and milestones, factors
BHUVANESHWARI BADIGER
 
PDF
CONCURSO DE POESIA “POETUFAS – PASSOS SUAVES PELO VERSO.pdf
Colégio Santa Teresinha
 
PDF
ARAL_Orientation_Day-2-Sessions_ARAL-Readung ARAL-Mathematics ARAL-Sciencev2.pdf
JoelVilloso1
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - GLOBAL SUCCESS - CẢ NĂM - NĂM 2024 (VOCABULARY, ...
Nguyen Thanh Tu Collection
 
Unit 2 COMMERCIAL BANKING, Corporate banking.pptx
AnubalaSuresh1
 
2025 Winter SWAYAM NPTEL & A Student.pptx
Utsav Yagnik
 
The Different Types of Non-Experimental Research
Thelma Villaflores
 
How to Convert an Opportunity into a Quotation in Odoo 18 CRM
Celine George
 
Talk on Critical Theory, Part One, Philosophy of Social Sciences
Soraj Hongladarom
 
How to Set Up Tags in Odoo 18 - Odoo Slides
Celine George
 
MENINGITIS: NURSING MANAGEMENT, BACTERIAL MENINGITIS, VIRAL MENINGITIS.pptx
PRADEEP ABOTHU
 
How to Handle Salesperson Commision in Odoo 18 Sales
Celine George
 
DIGESTION OF CARBOHYDRATES,PROTEINS,LIPIDS
raviralanaresh2
 
Chapter-V-DED-Entrepreneurship: Institutions Facilitating Entrepreneurship
Dayanand Huded
 
community health nursing question paper 2.pdf
Prince kumar
 
Stokey: A Jewish Village by Rachel Kolsky
History of Stoke Newington
 
Exploring the Different Types of Experimental Research
Thelma Villaflores
 
Reconstruct, Restore, Reimagine: New Perspectives on Stoke Newington’s Histor...
History of Stoke Newington
 
How to Manage Large Scrollbar in Odoo 18 POS
Celine George
 
grade 5 lesson matatag ENGLISH 5_Q1_PPT_WEEK4.pptx
SireQuinn
 
Growth and development and milestones, factors
BHUVANESHWARI BADIGER
 
CONCURSO DE POESIA “POETUFAS – PASSOS SUAVES PELO VERSO.pdf
Colégio Santa Teresinha
 
ARAL_Orientation_Day-2-Sessions_ARAL-Readung ARAL-Mathematics ARAL-Sciencev2.pdf
JoelVilloso1
 
Ad

Hadoop interview questions

  • 1. HADOOP INTERVIEW QUESTIONS Reach Us: Radiantits.com Contact Us: +12105037100
  • 2. 1) What is Hadoop Map Reduce ? For processing large data sets in parallel across a hadoop cluster, Hadoop MapReduce framework is used. Data analysis uses a two-step map and reduce process. Reach Us: Radiantits.com Contact Us: +12105037100
  • 3. 2) How Hadoop MapReduce works? In MapReduce, during the map phase it counts the words in each document, while in the reduce phase it aggregates the data as per the document spanning the entire collection. During the map phase the input data is divided into splits for analysis by map tasks running in parallel across Hadoop framework. Reach Us: Radiantits.com Contact Us: +12105037100
  • 4. 3) Explain what is shuffling in MapReduce ? The process by which the system performs the sort and transfers the map outputs to the reducer as inputs is known as the shuffle. Reach Us: Radiantits.com Contact Us: +12105037100
  • 5. 4) Explain what is distributed Cache in MapReduce Framework ? Distributed Cache is an important feature provided by map reduce framework. When you want to share some files across all nodes in Hadoop Cluster, DistributedCache is used. The files could be an executable jar files or simple properties file. Reach Us: Radiantits.com Contact Us: +12105037100
  • 6. 5) Explain what is NameNode in Hadoop? NameNode in Hadoop is the node, where Hadoop stores all the file location information in HDFS (Hadoop Distributed File System). In other words, NameNode is the centrepiece of an HDFS file system. It keeps the record of all the files in the file system, and tracks the file data across the cluster or multiple machines. Reach Us: Radiantits.com Contact Us: +12105037100
  • 7. 7) Explain what is heartbeat in HDFS? Heartbeat is referred to a signal used between a data node and Name node, and between task tracker and job tracker, if the Name node or job tracker does not respond to the signal, then it is considered there is some issues with data node or task tracker Reach Us: Radiantits.com Contact Us: +12105037100
  • 8. 8) Explain what combiners is and when you should use a combiner in a MapReduce Job? To increase the efficiency of MapReduce Program, Combiners are used. The amount of data can be reduced with the help of combiner’s that need to be transferred across to the reducers. If the operation performed is commutative and associative you can use your reducer code as a combiner. The execution of combiner is not guaranteed in Hadoop Reach Us: Radiantits.com Contact Us: +12105037100
  • 9. 9) What happens when a data node fails ? When a data node fails Job tracker and name node detect the failure On the failed node all tasks are re-scheduled Name node replicates the users data to another node Reach Us: Radiantits.com Contact Us: +12105037100
  • 10. 10) Explain what is the function of Map Reducer partitioner? The function of Map Reducer partitioner is to make sure that all the value of a single key goes to the same reducer, eventually which helps evenly distribution of the map output over the reducers.