SlideShare a Scribd company logo
POORNIMA INSTITUTE OF ENGINEERING & 
TECHNOLOGY, JAIPUR 
DEPARTMENT OF COMPUTER ENGINEERING 
A 
PRACTICAL TRAINING PRESENTATION 
ON 
BIG DATA HADOOP 
SESSION 2014 – 15 
Presented By: Guided By: 
Ashutosh Tiwari Dr. E.S. Pilli 
CE/11/083 Assistant Professor 
Ashok Rayal CS, Department 
CE/11/025 MNIT, Jaipur.
Topics 
1. Organization Details 
2. Training Details 
3. Technology Specification 
4. Project Summary 
5. Snapshots 
6. Conclusion
ORGANIZATION PROFILE 
 Name-Malviya National Institute of Techonology, Jaipur 
 MNIT, Jaipur is one of 30 national institutes of technology in 
India. 
 MNIT, established in 1963 inspired by Pt. Madan Mohan 
Malviya. 
 The institute's director is I. K. Bhat and the chairman of the 
board of Governors is Dr. K. K. Aggarwal. 
 Organization’s contacts: 
Email : espilli.cse@mnit.ac.in 
Website : www.mnit.ac.in
Training Details 
 Start Date: 28/05/2014 
 Last Date: 9/07/2014 
 No. Of Days: 45(30+15). 
 Timing: 9 AM to 5 PM 
 Our training at MNIT were broadly divided into three phases: 
o Case study of Hadoop and related papers (first 30 
days). 
o Hadoop cluster making (first 30 days). 
o Implementation of Near Duplicate Detection Using 
Hadoop MapReduce (last 15 days).
ABOUT PROJECT 
Near Duplicate Detection: 
 Comparative analysis of millions documents exist in network 
jargon to find similar document based on a predefined 
threshold value. 
 Near duplicate detection is essentially used in web crawls and 
many others data mining tasks.
TECHNOLOGY SPECIFICATION 
OF PROJECT 
Project: Near Duplicate Detection 
Technology Used: 
 Hadoop 
 Map Reduce 
 HDFS 
 SSH and Shell Scripting 
 Java
SNAPSHOTS-HDFS
SNAPSHOTS-MAPREDUCE 
PROCESSING
SNAPSHOTS-OUTPUT
CONCLUSION 
 Training in big data helped us to know what is the crazy trend 
in IT industries and how technology is becoming more fruitful 
to human development. 
 Big Data is the future. Currently A lot of research is going on 
in this field. As data is increasing at faster rate thus there is a 
huge need of such tools and technology which can handle it. 
 Hadoop is the most emerging framework used by most of big 
firms like Facebook, Microsoft, IBM, Yahoo, Amazon and 
lots of other more. 
 Our experience at MNIT, was absolutely awesome as it has 
given as the platform and support for our tasks and case study.
Presentation on Big Data Hadoop (Summer Training Demo)
Presentation on Big Data Hadoop (Summer Training Demo)
Presentation on Big Data Hadoop (Summer Training Demo)

More Related Content

PPTX
Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...
Ashok Royal
 
PDF
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce
Mahantesh Angadi
 
PPTX
Hadoop
Anil Reddy
 
PPTX
Whatisbigdataandwhylearnhadoop
Edureka!
 
PPTX
The Future of Data Science
sarith divakar
 
PPTX
Introduction of Big data and Hadoop
Arohi Khandelwal
 
PDF
Hadoop,Big Data Analytics and More
Trendwise Analytics
 
PDF
Bigdata and Hadoop Bootcamp
Spotle.ai
 
Detailed presentation on big data hadoop +Hadoop Project Near Duplicate Detec...
Ashok Royal
 
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce
Mahantesh Angadi
 
Hadoop
Anil Reddy
 
Whatisbigdataandwhylearnhadoop
Edureka!
 
The Future of Data Science
sarith divakar
 
Introduction of Big data and Hadoop
Arohi Khandelwal
 
Hadoop,Big Data Analytics and More
Trendwise Analytics
 
Bigdata and Hadoop Bootcamp
Spotle.ai
 

What's hot (20)

PDF
Big data Big Analytics
Ajay Ohri
 
PDF
Rob peglar introduction_analytics _big data_hadoop
Ghassan Al-Yafie
 
PDF
Big Data Final Presentation
17aroumougamh
 
PDF
Introduction To Big Data Analytics On Hadoop - SpringPeople
SpringPeople
 
PDF
Guest Lecture: Introduction to Big Data at Indian Institute of Technology
Nishant Gandhi
 
PDF
Big data technologies and Hadoop infrastructure
Roman Nikitchenko
 
PDF
BIG DATA
Dr. Shashank Shetty
 
PPTX
Big Data Course - BigData HUB
Ahmed Salman
 
PDF
Big Data with Hadoop – For Data Management, Processing and Storing
IRJET Journal
 
PDF
Big Data simplified
Praveen Hanchinal
 
DOCX
Big data abstract
nandhiniarumugam619
 
PPTX
Introduction to BIg Data and Hadoop
Amir Shaikh
 
PPTX
Big data ppt
Shweta Sahu
 
PDF
Introduction to Big data with Hadoop & Spark | Big Data Hadoop Spark Tutorial...
CloudxLab
 
PPTX
Big Data - An Overview
Arvind Kalyan
 
PPT
Big Data: An Overview
C. Scyphers
 
PDF
Big Data: an introduction
Bart Vandewoestyne
 
PPT
BigData Analytics with Hadoop and BIRT
Amrit Chhetri
 
PPT
Big data analytics, survey r.nabati
nabati
 
PPTX
Big Data Analytics with Hadoop
Philippe Julio
 
Big data Big Analytics
Ajay Ohri
 
Rob peglar introduction_analytics _big data_hadoop
Ghassan Al-Yafie
 
Big Data Final Presentation
17aroumougamh
 
Introduction To Big Data Analytics On Hadoop - SpringPeople
SpringPeople
 
Guest Lecture: Introduction to Big Data at Indian Institute of Technology
Nishant Gandhi
 
Big data technologies and Hadoop infrastructure
Roman Nikitchenko
 
Big Data Course - BigData HUB
Ahmed Salman
 
Big Data with Hadoop – For Data Management, Processing and Storing
IRJET Journal
 
Big Data simplified
Praveen Hanchinal
 
Big data abstract
nandhiniarumugam619
 
Introduction to BIg Data and Hadoop
Amir Shaikh
 
Big data ppt
Shweta Sahu
 
Introduction to Big data with Hadoop & Spark | Big Data Hadoop Spark Tutorial...
CloudxLab
 
Big Data - An Overview
Arvind Kalyan
 
Big Data: An Overview
C. Scyphers
 
Big Data: an introduction
Bart Vandewoestyne
 
BigData Analytics with Hadoop and BIRT
Amrit Chhetri
 
Big data analytics, survey r.nabati
nabati
 
Big Data Analytics with Hadoop
Philippe Julio
 
Ad

Viewers also liked (9)

PPTX
Big data and hadoop
Mohit Tare
 
PDF
Introduction to Big Data and Hadoop
Febiyan Rachman
 
PDF
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Mahantesh Angadi
 
PDF
Social Big Data in Government
Adegboyega Ojo
 
PPTX
On Big Data
arttan2001
 
PPT
Deployment and Management of Hadoop Clusters
Amal G Jose
 
PPTX
Big Data & Hadoop Tutorial
Edureka!
 
PPTX
Big data and Hadoop
Rahul Agarwal
 
PPTX
Big data ppt
Nasrin Hussain
 
Big data and hadoop
Mohit Tare
 
Introduction to Big Data and Hadoop
Febiyan Rachman
 
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Mahantesh Angadi
 
Social Big Data in Government
Adegboyega Ojo
 
On Big Data
arttan2001
 
Deployment and Management of Hadoop Clusters
Amal G Jose
 
Big Data & Hadoop Tutorial
Edureka!
 
Big data and Hadoop
Rahul Agarwal
 
Big data ppt
Nasrin Hussain
 
Ad

Similar to Presentation on Big Data Hadoop (Summer Training Demo) (20)

DOCX
Big data processing using - Hadoop Technology
Shital Kat
 
PPTX
Large-scale Parallel Collaborative Filtering and Clustering using MapReduce f...
Varad Meru
 
PPTX
big data and hadoop
Shamama Kamal
 
PPTX
A Glimpse of Bigdata - Introduction
saisreealekhya
 
PPTX
Big data Hadoop presentation
Shivanee garg
 
PDF
A Survey on Approaches for Frequent Item Set Mining on Apache Hadoop
IJTET Journal
 
PPTX
Hadoop
ABHIJEET RAJ
 
PPT
Hadoop
Gagan Agrawal
 
PPTX
Big data analytics - hadoop
Vishwajeet Jadeja
 
PDF
Distributed Computing with Apache Hadoop: Technology Overview
Konstantin V. Shvachko
 
PDF
Keynote from ApacheCon NA 2011
Hortonworks
 
PDF
∂u∂u Multi-Tenanted Framework: Distributed Near Duplicate Detection for Big Data
Pradeeban Kathiravelu, Ph.D.
 
PPTX
Bigdata and hadoop
Aditi Yadav
 
PPTX
Introduction to HADOOP
Shital Kat
 
PDF
Hadoop Overview & Architecture
EMC
 
PDF
Apache hadoop bigdata-in-banking
m_hepburn
 
PDF
Hadoop Overview kdd2011
Milind Bhandarkar
 
PDF
Big_data_1674238705.ppt is a basic background
NidhiAhuja30
 
PPTX
Presentation1
Atul Singh
 
PDF
Using MongoDB + Hadoop Together
MongoDB
 
Big data processing using - Hadoop Technology
Shital Kat
 
Large-scale Parallel Collaborative Filtering and Clustering using MapReduce f...
Varad Meru
 
big data and hadoop
Shamama Kamal
 
A Glimpse of Bigdata - Introduction
saisreealekhya
 
Big data Hadoop presentation
Shivanee garg
 
A Survey on Approaches for Frequent Item Set Mining on Apache Hadoop
IJTET Journal
 
Hadoop
ABHIJEET RAJ
 
Big data analytics - hadoop
Vishwajeet Jadeja
 
Distributed Computing with Apache Hadoop: Technology Overview
Konstantin V. Shvachko
 
Keynote from ApacheCon NA 2011
Hortonworks
 
∂u∂u Multi-Tenanted Framework: Distributed Near Duplicate Detection for Big Data
Pradeeban Kathiravelu, Ph.D.
 
Bigdata and hadoop
Aditi Yadav
 
Introduction to HADOOP
Shital Kat
 
Hadoop Overview & Architecture
EMC
 
Apache hadoop bigdata-in-banking
m_hepburn
 
Hadoop Overview kdd2011
Milind Bhandarkar
 
Big_data_1674238705.ppt is a basic background
NidhiAhuja30
 
Presentation1
Atul Singh
 
Using MongoDB + Hadoop Together
MongoDB
 

Recently uploaded (20)

PDF
SUMMER INTERNSHIP REPORT[1] (AutoRecovered) (6) (1).pdf
pandeydiksha814
 
PDF
TIC ACTIVIDAD 1geeeeeeeeeeeeeeeeeeeeeeeeeeeeeer3.pdf
Thais Ruiz
 
PPTX
World-population.pptx fire bunberbpeople
umutunsalnsl4402
 
PDF
WISE main accomplishments for ISQOLS award July 2025.pdf
StatsCommunications
 
PPTX
Fuzzy_Membership_Functions_Presentation.pptx
pythoncrazy2024
 
PPTX
Databricks-DE-Associate Certification Questions-june-2024.pptx
pedelli41
 
PPTX
Pipeline Automatic Leak Detection for Water Distribution Systems
Sione Palu
 
PDF
Key_Statistical_Techniques_in_Analytics_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
PPTX
short term project on AI Driven Data Analytics
JMJCollegeComputerde
 
PPTX
Fluvial_Civilizations_Presentation (1).pptx
alisslovemendoza7
 
PPTX
The whitetiger novel review for collegeassignment.pptx
DhruvPatel754154
 
PPTX
Web dev -ppt that helps us understand web technology
shubhragoyal12
 
PDF
202501214233242351219 QASS Session 2.pdf
lauramejiamillan
 
PPTX
Introduction to computer chapter one 2017.pptx
mensunmarley
 
PPTX
Probability systematic sampling methods.pptx
PrakashRajput19
 
PDF
blockchain123456789012345678901234567890
tanvikhunt1003
 
PPTX
Future_of_AI_Presentation for everyone.pptx
boranamanju07
 
PPTX
INFO8116 -Big data architecture and analytics
guddipatel10
 
PPTX
Blue and Dark Blue Modern Technology Presentation.pptx
ap177979
 
PDF
The_Future_of_Data_Analytics_by_CA_Suvidha_Chaplot_UPDATED.pdf
CA Suvidha Chaplot
 
SUMMER INTERNSHIP REPORT[1] (AutoRecovered) (6) (1).pdf
pandeydiksha814
 
TIC ACTIVIDAD 1geeeeeeeeeeeeeeeeeeeeeeeeeeeeeer3.pdf
Thais Ruiz
 
World-population.pptx fire bunberbpeople
umutunsalnsl4402
 
WISE main accomplishments for ISQOLS award July 2025.pdf
StatsCommunications
 
Fuzzy_Membership_Functions_Presentation.pptx
pythoncrazy2024
 
Databricks-DE-Associate Certification Questions-june-2024.pptx
pedelli41
 
Pipeline Automatic Leak Detection for Water Distribution Systems
Sione Palu
 
Key_Statistical_Techniques_in_Analytics_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
short term project on AI Driven Data Analytics
JMJCollegeComputerde
 
Fluvial_Civilizations_Presentation (1).pptx
alisslovemendoza7
 
The whitetiger novel review for collegeassignment.pptx
DhruvPatel754154
 
Web dev -ppt that helps us understand web technology
shubhragoyal12
 
202501214233242351219 QASS Session 2.pdf
lauramejiamillan
 
Introduction to computer chapter one 2017.pptx
mensunmarley
 
Probability systematic sampling methods.pptx
PrakashRajput19
 
blockchain123456789012345678901234567890
tanvikhunt1003
 
Future_of_AI_Presentation for everyone.pptx
boranamanju07
 
INFO8116 -Big data architecture and analytics
guddipatel10
 
Blue and Dark Blue Modern Technology Presentation.pptx
ap177979
 
The_Future_of_Data_Analytics_by_CA_Suvidha_Chaplot_UPDATED.pdf
CA Suvidha Chaplot
 

Presentation on Big Data Hadoop (Summer Training Demo)

  • 1. POORNIMA INSTITUTE OF ENGINEERING & TECHNOLOGY, JAIPUR DEPARTMENT OF COMPUTER ENGINEERING A PRACTICAL TRAINING PRESENTATION ON BIG DATA HADOOP SESSION 2014 – 15 Presented By: Guided By: Ashutosh Tiwari Dr. E.S. Pilli CE/11/083 Assistant Professor Ashok Rayal CS, Department CE/11/025 MNIT, Jaipur.
  • 2. Topics 1. Organization Details 2. Training Details 3. Technology Specification 4. Project Summary 5. Snapshots 6. Conclusion
  • 3. ORGANIZATION PROFILE  Name-Malviya National Institute of Techonology, Jaipur  MNIT, Jaipur is one of 30 national institutes of technology in India.  MNIT, established in 1963 inspired by Pt. Madan Mohan Malviya.  The institute's director is I. K. Bhat and the chairman of the board of Governors is Dr. K. K. Aggarwal.  Organization’s contacts: Email : [email protected] Website : www.mnit.ac.in
  • 4. Training Details  Start Date: 28/05/2014  Last Date: 9/07/2014  No. Of Days: 45(30+15).  Timing: 9 AM to 5 PM  Our training at MNIT were broadly divided into three phases: o Case study of Hadoop and related papers (first 30 days). o Hadoop cluster making (first 30 days). o Implementation of Near Duplicate Detection Using Hadoop MapReduce (last 15 days).
  • 5. ABOUT PROJECT Near Duplicate Detection:  Comparative analysis of millions documents exist in network jargon to find similar document based on a predefined threshold value.  Near duplicate detection is essentially used in web crawls and many others data mining tasks.
  • 6. TECHNOLOGY SPECIFICATION OF PROJECT Project: Near Duplicate Detection Technology Used:  Hadoop  Map Reduce  HDFS  SSH and Shell Scripting  Java
  • 10. CONCLUSION  Training in big data helped us to know what is the crazy trend in IT industries and how technology is becoming more fruitful to human development.  Big Data is the future. Currently A lot of research is going on in this field. As data is increasing at faster rate thus there is a huge need of such tools and technology which can handle it.  Hadoop is the most emerging framework used by most of big firms like Facebook, Microsoft, IBM, Yahoo, Amazon and lots of other more.  Our experience at MNIT, was absolutely awesome as it has given as the platform and support for our tasks and case study.