SlideShare a Scribd company logo
2
Most read
5
Most read
7
Most read
Name:Aman Adhikari
Email: adhikariaman01@gmail.com
 Machine Learning , a branch of AI, is about
construction and study of system that can
learn from existing data.
It is used in field like:
Information retrieval
Identify key topics in large collections of text
Biology
Linear Algebra etc.
 An Apache Software Foundation project to
create scalable machine learning libraries
under the Apache Software License.
WHY MAHOUT ?
Many Open Source Machine Learning libraries either:
 Lack Community
 Lack Documentation and Examples
 Lack Scalability
 Lack the Apache License
 Or are not research-oriented
 Began life at 2008 as sub project of Apache
Lucene (search, text mining- API).
 Lucene commiter felt it to include as
separate project and mahout absorbed Taste
collaborative filtering project.
 At April 2010, Mahout became top level
apache project
 Google News sees about 3.5 million new
news articles per day and clustered with
other articles in minutes to deliver timely.
Other eg. Picasa.
 Mahout makes use of hadoop.
 Some algorithms won’t scale to massive machine
clusters but map-reduce framework like apache
hadoop do.
 Mahout convert algorithm to work at scale on top
of Hadoop.
 Recommender engines (Collaborative
Filtering)
 Clustering
 Classification
 Extensive framework for collaborative
filtering.
 Recommenders:
-- User Based
-- Item Based
 Online and Offline support
-- Offline can utilize hadoop
 Used by Amazon , Facebook etc.
Introduction to Apache Mahout
 Clustering techniques attempt to group a
large number of things together into clusters
that share some similarity.
 K-means , Fuzzy K-means
 Summly app also summarize similar stories
from different news site and gives a brief
news on that app.(concept of Google news)
 Classification techniques decide how much a
thing is or isn’t part of some type or
category, or how much it does or doesn’t
have some attribute.
 Example:
-- Yahoo Mail spam checker
-- Facebook face detection
 Mahout is young ,open source , scalable
machine learning library from apache
 Its technique are no longer theory instead
deployed to solve in real world like e-
commerce, video , picture etc.
 Scalability being the major issue Hadoop is
on rescue.
Introduction to Apache Mahout
Introduction to Apache Mahout

More Related Content

What's hot (20)

PDF
Llama-index
Denis973830
 
PPTX
How to fine-tune and develop your own large language model.pptx
Knoldus Inc.
 
PDF
Multimodal Deep Learning
Universitat Politècnica de Catalunya
 
PPTX
PPT on Hadoop
Shubham Parmar
 
PPTX
Big data and Hadoop
Rahul Agarwal
 
PPTX
Deep neural networks
Si Haem
 
PDF
LLMs Bootcamp
Fiza987241
 
PPTX
HADOOP TECHNOLOGY ppt
sravya raju
 
PDF
And then there were ... Large Language Models
Leon Dohmen
 
PDF
Big Data technology Landscape
ShivanandaVSeeri
 
PDF
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
Sebastian Raschka
 
PPTX
Ensemble learning
Haris Jamil
 
PPTX
Pig latin
Sadiq Basha
 
PDF
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
Cynthia Saracco
 
PDF
Machine Learning and its Applications
Dr Ganesh Iyer
 
PDF
Hadoop YARN
Vigen Sahakyan
 
PDF
Machine Learning using Apache Spark MLlib
IMC Institute
 
PDF
Real World End to End machine Learning Pipeline
Srivatsan Srinivasan
 
PPTX
A Comprehensive Review of Large Language Models for.pptx
SaiPragnaKancheti
 
PDF
What is Machine Learning | Introduction to Machine Learning | Machine Learnin...
Simplilearn
 
Llama-index
Denis973830
 
How to fine-tune and develop your own large language model.pptx
Knoldus Inc.
 
Multimodal Deep Learning
Universitat Politècnica de Catalunya
 
PPT on Hadoop
Shubham Parmar
 
Big data and Hadoop
Rahul Agarwal
 
Deep neural networks
Si Haem
 
LLMs Bootcamp
Fiza987241
 
HADOOP TECHNOLOGY ppt
sravya raju
 
And then there were ... Large Language Models
Leon Dohmen
 
Big Data technology Landscape
ShivanandaVSeeri
 
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
Sebastian Raschka
 
Ensemble learning
Haris Jamil
 
Pig latin
Sadiq Basha
 
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
Cynthia Saracco
 
Machine Learning and its Applications
Dr Ganesh Iyer
 
Hadoop YARN
Vigen Sahakyan
 
Machine Learning using Apache Spark MLlib
IMC Institute
 
Real World End to End machine Learning Pipeline
Srivatsan Srinivasan
 
A Comprehensive Review of Large Language Models for.pptx
SaiPragnaKancheti
 
What is Machine Learning | Introduction to Machine Learning | Machine Learnin...
Simplilearn
 

Viewers also liked (17)

PPTX
Machine Learning and Apache Mahout : An Introduction
Varad Meru
 
PPTX
Intro to Apache Mahout
Grant Ingersoll
 
PPTX
Mahout and Recommendations
Ted Dunning
 
PDF
Scientific Article Recommendation with Mahout
Kris Jack
 
PPTX
Apache Mahout
Ajit Koti
 
PPTX
Introduction to Mahout
Ted Dunning
 
PDF
Mahout classification presentation
Naoki Nakatani
 
PPTX
Biometric Databases and Hadoop__HadoopSummit2010
Yahoo Developer Network
 
KEY
Machine Learning with Apache Mahout
Daniel Glauser
 
PDF
Big Data Analytics using Mahout
IMC Institute
 
PDF
Yahoo! Hadoop User Group - May 2010 Meetup - Apache Hadoop Release Plans for ...
Hadoop User Group
 
PDF
Apache Mahout Tutorial - Recommendation - 2013/2014
Cataldo Musto
 
PPTX
Yahoo! Mail antispam - Bay area Hadoop user group
Hadoop User Group
 
PDF
Apache Mahout
Save Manos
 
PDF
Tutorial Mahout - Recommendation
Cataldo Musto
 
PPTX
Apache Mahout 於電子商務的應用
James Chen
 
PDF
Introduction to Mahout and Machine Learning
Varad Meru
 
Machine Learning and Apache Mahout : An Introduction
Varad Meru
 
Intro to Apache Mahout
Grant Ingersoll
 
Mahout and Recommendations
Ted Dunning
 
Scientific Article Recommendation with Mahout
Kris Jack
 
Apache Mahout
Ajit Koti
 
Introduction to Mahout
Ted Dunning
 
Mahout classification presentation
Naoki Nakatani
 
Biometric Databases and Hadoop__HadoopSummit2010
Yahoo Developer Network
 
Machine Learning with Apache Mahout
Daniel Glauser
 
Big Data Analytics using Mahout
IMC Institute
 
Yahoo! Hadoop User Group - May 2010 Meetup - Apache Hadoop Release Plans for ...
Hadoop User Group
 
Apache Mahout Tutorial - Recommendation - 2013/2014
Cataldo Musto
 
Yahoo! Mail antispam - Bay area Hadoop user group
Hadoop User Group
 
Apache Mahout
Save Manos
 
Tutorial Mahout - Recommendation
Cataldo Musto
 
Apache Mahout 於電子商務的應用
James Chen
 
Introduction to Mahout and Machine Learning
Varad Meru
 
Ad

Similar to Introduction to Apache Mahout (20)

KEY
Machine Learning & Apache Mahout
Domingo Suarez Torres
 
DOC
Download Materials
butest
 
PDF
SDEC2011 Mahout - the what, the how and the why
Korea Sdec
 
PPTX
mahout introduction
changgeng Zhang
 
PDF
OSCON: Apache Mahout - Mammoth Scale Machine Learning
Robin Anil
 
PDF
Artificial Intelligence Layer: Mahout, MLLib, and other projects
Victor Sanchez Anguix
 
PPTX
Intro to Mahout
Uri Lavi
 
PDF
Mahout and Distributed Machine Learning 101
John Ternent
 
PPTX
Hadoop World 2011: Data Mining in Hadoop, Making Sense of it in Mahout! - Mic...
Cloudera, Inc.
 
PDF
Mahout Tutorial and Hands-on (version 2015)
Cataldo Musto
 
PPTX
Intro to Mahout -- DC Hadoop
Grant Ingersoll
 
PDF
Apache mahout - introduction
Jackson dos Santos Olveira
 
PPTX
AEM integration with Apache Mahout
Ankit Gubrani
 
PPTX
AEM integration with Apache Mahout
Rima Mittal
 
PDF
Mahout tutorial
Ashoka Vanjare
 
PDF
Test Presentation
Prafulla Kiran
 
PPTX
Apache Mahout: Driving the Yellow Elephant
Grant Ingersoll
 
PPTX
MahoutNew
Rahul Reghunath
 
PPTX
Apache mahout and R-mining complex dataobject
sakthibalabalamuruga
 
Machine Learning & Apache Mahout
Domingo Suarez Torres
 
Download Materials
butest
 
SDEC2011 Mahout - the what, the how and the why
Korea Sdec
 
mahout introduction
changgeng Zhang
 
OSCON: Apache Mahout - Mammoth Scale Machine Learning
Robin Anil
 
Artificial Intelligence Layer: Mahout, MLLib, and other projects
Victor Sanchez Anguix
 
Intro to Mahout
Uri Lavi
 
Mahout and Distributed Machine Learning 101
John Ternent
 
Hadoop World 2011: Data Mining in Hadoop, Making Sense of it in Mahout! - Mic...
Cloudera, Inc.
 
Mahout Tutorial and Hands-on (version 2015)
Cataldo Musto
 
Intro to Mahout -- DC Hadoop
Grant Ingersoll
 
Apache mahout - introduction
Jackson dos Santos Olveira
 
AEM integration with Apache Mahout
Ankit Gubrani
 
AEM integration with Apache Mahout
Rima Mittal
 
Mahout tutorial
Ashoka Vanjare
 
Test Presentation
Prafulla Kiran
 
Apache Mahout: Driving the Yellow Elephant
Grant Ingersoll
 
MahoutNew
Rahul Reghunath
 
Apache mahout and R-mining complex dataobject
sakthibalabalamuruga
 
Ad

More from Aman Adhikari (20)

PDF
Algorithmic Toolbox Certificate from Coursera for Aman Adhikari
Aman Adhikari
 
PPS
Vp all slides
Aman Adhikari
 
PPS
Mca se chapter_9_formal_methods
Aman Adhikari
 
PPS
Mca se chapter_07_software_validation
Aman Adhikari
 
PDF
Mca 1st & 2nd final
Aman Adhikari
 
PPTX
Software testing
Aman Adhikari
 
PPTX
Software requirement and specification
Aman Adhikari
 
PPTX
Software quality assurance
Aman Adhikari
 
PPTX
Software project plannings
Aman Adhikari
 
PPTX
Software requirement and specification
Aman Adhikari
 
PPTX
Software project plannings
Aman Adhikari
 
PDF
Software engineering mca
Aman Adhikari
 
PPTX
Software ee1
Aman Adhikari
 
PPTX
Software ee111
Aman Adhikari
 
PPTX
Research problem unit2 supplementary
Aman Adhikari
 
PPTX
Research methodology unit i
Aman Adhikari
 
PPTX
Research methodology unit6
Aman Adhikari
 
PPTX
Research methodology – unit5
Aman Adhikari
 
PPTX
Research methodology – unit 9
Aman Adhikari
 
PPTX
Research methodology – unit 4
Aman Adhikari
 
Algorithmic Toolbox Certificate from Coursera for Aman Adhikari
Aman Adhikari
 
Vp all slides
Aman Adhikari
 
Mca se chapter_9_formal_methods
Aman Adhikari
 
Mca se chapter_07_software_validation
Aman Adhikari
 
Mca 1st & 2nd final
Aman Adhikari
 
Software testing
Aman Adhikari
 
Software requirement and specification
Aman Adhikari
 
Software quality assurance
Aman Adhikari
 
Software project plannings
Aman Adhikari
 
Software requirement and specification
Aman Adhikari
 
Software project plannings
Aman Adhikari
 
Software engineering mca
Aman Adhikari
 
Software ee1
Aman Adhikari
 
Software ee111
Aman Adhikari
 
Research problem unit2 supplementary
Aman Adhikari
 
Research methodology unit i
Aman Adhikari
 
Research methodology unit6
Aman Adhikari
 
Research methodology – unit5
Aman Adhikari
 
Research methodology – unit 9
Aman Adhikari
 
Research methodology – unit 4
Aman Adhikari
 

Recently uploaded (20)

PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
PDF
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
PPTX
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
PDF
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
PPTX
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
PDF
Biography of Daniel Podor.pdf
Daniel Podor
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PDF
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
DOCX
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
PDF
Staying Human in a Machine- Accelerated World
Catalin Jora
 
PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
Biography of Daniel Podor.pdf
Daniel Podor
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
Staying Human in a Machine- Accelerated World
Catalin Jora
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 

Introduction to Apache Mahout

  • 2.  Machine Learning , a branch of AI, is about construction and study of system that can learn from existing data. It is used in field like: Information retrieval Identify key topics in large collections of text Biology Linear Algebra etc.
  • 3.  An Apache Software Foundation project to create scalable machine learning libraries under the Apache Software License. WHY MAHOUT ? Many Open Source Machine Learning libraries either:  Lack Community  Lack Documentation and Examples  Lack Scalability  Lack the Apache License  Or are not research-oriented
  • 4.  Began life at 2008 as sub project of Apache Lucene (search, text mining- API).  Lucene commiter felt it to include as separate project and mahout absorbed Taste collaborative filtering project.  At April 2010, Mahout became top level apache project
  • 5.  Google News sees about 3.5 million new news articles per day and clustered with other articles in minutes to deliver timely. Other eg. Picasa.  Mahout makes use of hadoop.  Some algorithms won’t scale to massive machine clusters but map-reduce framework like apache hadoop do.  Mahout convert algorithm to work at scale on top of Hadoop.
  • 6.  Recommender engines (Collaborative Filtering)  Clustering  Classification
  • 7.  Extensive framework for collaborative filtering.  Recommenders: -- User Based -- Item Based  Online and Offline support -- Offline can utilize hadoop  Used by Amazon , Facebook etc.
  • 9.  Clustering techniques attempt to group a large number of things together into clusters that share some similarity.  K-means , Fuzzy K-means  Summly app also summarize similar stories from different news site and gives a brief news on that app.(concept of Google news)
  • 10.  Classification techniques decide how much a thing is or isn’t part of some type or category, or how much it does or doesn’t have some attribute.  Example: -- Yahoo Mail spam checker -- Facebook face detection
  • 11.  Mahout is young ,open source , scalable machine learning library from apache  Its technique are no longer theory instead deployed to solve in real world like e- commerce, video , picture etc.  Scalability being the major issue Hadoop is on rescue.