SlideShare a Scribd company logo
Machine Learning ICS 178 Instructor: Max Welling
What is Expected? Class Homework/Projects (40%) Midterm (20%) Final (40%) For the projects, students should make teams. This class needs your active participation: please ask questions and  participate in discussions (there is no such thing as a dumb question).
Syllabus 1: Introduction: overview, examples, goals, probability, conditional independence, matrices, eigenvalue decompositions 2: Optimization and Data Visualization: Stochastic gradient descent, coordinate descent, centering, sphering, histograms, scatter-plots. 3: Classification I: emprirical Risk Minimization, k-nearest neighbors, decision stumps, decision tree. 4: Classification II: random forests, boosting. 5: Neural networks: perceptron, logistic regression, multi-layer networks, back-propagation. 6: Regression: Least squares regression. 7: Clustering: k-means, single linkage, agglomorative clustering, MDL penalty. 8: Dimesionality reduction: principal components analysis, Fisher linear discriminant analysis. 9: Reinforcement learning: MDPs, TD- and Q-learning, value iteration. 10: Bayesian methods: Bayes rule, generative models, naive Bayes classifier.
Machine Learning according to  The ability of a machine to improve its performance based on previous results. The process by which computer systems can be directed to improve their  performance over time. Examples are neural networks and genetic algorithms. Subspecialty of artificial intelligence concerned with developing methods for software  to learn from experience or extract knowledge from examples in a database. The ability of a program to learn from experience —  that is, to modify its execution on the basis of newly acquired information.  Machine learning is an area of artificial intelligence concerned with the  development of techniques which allow computers to "learn".  More specifically, machine learning is a method for creating computer  programs by the analysis of data sets. Machine learning overlaps heavily  with statistics, since both fields study the analysis of data, but unlike statistics, machine learning is concerned with the algorithmic complexity of computational implementations. ...
Some Examples ZIP code recognition Loan application classification  Signature recognition Voice recognition over phone Credit card fraud detection Spam filter Suggesting other products at Amazone.com  Marketing Stock market prediction Expert level chess and checkers systems biometric identification (fingerprints, DNA, iris scan, face) machine translation web-search document & information retrieval camera surveillance robosoccer and so on and so on...
Can Computers play Humans at Chess? Chess Playing is a classic AI problem well-defined problem very complex: difficult for humans to play well Conclusion: YES:  today’s computers can beat even the best human Garry Kasparov (current World Champion ) Deep Blue Deep Thought Points Ratings
2005 DARPA Grand Challenge The Grand Challenge is an off-road robot competition devised by DARPA (Defense Advanced Research Projects Agency) to promote research in the area of autonomous vehicles. The challenge consists of building a robot capable of navigating 175 miles through   desert terrain in less than 10 hours, with no human intervention.  http://www.grandchallenge.org/
2007 Darpa Challenge http://www.darpa.mil/grandchallenge/overview.asp                                                                                                                                  
Netflix Challenge http://www.netflixprize.com/leaderboard Netflix awards $1M for the person who improves their system by 10%. The relevant machine learning problem goes under then name: “ user recommendation system” or “collaborative filtering”. When you shop online at Amazon.com they recommend books based on what links you are clicking. For netflix the relevant problem is predicting movie-rating values for users.  movies (+/- 17,770) users (+/- 240,000) total of +/- 400,000,000 nonzero entries (99% sparse)
Netflix Challenge source: http://www.netflixprize.com/community/viewtopic.php?id=103 mean movie rating value   # movies with that mean mean user rating value   # users with that mean # ratings   # ratings   # movies # users
The Task The user-movie matrix has many missing entries: Joe did not happen to rate “ET”. Netflix wants to recommend unseen movies to users based on movies he/she has seen (and rated!) in the past. To recommend movies we are being asked to fill in the missing entries for Joe  with predicted ratings and pick the movies with the highest predicted ratings. Where does the information come from? Say we want to predict the rating for Joe and ET. I: Mary has rated all movies that Joe has seen in the past very similarly.  She has also seen ET and rated it with a 5. What would you predict for Joe? II: StarTrek that has obtained very similar ratings as ET from all users.  StarTrek was rated 4 by Joe. What would you predict for ET?
Your Homework & Project You will team up with 1 or more partners and implement algorithms that we  discuss in class on the netflix problem. Our goal is to get high up on the leaderboard This involves both trying out various learning techniques (machine learning) as well as dealing with the large size of the data (data mining). Towards the end we will combine all our algorithms to get a final score. Every class (starting next week) we will have a presentation by 1 team to report on their progress and to share experience.  Read this article on how good these systems can be:   http://www.theonion.com/content/node/57311?utm_source=onion_rss_daily
Text Data Text corpora are widely available in digital form these days (scanned journals, scanned newspapers, blogs,...). We can mine this text and discover interesting patterns: what topics are present in this article, what is the most similar/relevant article/webpage in the corpus. Here the data has a very similar format: word-tokens (+/- 20,000) documents (up to 1000,000) 99% sparse
Text Data Each document is represented as a count vector for each of the words in the vocabulary: [20,5,3,0,1,0,2,0,0,0,5,0,...]. So, in the article the word “president” appeared 5 times (can you guess a topic?). Now, we don’t want to fill in missing entries (sparse means “0”, not missing). Our task is to find for instance which documents are most similar  (document retrieval). Many more data matrices have the same format: for instance gene-expression  data is a matrix of genes vs. experiments where the values represent the  “activity level” of the gene in that experiment. Can we identify diseases? “ the” “ president”
Why is this cool/important? Modern technologies generate data at an unprecedented scale. The amount of data doubles every year. “ One petabyte is equivalent to the text in one billion books,  yet many scientific instruments, including the Large Synoptic Survey Telescope,  will soon be generating several petabytes annually”.  ( 2020 Computing: Science in an exponential world:   Nature  Published online: 22 March 2006) Computers dominate our daily lives Science, industry, army, our social interactions etc. We can no longer “eyeball” the images captured by some satellite for interesting events, or check every webpage for some topic. We need to trust computers to do the work for us.

More Related Content

PPTX
Frontiers of Computational Journalism week 3 - Information Filter Design
Jonathan Stray
 
PPTX
Frontiers of Computational Journalism week 2 - Text Analysis
Jonathan Stray
 
PPTX
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
Jonathan Stray
 
PPTX
Avoiding Machine Learning Pitfalls 2-10-18
Dan Elton
 
PPTX
Implementing Artificial Intelligence with Big Data
IDEAS - Int'l Data Engineering and Science Association
 
PDF
The Unreasonable Benefits of Deep Learning
indico data
 
PPT
dell_ml_rm.ppt
butest
 
PPT
Knowledge Representation in the Age of Deep Learning, Watson, and the Semanti...
James Hendler
 
Frontiers of Computational Journalism week 3 - Information Filter Design
Jonathan Stray
 
Frontiers of Computational Journalism week 2 - Text Analysis
Jonathan Stray
 
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
Jonathan Stray
 
Avoiding Machine Learning Pitfalls 2-10-18
Dan Elton
 
Implementing Artificial Intelligence with Big Data
IDEAS - Int'l Data Engineering and Science Association
 
The Unreasonable Benefits of Deep Learning
indico data
 
dell_ml_rm.ppt
butest
 
Knowledge Representation in the Age of Deep Learning, Watson, and the Semanti...
James Hendler
 

What's hot (13)

PDF
巨量與開放資料之創新機會與關鍵挑戰-曾新穆
台灣資料科學年會
 
PPT
Social Machines - 2017 Update (University of Iowa)
James Hendler
 
PDF
Complex Networks: Science, Programming, and Databases
S.M. Mahdi Seyednezhad, Ph.D.
 
PPTX
Ethical Issues in Machine Learning Algorithms. (Part 3)
Vladimir Kanchev
 
PPTX
Ethical Issues in Machine Learning Algorithms (Part 2)
Vladimir Kanchev
 
PDF
Collaborative filtering for recommendation systems in Python, Nicolas Hug
Pôle Systematic Paris-Region
 
PDF
Project Progress Report - Recommender Systems for Social Networks
amirhhz
 
PPTX
Crowdsourcing Linked Data Quality Assessment
Maribel Acosta Deibe
 
PDF
Big-data analytics: challenges and opportunities
台灣資料科學年會
 
PDF
林守德/Practical Issues in Machine Learning
台灣資料科學年會
 
PDF
RAPIDE
Tessella
 
PDF
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints
Cataldo Musto
 
PPT
Introduction to question answering for linked data & big data
Andre Freitas
 
巨量與開放資料之創新機會與關鍵挑戰-曾新穆
台灣資料科學年會
 
Social Machines - 2017 Update (University of Iowa)
James Hendler
 
Complex Networks: Science, Programming, and Databases
S.M. Mahdi Seyednezhad, Ph.D.
 
Ethical Issues in Machine Learning Algorithms. (Part 3)
Vladimir Kanchev
 
Ethical Issues in Machine Learning Algorithms (Part 2)
Vladimir Kanchev
 
Collaborative filtering for recommendation systems in Python, Nicolas Hug
Pôle Systematic Paris-Region
 
Project Progress Report - Recommender Systems for Social Networks
amirhhz
 
Crowdsourcing Linked Data Quality Assessment
Maribel Acosta Deibe
 
Big-data analytics: challenges and opportunities
台灣資料科學年會
 
林守德/Practical Issues in Machine Learning
台灣資料科學年會
 
RAPIDE
Tessella
 
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints
Cataldo Musto
 
Introduction to question answering for linked data & big data
Andre Freitas
 
Ad

Viewers also liked (20)

DOC
FOCUS.doc
butest
 
DOCX
Outline D
butest
 
PDF
Pregled in napoved razvoja fotovoltaične panoge
Slovenian Photovoltaic Association (ZSFV)
 
PPS
珍惜你身邊的朋友
花東宏宣
 
DOCX
February 2010 - Boulder Valley School District
butest
 
DOC
msword
butest
 
PPT
prie.ppt
butest
 
DOCX
TRAINING
butest
 
DOCX
MEDSHOW 2010 – MEDS13 CLASS ACT – “THE INTERN” Lights down. Static ...
butest
 
DOC
doc
butest
 
PPTX
Conference marketing presentation
nanvuwc
 
PDF
Публикация в журнале &Стратегии
New Social Communications
 
DOC
SVM Tutorial
butest
 
DOCX
Samenvatting
Jan Luursema
 
PPT
This is a heavily data-oriented
butest
 
DOC
Doc.doc
butest
 
DOC
chapter 8
butest
 
DOC
Text Mining: Beyond Extraction Towards Exploitation
butest
 
PPT
slides
butest
 
DOCX
STEFANO CARRINO
butest
 
FOCUS.doc
butest
 
Outline D
butest
 
Pregled in napoved razvoja fotovoltaične panoge
Slovenian Photovoltaic Association (ZSFV)
 
珍惜你身邊的朋友
花東宏宣
 
February 2010 - Boulder Valley School District
butest
 
msword
butest
 
prie.ppt
butest
 
TRAINING
butest
 
MEDSHOW 2010 – MEDS13 CLASS ACT – “THE INTERN” Lights down. Static ...
butest
 
doc
butest
 
Conference marketing presentation
nanvuwc
 
Публикация в журнале &Стратегии
New Social Communications
 
SVM Tutorial
butest
 
Samenvatting
Jan Luursema
 
This is a heavily data-oriented
butest
 
Doc.doc
butest
 
chapter 8
butest
 
Text Mining: Beyond Extraction Towards Exploitation
butest
 
slides
butest
 
STEFANO CARRINO
butest
 
Ad

Similar to Machine Learning ICS 273A (20)

PPT
Eick/Alpaydin Introduction
butest
 
PDF
Big data and AI presentation slides
CloudxLab
 
PDF
One talk Machine Learning
ONE Talks
 
PDF
Machine Learning: Learning with data
ONE Talks
 
PDF
Chapter1MACHINE LEARNING THEORY AND PRACTICES.pdf
PRABHUCECC
 
PPTX
recent.pptx
addisuaddaaa
 
PDF
Machine Learning - Implementation with Python - 1
University College of Engineering Kakinada, JNTUK - Kakinada, India
 
PDF
AI Presentation 1
Mustafa Kuğu
 
PDF
Ai & ml
Avilay Parekh
 
PDF
Encyclopedia Of Machine Learning Sammut C Webb G Eds
vioringoknur
 
PDF
ML MODULE 1_slideshare.pdf
Shiwani Gupta
 
PDF
Binary Search Algorithm
Anastasia Jakubow
 
PDF
Lecture2 - Machine Learning
Albert Orriols-Puig
 
PDF
Fantastic Problems and Where to Find Them: Daryl Weir
Futurice
 
PPTX
Launching into machine learning
Dr.R. Gunavathi Ramasamy
 
PPT
Machine Learning Ch 1.ppt
ARVIND SARDAR
 
PDF
AILABS - Lecture Series - Is AI the New Electricity? - Advances In Machine Le...
AILABS Academy
 
PPT
Machine Learning ICS 273A
butest
 
PPT
Artificial Intelligence AI Topics History and Overview
butest
 
PPT
Artificial Intelligence AI Topics History and Overview
butest
 
Eick/Alpaydin Introduction
butest
 
Big data and AI presentation slides
CloudxLab
 
One talk Machine Learning
ONE Talks
 
Machine Learning: Learning with data
ONE Talks
 
Chapter1MACHINE LEARNING THEORY AND PRACTICES.pdf
PRABHUCECC
 
recent.pptx
addisuaddaaa
 
Machine Learning - Implementation with Python - 1
University College of Engineering Kakinada, JNTUK - Kakinada, India
 
AI Presentation 1
Mustafa Kuğu
 
Ai & ml
Avilay Parekh
 
Encyclopedia Of Machine Learning Sammut C Webb G Eds
vioringoknur
 
ML MODULE 1_slideshare.pdf
Shiwani Gupta
 
Binary Search Algorithm
Anastasia Jakubow
 
Lecture2 - Machine Learning
Albert Orriols-Puig
 
Fantastic Problems and Where to Find Them: Daryl Weir
Futurice
 
Launching into machine learning
Dr.R. Gunavathi Ramasamy
 
Machine Learning Ch 1.ppt
ARVIND SARDAR
 
AILABS - Lecture Series - Is AI the New Electricity? - Advances In Machine Le...
AILABS Academy
 
Machine Learning ICS 273A
butest
 
Artificial Intelligence AI Topics History and Overview
butest
 
Artificial Intelligence AI Topics History and Overview
butest
 

More from butest (20)

PDF
EL MODELO DE NEGOCIO DE YOUTUBE
butest
 
DOC
1. MPEG I.B.P frame之不同
butest
 
PDF
LESSONS FROM THE MICHAEL JACKSON TRIAL
butest
 
PPT
Timeline: The Life of Michael Jackson
butest
 
DOCX
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
butest
 
PDF
LESSONS FROM THE MICHAEL JACKSON TRIAL
butest
 
PPTX
Com 380, Summer II
butest
 
PPT
PPT
butest
 
DOCX
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
butest
 
DOC
MICHAEL JACKSON.doc
butest
 
PPTX
Social Networks: Twitter Facebook SL - Slide 1
butest
 
PPT
Facebook
butest
 
DOCX
Executive Summary Hare Chevrolet is a General Motors dealership ...
butest
 
DOC
Welcome to the Dougherty County Public Library's Facebook and ...
butest
 
DOC
NEWS ANNOUNCEMENT
butest
 
DOC
C-2100 Ultra Zoom.doc
butest
 
DOC
MAC Printing on ITS Printers.doc.doc
butest
 
DOC
Mac OS X Guide.doc
butest
 
DOC
hier
butest
 
DOC
WEB DESIGN!
butest
 
EL MODELO DE NEGOCIO DE YOUTUBE
butest
 
1. MPEG I.B.P frame之不同
butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
butest
 
Timeline: The Life of Michael Jackson
butest
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
butest
 
Com 380, Summer II
butest
 
PPT
butest
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
butest
 
MICHAEL JACKSON.doc
butest
 
Social Networks: Twitter Facebook SL - Slide 1
butest
 
Facebook
butest
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
butest
 
Welcome to the Dougherty County Public Library's Facebook and ...
butest
 
NEWS ANNOUNCEMENT
butest
 
C-2100 Ultra Zoom.doc
butest
 
MAC Printing on ITS Printers.doc.doc
butest
 
Mac OS X Guide.doc
butest
 
hier
butest
 
WEB DESIGN!
butest
 

Machine Learning ICS 273A

  • 1. Machine Learning ICS 178 Instructor: Max Welling
  • 2. What is Expected? Class Homework/Projects (40%) Midterm (20%) Final (40%) For the projects, students should make teams. This class needs your active participation: please ask questions and participate in discussions (there is no such thing as a dumb question).
  • 3. Syllabus 1: Introduction: overview, examples, goals, probability, conditional independence, matrices, eigenvalue decompositions 2: Optimization and Data Visualization: Stochastic gradient descent, coordinate descent, centering, sphering, histograms, scatter-plots. 3: Classification I: emprirical Risk Minimization, k-nearest neighbors, decision stumps, decision tree. 4: Classification II: random forests, boosting. 5: Neural networks: perceptron, logistic regression, multi-layer networks, back-propagation. 6: Regression: Least squares regression. 7: Clustering: k-means, single linkage, agglomorative clustering, MDL penalty. 8: Dimesionality reduction: principal components analysis, Fisher linear discriminant analysis. 9: Reinforcement learning: MDPs, TD- and Q-learning, value iteration. 10: Bayesian methods: Bayes rule, generative models, naive Bayes classifier.
  • 4. Machine Learning according to The ability of a machine to improve its performance based on previous results. The process by which computer systems can be directed to improve their performance over time. Examples are neural networks and genetic algorithms. Subspecialty of artificial intelligence concerned with developing methods for software to learn from experience or extract knowledge from examples in a database. The ability of a program to learn from experience — that is, to modify its execution on the basis of newly acquired information. Machine learning is an area of artificial intelligence concerned with the development of techniques which allow computers to "learn". More specifically, machine learning is a method for creating computer programs by the analysis of data sets. Machine learning overlaps heavily with statistics, since both fields study the analysis of data, but unlike statistics, machine learning is concerned with the algorithmic complexity of computational implementations. ...
  • 5. Some Examples ZIP code recognition Loan application classification Signature recognition Voice recognition over phone Credit card fraud detection Spam filter Suggesting other products at Amazone.com Marketing Stock market prediction Expert level chess and checkers systems biometric identification (fingerprints, DNA, iris scan, face) machine translation web-search document & information retrieval camera surveillance robosoccer and so on and so on...
  • 6. Can Computers play Humans at Chess? Chess Playing is a classic AI problem well-defined problem very complex: difficult for humans to play well Conclusion: YES: today’s computers can beat even the best human Garry Kasparov (current World Champion ) Deep Blue Deep Thought Points Ratings
  • 7. 2005 DARPA Grand Challenge The Grand Challenge is an off-road robot competition devised by DARPA (Defense Advanced Research Projects Agency) to promote research in the area of autonomous vehicles. The challenge consists of building a robot capable of navigating 175 miles through desert terrain in less than 10 hours, with no human intervention. http://www.grandchallenge.org/
  • 8. 2007 Darpa Challenge http://www.darpa.mil/grandchallenge/overview.asp                                                                                                                               
  • 9. Netflix Challenge http://www.netflixprize.com/leaderboard Netflix awards $1M for the person who improves their system by 10%. The relevant machine learning problem goes under then name: “ user recommendation system” or “collaborative filtering”. When you shop online at Amazon.com they recommend books based on what links you are clicking. For netflix the relevant problem is predicting movie-rating values for users. movies (+/- 17,770) users (+/- 240,000) total of +/- 400,000,000 nonzero entries (99% sparse)
  • 10. Netflix Challenge source: http://www.netflixprize.com/community/viewtopic.php?id=103 mean movie rating value # movies with that mean mean user rating value # users with that mean # ratings # ratings # movies # users
  • 11. The Task The user-movie matrix has many missing entries: Joe did not happen to rate “ET”. Netflix wants to recommend unseen movies to users based on movies he/she has seen (and rated!) in the past. To recommend movies we are being asked to fill in the missing entries for Joe with predicted ratings and pick the movies with the highest predicted ratings. Where does the information come from? Say we want to predict the rating for Joe and ET. I: Mary has rated all movies that Joe has seen in the past very similarly. She has also seen ET and rated it with a 5. What would you predict for Joe? II: StarTrek that has obtained very similar ratings as ET from all users. StarTrek was rated 4 by Joe. What would you predict for ET?
  • 12. Your Homework & Project You will team up with 1 or more partners and implement algorithms that we discuss in class on the netflix problem. Our goal is to get high up on the leaderboard This involves both trying out various learning techniques (machine learning) as well as dealing with the large size of the data (data mining). Towards the end we will combine all our algorithms to get a final score. Every class (starting next week) we will have a presentation by 1 team to report on their progress and to share experience. Read this article on how good these systems can be: http://www.theonion.com/content/node/57311?utm_source=onion_rss_daily
  • 13. Text Data Text corpora are widely available in digital form these days (scanned journals, scanned newspapers, blogs,...). We can mine this text and discover interesting patterns: what topics are present in this article, what is the most similar/relevant article/webpage in the corpus. Here the data has a very similar format: word-tokens (+/- 20,000) documents (up to 1000,000) 99% sparse
  • 14. Text Data Each document is represented as a count vector for each of the words in the vocabulary: [20,5,3,0,1,0,2,0,0,0,5,0,...]. So, in the article the word “president” appeared 5 times (can you guess a topic?). Now, we don’t want to fill in missing entries (sparse means “0”, not missing). Our task is to find for instance which documents are most similar (document retrieval). Many more data matrices have the same format: for instance gene-expression data is a matrix of genes vs. experiments where the values represent the “activity level” of the gene in that experiment. Can we identify diseases? “ the” “ president”
  • 15. Why is this cool/important? Modern technologies generate data at an unprecedented scale. The amount of data doubles every year. “ One petabyte is equivalent to the text in one billion books, yet many scientific instruments, including the Large Synoptic Survey Telescope, will soon be generating several petabytes annually”. ( 2020 Computing: Science in an exponential world: Nature Published online: 22 March 2006) Computers dominate our daily lives Science, industry, army, our social interactions etc. We can no longer “eyeball” the images captured by some satellite for interesting events, or check every webpage for some topic. We need to trust computers to do the work for us.