SlideShare a Scribd company logo
Technical Area:
      Machine Learning and Pattern Recognition

Examiner: Alex “Sandy” Pentland
Toshiba Professor in Media Arts and Sciences
MIT Media Laboratory

Description
The objective of this area is to familiarize myself with the main techniques and
algorithms of machine learning and pattern recognition. The goal is to understand high-
level advantages and disadvantages of various approaches to better understand how they
can be used, combined, and improved.

Supervised learning
   • Graphical Models (Bayes Nets*, HMMs, decision theory)
   • Instance based learning (NN, KNN)
   • Decision trees* (ID3, C4.5)
   • Sequential Learning
           o Dynamic Bayesian Networks* (HMMs)
           o Sliding window*/Recurrent Sliding window (RNN, RDT)
           o Conditional Random fields
   • Linear/non-linear regression and classification
       • Generalized linear discriminant
       • Neural Networks (Perceptron, MLPs, RBFs)
       • Support Vector Machines*
Unsupervised learning
   • Partitional* (Generative: Mixture of Gaussians, reconstructive:K-means)
   • Hierarchical clustering* (single, complete, average link)
   • Spectral clustering
Semi-supervised learning* (Cluster-classify, transductive SVMs)
Reinforcement learning (Q-learning, TD learning)
Ensemble Methods* (Weighted majority, Bagging, Boosting)
Density estimation (NN, Kernel based, Bayesian approaches)
Feature Extraction techniques (Fisher, PCA, ICA)
Feature Selection techniques (Filtering, wrapping, Bayesian)
Parameter estimation techniques (ML, MAP, MC, EM, BP, CV)
* Covered in more depth


Written Requirement
The written requirement for this area will consist of a 24-hour take-home exam.

Signature: ______________________________ Date: _____________
Reading list
The reading list is structured as follow:


Fundamentals

Graphical Models
R. Duda, P. Hart, and D. Stork, "Chapter 2. Bayesian Decision Theory," in
Pattern Classification, 2nd ed: John Wiley & Sons, 2000.

S. Russell and P. Norvig, "Probabilistic Reasoning Systems," in Artificial
intelligence: a Modern Approach: Prentice Hall, 1995, pp. 436-467.

M. Jordan and C. Bishop, An Introduction to Graphical Models: to be published,
2001. CH 1, 5-7

K. Murphy, An Introduction to Graphical Models, Technical report, Intel
Research, May 2001.

R. Cowell, "Introduction to Inference for Bayesian Networks," in Learning in
Graphical Models: MIT Press, 1998, pp. 9-26.

R. Cowell, "Advanced Inference in Bayesian Networks," in Learning in Graphical
Models: MIT Press, 1998, pp. 27-49.

J. S. Yedidia and W. T. Freeman, "Understanding belief propagation and its
generalizations," in Exploring Artificial Intelligence in the New Millenium, vol.
Chap 8, S. a. T. Books, Ed., 2003, pp. 236-239.

D. Heckerman, "A Tutorial on Learning with Bayesian Networks," in Learning in
Graphical Models: MIT Press, 1998, pp. 301-354.

M. I. Jordan, Z. Ghahramani, T. Jaakkola, and L. K. Saul, "An Introduction to
Variational Methods for Graphical Models," in Learning in Graphical Models: MIT
Press, 1998, pp. 105-162.

D. J. C. MacKay, "Introduction to Monte Carlo Methods," in Learning in Graphical
Models: MIT Press, 1998, pp. 175-204.

C. Andrieu, N. d. Freitas, A. Doucet, and M. Jordan, "An Introduction to MCMC
for Machine Learning," Machine Learning, vol. 50, pp. 5-43, 2003.

H. Guo and W. Luo, "Implementation and Evaluation of Exact and Approximate
Dynamic Bayesian Network Inference Algorithms."
The Influence Model
S. Basu, T. Choudhury, and B. Clarkson, Learning Human Interactions with the
Influence Model, Technical report 539, Massachusetts Institute of Technology,
Media Lab, 2001.

A. K. Jammalamadaka, Aspects of Inference for the Influence Model and Related
Graphical Models, Master's Thesis, Electrical Engineering and Computer
Science, Massachusetts Institute of Technology.


Parameter estimation techniques (ML, MAP, MC, EM, BP, CV)
R. Duda, P. Hart, and D. Stork, "Chapter 3. Maximum Likelihood and Bayesian
Parameter Estimation," in Pattern Classification, 2nd ed: John Wiley & Sons,
2000.

C. M. Bishop, "Parameter Optimization Algorithms," in Neural Networks for
Pattern Recognition: Oxford University Press, 1995, pp. 253-292.


Instance based learning (NN, KNN)
R. Duda, P. Hart, and D. Stork, "Chapter 4. Non-parametric Techniques," in
Pattern Classification, 2nd ed: John Wiley & Sons, 2000.

T. Mitchell, "Instance-Based Learning," in Machine Learning: McGraw Hill, 1997.

C. Atkeson, A. Moore, and S. Schaal, "Locally Weighted Learning," in AI Review,
vol. 11: Kluwer, 1997, pp. 11-73.


Decision trees* (ID3, C4.5)
R. Duda, P. Hart, and D. Stork, "Chapter 8. Non-metric Methods," in Pattern
Classification, 2nd ed: John Wiley & Sons, 2000.

T. Mitchell, "Decision Tree Learning," in Machine Learning: McGraw Hill, 1997

S. R. Safavian and D. Landgrebe, "A Survey of Decision Tree Classifier
Methodology," in IEEE Trans. Systems, Man, Cybernetics, vol. 21, 1991, pp.
660-674.

P. Domingos and F. Provost, Well-trained {PETs}: Improving Probability
Estimation Trees, CDER Working Paper #00-04-IS, Stern School of Business,
New York University, NY, NY 2000.

G. Bakiri and T. G. Dietterich, "Achieving High-Accuracy Text-to-Speech with
Machine Learning," in Data Mining in Speech Syntesis, B. Damper, Ed., 1997.
Sequential learning
R. Duda, P. Hart, and D. Stork, "Chapter 3.1 Hidden Markov Models," in Pattern
Classification, 2nd ed: John Wiley & Sons, 2000.

L. R. Rabiner and B.-H. Juang, "A Theory and Implementation of Hidden Markov
Models," in Fundamentals of Speech Recognition: Prentice Hall, 1993, pp. 321-
389.

K. P. Murphy, Hidden Semi-Markov Models (HSMMs), Technical report, 2002.

K. Murphy, Dynamic Bayesian Networks: Representation, Inference and
Learning, Ph.D thesis, Computer Science Division, UC Berkeley, 2002.

T. G. Dietterich, "Machine Learning for Sequential Data: A Review," in Structural,
Syntactic, and Statistical Pattern Recognition, vol. 2396, Lecture Notes in
Computer Science, Ed.: Springer-Verlag, 2002, pp. 15-30.

H. M. Wallach, Conditional Random Fields: An Introduction, Technical Report
MS-CIS-04-21, Department of Computer and Information Science, University of
Pennsylvania, 2004.

J. Lafferty, A. McCallum, and F. Pereira, "Conditional Random Fields:
Probabilistic Models for Segmenting and Labeling Sequential Data," in In
Proceedings of the Eighteenth International conference on Machine Learning
(ICML-2001), 2001

A. McCallum, "Efficiently Inducing Features of Conditional Random Fields," in
Proceedings of the 19ty Conference in Uncertainty in Artificial Intelligence (UIA
'03), 2003.

B. Pearlmutter, Dynamic Recurrent Neural Networks, Technical Report CMU-CS-
90-196, Carnegie Mellon University School of Computer Science, Pittsburgh, PA
1990.


Linear/non-linear regression and classification
R. Duda, P. Hart, and D. Stork, "Chapter 5. Linear Discriminant Functions," in
Pattern Classification, 2nd ed: John Wiley & Sons, 2000.

A. K. Jain, R. P. W. Duin, and J. C. Mao, "Statistical Pattern Recognition: A
Review," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.
22, 1, pp. 4-37, 2000.
Neural Networks
R. Duda, P. Hart, and D. Stork, "Chapter 6. Multi-layer Neural Networks," in
Pattern Classification, 2nd ed: John Wiley & Sons, 2000.

A. K. Jain, J. Mao, and K. M. Mohiuddin, "Artificial Neural Networks: A Tutorial,"
in IEEE Computer, vol. 29, 1996, pp. 31-44.

J. Sima and P. Orponen, "General-Purpose Computation with Neural Networks:
A Survey of Complexity Theoretic Results," in Neural Computation, vol. 15, 2003,
pp. 2727-2778

C. M. Bishop, "Single Layer Networks," in Neural Networks for Pattern
Recognition: Oxford University Press, 1995, pp. 77-112.

C. M. Bishop, "The Multi-Layer Perceptron," in Neural Networks for Pattern
Recognition: Oxford University Press, 1995, pp. 116-137.

Support Vector Machines
C. J. C. Burges, "A Tutorial on Support Vector Machines for Pattern
Recognition," Data Mining and Knowledge Discovery, vol. 2, pp. 121-167, 1998.

R. Herbrich, "Kernel Classifiers from a Machine Learning Perspective," in
Learning Kernel Classifiers: The MIT Press, pp. 49-66.

C. Hsu, C. Chang, and C. Lin, A Practical Guide to Support Vector Classification,
Technical Report Department of Computer Science and Information Engineering,
National Taiwan University, 2003.


Density estimation
R. Duda, P. Hart, and D. Stork, "Chapter 4. Non-parametric Techniques," in
Pattern Classification, 2nd ed: John Wiley & Sons, 2000.

C. M. Bishop, "Probabilistic Density Estimation," in Neural Networks for Pattern
Recognition: Oxford University Press, 1995, pp. 33-73.


Unsupervised learning
R. Duda, P. Hart, and D. Stork, "Chapter 10. Unsupervised Learning and
Clustering," in Pattern Classification, 2nd ed: John Wiley & Sons, 2000.

P. Berkhin, Survey Of Clustering Data Mining Techniques, Technical Report
Accrue Software, San Jose, CA 2002.

G. Fung, A Comprehensive Overview of Basic Clustering Algorithms, Technical
Report, University of Winsconsin, Madison, 2001.
A. K. Jain, M. N. Murty, and P. J. Flynn, "Data Clustering: A Review," ACM
Computing Surveys, vol. 31, 3, pp. 264-323, 1999.

A. Ng, M. Jordan, and Y. Weiss, "On spectral clustering: Analysis and an
Algorithm," in Advances in Neural Information Processing Systems, 2001

S. Zhong, Probabilistic Model-Based Clustering of Time Series, Ph.D Qualifying
proposal University of Texas at Austin, Austin, Texas, May 2002.


Semi-supervised learning* (Cluster-classify, transductive SVMs)

X. Zhu, Semi-Supervised Learning with Graphs, Technical CMU-LTI-05-192,
Carnegie Mellon University, 2005.

A. Blum and T. Mitchell, "Combining Labeled and Unlabeled Data with Co-
Training," in Proceedings of the 11th Conference on Computational Learning
Theory. Madison, Wisconsin, United States: Morgan Kaufmann Publishers, 1998,
pp. 92-100

O. Chapelle, J. Weston, and B. Olkopf, "Cluster Kernels for Semi-supervised
Learning," in Advances in Neural Information Processing Systems, vol. 15.
Cambridge, MA: The MIT Press, 2003, pp. 585-592.

K. Bennett and A. Demiriz, "Semi-Supervised Support Vector Machines," in
Proceedings of Advances in Neural Information Processing Systems, M. S.
Kearns, S. A. Solla, and D. A. Cohn, Eds.: The MIT Press, 1998, pp. 368-374
K. P. Bennett and A. Demiriz, "Optimization Approaches to Semi-Supervised
Learning," in to appear in Applications and Algorithms of Complementarity, L.
Mangasarian and J. S. Pang, Eds.: Kluwer Academic Publishers, 2000.

T. Joachims, "Transductive Inference for Text Classification using Support Vector
Machines," in Proceedings of The 16th International Conference on Machine
Learning (ICML' 99). San Francisco, CA: Morgan Kaufmann Publishers, 1999,
pp. 200-209.


Reinforcement learning (Q-learning, TD learning)
L. P. Kaelbling, M. L. Littman, and A. W. Moore, "Reinforcement Learning: a
Survey," Journal of Artificial Intelligence Research, vol. 4, pp. 237-285, 1996.

R. S. Sutton and A. G. Barto, "The Reinforcement Learning Problem," in
Reinforcement Learning: The MIT Press, pp. 51-81
R. S. Sutton and A. G. Barto, "Temporal-Difference Learning," in Reinforcement
Learning: The MIT Press, pp. 133-157.

R. S. Sutton, "Learning to Predict by the Method of Temporal Differences,"
Machine Learning, vol. 3, pp. 9-44, 1998.


Feature Extraction techniques (Fisher, PCA, ICA)
R. Duda, P. Hart, and D. Stork, "Chapter 3.8 Component Analysis and
Discriminants," in Pattern Classification, 2nd ed: John Wiley & Sons, 2000.

C. J. C. Burges, Geometric Methods for Feature Extraction and Dimensional
Reduction: A Guided Tour, Technical Report MSR-TR-2004-55, Microsoft
Research, Redmond, WA 2003.


Ensemble Methods

T. G. Dietterich, "Ensemble Methods in Machine Learning," in Proceedings of the
First International Workshop on Multiple Classifier Systems, vol. 1857, 2000, pp.
1-15

Y. Freund and R. Schapire, "A short introduction to boosting," in Journal of
Japanese Society for Artificial Intelligence, vol. 11, 1999, pp. 771-780.

E. Bauer and R. Kohavi, "An Empirical Comparison of Voting Classification
Algorithms: Bagging, Boosting, and Variants," in Machine Learning, vol. 36,
1999, pp. 105-139.

T. G. Dietterich, "Machine-Learning Research: Four Current Directions," in The
AI Magazine, vol. 18, 1998, pp. 97-136


Applications to activity recognition
M. Philipose, K. Fishkin, D. Fox, H. Kautz, D. Patterson, and M. perkowitz,
"Guide: Towards Understanding Daily Life via Auto-Identification and Statistical
Analysis," in Proceedings of The 2nd International Workshop on Ubiquitous
Computing for Pervasive Healthcare Applications (UbiHealth ‘03). Seattle, WA,
2003.

D. Wilson, "Simultaneous Tracking & Activity Recognition (STAR) Using Many
Anonymous, Binary Sensors," to be published in Proceedings of The 3rd
International Conference on Pervasive Computing (Pervasive ‘05). Munich,
Germany, 2005.
D. Patterson, D. Fox, H. Kautz, and M. Philipose, "Expressive, Tractable and
Scalable Techniques for Modeling Activities of Daily Living.," in Proceedings of
The 2nd International Workshop on Ubiquitous Computing for Pervasive
Healthcare Applications (UbiHealth '03). Seattle, WA, 2003.

D. Patterson, L. Liao, D. Fox, and H. Kautz, "Inferring High Level Behavior from
Low Level Sensors.," in Fifth Annual Conference on Ubiquitous Computing
(UBICOMP 2003). Seattle, WA, 2003.

M. Philipose, K. P. Fishkin, M. Perkowitz, D. J. Patterson, D. Hahnel, D. Fox, and
H. Kautz, "Inferring Activities from Interactions with Objects," IEEE Pervasive
Computing Magazine, vol. 3, 4, 2004.

M. Perkowitz, M. Philipose, D. J. Patterson, and K. Fishkin, "Mining Models of
Human Activities from the Web," in Proceedings of The Thirteenth International
World Wide Web Conference (WWW '04). New York, USA, 2004.

D. Patterson, D. Fox, H. Kautz, and M. Philipose, Sporadic State Estimation for
General Activity Inference, Technical report irs_tr_04_003_a, Intel Research
Seattle and the University of Washington, June 2004.

D. Patterson, Modeling Details of the Activity Tracker, Technical report
irs_tr_04_003_a, Intel Research Seattle and the University of Washington,
Seattle, WA, July 2004.

N. M. Oliver, B. Rosario, and A. Pentland, "A Bayesian Computer Vision System
for Modeling Human Interactions," IEEE Transactions on Pattern Analysis and
Machine Intelligence, vol. 22, 8, pp. 831-843, 2000.

N. Oliver, E. Horvitz, and A. Garg, "Hierarchical Representations for Learning
and Inferring Office Activity from Multimodal Information," in Proceedings of the
4th International Conference on Multimodal Interfaces, 2002.


S. Luhr, H. Bui, S. Venkatesh, and G. West, "Recognition of Human Activity
Through Hierarchical Stochastic Learning," in Proceedings of The IEEE
International Conference on Pervasive Computing and Communications: IEEE
Press, 2003.

More Related Content

What's hot (20)

PDF
An Extensive Review on Generative Adversarial Networks GAN’s
ijtsrd
 
PDF
September 2021 - Top 10 Read Articles in Signal & Image Processing
sipij
 
DOC
Comparison of relational and attribute-IEEE-1999-published ...
butest
 
PDF
July 2021: Top Read Articles in Signal & Image Processing
sipij
 
PDF
A NEW CODING METHOD IN PATTERN RECOGNITION FINGERPRINT IMAGE USING VECTOR QUA...
International Journal of Technical Research & Application
 
PDF
Prasanna Raut CV
Prasanna Raut
 
PDF
chalenges and apportunity of deep learning for big data analysis f
maru kindeneh
 
PDF
Handwriting identification using deep convolutional neural network method
TELKOMNIKA JOURNAL
 
PDF
Text classification based on gated recurrent unit combines with support vecto...
IJECEIAES
 
PDF
Most Cited Articles in Academia --Signal & Image Processing : An Internationa...
sipij
 
PDF
Most Viewed Articles in Academia for an year - International Journal of Ubiqu...
ijujournal
 
PDF
M.tech computerunitwise
chetanvchaudhari
 
PDF
Eat it, Review it: A New Approach for Review Prediction
vivatechijri
 
PDF
Prediction of Student's Performance with Deep Neural Networks
CSCJournals
 
PDF
Top Read Articles in April 2020 - IJU
ijujournal
 
PDF
Trends of machine learning in 2020 - International Journal of Artificial Inte...
gerogepatton
 
PDF
A new study of dss based on neural network and data mining
Attaporn Ninsuwan
 
PDF
IRJET- Deep Learning Techniques for Object Detection
IRJET Journal
 
DOC
Dr. B.M.Patil
Dr.Bankat M Patil
 
PDF
Use of artificial neural network in pattern recognition
kamalsrit
 
An Extensive Review on Generative Adversarial Networks GAN’s
ijtsrd
 
September 2021 - Top 10 Read Articles in Signal & Image Processing
sipij
 
Comparison of relational and attribute-IEEE-1999-published ...
butest
 
July 2021: Top Read Articles in Signal & Image Processing
sipij
 
A NEW CODING METHOD IN PATTERN RECOGNITION FINGERPRINT IMAGE USING VECTOR QUA...
International Journal of Technical Research & Application
 
Prasanna Raut CV
Prasanna Raut
 
chalenges and apportunity of deep learning for big data analysis f
maru kindeneh
 
Handwriting identification using deep convolutional neural network method
TELKOMNIKA JOURNAL
 
Text classification based on gated recurrent unit combines with support vecto...
IJECEIAES
 
Most Cited Articles in Academia --Signal & Image Processing : An Internationa...
sipij
 
Most Viewed Articles in Academia for an year - International Journal of Ubiqu...
ijujournal
 
M.tech computerunitwise
chetanvchaudhari
 
Eat it, Review it: A New Approach for Review Prediction
vivatechijri
 
Prediction of Student's Performance with Deep Neural Networks
CSCJournals
 
Top Read Articles in April 2020 - IJU
ijujournal
 
Trends of machine learning in 2020 - International Journal of Artificial Inte...
gerogepatton
 
A new study of dss based on neural network and data mining
Attaporn Ninsuwan
 
IRJET- Deep Learning Techniques for Object Detection
IRJET Journal
 
Dr. B.M.Patil
Dr.Bankat M Patil
 
Use of artificial neural network in pattern recognition
kamalsrit
 

Viewers also liked (11)

PPT
Ict Vision And Strategy Development
Alan McSweeney
 
PPTX
Machine Learning Introduction for Digital Business Leaders
Sudha Jamthe
 
PDF
Reproducibility and automation of machine learning process
Denis Dus
 
PDF
Directions towards a cool consumer review platform using machine learning (ml...
Dhwaj Raj
 
PDF
Lessons learned
hexgnu
 
PDF
Is Machine learning for your business? - Girls in Tech Luxembourg
Marie-Adélaïde Gervis
 
PDF
.Net development with Azure Machine Learning (AzureML) Nov 2014
Mark Tabladillo
 
PDF
Assignment of arbitrarily distributed random samples to the fixed probability...
Denis Dus
 
PDF
Requirements for next generation of Cloud Computing: Case study with multiple...
David Lary
 
PDF
Machine Learning part 2 - Introduction to Data Science
Frank Kienle
 
PPTX
Introduction to Machine Learning
Lior Rokach
 
Ict Vision And Strategy Development
Alan McSweeney
 
Machine Learning Introduction for Digital Business Leaders
Sudha Jamthe
 
Reproducibility and automation of machine learning process
Denis Dus
 
Directions towards a cool consumer review platform using machine learning (ml...
Dhwaj Raj
 
Lessons learned
hexgnu
 
Is Machine learning for your business? - Girls in Tech Luxembourg
Marie-Adélaïde Gervis
 
.Net development with Azure Machine Learning (AzureML) Nov 2014
Mark Tabladillo
 
Assignment of arbitrarily distributed random samples to the fixed probability...
Denis Dus
 
Requirements for next generation of Cloud Computing: Case study with multiple...
David Lary
 
Machine Learning part 2 - Introduction to Data Science
Frank Kienle
 
Introduction to Machine Learning
Lior Rokach
 
Ad

Similar to Technical Area: Machine Learning and Pattern Recognition (20)

DOCX
ML Project(by-Ethem-Alpaydin)-Introduction-to-Machine-Learni-24.docx
audeleypearl
 
PPTX
Techniques Machine Learning
DataminingTools Inc
 
PPTX
Intro to machine learning
Akshay Kanchan
 
PPTX
Simple overview of machine learning
priyadharshini R
 
PPTX
machine learning algorithm.pptx
SasmitaDash28
 
PDF
Introduction Machine Learning Syllabus
Andres Mendez-Vazquez
 
PPTX
Dive into Machine Learning Event MUGDSC.pptx
RakshaAgrawal21
 
PPTX
Dive into Machine Learning Event--MUGDSC
RakshaAgrawal21
 
PDF
Presentation-19.08.2024hvug7gugyvuvugugugugugug
amanna7980
 
PDF
Introduction to conventional machine learning techniques
Xavier Rafael Palou
 
PDF
ml basics ARTIFICIAL INTELLIGENCE, MACHINE LEARNING, TYPES OF MACHINE LEARNIN...
EmanAmir9
 
PDF
IRJET - A Survey on Machine Learning Algorithms, Techniques and Applications
IRJET Journal
 
PDF
A Compendium of Various Applications of Machine Learning
IRJET Journal
 
PPS
Brief Tour of Machine Learning
butest
 
PPT
Machine learning and deep learning algorithms
KannanA29
 
PPTX
5. Machine Learning.pptx
ssuser6654de1
 
PDF
A Few Useful Things to Know about Machine Learning
nep_test_account
 
PPTX
Introduction to Machine Learning
Panimalar Engineering College
 
PPTX
INTRODUCTION TO ML basics of ml that one should know
PriyanshuGupta285178
 
PPT
Introduction to Machine Learning Aristotelis Tsirigos
butest
 
ML Project(by-Ethem-Alpaydin)-Introduction-to-Machine-Learni-24.docx
audeleypearl
 
Techniques Machine Learning
DataminingTools Inc
 
Intro to machine learning
Akshay Kanchan
 
Simple overview of machine learning
priyadharshini R
 
machine learning algorithm.pptx
SasmitaDash28
 
Introduction Machine Learning Syllabus
Andres Mendez-Vazquez
 
Dive into Machine Learning Event MUGDSC.pptx
RakshaAgrawal21
 
Dive into Machine Learning Event--MUGDSC
RakshaAgrawal21
 
Presentation-19.08.2024hvug7gugyvuvugugugugugug
amanna7980
 
Introduction to conventional machine learning techniques
Xavier Rafael Palou
 
ml basics ARTIFICIAL INTELLIGENCE, MACHINE LEARNING, TYPES OF MACHINE LEARNIN...
EmanAmir9
 
IRJET - A Survey on Machine Learning Algorithms, Techniques and Applications
IRJET Journal
 
A Compendium of Various Applications of Machine Learning
IRJET Journal
 
Brief Tour of Machine Learning
butest
 
Machine learning and deep learning algorithms
KannanA29
 
5. Machine Learning.pptx
ssuser6654de1
 
A Few Useful Things to Know about Machine Learning
nep_test_account
 
Introduction to Machine Learning
Panimalar Engineering College
 
INTRODUCTION TO ML basics of ml that one should know
PriyanshuGupta285178
 
Introduction to Machine Learning Aristotelis Tsirigos
butest
 
Ad

More from butest (20)

PDF
EL MODELO DE NEGOCIO DE YOUTUBE
butest
 
DOC
1. MPEG I.B.P frame之不同
butest
 
PDF
LESSONS FROM THE MICHAEL JACKSON TRIAL
butest
 
PPT
Timeline: The Life of Michael Jackson
butest
 
DOCX
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
butest
 
PDF
LESSONS FROM THE MICHAEL JACKSON TRIAL
butest
 
PPTX
Com 380, Summer II
butest
 
PPT
PPT
butest
 
DOCX
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
butest
 
DOC
MICHAEL JACKSON.doc
butest
 
PPTX
Social Networks: Twitter Facebook SL - Slide 1
butest
 
PPT
Facebook
butest
 
DOCX
Executive Summary Hare Chevrolet is a General Motors dealership ...
butest
 
DOC
Welcome to the Dougherty County Public Library's Facebook and ...
butest
 
DOC
NEWS ANNOUNCEMENT
butest
 
DOC
C-2100 Ultra Zoom.doc
butest
 
DOC
MAC Printing on ITS Printers.doc.doc
butest
 
DOC
Mac OS X Guide.doc
butest
 
DOC
hier
butest
 
DOC
WEB DESIGN!
butest
 
EL MODELO DE NEGOCIO DE YOUTUBE
butest
 
1. MPEG I.B.P frame之不同
butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
butest
 
Timeline: The Life of Michael Jackson
butest
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
butest
 
Com 380, Summer II
butest
 
PPT
butest
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
butest
 
MICHAEL JACKSON.doc
butest
 
Social Networks: Twitter Facebook SL - Slide 1
butest
 
Facebook
butest
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
butest
 
Welcome to the Dougherty County Public Library's Facebook and ...
butest
 
NEWS ANNOUNCEMENT
butest
 
C-2100 Ultra Zoom.doc
butest
 
MAC Printing on ITS Printers.doc.doc
butest
 
Mac OS X Guide.doc
butest
 
hier
butest
 
WEB DESIGN!
butest
 

Technical Area: Machine Learning and Pattern Recognition

  • 1. Technical Area: Machine Learning and Pattern Recognition Examiner: Alex “Sandy” Pentland Toshiba Professor in Media Arts and Sciences MIT Media Laboratory Description The objective of this area is to familiarize myself with the main techniques and algorithms of machine learning and pattern recognition. The goal is to understand high- level advantages and disadvantages of various approaches to better understand how they can be used, combined, and improved. Supervised learning • Graphical Models (Bayes Nets*, HMMs, decision theory) • Instance based learning (NN, KNN) • Decision trees* (ID3, C4.5) • Sequential Learning o Dynamic Bayesian Networks* (HMMs) o Sliding window*/Recurrent Sliding window (RNN, RDT) o Conditional Random fields • Linear/non-linear regression and classification • Generalized linear discriminant • Neural Networks (Perceptron, MLPs, RBFs) • Support Vector Machines* Unsupervised learning • Partitional* (Generative: Mixture of Gaussians, reconstructive:K-means) • Hierarchical clustering* (single, complete, average link) • Spectral clustering Semi-supervised learning* (Cluster-classify, transductive SVMs) Reinforcement learning (Q-learning, TD learning) Ensemble Methods* (Weighted majority, Bagging, Boosting) Density estimation (NN, Kernel based, Bayesian approaches) Feature Extraction techniques (Fisher, PCA, ICA) Feature Selection techniques (Filtering, wrapping, Bayesian) Parameter estimation techniques (ML, MAP, MC, EM, BP, CV) * Covered in more depth Written Requirement The written requirement for this area will consist of a 24-hour take-home exam. Signature: ______________________________ Date: _____________
  • 2. Reading list The reading list is structured as follow: Fundamentals Graphical Models R. Duda, P. Hart, and D. Stork, "Chapter 2. Bayesian Decision Theory," in Pattern Classification, 2nd ed: John Wiley & Sons, 2000. S. Russell and P. Norvig, "Probabilistic Reasoning Systems," in Artificial intelligence: a Modern Approach: Prentice Hall, 1995, pp. 436-467. M. Jordan and C. Bishop, An Introduction to Graphical Models: to be published, 2001. CH 1, 5-7 K. Murphy, An Introduction to Graphical Models, Technical report, Intel Research, May 2001. R. Cowell, "Introduction to Inference for Bayesian Networks," in Learning in Graphical Models: MIT Press, 1998, pp. 9-26. R. Cowell, "Advanced Inference in Bayesian Networks," in Learning in Graphical Models: MIT Press, 1998, pp. 27-49. J. S. Yedidia and W. T. Freeman, "Understanding belief propagation and its generalizations," in Exploring Artificial Intelligence in the New Millenium, vol. Chap 8, S. a. T. Books, Ed., 2003, pp. 236-239. D. Heckerman, "A Tutorial on Learning with Bayesian Networks," in Learning in Graphical Models: MIT Press, 1998, pp. 301-354. M. I. Jordan, Z. Ghahramani, T. Jaakkola, and L. K. Saul, "An Introduction to Variational Methods for Graphical Models," in Learning in Graphical Models: MIT Press, 1998, pp. 105-162. D. J. C. MacKay, "Introduction to Monte Carlo Methods," in Learning in Graphical Models: MIT Press, 1998, pp. 175-204. C. Andrieu, N. d. Freitas, A. Doucet, and M. Jordan, "An Introduction to MCMC for Machine Learning," Machine Learning, vol. 50, pp. 5-43, 2003. H. Guo and W. Luo, "Implementation and Evaluation of Exact and Approximate Dynamic Bayesian Network Inference Algorithms."
  • 3. The Influence Model S. Basu, T. Choudhury, and B. Clarkson, Learning Human Interactions with the Influence Model, Technical report 539, Massachusetts Institute of Technology, Media Lab, 2001. A. K. Jammalamadaka, Aspects of Inference for the Influence Model and Related Graphical Models, Master's Thesis, Electrical Engineering and Computer Science, Massachusetts Institute of Technology. Parameter estimation techniques (ML, MAP, MC, EM, BP, CV) R. Duda, P. Hart, and D. Stork, "Chapter 3. Maximum Likelihood and Bayesian Parameter Estimation," in Pattern Classification, 2nd ed: John Wiley & Sons, 2000. C. M. Bishop, "Parameter Optimization Algorithms," in Neural Networks for Pattern Recognition: Oxford University Press, 1995, pp. 253-292. Instance based learning (NN, KNN) R. Duda, P. Hart, and D. Stork, "Chapter 4. Non-parametric Techniques," in Pattern Classification, 2nd ed: John Wiley & Sons, 2000. T. Mitchell, "Instance-Based Learning," in Machine Learning: McGraw Hill, 1997. C. Atkeson, A. Moore, and S. Schaal, "Locally Weighted Learning," in AI Review, vol. 11: Kluwer, 1997, pp. 11-73. Decision trees* (ID3, C4.5) R. Duda, P. Hart, and D. Stork, "Chapter 8. Non-metric Methods," in Pattern Classification, 2nd ed: John Wiley & Sons, 2000. T. Mitchell, "Decision Tree Learning," in Machine Learning: McGraw Hill, 1997 S. R. Safavian and D. Landgrebe, "A Survey of Decision Tree Classifier Methodology," in IEEE Trans. Systems, Man, Cybernetics, vol. 21, 1991, pp. 660-674. P. Domingos and F. Provost, Well-trained {PETs}: Improving Probability Estimation Trees, CDER Working Paper #00-04-IS, Stern School of Business, New York University, NY, NY 2000. G. Bakiri and T. G. Dietterich, "Achieving High-Accuracy Text-to-Speech with Machine Learning," in Data Mining in Speech Syntesis, B. Damper, Ed., 1997.
  • 4. Sequential learning R. Duda, P. Hart, and D. Stork, "Chapter 3.1 Hidden Markov Models," in Pattern Classification, 2nd ed: John Wiley & Sons, 2000. L. R. Rabiner and B.-H. Juang, "A Theory and Implementation of Hidden Markov Models," in Fundamentals of Speech Recognition: Prentice Hall, 1993, pp. 321- 389. K. P. Murphy, Hidden Semi-Markov Models (HSMMs), Technical report, 2002. K. Murphy, Dynamic Bayesian Networks: Representation, Inference and Learning, Ph.D thesis, Computer Science Division, UC Berkeley, 2002. T. G. Dietterich, "Machine Learning for Sequential Data: A Review," in Structural, Syntactic, and Statistical Pattern Recognition, vol. 2396, Lecture Notes in Computer Science, Ed.: Springer-Verlag, 2002, pp. 15-30. H. M. Wallach, Conditional Random Fields: An Introduction, Technical Report MS-CIS-04-21, Department of Computer and Information Science, University of Pennsylvania, 2004. J. Lafferty, A. McCallum, and F. Pereira, "Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequential Data," in In Proceedings of the Eighteenth International conference on Machine Learning (ICML-2001), 2001 A. McCallum, "Efficiently Inducing Features of Conditional Random Fields," in Proceedings of the 19ty Conference in Uncertainty in Artificial Intelligence (UIA '03), 2003. B. Pearlmutter, Dynamic Recurrent Neural Networks, Technical Report CMU-CS- 90-196, Carnegie Mellon University School of Computer Science, Pittsburgh, PA 1990. Linear/non-linear regression and classification R. Duda, P. Hart, and D. Stork, "Chapter 5. Linear Discriminant Functions," in Pattern Classification, 2nd ed: John Wiley & Sons, 2000. A. K. Jain, R. P. W. Duin, and J. C. Mao, "Statistical Pattern Recognition: A Review," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, 1, pp. 4-37, 2000.
  • 5. Neural Networks R. Duda, P. Hart, and D. Stork, "Chapter 6. Multi-layer Neural Networks," in Pattern Classification, 2nd ed: John Wiley & Sons, 2000. A. K. Jain, J. Mao, and K. M. Mohiuddin, "Artificial Neural Networks: A Tutorial," in IEEE Computer, vol. 29, 1996, pp. 31-44. J. Sima and P. Orponen, "General-Purpose Computation with Neural Networks: A Survey of Complexity Theoretic Results," in Neural Computation, vol. 15, 2003, pp. 2727-2778 C. M. Bishop, "Single Layer Networks," in Neural Networks for Pattern Recognition: Oxford University Press, 1995, pp. 77-112. C. M. Bishop, "The Multi-Layer Perceptron," in Neural Networks for Pattern Recognition: Oxford University Press, 1995, pp. 116-137. Support Vector Machines C. J. C. Burges, "A Tutorial on Support Vector Machines for Pattern Recognition," Data Mining and Knowledge Discovery, vol. 2, pp. 121-167, 1998. R. Herbrich, "Kernel Classifiers from a Machine Learning Perspective," in Learning Kernel Classifiers: The MIT Press, pp. 49-66. C. Hsu, C. Chang, and C. Lin, A Practical Guide to Support Vector Classification, Technical Report Department of Computer Science and Information Engineering, National Taiwan University, 2003. Density estimation R. Duda, P. Hart, and D. Stork, "Chapter 4. Non-parametric Techniques," in Pattern Classification, 2nd ed: John Wiley & Sons, 2000. C. M. Bishop, "Probabilistic Density Estimation," in Neural Networks for Pattern Recognition: Oxford University Press, 1995, pp. 33-73. Unsupervised learning R. Duda, P. Hart, and D. Stork, "Chapter 10. Unsupervised Learning and Clustering," in Pattern Classification, 2nd ed: John Wiley & Sons, 2000. P. Berkhin, Survey Of Clustering Data Mining Techniques, Technical Report Accrue Software, San Jose, CA 2002. G. Fung, A Comprehensive Overview of Basic Clustering Algorithms, Technical Report, University of Winsconsin, Madison, 2001.
  • 6. A. K. Jain, M. N. Murty, and P. J. Flynn, "Data Clustering: A Review," ACM Computing Surveys, vol. 31, 3, pp. 264-323, 1999. A. Ng, M. Jordan, and Y. Weiss, "On spectral clustering: Analysis and an Algorithm," in Advances in Neural Information Processing Systems, 2001 S. Zhong, Probabilistic Model-Based Clustering of Time Series, Ph.D Qualifying proposal University of Texas at Austin, Austin, Texas, May 2002. Semi-supervised learning* (Cluster-classify, transductive SVMs) X. Zhu, Semi-Supervised Learning with Graphs, Technical CMU-LTI-05-192, Carnegie Mellon University, 2005. A. Blum and T. Mitchell, "Combining Labeled and Unlabeled Data with Co- Training," in Proceedings of the 11th Conference on Computational Learning Theory. Madison, Wisconsin, United States: Morgan Kaufmann Publishers, 1998, pp. 92-100 O. Chapelle, J. Weston, and B. Olkopf, "Cluster Kernels for Semi-supervised Learning," in Advances in Neural Information Processing Systems, vol. 15. Cambridge, MA: The MIT Press, 2003, pp. 585-592. K. Bennett and A. Demiriz, "Semi-Supervised Support Vector Machines," in Proceedings of Advances in Neural Information Processing Systems, M. S. Kearns, S. A. Solla, and D. A. Cohn, Eds.: The MIT Press, 1998, pp. 368-374 K. P. Bennett and A. Demiriz, "Optimization Approaches to Semi-Supervised Learning," in to appear in Applications and Algorithms of Complementarity, L. Mangasarian and J. S. Pang, Eds.: Kluwer Academic Publishers, 2000. T. Joachims, "Transductive Inference for Text Classification using Support Vector Machines," in Proceedings of The 16th International Conference on Machine Learning (ICML' 99). San Francisco, CA: Morgan Kaufmann Publishers, 1999, pp. 200-209. Reinforcement learning (Q-learning, TD learning) L. P. Kaelbling, M. L. Littman, and A. W. Moore, "Reinforcement Learning: a Survey," Journal of Artificial Intelligence Research, vol. 4, pp. 237-285, 1996. R. S. Sutton and A. G. Barto, "The Reinforcement Learning Problem," in Reinforcement Learning: The MIT Press, pp. 51-81
  • 7. R. S. Sutton and A. G. Barto, "Temporal-Difference Learning," in Reinforcement Learning: The MIT Press, pp. 133-157. R. S. Sutton, "Learning to Predict by the Method of Temporal Differences," Machine Learning, vol. 3, pp. 9-44, 1998. Feature Extraction techniques (Fisher, PCA, ICA) R. Duda, P. Hart, and D. Stork, "Chapter 3.8 Component Analysis and Discriminants," in Pattern Classification, 2nd ed: John Wiley & Sons, 2000. C. J. C. Burges, Geometric Methods for Feature Extraction and Dimensional Reduction: A Guided Tour, Technical Report MSR-TR-2004-55, Microsoft Research, Redmond, WA 2003. Ensemble Methods T. G. Dietterich, "Ensemble Methods in Machine Learning," in Proceedings of the First International Workshop on Multiple Classifier Systems, vol. 1857, 2000, pp. 1-15 Y. Freund and R. Schapire, "A short introduction to boosting," in Journal of Japanese Society for Artificial Intelligence, vol. 11, 1999, pp. 771-780. E. Bauer and R. Kohavi, "An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants," in Machine Learning, vol. 36, 1999, pp. 105-139. T. G. Dietterich, "Machine-Learning Research: Four Current Directions," in The AI Magazine, vol. 18, 1998, pp. 97-136 Applications to activity recognition M. Philipose, K. Fishkin, D. Fox, H. Kautz, D. Patterson, and M. perkowitz, "Guide: Towards Understanding Daily Life via Auto-Identification and Statistical Analysis," in Proceedings of The 2nd International Workshop on Ubiquitous Computing for Pervasive Healthcare Applications (UbiHealth ‘03). Seattle, WA, 2003. D. Wilson, "Simultaneous Tracking & Activity Recognition (STAR) Using Many Anonymous, Binary Sensors," to be published in Proceedings of The 3rd International Conference on Pervasive Computing (Pervasive ‘05). Munich, Germany, 2005.
  • 8. D. Patterson, D. Fox, H. Kautz, and M. Philipose, "Expressive, Tractable and Scalable Techniques for Modeling Activities of Daily Living.," in Proceedings of The 2nd International Workshop on Ubiquitous Computing for Pervasive Healthcare Applications (UbiHealth '03). Seattle, WA, 2003. D. Patterson, L. Liao, D. Fox, and H. Kautz, "Inferring High Level Behavior from Low Level Sensors.," in Fifth Annual Conference on Ubiquitous Computing (UBICOMP 2003). Seattle, WA, 2003. M. Philipose, K. P. Fishkin, M. Perkowitz, D. J. Patterson, D. Hahnel, D. Fox, and H. Kautz, "Inferring Activities from Interactions with Objects," IEEE Pervasive Computing Magazine, vol. 3, 4, 2004. M. Perkowitz, M. Philipose, D. J. Patterson, and K. Fishkin, "Mining Models of Human Activities from the Web," in Proceedings of The Thirteenth International World Wide Web Conference (WWW '04). New York, USA, 2004. D. Patterson, D. Fox, H. Kautz, and M. Philipose, Sporadic State Estimation for General Activity Inference, Technical report irs_tr_04_003_a, Intel Research Seattle and the University of Washington, June 2004. D. Patterson, Modeling Details of the Activity Tracker, Technical report irs_tr_04_003_a, Intel Research Seattle and the University of Washington, Seattle, WA, July 2004. N. M. Oliver, B. Rosario, and A. Pentland, "A Bayesian Computer Vision System for Modeling Human Interactions," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, 8, pp. 831-843, 2000. N. Oliver, E. Horvitz, and A. Garg, "Hierarchical Representations for Learning and Inferring Office Activity from Multimodal Information," in Proceedings of the 4th International Conference on Multimodal Interfaces, 2002. S. Luhr, H. Bui, S. Venkatesh, and G. West, "Recognition of Human Activity Through Hierarchical Stochastic Learning," in Proceedings of The IEEE International Conference on Pervasive Computing and Communications: IEEE Press, 2003.