SlideShare a Scribd company logo
Machine Learning Introduction
교재 Machine Learning, Tom T. Mitchell, McGraw-Hill     일부  Reinforcement Learning: An Introduction, R. S. Sutton and A. G. Barto, The MIT Press, 1998     발표
Machine Learning How to construct computer programs that automatically improve with experience Data mining(medical applications: 1989), fraudulent credit card (1989),  transactions, information filtering, users’ reading preference, autonomous vehicles, backgammon at level of world champions(1992), speech recognition(1989), optimizing energy cost Machine learning theory  How does learning performance vary with the number of training examples presented What learning algorithms are most appropriate for various types of learning tasks
예제 프로그램 http://www.cs.cmu.edu/~tom/mlbook.html Face recognition Decision tree learning code Data for financial loan analysis  Bayes classifier code Data for analyzing text documents
이론적 연구 Fundamental relationship among the number of training examples observed, the number of hypotheses under consideration, and the expected error in learned hypotheses Biological systems
Def. A computer program is said to learn from experience  E  wrt some classes of tasks  T  and performance  P , if its performance at tasks in  T , as measured by  P , improves with experience  E .
Outline Why Machine Learning? What is a well-defined learning problem? An example: learning to play checkers What questions should we ask about   Machine Learning?
Why Machine Learning Recent progress in algorithms and  theory Growing flood of online data Computational power is available Budding industry
Three niches for machine learning: Data mining : using historical data to improve decisions medical records    medical knowledge Software applications we can't program by hand autonomous driving speech recognition Self customizing programs Newsreader that learns user interests
Typical Datamining Task (1/2) Data :
Typical Datamining Task (2/2) Given: 9714 patient records, each describing a pregnancy and birth Each patient record contains 215 features Learn to predict: Classes of future patients at high risk for Emergency Cesarean Section
Datamining Result One of 18 learned rules: If  No previous vaginal delivery, and Abnormal 2nd Trimester Ultrasound, and Malpresentation at admission Then Probability of Emergency C-Section is 0.6 Over training data: 26/41 = .63, Over test data: 12/20 = .60
Credit Risk Analysis (1/2) Data :
Credit Risk Analysis (2/2) Rules learned from synthesized data: If  Other-Delinquent-Accounts > 2, and Number-Delinquent-Billing-Cycles > 1 Then Profitable-Customer? = No [Deny Credit Card application] If Other-Delinquent-Accounts = 0, and (Income > $30k) OR (Years-of-Credit > 3) Then Profitable-Customer? = Yes [Accept Credit Card application]
Other Prediction Problems (1/2)
Other Prediction Problems (2/2)
Problems Too Difficult to Program by Hand ALVINN [Pomerleau] drives 70 mph on highways
Software that Customizes to User http://www.wisewire.com
Where Is this Headed? (1/2) Today: tip of the iceberg First-generation algorithms: neural nets, decision trees, regression ... Applied to well-formatted database Budding industry
Where Is this Headed? (2/2) Opportunity for tomorrow: enormous impact Learn across full mixed-media data Learn across multiple internal databases, plus the web and newsfeeds Learn by active experimentation Learn decisions rather than predictions Cumulative, lifelong learning Programming languages with learning embedded?
Relevant Disciplines Artificial intelligence Bayesian methods Computational complexity theory Control theory Information theory Philosophy Psychology and neurobiology Statistics . . .
What is the Learning Problem? Learning = Improving with experience at some task Improve over task T, with respect to performance measure P, based on experience E. E.g., Learn to play checkers T: Play checkers P: % of games won in world tournament E: opportunity to play against self
Learning to Play Checkers T: Play checkers P: Percent of games won in world tournament What experience? What exactly should be learned? How shall it be represented? What specific algorithm to learn it?
Type of Training Experience Direct or indirect? Teacher or not? A problem: is training experience representative of performance goal?
Choose the Target Function ChooseMove  :  Board      Move  ?? V : Board    R ?? . . .
Possible Definition for Target Function V if b is a final board state that is won, then  V( b ) = 100 if b is a final board state that is lost, then  V( b ) = -100 if b is a final board state that is drawn, then  V( b ) = 0 if b is not a final state in the game, then V( b ) = V( b ' ),  where  b '  is the best final board state that can be achieved  starting from  b  and playing optimally until the end of the game. This gives correct values, but is not operational
Choose Representation for Target Function collection of rules? neural network ? polynomial function of board features? . . .
A Representation for Learned Function w 0 + w 1 ·bp(b)+ w 2 · rp(b)+w 3 · bk(b)+w 4 · rk(b)+w 5 · bt(b)+w 6 · rt(b) bp(b)  : number of black pieces on board b rp(b)  : number of red pieces on b bk(b)  : number of black kings on b rk(b)  : number of red kings on b bt(b)  : number of red pieces threatened by black (i.e., which can be taken on black's next turn) rt(b)  : number of black pieces threatened by red
Obtaining Training Examples V( b ): the true target function V( b ) : the learned function V train ( b ): the training value One rule for estimating training values: V train ( b )     V( Successor ( b )) ^ ^
Choose Weight Tuning Rule LMS Weight update rule: Do repeatedly: Select a training example  b  at random 1. Compute error ( b ): error ( b ) = V train ( b ) – V( b ) 2. For each board feature  f i , update weight  w i : w i      w i   + c  ·   f i  ·  error ( b ) c  is some small constant, say 0.1, to moderate the rate of  learning
Final design   The performance system Playing games The critic 차이 발견  ( 분석 ) The generalizer  Generate new hypothesis The experiment generator Generate new problems
학습방법  Backgammon : 6 개  feature 를 늘여서 Reinforcement learning  Neural network :::  판 자체 , 100 만번 스스로 학습    인간에 필적할 만함 Nearest Neighbor algorithm :  여러 가지 학습자료를 저장한 후 가까운 것을 찾아서 처리  Genetic algorithm :::  여러 프로그램을 만들어 적자생존을 통해  진화 Explanation-based learning :::  이기고 지는 이유에 대한 분석을 통한 학습
Design Choices
Some Issues in Machine Learning What algorithms can approximate functions well (and when)? How does number of training examples influence accuracy? How does complexity of hypothesis representation impact it? How does noisy data influence accuracy? What are the theoretical limits of learnability? How can prior knowledge of learner help? What clues can we get from biological learning systems? How can systems alter their own representations?

More Related Content

What's hot (20)

PPTX
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
Parth Khare
 
PPTX
Decision Tree - ID3
Xueping Peng
 
PPTX
Decision Tree Learning
Milind Gokhale
 
PPTX
Decision Trees
Student
 
PDF
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Marina Santini
 
PDF
Machine learning
Andrea Iacono
 
PPTX
Decision Tree - C4.5&CART
Xueping Peng
 
PPT
CC282 Decision trees Lecture 2 slides for CC282 Machine ...
butest
 
PPTX
Machine learning algorithms and business use cases
Sridhar Ratakonda
 
PPTX
Machine Learning and Real-World Applications
MachinePulse
 
PPTX
Lecture 02: Machine Learning for Language Technology - Decision Trees and Nea...
Marina Santini
 
PPT
002.decision trees
hoangminhdong
 
PDF
Classification Based Machine Learning Algorithms
Md. Main Uddin Rony
 
PDF
Decision Tree Algorithm | Decision Tree in Python | Machine Learning Algorith...
Edureka!
 
PPT
Decision tree
Ami_Surati
 
PPT
introducción a Machine Learning
butest
 
PPTX
Decision tree in artificial intelligence
MdAlAmin187
 
PPT
Introduction to Machine Learning Aristotelis Tsirigos
butest
 
PPT
ensemble learning
butest
 
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
Parth Khare
 
Decision Tree - ID3
Xueping Peng
 
Decision Tree Learning
Milind Gokhale
 
Decision Trees
Student
 
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Marina Santini
 
Machine learning
Andrea Iacono
 
Decision Tree - C4.5&CART
Xueping Peng
 
CC282 Decision trees Lecture 2 slides for CC282 Machine ...
butest
 
Machine learning algorithms and business use cases
Sridhar Ratakonda
 
Machine Learning and Real-World Applications
MachinePulse
 
Lecture 02: Machine Learning for Language Technology - Decision Trees and Nea...
Marina Santini
 
002.decision trees
hoangminhdong
 
Classification Based Machine Learning Algorithms
Md. Main Uddin Rony
 
Decision Tree Algorithm | Decision Tree in Python | Machine Learning Algorith...
Edureka!
 
Decision tree
Ami_Surati
 
introducción a Machine Learning
butest
 
Decision tree in artificial intelligence
MdAlAmin187
 
Introduction to Machine Learning Aristotelis Tsirigos
butest
 
ensemble learning
butest
 

Viewers also liked (10)

PDF
TaPP 2013 - Provenance for Data Mining
Boris Glavic
 
PPT
Machine Learning Chapter 11 2
butest
 
PDF
Machine learning Lecture 4
Srinivasan R
 
PPTX
ID3 ALGORITHM
HARDIK SINGH
 
PDF
Lecture2 - Machine Learning
Albert Orriols-Puig
 
PPTX
Genetic Algorithm
Tauseef Ahmad
 
PDF
Lecture11 - neural networks
Albert Orriols-Puig
 
PPTX
Introduction to Genetic Algorithms
Ahmed Othman
 
PPT
Genetic algorithm
garima931
 
PPTX
Introduction to Machine Learning
Lior Rokach
 
TaPP 2013 - Provenance for Data Mining
Boris Glavic
 
Machine Learning Chapter 11 2
butest
 
Machine learning Lecture 4
Srinivasan R
 
ID3 ALGORITHM
HARDIK SINGH
 
Lecture2 - Machine Learning
Albert Orriols-Puig
 
Genetic Algorithm
Tauseef Ahmad
 
Lecture11 - neural networks
Albert Orriols-Puig
 
Introduction to Genetic Algorithms
Ahmed Othman
 
Genetic algorithm
garima931
 
Introduction to Machine Learning
Lior Rokach
 
Ad

Similar to Machine Learning 1 - Introduction (20)

PPT
Machine Learning Techniques all units .ppt
vidhyav58
 
PPT
original
butest
 
PPT
vorl1.ppt
butest
 
PPT
课堂讲义(最后更新:2009-9-25)
butest
 
PPT
introducción a Machine Learning
butest
 
PPTX
ML_ Unit_1_PART_A
Srimatre K
 
PPTX
Machine Learning.pptx
NancyBeaulah_R
 
PPT
Chapter01.ppt
butest
 
PPT
This is a heavily data-oriented
butest
 
PPT
This is a heavily data-oriented
butest
 
PPT
Machine Learning
butest
 
PPT
Machine Learning
butest
 
PPTX
IMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYOND
Rabi Das
 
PPT
Machine Learning Fall, 2007 Course Information
butest
 
PPTX
Unit 2 TOMMichlwjwjwjwjwwjejejejejejejej
palavalasasandeep407
 
PPT
Machine Learning and Inductive Inference
butest
 
PDF
employed to cover the tampering traces of a tampered image.
rapellisrikanth
 
PPT
ML_Lecture_1.ppt
RashiAgarwal839124
 
PPTX
Intelligent Ruby + Machine Learning
Ilya Grigorik
 
Machine Learning Techniques all units .ppt
vidhyav58
 
original
butest
 
vorl1.ppt
butest
 
课堂讲义(最后更新:2009-9-25)
butest
 
introducción a Machine Learning
butest
 
ML_ Unit_1_PART_A
Srimatre K
 
Machine Learning.pptx
NancyBeaulah_R
 
Chapter01.ppt
butest
 
This is a heavily data-oriented
butest
 
This is a heavily data-oriented
butest
 
Machine Learning
butest
 
Machine Learning
butest
 
IMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYOND
Rabi Das
 
Machine Learning Fall, 2007 Course Information
butest
 
Unit 2 TOMMichlwjwjwjwjwwjejejejejejejej
palavalasasandeep407
 
Machine Learning and Inductive Inference
butest
 
employed to cover the tampering traces of a tampered image.
rapellisrikanth
 
ML_Lecture_1.ppt
RashiAgarwal839124
 
Intelligent Ruby + Machine Learning
Ilya Grigorik
 
Ad

More from butest (20)

PDF
EL MODELO DE NEGOCIO DE YOUTUBE
butest
 
DOC
1. MPEG I.B.P frame之不同
butest
 
PDF
LESSONS FROM THE MICHAEL JACKSON TRIAL
butest
 
PPT
Timeline: The Life of Michael Jackson
butest
 
DOCX
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
butest
 
PDF
LESSONS FROM THE MICHAEL JACKSON TRIAL
butest
 
PPTX
Com 380, Summer II
butest
 
PPT
PPT
butest
 
DOCX
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
butest
 
DOC
MICHAEL JACKSON.doc
butest
 
PPTX
Social Networks: Twitter Facebook SL - Slide 1
butest
 
PPT
Facebook
butest
 
DOCX
Executive Summary Hare Chevrolet is a General Motors dealership ...
butest
 
DOC
Welcome to the Dougherty County Public Library's Facebook and ...
butest
 
DOC
NEWS ANNOUNCEMENT
butest
 
DOC
C-2100 Ultra Zoom.doc
butest
 
DOC
MAC Printing on ITS Printers.doc.doc
butest
 
DOC
Mac OS X Guide.doc
butest
 
DOC
hier
butest
 
DOC
WEB DESIGN!
butest
 
EL MODELO DE NEGOCIO DE YOUTUBE
butest
 
1. MPEG I.B.P frame之不同
butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
butest
 
Timeline: The Life of Michael Jackson
butest
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
butest
 
Com 380, Summer II
butest
 
PPT
butest
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
butest
 
MICHAEL JACKSON.doc
butest
 
Social Networks: Twitter Facebook SL - Slide 1
butest
 
Facebook
butest
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
butest
 
Welcome to the Dougherty County Public Library's Facebook and ...
butest
 
NEWS ANNOUNCEMENT
butest
 
C-2100 Ultra Zoom.doc
butest
 
MAC Printing on ITS Printers.doc.doc
butest
 
Mac OS X Guide.doc
butest
 
hier
butest
 
WEB DESIGN!
butest
 

Machine Learning 1 - Introduction

  • 2. 교재 Machine Learning, Tom T. Mitchell, McGraw-Hill  일부 Reinforcement Learning: An Introduction, R. S. Sutton and A. G. Barto, The MIT Press, 1998  발표
  • 3. Machine Learning How to construct computer programs that automatically improve with experience Data mining(medical applications: 1989), fraudulent credit card (1989), transactions, information filtering, users’ reading preference, autonomous vehicles, backgammon at level of world champions(1992), speech recognition(1989), optimizing energy cost Machine learning theory How does learning performance vary with the number of training examples presented What learning algorithms are most appropriate for various types of learning tasks
  • 4. 예제 프로그램 http://www.cs.cmu.edu/~tom/mlbook.html Face recognition Decision tree learning code Data for financial loan analysis Bayes classifier code Data for analyzing text documents
  • 5. 이론적 연구 Fundamental relationship among the number of training examples observed, the number of hypotheses under consideration, and the expected error in learned hypotheses Biological systems
  • 6. Def. A computer program is said to learn from experience E wrt some classes of tasks T and performance P , if its performance at tasks in T , as measured by P , improves with experience E .
  • 7. Outline Why Machine Learning? What is a well-defined learning problem? An example: learning to play checkers What questions should we ask about Machine Learning?
  • 8. Why Machine Learning Recent progress in algorithms and theory Growing flood of online data Computational power is available Budding industry
  • 9. Three niches for machine learning: Data mining : using historical data to improve decisions medical records  medical knowledge Software applications we can't program by hand autonomous driving speech recognition Self customizing programs Newsreader that learns user interests
  • 10. Typical Datamining Task (1/2) Data :
  • 11. Typical Datamining Task (2/2) Given: 9714 patient records, each describing a pregnancy and birth Each patient record contains 215 features Learn to predict: Classes of future patients at high risk for Emergency Cesarean Section
  • 12. Datamining Result One of 18 learned rules: If No previous vaginal delivery, and Abnormal 2nd Trimester Ultrasound, and Malpresentation at admission Then Probability of Emergency C-Section is 0.6 Over training data: 26/41 = .63, Over test data: 12/20 = .60
  • 13. Credit Risk Analysis (1/2) Data :
  • 14. Credit Risk Analysis (2/2) Rules learned from synthesized data: If Other-Delinquent-Accounts > 2, and Number-Delinquent-Billing-Cycles > 1 Then Profitable-Customer? = No [Deny Credit Card application] If Other-Delinquent-Accounts = 0, and (Income > $30k) OR (Years-of-Credit > 3) Then Profitable-Customer? = Yes [Accept Credit Card application]
  • 17. Problems Too Difficult to Program by Hand ALVINN [Pomerleau] drives 70 mph on highways
  • 18. Software that Customizes to User http://www.wisewire.com
  • 19. Where Is this Headed? (1/2) Today: tip of the iceberg First-generation algorithms: neural nets, decision trees, regression ... Applied to well-formatted database Budding industry
  • 20. Where Is this Headed? (2/2) Opportunity for tomorrow: enormous impact Learn across full mixed-media data Learn across multiple internal databases, plus the web and newsfeeds Learn by active experimentation Learn decisions rather than predictions Cumulative, lifelong learning Programming languages with learning embedded?
  • 21. Relevant Disciplines Artificial intelligence Bayesian methods Computational complexity theory Control theory Information theory Philosophy Psychology and neurobiology Statistics . . .
  • 22. What is the Learning Problem? Learning = Improving with experience at some task Improve over task T, with respect to performance measure P, based on experience E. E.g., Learn to play checkers T: Play checkers P: % of games won in world tournament E: opportunity to play against self
  • 23. Learning to Play Checkers T: Play checkers P: Percent of games won in world tournament What experience? What exactly should be learned? How shall it be represented? What specific algorithm to learn it?
  • 24. Type of Training Experience Direct or indirect? Teacher or not? A problem: is training experience representative of performance goal?
  • 25. Choose the Target Function ChooseMove : Board  Move ?? V : Board  R ?? . . .
  • 26. Possible Definition for Target Function V if b is a final board state that is won, then V( b ) = 100 if b is a final board state that is lost, then V( b ) = -100 if b is a final board state that is drawn, then V( b ) = 0 if b is not a final state in the game, then V( b ) = V( b ' ), where b ' is the best final board state that can be achieved starting from b and playing optimally until the end of the game. This gives correct values, but is not operational
  • 27. Choose Representation for Target Function collection of rules? neural network ? polynomial function of board features? . . .
  • 28. A Representation for Learned Function w 0 + w 1 ·bp(b)+ w 2 · rp(b)+w 3 · bk(b)+w 4 · rk(b)+w 5 · bt(b)+w 6 · rt(b) bp(b) : number of black pieces on board b rp(b) : number of red pieces on b bk(b) : number of black kings on b rk(b) : number of red kings on b bt(b) : number of red pieces threatened by black (i.e., which can be taken on black's next turn) rt(b) : number of black pieces threatened by red
  • 29. Obtaining Training Examples V( b ): the true target function V( b ) : the learned function V train ( b ): the training value One rule for estimating training values: V train ( b )  V( Successor ( b )) ^ ^
  • 30. Choose Weight Tuning Rule LMS Weight update rule: Do repeatedly: Select a training example b at random 1. Compute error ( b ): error ( b ) = V train ( b ) – V( b ) 2. For each board feature f i , update weight w i : w i  w i + c · f i · error ( b ) c is some small constant, say 0.1, to moderate the rate of learning
  • 31. Final design The performance system Playing games The critic 차이 발견 ( 분석 ) The generalizer Generate new hypothesis The experiment generator Generate new problems
  • 32. 학습방법 Backgammon : 6 개 feature 를 늘여서 Reinforcement learning Neural network ::: 판 자체 , 100 만번 스스로 학습  인간에 필적할 만함 Nearest Neighbor algorithm : 여러 가지 학습자료를 저장한 후 가까운 것을 찾아서 처리 Genetic algorithm ::: 여러 프로그램을 만들어 적자생존을 통해 진화 Explanation-based learning ::: 이기고 지는 이유에 대한 분석을 통한 학습
  • 34. Some Issues in Machine Learning What algorithms can approximate functions well (and when)? How does number of training examples influence accuracy? How does complexity of hypothesis representation impact it? How does noisy data influence accuracy? What are the theoretical limits of learnability? How can prior knowledge of learner help? What clues can we get from biological learning systems? How can systems alter their own representations?