SlideShare a Scribd company logo
Chapter 8 Covering (Rules-based) Algorithm Data Mining Technology
Chapter 8 Covering (Rules-based) Algorithm Written by Shakhina Pulatova  Presented by Zhao Xinyou [email_address] 2007.11.13 Data Mining Technology Some materials (Examples) are taken from Website.
Contents What is the Covering (Rule-based) algorithm? Classification Rules- Straightforward 1. If-Then rule 2. Generating rules from Decision Tree Rule-based Algorithm 1. The 1R Algorithm / Learn One Rule 2. The PRISM Algorithm 3. Other Algorithm Application of Covering algorithm Discussion on e/m-learning application
Introduction-App-1 PP87-88 Training Data Attributes Record Rules Rules given by people Rules generated by computer Setting 1.(1.75, 0)  short 2. [1.75, 1.95) Medium 3. [1.95, ..) tall
Introduction-App-2 PP87-88 How to get all tall people from B based on A A B + Training Data
What is Rule-based Algorithm? Definition : Each classification method uses an algorithm to generate rules from the sample data. These rules are then applied to new data. Rule-based algorithm  provide mechanisms that generate rules by  1. concentrating on a specific class at a time 2. maximizing the probability of the desired classification. PP87-88 Should be compact, easy-to-interpret, and accurate.
Classification Rules- Straightforward If-Then rule Generating rules from Decision Tree PP88-89
formal Specification of Rule-based Algorithm The classification  r ules, r=<a, c>, consists of : a  ( a ntecedent/precondition): a series of tests that be valuated as  true  or  false ; c  ( c onsequent/conclusion): the class or classes that apply to instances covered by rule r. PP88 a=0,b=0 a=0,b=1 a=1,b=0 a=1,b=1 a = x y c = a=0 b=0 b=0 yes no X X Y Y no no yes yes
Remarks of Straightforward classification The  a ntecedent contains a predicate that can be valuated as true or false against each tuple in database. These rules relate directly to corresponding decision tree (DT) that could be created. A DT can always be used to generate rules, but they are not equivalent. Differences: -the tree has a implied order in which the splitting is performed; rules have no order. -a tree is created based on looking at all classes; only one class must be examined at a time. PP88-89
If-Then rule Straightforward way to perform classification is to generate if-then rules that cover all cases. 1 PP88
Generating rules from Decision Tree -1-Con’ Decision Tree 2
Generating rules from Decision Tree -2-Con’ y n a b c d x y y
Generating rules from Decision Tree -3-Con’
Remarks Rules may be more complex and incomprehensible from DT. A new test or rules need reshaping the whole tree Rules obtained without decision trees are more compact and accurate. So many other covering algorithms have been proposed. PP89-90 a b x y y c d x y y n n n n c d x y y n n c d x y y n n c d x y y n n duplicate subtrees a=0 b=0 b=0 yes no X X Y Y no no yes yes a=1 and c=0  Y
Rule-based Classification Generate rules The 1R Algorithm / Learn One Rule The PRISM Algorithm Other Algorithm PP90
Generating rules without Decision Trees-1-con’ Goal: find rules that identify the instances of a specific class Generate the “best” rule possible by optimizing the desired classification probability Usually, the “best” attribute-pair is chosen Remark -these technologies are also called covering algorithms because they attempt to generate rules which exactly  cover  a specific class.
Generate Rules-Example-2-Con' Example 3 Question: We want to generate a rule to classify persons as tall. Basic format of the rule: if ? then class = tall Goal: replace “?” with predicates that can be used to obtain the “best” probability of being tall PP90
Generate Rules-Algorithms-3-Con' 1.Generate rule R on training data S; 2.Remove the training data covered by rule R; 3. Repeat the process. PP90
Generate Rules-Example-4-Con' Sequential Covering (I) Original data (ii) Step 1 r = NULL (iii) Step 2 R1 r = R1 (iii) Step 3 R1 R2 r = R1  U R2 (iii) Step 4 R1 R2 R3 r = R1  U R2  U R3 Wrong Class
1R Algorithm/ Learn One Rule-Con’  Simple and cheap method it only generates a one level decision tree. Classify an object on the basis of a single attribute. Idea: Rules will be constructed to test a single attribute and branch for every value of that attribute. For each branch, the class with the test classification is the one occurring  PP91
1R Algorithm/ Learn One Rule-Con’  Idea : 1. Rules will be constructed to test a single attribute and branch for every value of that attribute.  Step   2. For each branch, the class with the test classification is the one occurring. 3. Find one biggest number as rules 4. Error rate will be evaluated. 5. The minimum error rate will be chosen.  PP91 M->T  Error=5 F->M  Error=3 Total  Error=8 Total  Error=3 Total  Error=.. A2 An Gender F 2 5 1 S M T M 1 4 10 S M T
1R Algorithm Input: D   //Training Data T   //Attributes to consider for rules   C   //Classes Output : R   //Rules ALgorithm : R=Φ; for all A in T do R A =Φ; for all possbile value, v, of A do for all C j ∈C do find count(C j ) end for let C m  be the class with the largest count; R A =R A ((A=v) ->(class= C m )); end for ERR A =number of tuples incorrectly classified by R A ; e nd for R=R A  where ERR A  is minimum T={Gender, Height} D C={{F, M},  {0, ∞}} C1 C2 Training Data Gender F M Short Medium Tall 3 6 0 Short Medium Tall 1 2 3 R1=F->medium R2=M->tall Height
Example 5 – 1R-3-Con’ Rules  based on  height … ... … 0/2 0/2 0/3 0/4 1/2 0/2 3/9 3/6 Error 1/15 (0  , 1.6]-> short (1.6, 1.7]->short (1.7, 1.8]-> medium (1.8, 1.9]-> medium (1.9, 2.0]-> medium (2.0,  ∞ ]-> tall Height (Step=0.1) 2 6/15 F->medium M->tall Gender 1 Total Error Rules Attribute Option
Example 6 -1R PP92-93 5/14 2/8 3/6 False->yes True->no windy 4 4/14 3/7 1/7 High->no Normal->yes humidity 3 2/4 2/6 1/4 2/5 0/4 2/5 Error 5/14 Hot->no Mild->yes Cool->yes temperature 2 4/14 Sunny->no Overcast->yes Rainy->yes outlook 1 Total Error Rules Attribute Rules  based on humidity  OR High->no Normal->yes Rules  based on outlook Sunny->no Overcast->yes Rainy->yes
PRISM Algorithm-Con’ PRISM generate rules for each class by looking at the training data and adding rules that completely describe all tuples in that class. Generates only correct or perfect rules: the accuracy of so-constructed PRISM is 100%. Measures the success of a rule by a p/t, where  -p is number of positive instance,  -T is total number of instance covered by the rule. Gender=Male  P=10, T=10 Gender=Female  P=1 T=8  R=Gender = Male …… A2 An Gender F 2 5 1 S M T M 0 0 10 S M T
PRISM Algorithm Step Input  D  and  C  (Attribute -> Value) 1.Compute all class P/T  (Attribute->Value) 2. Find one or more pair of  (Attribute->Value)   P/T = 100% 3. Select  (Attribute->Value)  as  Rule 4. Repeat 1-3 until no data in  D Input: D   //Training Data C   //Classes Output: R //Rules
Example 8-Con’-which class may be tall? Compute the value  p / t Which one is 100% PP94-95 0/9 Gender = F 1 2/2 2.0< Height 8 ½ 1.9< Height  ≤ 2.0 7 0/4 1.8< Height  ≤ 1.9 6 0/3 1.7< Height  ≤ 1.8 5 0/2 1.6< Height  ≤ 1.7 4 0/2 Height  ≤ 1.6 3 3/6 Gender = M 2 p / t (Attribute, value) Num R1  = 2.0< Height
R2  = 1.95< Height ≤ 2.0 R = R1 U R2 PP94-96 … … … 1/1 1.95< Height  ≤ 2.0 0/1 1.9< Height  ≤ 1.95 p / t (Attribute, value) Num
Example 9-Con’-which days may play? The predicate  outlook=overcast   correctly implies  play=yes  on all four rows R1 =if outlook=overcast, then play=yes Compute the value  p / t
Example 8-Con’ R2= if humidity=normal and windy=false, then play=yes
Example 8-Con’ R3 =….. R = R1 U R2 U R3 U…
Application of Covering Algorithm To derive classification rules applied for diagnosing illness, business planning, banking, government. Machine learning Text classification. But to photos, it is difficult… And so on.
Application on E-learning/M-learning Adaptive and personalized learning materials Virtual Group Classification Initial Learner’s information Classification of learning styles or some Provide adaptive and personalized materials Collect learning styles feedback Chapter 2 or 3 Similarity, Bayesian… Rule-based algorithm
Discussion

More Related Content

What's hot (20)

PDF
Presentation on Domain Name System
Chinmay Joshi
 
PPT
BackTracking Algorithm: Technique and Examples
Fahim Ferdous
 
PPTX
Church Turing Thesis
Hemant Sharma
 
PPTX
Decision Tree Classification Algorithm.pptx
PriyadharshiniG41
 
PPTX
Adversarial search
Nilu Desai
 
PPTX
Minmax Algorithm In Artificial Intelligence slides
SamiaAziz4
 
PPTX
Trees and graphs
Lokesh Singrol
 
PPT
Learning sets of rules, Sequential Learning Algorithm,FOIL
Pavithra Thippanaik
 
PDF
Confusion Matrix Explained
Stockholm University
 
PPTX
ER model to Relational model mapping
Shubham Saini
 
PDF
Syntax analysis
Akshaya Arunan
 
PPTX
Min-Max algorithm
Dr. C.V. Suresh Babu
 
PPTX
3 classification
Mahmoud Alfarra
 
PDF
Token, Pattern and Lexeme
A. S. M. Shafi
 
PPTX
Frames
amitp26
 
PPTX
Regular expressions
Shiraz316
 
PPTX
Adversarial search
Dheerendra k
 
PDF
symmetric key encryption algorithms
Rashmi Burugupalli
 
PPTX
Design and Analysis of Algorithms
Arvind Krishnaa
 
PPTX
KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...
Simplilearn
 
Presentation on Domain Name System
Chinmay Joshi
 
BackTracking Algorithm: Technique and Examples
Fahim Ferdous
 
Church Turing Thesis
Hemant Sharma
 
Decision Tree Classification Algorithm.pptx
PriyadharshiniG41
 
Adversarial search
Nilu Desai
 
Minmax Algorithm In Artificial Intelligence slides
SamiaAziz4
 
Trees and graphs
Lokesh Singrol
 
Learning sets of rules, Sequential Learning Algorithm,FOIL
Pavithra Thippanaik
 
Confusion Matrix Explained
Stockholm University
 
ER model to Relational model mapping
Shubham Saini
 
Syntax analysis
Akshaya Arunan
 
Min-Max algorithm
Dr. C.V. Suresh Babu
 
3 classification
Mahmoud Alfarra
 
Token, Pattern and Lexeme
A. S. M. Shafi
 
Frames
amitp26
 
Regular expressions
Shiraz316
 
Adversarial search
Dheerendra k
 
symmetric key encryption algorithms
Rashmi Burugupalli
 
Design and Analysis of Algorithms
Arvind Krishnaa
 
KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...
Simplilearn
 

Viewers also liked (20)

PDF
Machine Learning and Data Mining: 12 Classification Rules
Pier Luca Lanzi
 
PDF
rule-based classifier
Sean Chiu
 
PPTX
05 classification 1 decision tree and rule based classification
นนทวัฒน์ บุญบา
 
PDF
DATA MINING on WEKA
satyamkhatri
 
PPTX
Data mining
Monsur Ahmed Shafiq
 
PPTX
Functional Leap of Faith (Keynote at JDay Lviv 2014)
Tomer Gabel
 
PDF
C++ TUTORIAL 8
Farhan Ab Rahman
 
PPT
Ch5 alternative classification
dadaoxing
 
PDF
Randomized Algorithms in Linear Algebra & the Column Subset Selection Problem
Wei Xue
 
PPTX
Dynamic programming
Yıldırım Tam
 
PPT
Chap08alg
Munkhchimeg
 
PDF
Solving The Shortest Path Tour Problem
Nozir Shokirov
 
DOC
Data mining notes
AVC College of Engineering
 
PPT
21 backtracking
Aparup Behera
 
PDF
DP
Subba Oota
 
PPT
5.5 back track
Krish_ver2
 
PPTX
Subset sum problem Dynamic and Brute Force Approch
Ijlal Ijlal
 
PPT
Dynamic programming in Algorithm Analysis
Rajendran
 
PPTX
Class warshal2
Debarati Das
 
PDF
C++ idioms by example (Nov 2008)
Olve Maudal
 
Machine Learning and Data Mining: 12 Classification Rules
Pier Luca Lanzi
 
rule-based classifier
Sean Chiu
 
05 classification 1 decision tree and rule based classification
นนทวัฒน์ บุญบา
 
DATA MINING on WEKA
satyamkhatri
 
Data mining
Monsur Ahmed Shafiq
 
Functional Leap of Faith (Keynote at JDay Lviv 2014)
Tomer Gabel
 
C++ TUTORIAL 8
Farhan Ab Rahman
 
Ch5 alternative classification
dadaoxing
 
Randomized Algorithms in Linear Algebra & the Column Subset Selection Problem
Wei Xue
 
Dynamic programming
Yıldırım Tam
 
Chap08alg
Munkhchimeg
 
Solving The Shortest Path Tour Problem
Nozir Shokirov
 
Data mining notes
AVC College of Engineering
 
21 backtracking
Aparup Behera
 
5.5 back track
Krish_ver2
 
Subset sum problem Dynamic and Brute Force Approch
Ijlal Ijlal
 
Dynamic programming in Algorithm Analysis
Rajendran
 
Class warshal2
Debarati Das
 
C++ idioms by example (Nov 2008)
Olve Maudal
 
Ad

Similar to Covering (Rules-based) Algorithm (20)

PDF
DATA STRUCTURE.pdf
ibrahim386946
 
PDF
DATA STRUCTURE
RobinRohit2
 
PPT
Design and analysis of algorithm in Computer Science
secularistpartyofind
 
PPTX
Predictive analytics using 'R' Programming
ssusere796b3
 
PPT
White boxvsblackbox
sanerjjd
 
PPTX
Daa unit 1
jinalgoti
 
PPT
Droolsand Rule Based Systems 2008 Srping
Srinath Perera
 
PPT
decison tree and rules in data mining techniques
ALIZAIB KHAN
 
PPTX
Module-1.pptxbdjdhcdbejdjhdbchchchchchjcjcjc
shashashashashank
 
PPT
Ch02 primitive-data-definite-loops
James Brotsos
 
PDF
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Universitat Politècnica de Catalunya
 
PPT
UNIT-1-PPTS-DAA.ppt
racha49
 
PPT
UNIT-1-PPTS-DAA.ppt
SamridhiGulati4
 
PPT
Introduction to Design Algorithm And Analysis.ppt
BhargaviDalal4
 
PPT
Learning
butest
 
PDF
ppts foe design and analysis of algorithm
nawaz65
 
PDF
A Fast Decision Rule Engine for Anomaly Detection
Databricks
 
PPTX
machine _learning_introductionand python.pptx
ChandrakalaV15
 
PPS
Lec 1 Ds
Qundeel
 
PPS
Lec 1 Ds
Qundeel
 
DATA STRUCTURE.pdf
ibrahim386946
 
DATA STRUCTURE
RobinRohit2
 
Design and analysis of algorithm in Computer Science
secularistpartyofind
 
Predictive analytics using 'R' Programming
ssusere796b3
 
White boxvsblackbox
sanerjjd
 
Daa unit 1
jinalgoti
 
Droolsand Rule Based Systems 2008 Srping
Srinath Perera
 
decison tree and rules in data mining techniques
ALIZAIB KHAN
 
Module-1.pptxbdjdhcdbejdjhdbchchchchchjcjcjc
shashashashashank
 
Ch02 primitive-data-definite-loops
James Brotsos
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Universitat Politècnica de Catalunya
 
UNIT-1-PPTS-DAA.ppt
racha49
 
UNIT-1-PPTS-DAA.ppt
SamridhiGulati4
 
Introduction to Design Algorithm And Analysis.ppt
BhargaviDalal4
 
Learning
butest
 
ppts foe design and analysis of algorithm
nawaz65
 
A Fast Decision Rule Engine for Anomaly Detection
Databricks
 
machine _learning_introductionand python.pptx
ChandrakalaV15
 
Lec 1 Ds
Qundeel
 
Lec 1 Ds
Qundeel
 
Ad

More from ZHAO Sam (8)

PPTX
Solr installation
ZHAO Sam
 
PDF
Special issue on Technology Enhanced Learning
ZHAO Sam
 
PPT
国際会議推薦システムAcademic Conference Publishing System
ZHAO Sam
 
PPT
祝大家新年快樂
ZHAO Sam
 
PPT
Ubiquitous
ZHAO Sam
 
PPT
Clustering: Large Databases in data mining
ZHAO Sam
 
PPT
similarity measure
ZHAO Sam
 
PPT
A Real-Time Interactive Shared System for Distance Learning
ZHAO Sam
 
Solr installation
ZHAO Sam
 
Special issue on Technology Enhanced Learning
ZHAO Sam
 
国際会議推薦システムAcademic Conference Publishing System
ZHAO Sam
 
祝大家新年快樂
ZHAO Sam
 
Ubiquitous
ZHAO Sam
 
Clustering: Large Databases in data mining
ZHAO Sam
 
similarity measure
ZHAO Sam
 
A Real-Time Interactive Shared System for Distance Learning
ZHAO Sam
 

Recently uploaded (20)

PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
July Patch Tuesday
Ivanti
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PDF
What Makes Contify’s News API Stand Out: Key Features at a Glance
Contify
 
PDF
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PPTX
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PPTX
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
PDF
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
IoT-Powered Industrial Transformation – Smart Manufacturing to Connected Heal...
Rejig Digital
 
PDF
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
PDF
Biography of Daniel Podor.pdf
Daniel Podor
 
PDF
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
July Patch Tuesday
Ivanti
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
What Makes Contify’s News API Stand Out: Key Features at a Glance
Contify
 
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
IoT-Powered Industrial Transformation – Smart Manufacturing to Connected Heal...
Rejig Digital
 
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
Biography of Daniel Podor.pdf
Daniel Podor
 
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 

Covering (Rules-based) Algorithm

  • 1. Chapter 8 Covering (Rules-based) Algorithm Data Mining Technology
  • 2. Chapter 8 Covering (Rules-based) Algorithm Written by Shakhina Pulatova Presented by Zhao Xinyou [email_address] 2007.11.13 Data Mining Technology Some materials (Examples) are taken from Website.
  • 3. Contents What is the Covering (Rule-based) algorithm? Classification Rules- Straightforward 1. If-Then rule 2. Generating rules from Decision Tree Rule-based Algorithm 1. The 1R Algorithm / Learn One Rule 2. The PRISM Algorithm 3. Other Algorithm Application of Covering algorithm Discussion on e/m-learning application
  • 4. Introduction-App-1 PP87-88 Training Data Attributes Record Rules Rules given by people Rules generated by computer Setting 1.(1.75, 0) short 2. [1.75, 1.95) Medium 3. [1.95, ..) tall
  • 5. Introduction-App-2 PP87-88 How to get all tall people from B based on A A B + Training Data
  • 6. What is Rule-based Algorithm? Definition : Each classification method uses an algorithm to generate rules from the sample data. These rules are then applied to new data. Rule-based algorithm provide mechanisms that generate rules by 1. concentrating on a specific class at a time 2. maximizing the probability of the desired classification. PP87-88 Should be compact, easy-to-interpret, and accurate.
  • 7. Classification Rules- Straightforward If-Then rule Generating rules from Decision Tree PP88-89
  • 8. formal Specification of Rule-based Algorithm The classification r ules, r=<a, c>, consists of : a ( a ntecedent/precondition): a series of tests that be valuated as true or false ; c ( c onsequent/conclusion): the class or classes that apply to instances covered by rule r. PP88 a=0,b=0 a=0,b=1 a=1,b=0 a=1,b=1 a = x y c = a=0 b=0 b=0 yes no X X Y Y no no yes yes
  • 9. Remarks of Straightforward classification The a ntecedent contains a predicate that can be valuated as true or false against each tuple in database. These rules relate directly to corresponding decision tree (DT) that could be created. A DT can always be used to generate rules, but they are not equivalent. Differences: -the tree has a implied order in which the splitting is performed; rules have no order. -a tree is created based on looking at all classes; only one class must be examined at a time. PP88-89
  • 10. If-Then rule Straightforward way to perform classification is to generate if-then rules that cover all cases. 1 PP88
  • 11. Generating rules from Decision Tree -1-Con’ Decision Tree 2
  • 12. Generating rules from Decision Tree -2-Con’ y n a b c d x y y
  • 13. Generating rules from Decision Tree -3-Con’
  • 14. Remarks Rules may be more complex and incomprehensible from DT. A new test or rules need reshaping the whole tree Rules obtained without decision trees are more compact and accurate. So many other covering algorithms have been proposed. PP89-90 a b x y y c d x y y n n n n c d x y y n n c d x y y n n c d x y y n n duplicate subtrees a=0 b=0 b=0 yes no X X Y Y no no yes yes a=1 and c=0 Y
  • 15. Rule-based Classification Generate rules The 1R Algorithm / Learn One Rule The PRISM Algorithm Other Algorithm PP90
  • 16. Generating rules without Decision Trees-1-con’ Goal: find rules that identify the instances of a specific class Generate the “best” rule possible by optimizing the desired classification probability Usually, the “best” attribute-pair is chosen Remark -these technologies are also called covering algorithms because they attempt to generate rules which exactly cover a specific class.
  • 17. Generate Rules-Example-2-Con' Example 3 Question: We want to generate a rule to classify persons as tall. Basic format of the rule: if ? then class = tall Goal: replace “?” with predicates that can be used to obtain the “best” probability of being tall PP90
  • 18. Generate Rules-Algorithms-3-Con' 1.Generate rule R on training data S; 2.Remove the training data covered by rule R; 3. Repeat the process. PP90
  • 19. Generate Rules-Example-4-Con' Sequential Covering (I) Original data (ii) Step 1 r = NULL (iii) Step 2 R1 r = R1 (iii) Step 3 R1 R2 r = R1 U R2 (iii) Step 4 R1 R2 R3 r = R1 U R2 U R3 Wrong Class
  • 20. 1R Algorithm/ Learn One Rule-Con’ Simple and cheap method it only generates a one level decision tree. Classify an object on the basis of a single attribute. Idea: Rules will be constructed to test a single attribute and branch for every value of that attribute. For each branch, the class with the test classification is the one occurring PP91
  • 21. 1R Algorithm/ Learn One Rule-Con’ Idea : 1. Rules will be constructed to test a single attribute and branch for every value of that attribute. Step 2. For each branch, the class with the test classification is the one occurring. 3. Find one biggest number as rules 4. Error rate will be evaluated. 5. The minimum error rate will be chosen. PP91 M->T Error=5 F->M Error=3 Total Error=8 Total Error=3 Total Error=.. A2 An Gender F 2 5 1 S M T M 1 4 10 S M T
  • 22. 1R Algorithm Input: D //Training Data T //Attributes to consider for rules C //Classes Output : R //Rules ALgorithm : R=Φ; for all A in T do R A =Φ; for all possbile value, v, of A do for all C j ∈C do find count(C j ) end for let C m be the class with the largest count; R A =R A ((A=v) ->(class= C m )); end for ERR A =number of tuples incorrectly classified by R A ; e nd for R=R A where ERR A is minimum T={Gender, Height} D C={{F, M}, {0, ∞}} C1 C2 Training Data Gender F M Short Medium Tall 3 6 0 Short Medium Tall 1 2 3 R1=F->medium R2=M->tall Height
  • 23. Example 5 – 1R-3-Con’ Rules based on height … ... … 0/2 0/2 0/3 0/4 1/2 0/2 3/9 3/6 Error 1/15 (0 , 1.6]-> short (1.6, 1.7]->short (1.7, 1.8]-> medium (1.8, 1.9]-> medium (1.9, 2.0]-> medium (2.0, ∞ ]-> tall Height (Step=0.1) 2 6/15 F->medium M->tall Gender 1 Total Error Rules Attribute Option
  • 24. Example 6 -1R PP92-93 5/14 2/8 3/6 False->yes True->no windy 4 4/14 3/7 1/7 High->no Normal->yes humidity 3 2/4 2/6 1/4 2/5 0/4 2/5 Error 5/14 Hot->no Mild->yes Cool->yes temperature 2 4/14 Sunny->no Overcast->yes Rainy->yes outlook 1 Total Error Rules Attribute Rules based on humidity OR High->no Normal->yes Rules based on outlook Sunny->no Overcast->yes Rainy->yes
  • 25. PRISM Algorithm-Con’ PRISM generate rules for each class by looking at the training data and adding rules that completely describe all tuples in that class. Generates only correct or perfect rules: the accuracy of so-constructed PRISM is 100%. Measures the success of a rule by a p/t, where -p is number of positive instance, -T is total number of instance covered by the rule. Gender=Male P=10, T=10 Gender=Female P=1 T=8 R=Gender = Male …… A2 An Gender F 2 5 1 S M T M 0 0 10 S M T
  • 26. PRISM Algorithm Step Input D and C (Attribute -> Value) 1.Compute all class P/T (Attribute->Value) 2. Find one or more pair of (Attribute->Value) P/T = 100% 3. Select (Attribute->Value) as Rule 4. Repeat 1-3 until no data in D Input: D //Training Data C //Classes Output: R //Rules
  • 27. Example 8-Con’-which class may be tall? Compute the value p / t Which one is 100% PP94-95 0/9 Gender = F 1 2/2 2.0< Height 8 ½ 1.9< Height ≤ 2.0 7 0/4 1.8< Height ≤ 1.9 6 0/3 1.7< Height ≤ 1.8 5 0/2 1.6< Height ≤ 1.7 4 0/2 Height ≤ 1.6 3 3/6 Gender = M 2 p / t (Attribute, value) Num R1 = 2.0< Height
  • 28. R2 = 1.95< Height ≤ 2.0 R = R1 U R2 PP94-96 … … … 1/1 1.95< Height ≤ 2.0 0/1 1.9< Height ≤ 1.95 p / t (Attribute, value) Num
  • 29. Example 9-Con’-which days may play? The predicate outlook=overcast correctly implies play=yes on all four rows R1 =if outlook=overcast, then play=yes Compute the value p / t
  • 30. Example 8-Con’ R2= if humidity=normal and windy=false, then play=yes
  • 31. Example 8-Con’ R3 =….. R = R1 U R2 U R3 U…
  • 32. Application of Covering Algorithm To derive classification rules applied for diagnosing illness, business planning, banking, government. Machine learning Text classification. But to photos, it is difficult… And so on.
  • 33. Application on E-learning/M-learning Adaptive and personalized learning materials Virtual Group Classification Initial Learner’s information Classification of learning styles or some Provide adaptive and personalized materials Collect learning styles feedback Chapter 2 or 3 Similarity, Bayesian… Rule-based algorithm