[PDF][PDF] Compression-Based Induction and Genome Data.

R Hewett, JH Leuchner, CM Teng, SD Mooney… - FLAIRS, 2002 - cdn.aaai.org
R Hewett, JH Leuchner, CM Teng, SD Mooney, TE Klein
FLAIRS, 2002cdn.aaai.org
Our previous work developed SORCER, a learning system that induces a set of rules from a
data set represented as a second-order decision table. Second-order decision tables are
database relations in which rows have sets of atomic values as components. Using sets of
values, which are interpreted as disjunctions, provides compact representations that
facilitate efficient management and enhance comprehensibility. SORCER generates
classifiers with a near minimum number of rows. The induction algorithm can be viewed as a …
Abstract
Our previous work developed SORCER, a learning system that induces a set of rules from a data set represented as a second-order decision table. Second-order decision tables are database relations in which rows have sets of atomic values as components. Using sets of values, which are interpreted as disjunctions, provides compact representations that facilitate efficient management and enhance comprehensibility. SORCER generates classifiers with a near minimum number of rows. The induction algorithm can be viewed as a table compression technique in which a table of training data is transformed into a second-order table with fewer rows by merging rows in ways that preserve consistency with the training data. In this paper we propose three new mechanisms in SORCER:(1) compression by removal of table columns,(2) inclusion of simple rules based on statistics, and (3) a method for partitioning continuous data into discrete clusters. We apply our approach to classify clinical phenotypes of a genetic collagenous disorder, Osteogenesis imperfecta, using a data set of point mutations in COLIA1 gene. Preliminary results show that on the average, over ten 10-fold cross validations, SORCER obtained an error estimate of 16.7%, compared to 35.1% obtained from the decision tree learner, C4. 5.
cdn.aaai.org
Showing the best result for this search. See all results