Redundant Feature Elimination for Multi-Class Problems
Annalisa Appice - University of Bari
Michelangelo Ceci - University of Bari
Simon Rawles - University of Bristol
Peter Flach - University of Bristol
We consider the problem of eliminating redundant Boolean features for a givendata set, where a feature is redundant if it separates the classes less wellthan another feature or set of features. Lavrac et al. proposed the algorithmREDUCE that works by pairwise comparison of features, i.e., it eliminates afeature if it is redundant with respect to another feature. Their algorithmoperates in an ILP setting and is restricted to two-class problems. In thispaper we improve their method and extend it to multiple classes. Central toour approach is the notion of a neighbourhood of examples: a set of examplesof the same class where the number of different features between examples isrelatively small. Redundant features are eliminated by applying a revisedversion of the REDUCE method to each pair of neighbourhoods of differentclass. We analyse the performance of our method on a range of data sets.