The Effect of Class Imbalance, Complexity, Size, and Learning Distribution on Classifier Performance

Publication Date

2011

Document Type

Article

Issue

3-4

Abstract

Classes of real world datasets have various properties (such as imbalance, size, complexity, and class distribution) that make the classification task more difficult. We investigate the robustness of six classification techniques over data having various combinations of the above mentioned properties. One artificial domain and six real world datasets are used in these experiments. Results of our analysis point to the frequency-based classifiers (such as the fuzzy and the Bayes classifiers) as being more robust over various imbalance, size, complexity, and training distribution. © 2011 Inderscience Enterprises Ltd.

Keywords

Classification, Fuzzy sets, Imbalance data, Learning distribution

Published Version

Share

COinS