Classifier Ensemble Methods
Abstract
Multiclassifier systems, the focus of this article, provide scientists and data professionals with powerful techniques for tackling complex datasets. The basic idea behind the multiclassifier approach is to average the decisions or hypotheses of a diverse group of classifiers in order to produce a better decision or hypothesis.
As an introduction to our subject, we begin with a detailed examination of the canonic single-classifier system, as this provides the mathematical foundation needed in our presentation of multiclassifier systems. We then describe some important methods for constructing multiclassifier systems at all the levels mentioned above: the classifier level, the combination level, the data level, and the feature level.
We thus end our overview of multiclassifier systems with a section that provides guidance for experimentally constructing general-purpose (GP) multiclassifier systems.
Bibliography
- 1 J. W. Tukey. Exploratory Data Analysis. Addison-Wesley: Reading, MA, 1977.
- 2 B. V. Dasarathy and B. V. Sheela. Composite Classifier System Design: Concepts and Methodology. Proc. IEEE 1979, 67(5), pp 708–713.
- 3 L. K. Hansen and P. Salamon. Neural Network Ensembles. IEEE Trans. Pattern Anal. Mach. Intell. 1990, 12, pp 993–1001.
- 4 R. E. Schapire. The Strength of Weak Learnability. Mach. Learn. 1990, 5(2), pp 197–227.
- 5 Y. Freund and R. E. Schapire. A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. J. Comput. Syst. Sci. 1997, 55(1), pp 119–139.
- 6 N. Oza and K. Tumer. Classifier Ensembles: Select Real-World Applications. Inf. Fusion 2008, 9(1), pp 4–20.
- 7 P. A. Gislason, J. A. Benediktsson, and J. R. Sveinsson. Decision Fusion for the Classification of Urban Remote Sensing Images. Pattern Recognit. Lett. 2006, 27, pp 294–300.
- 8 G. Giacinto and F. Roli. Ensembles of Neural Networks for Soft Classification of Remote Sensing Images, in European Symposium on Intelligent Techniques; Bari, Italy, 1997; pp 166–170.
- 9 M. Fauvel, J. Chanuscot, and J. A. Benediktsson. Decision Fusion for the Classification of Urban Remote Sensing Images. IEEE Trans. Geosci. Remote Sens. 2006, 44, pp 2828–2838.
- 10 A. Ross and A. Jain. Information Fusion in Biometrics. Pattern Recognit. Lett. 2003, 24(13), pp 2115–2125.
- 11 C.-F. Tsai. Combining Cluster Analysis with Classifier Ensembles to Predict Financial Distress. Inf. Fusion 2014, 16, pp 46–58.
- 12 X. Y. Pan, Y. Tian, Y. Huang, and H. B. Shen. Towards Better Accuracy for Missing Value Estimation of Epistatic Miniarray Profiling Data by a Novel Ensemble Approach. Genomics 2011, 97(5), pp 257–264.
- 13 H. B. Shen and K. C. Chou. Ensemble Classifier for Protein Fold Pattern Recognition. Bioinformatics 2006, 22(14), pp 1717–1722.
- 14 L. Peng et al. An Abnormal ECG Beat Detection Approach for Long-Term Monitoring of Heart Patients Based on Hybrid Kernel Machine Ensemble. In Multiple Classifier Systems, Lecture Notes in Computer Science. Springer: Berlin, 2005; pp 346–355.
- 15
L. I. Kuncheva.
Combining Pattern Classifiers: Methods and Algorithms,
2nd ed.;
John Wiley & Sons, Inc.:
New York,
2014.
10.1002/9781118914564 Google Scholar
- 16 L. Rokach. Pattern Classification Using Ensemble Methods. In Series in Machine Perception Artificial Intelligence, Vol. 75; World Scientific: New Jersey, 2010.
- 17 P. M. Narendra and K. Fukunaga. A Branch and Bound Algorithm for Feature Subset Selection. IEEE Trans. Comput. 1977, 26, pp 917–922.
- 18 A. Whitney. A Direct Method of Nonparametric Measurement Selection. IEEE Trans. Comput. 1971, 20, pp 1100–1103.
- 19 P. Pudil, J. Novovicova, and J. Kittler. Floating Search Methods in Feature Selection. Pattern Recognit. Lett. 1994, 5(11), pp 1119–1125.
- 20 P. A. Devijver and J. Kittler. Pattern Recognition: A Statistical Approach. Prentice Hall: Englewood Cliffs, NJ, 1982.
- 21 A. Jain and D. Zongker. Feature Selection: Evaluation, Application, and Small Sample Performance. IEEE Trans. Pattern Anal. Mach. Intell. 1997, 19(2), pp 153–158.
- 22 T. K. Ho, J. J. Hull, and S. N. Srihari. Decision Combination in Multiclassifier Systems. IEEE Trans. Pattern Anal. Mach. Intell. 1994, 16, pp 66–75.
- 23 J. D. Tubbs and W. O. Alltop. Measure of Confidence Associated with Combining Classification Rules. IEEE Trans. Syst. Man Cybern. 1991, 21, pp 690–692.
- 24 L. I. Kuncheva. Combining Pattern Classifiers: Methods and Algorithms. John Wiley & Sons, Inc.: New York, 2004.
- 25 D. Ruta and B. Gabrys. An Overview of Classifier Fusion Methods. Comput. Inf. Syst. 2000, 7, pp 1–10.
- 26 O. Melnik, Y. Vardi, and C. H. Zhang. Mixed Group Ranks: Preference and Confidence in Classifier Combination. IEEE Trans. Pattern Anal. Mach. Intell. 2004, 26, pp 973–981.
- 27 G. Rogova. Combining the Results of Several Neural Network Classifiers. Neural Networks 1994, 7, pp 777–781.
- 28 L. Zhang and W.-D. Zhou. Sparse Ensembles Using Weighted Combination Methods Based on Linear Programming. Pattern Recognit. 2011, 44(1), pp 97–106.
- 29 C. Schaffer. Selecting a Classification Method by Cross-Validation. Mach. Learn. 1993, 13, pp 135–143.
- 30 K. P. Woods, W. Kegelmeyer, and K. Bowyer. Combination of Multiple Classifiers Using Local Accuracy Estimates. IEEE Trans. Pattern Anal. Mach. Intell. 1997, 19(4), pp 405–410.
- 31 L. Breiman. Bagging Predictors. Mach. Learn. 1996, 24(2), pp 123–140.
- 32 G. Bologna and R. D. Appel. A Comparison Study on Protein Fold Recognition, in Proc. of the 9th International Conference on Neural Information Processing; Singapore, 2002.
- 33 G. Martínez-Muñoz and A. Suárez. Switching Class Labels to Generate Classification Ensembles. Pattern Recognit. 2005, 38(10), pp 1483–1494.
- 34
P. Melville and
R. J. Mooney.
Creating Diversity in Ensembles Using Artificial.
J. Inf. Fusion
2005,
6(1),
pp 99–111.
10.1016/j.inffus.2004.04.001 Google Scholar
- 35 L. Nanni and A. Lumini. FuzzyBagging: A Novel Ensemble of Classifiers. Pattern Recognit. 2006, 39(3), pp 488–490.
- 36 E. Bauer and R. Kohavi. An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants. Mach. Learn. 1999, 36(1–2), pp 105–139.
- 37 L. Nanni. Cluster-Based Pattern Discrimination: A Novel Technique for Feature Selection. Pattern Recognit. Lett. 2006, 27(6), pp 682–687.
- 38 T. K. Ho. The Random Subspace Method for Constructing Decision Forests. IEEE Trans. Pattern Anal. Mach. Intell. 1998, 20(8), pp 832–844.
- 39 K. Tumer and N. C. Oza. Input Decimated Ensembles. Pattern Anal. Appl. 2003, 6, pp 65–77.
- 40 L. Breiman. Random Forest. Mach. Learn. 2001, 45(1), pp 5–32.
- 41 J. J. Rodriguez, L. I. Kuncheva, and C. J. Alonso. Rotation Forest: A New Classifier Ensemble Method. IEEE Trans. Pattern Anal. Mach. Intell. 2006, 28(10), pp 1619–1630.
- 42 L. Nanni and A. Lumini. On Selecting Gabor Features for Biometric Authentication. Int. J. Comput. Appl. Technol. 2009, 35(1), pp 23–28.
- 43 K. Liu and D. Huang. Cancer Classification Using Rotation Forest. Comput. Biol. Med. 2008, 38(5), pp 601–610.
- 44 C.-X. Zhang and J.-S. Zhang. RotBoost: A Technique for Combining Rotation Forest and AdaBoost. Pattern Recognit. Lett. 2008, 29(10), pp 1524–1536.
- 45 W. Leigh, R. Purvis, and J. M. Ragusa. Forecasting the NYSE Composite Index with Echnical Analysis, Pattern Recognizer, Neural Networks, and Genetic Algorithm: A Case Study in Romantic Decision Support. Decision Support Syst. 2002, 32(4), pp 361–377.
- 46 D. H. Wolpert. The Supervised Learning No-Free-Lunch Theorems, in Proc. of 6th Online World Conference on Soft Computing in Industrial Applications; 2001; pp 25–42.
- 47 S. Droste, T. Jansen, and I. Wegener. Rigorous Complexity Analysis of the (1 + 1) Evolutionary Algorithm for Linear Functions with Boolean Inputs, in Proc. of the IEEE Conference on Evolutionary Computation; Anchorage, AK, 1998; pp 499–504.
- 48 L. Nanni et al. Heterogeneous Ensembles for the Missing Feature Problem, in Proc. of Northeast Decision Sciences Institute; New York City, 2013; pp 523–535.
- 49
L. Nanni,
S. Brahnam, and
A. Lumini.
Double Committee adaBoost.
J. King Saud Univ.
2013,
25(1),
pp 29–37.
10.1016/j.jksus.2012.02.001 Google Scholar
- 50 L. Nanni, A. Lumini, and S. Brahnam. An Empirical Study of Different Approaches for Protein Classification. Sci. World J. 2014, Article ID 236717, pp 1–17.
- 51 A. K. Jain, R. P. W. Duin, and J. Mao. Statistical Pattern Recognition: A Review. IEEE Trans. Pattern Anal. Mach. Intell. 2000, 22(1), pp 4–37.
- 52
R. Polikar.
Ensemble Based Systems in Decision Making.
IEEE Circuits Syst. Mag.
2006,
6(3),
pp 21–45.
10.1109/MCAS.2006.1688199 Google Scholar
- 53 L. Rokach. Taxonomy for Characterizing Ensemble Methods in Classification Tasks: A Review and Annotated Bibliography. Comput. Stat. Data Anal. 2009, 53(12), pp 4046–4072.
- 54
G. Seni and
J. Elder.
Ensemble Methods in Data Mining: Improving Accuracy through Combining Predictions.
Morgan and Claypool Publishers,
2010.
10.1007/978-3-031-01899-2 Google Scholar
- 55
B. Baruque and
E. Corchado.
Fusion Methods for Unsupervised Learning Ensembles.
Springer:
New York,
2011.
10.1007/978-3-642-16205-3 Google Scholar
Citing Literature
Wiley Encyclopedia of Electrical and Electronics Engineering
Browse other articles of this reference work: