Classification, Overview
Abstract
Classification is concerned with (i) constructing classes, and (ii) assigning objects to one of several predetermined classes. Classes may be constructed to balance measures of their hopefully large external differences and hopefully small internal differences, or so that subsequent assignment is optimal. Objectives for assigning to classes may be to minimize errors of misclassification and/or to minimize costs. Both (i) and (ii) exist in probabilistic and nonprobabilistic forms. The nature of the things being classified is crucial. Are they single items representative of many identical items, or representative of a population of similar items characterized by a probability distribution? Are they initially totally unstructured, or are they already explicitly, or implicitly, assigned to some previous classification? Class-construction includes the multivariate mixture problem (probabilistic), partition problems and many methods for hierarchical classification (nonprobabilistic). Assignment problems include discriminant analysis (probabilistic), diagnostic keys and tables (nonprobabilistic), and expert systems for diagnosis (hybrid).