Cluster Analysis of Subjects, Nonhierarchical Methods

Cluster analysis is concerned with investigating whether or not a given set of data consists of relatively distinct groups of observations. The cluster analysis methods produce a partition of individuals into a particular number of groups and a numerical index is assigned to the partition. This index indicates how successful the partition is in describing the data and therefore allows the comparison of competing partitions.

The numerical indices most commonly used arise from three matrices representing total dispersion, within-group dispersion and between-group dispersion. Several possible indices have been suggested which differ in the implicit assumptions made about the shape of any clusters present. Once the index has been selected, the partition that leads to its optimization is found. It is usually impractical to examine each individual partition, so algorithms designed to search for the optimum value of the clustering criterion have been developed.

References

1 Ball, G. H. & Hall, D. J. (1967). A clustering technique for summarizing multivariate data, Behavioural Science 12, 153–155.
10.1002/bs.3830120210
CAS PubMed Web of Science® Google Scholar
2 Banfield, J. D. & Raftery, A. E. (1993). Model-based Gaussian and non-Gaussian clustering, Biometrics 49, 803–822.
10.2307/2532201
Web of Science® Google Scholar
3 Beale, E. M. L. (1969). Euclidean cluster analysis, Bulletin of the International Statistical Institute 43, Book 2, 92–94.
Google Scholar
4 Calinski, T. & Harabasz, J. (1974). A dendrite method for cluster analysis, Communications in Statistics 3, 1–24.
Google Scholar
5 Everitt, B. S. (1993). Cluster Analysis, 3rd Ed. Arnold, London.
10.1016/0956-716X(93)90030-V
Web of Science® Google Scholar
6 Everitt, B. S., Gourlay, A. J. & Kendell, R. E. (1971). An attempt at validation of traditional psychiatric syndromes by cluster analysis, British Journal of Psychiatry 119, 399–412.
10.1192/bjp.119.551.399
CAS PubMed Web of Science® Google Scholar
7 Fisher, W. D. (1958). On grouping for maximum homogeneity, Journal of the American Statistical Association 53, 789–798.
10.1080/01621459.1958.10501479
Web of Science® Google Scholar
8 Fleiss, J. L. & Zubin, J. (1969). On the methods and theory of clustering, Multivariate Behavioural Research 4, 235–250.
10.1207/s15327906mbr0402_8
CAS PubMed Web of Science® Google Scholar
9 Forgey, E. W. (1965). Cluster analysis of multivariate data: efficiency versus interpretability of classifications, Biometrics 21, 768–769.
Google Scholar
10 Friedman, H. P. & Rubin, J. (1967). On some invariant criteria for grouping data, Journal of the American Statistical Association 62, 1159–1178.
10.1080/01621459.1967.10500923
Web of Science® Google Scholar
11 Hand, D. J., Daly, F., Lunn, A. D., McConway, K. J. & Ostrowski, E. (1994). A Handbook of Small Data Sets. Chapman & Hall, London.
10.1007/978-1-4899-7266-8
Google Scholar
12 Heinrich, I., O'Hara, H., Sweetman, B. & Anderson, J. A. D. (1985). Validation aspects of an empirically derived classification for non-specific low back pain, Statistician 34, 215–230.
10.2307/2988160
Web of Science® Google Scholar
13 Jancey, R. C. (1966). Multidimensional group analysis, Australian Journal of Botany 14, 127–130.
10.1071/BT9660127
Web of Science® Google Scholar
14 Krzanowski, W. J. (1988). Principles of Multivariate Analysis. Oxford Science Publications, Oxford.
Google Scholar
15 Liu, G. L. (1968).: Introduction to Combinational Mathematics. McGraw-Hill, New York.
Google Scholar
16 MacQueen, J. (1967). Some method for classification and analysis of multivariate observations, Proceedings of the Fifth Berkeley Symposium. University of California Press, Berkeley, Vol. 1, pp. 281–297.
Google Scholar
17 Marriott, F. H. C. (1971). Practical problems in a method of cluster analysis, Biometrics 27, 501–514.
10.2307/2528592
CAS PubMed Web of Science® Google Scholar
18 Marriott, F. H. C. (1982). Optimization methods of cluster analysis, Biometrika 69, 417–421.
10.1093/biomet/69.2.417
Web of Science® Google Scholar
19 Morant, G. M. (1923). A first study of the Tibetan skull, Biometrika 14, 193–260.
10.1093/biomet/14.3-4.193
Google Scholar
20 Rao, C. R. (1952). Advanced Statistical Methods in Biometrics Research. Wiley, New York.
Web of Science® Google Scholar
21 Scott, A. J. & Symons, M. J. (1981). Clustering methods based on likelihood ratio criteria, Biometrics 27, 387–398.
10.2307/2529003
Web of Science® Google Scholar
22 Singleton, R. C. & Kautz, W. (1965). Minimum Squared Error Clustering Algorithm. Stanford Research Institute, Stanford.
Google Scholar
23 Spath, H. (1985). Cluster Dissection and Analysis. Ellis Horwood, Chichester.
Google Scholar
24 Thorndike, R. L. (1953). Who belongs in a family?, Psychometrika 18, 267–276.
10.1007/BF02289263
Web of Science® Google Scholar

Citing Literature

Encyclopedia of Biostatistics

Browse other articles of this reference work: