Data Mining, Software Packages for
Abstract
The term data mining refers to the identification—within a typically large database—of new, valid, and interesting patterns. While data mining has become most popular in the context of, for example, database marketing, most of the methods under the data mining umbrella have been widely applied in biostatistics. We describe which main applications of data mining have arisen recently in biostatistics, and introduce the reader to some of the available data mining software packages with a reference to biostatistical needs.