Volume 6, Issue 1 pp. 462-471
Original Article with Visuanimation

Bump hunting by topological data analysis

Max Sommerfeld

Max Sommerfeld

Felix Bernstein Institute for Mathematical Statistics in the Biosciences, University of Göttingen, Göttingen 37077, Germany

Search for more papers by this author
Giseon Heo

Giseon Heo

School of Dentistry, University of Alberta, Edmonton, Alberta T6G 2R7, Canada

Search for more papers by this author
Peter Kim

Peter Kim

Department of Mathematics and Statistics, University of Guelph, Guelph, Ontario N1G 2W1, Canada

Search for more papers by this author
Stephen T. Rush

Stephen T. Rush

School of Medical Sciences, Örebro Universitet, Örebro SE-701 82, Sweden

Search for more papers by this author
J. S. Marron

Corresponding Author

J. S. Marron

Department of Statistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA

Email: [email protected]

Search for more papers by this author
First published: 27 November 2017
Citations: 5

Abstract

A topological data analysis approach is taken to the challenging problem of finding and validating the statistical significance of local modes in a data set. As with the SIgnificance of the ZERo (SiZer) approach to this problem, statistical inference is performed in a multi-scale way, that is, across bandwidths. The key contribution is a two-parameter approach to the persistent homology representation. For each kernel bandwidth, a sub-level set filtration of the resulting kernel density estimate is computed. Inference based on the resulting persistence diagram indicates statistical significance of modes. It is seen through a simulated example, and by analysis of the famous Hidalgo stamps data, that the new method has more statistical power for finding bumps than SiZer. Copyright © 2017 John Wiley & Sons, Ltd.

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.