Volume 42, Issue 8 e70093

ORIGINAL ARTICLE

The Performance of Distances Between Time Series: An In-Depth Comparison

Margarida G. M. S. Cardoso,

Corresponding Author

Margarida G. M. S. Cardoso

[email protected]

orcid.org/0000-0001-6239-7283

Instituto Universitário de Lisboa (ISCTE-IUL), Business Research Unit (BRU-IUL), Lisbon, Portugal

Correspondence:

Margarida G. M. S. Cardoso ([email protected])

Search for more papers by this author

Ana A. Martins,

Ana A. Martins

orcid.org/0000-0003-3733-6619

CIMA-Research Centre for Mathematics and Applications & CIMOSM–Centro de Investigação Em Modelação e Optimização de Sistemas Multifuncionais, Instituto Superior de Engenharia de Lisboa-ISEL, Instituto Politécnico de Lisboa, Lisbon, Portugal

Search for more papers by this author

Margarida G. M. S. Cardoso,

Corresponding Author

Margarida G. M. S. Cardoso

[email protected]

orcid.org/0000-0001-6239-7283

Instituto Universitário de Lisboa (ISCTE-IUL), Business Research Unit (BRU-IUL), Lisbon, Portugal

Correspondence:

Margarida G. M. S. Cardoso ([email protected])

Search for more papers by this author

Ana A. Martins,

Ana A. Martins

orcid.org/0000-0003-3733-6619

Search for more papers by this author

First published: 24 June 2025

https://doi.org/10.1111/exsy.70093

Funding: This research was supported by Fundação para a Ciência e a Tecnologia, grant UIDB/00315/2020 (DOI: 10.54499/UIDB/00315/2020). It was also supported by Instituto Politécnico Lisboa (IPL) with reference IPL/IDI&CA2023/ELForcast2_ISEL and Fundação para a Ciência e a Tecnologia, Portugal, through the project UID/MAT/04674/2013, CIMA and ISEL.

Share a link

Email
Wechat
Bluesky

ABSTRACT

The performance of distance measures between time series has been discussed in diverse studies. Most identified performance as the accuracy resulting from the use of a specific distance in 1-Nearest Neighbour. Few studies have addressed the related computation time, and no systematic analyses of the associations between the distances' performance (1-NN-based accuracy and computation time) and the time series' characteristics have been presented yet. We propose to fill this research gap by analysing these relationships considering the following features: the training and test sets' dimensions, the time series' length, the number of classes, and the classes' separability as measured by the Average Silhouette index. This last characteristic was not mentioned in previous studies. A methodological approach is devised to compare nine distance measures, including three recently proposed combined distances (COMB and two variants). We resort to a stepwise method for multiple comparisons and deal with the experiment-wise error rate to obtain homogeneous groups of distances with indistinct performances. The CART algorithm is used to explore the relationships between accuracy values corresponding to each distance measure under study (target) and the time series characteristics (predictors). Our analyses are based on datasets from the UCR time series classification archive. We concluded that the combined distance (COMB), dynamic time warping distance (DTW), and complexity invariance distance (CID) are consistently included in the subset of best-performing distances in all experimental scenarios. The latter (CID) has a significantly lower computational cost. We determined that the classes' separability is the time series' attribute most associated with the distances' performance.

Open Research

Data Availability Statement

Yes. The data sets were drawn from the University of California Riverside (UCR) Time Series Classification Archive datasets.

References

Abanda, A., U. Mori, and J. A. Lozano. 2019. “A Review on Distance Based Time Series Classification.” Data Mining and Knowledge Discovery 33, no. 2: 378–412.
10.1007/s10618-018-0596-4
Web of Science® Google Scholar
Bagnall, A., J. Lines, A. Bostrom, J. Large, and E. Keogh. 2017. “The Great Time Series Classification Bake Off: A Review and Experimental Evaluation of Recent Algorithmic Advances.” Data Mining and Knowledge Discovery 31: 606–660.
10.1007/s10618-016-0483-9
PubMed Web of Science® Google Scholar
Batista, G. E., E. J. Keogh, O. M. Tataw, and V. M. De Souza. 2014. “CID: An Efficient Complexity-Invariant Distance for Time Series.” Data Mining and Knowledge Discovery 28: 634–669.
10.1007/s10618-013-0312-3
Web of Science® Google Scholar
Batista, G. E., X. Wang, and E. J. Keogh. 2011. “A Complexity-Invariant Distance Measure for Time Series.” Paper presented at the Proceedings of the 2011 SIAM International Conference on Data Mining.
Google Scholar
Bergmann, B., and G. Hommel. 1988. “Improvements of General Multiple Test Procedures for Redundant Systems of Hypotheses.” Multiple Hypotheses Testing 70: 100–115.
10.1007/978-3-642-52307-6_8
Google Scholar
Berndt, D. J., and J. Clifford. 1994. “Using Dynamic Time Warping to Find Patterns in Time Series.” Paper presented at the Proceedings of the 3rd International Conference on Knowledge Discovery and Data Mining.
Google Scholar
Caiado, J., N. Crato, and D. Peña. 2006. “A Periodogram-Based Metric for Time Series Classification.” Computational Statistics & Data Analysis 50, no. 10: 2668–2684.
10.1016/j.csda.2005.04.012
Web of Science® Google Scholar
Campbell, G., and J. H. Skillings. 1985. “Nonparametric Stepwise Multiple Comparison Procedures.” Journal of the American Statistical Association 80, no. 392: 998–1003.
10.1080/01621459.1985.10478216
Web of Science® Google Scholar
Cardoso, M. G., and A. A. Martins. 2021. “ The Performance of a Combined Distance Between Time Series.” In Recent Developments in Statistics and Data Science, edited by R. Bispo, L. Henriques-Rodrigues, R. Alpizer-Jara, and M. D. Carvalho, 71–83. Springer.
Google Scholar
Cardoso, M. G., A. A. Martins, and J. Lagarto. 2019. “ Combining Various Dissimilarity Measures for Clustering Electricity Market Prices.” In Estatística: Desafios Transversais às Ciências com Dados. Atas do XXIV Congresso da Sociedade Portuguesa de Estatística, edited by P. Milheiro, A. Pacheco, B. d. Sousa, et al., 197–212. Edições SPE.
Google Scholar
Chen, Y., B. Hu, E. Keogh, and G. E. Batista. 2013. “Dtw-d: Time Series Semi-Supervised Learning From a Single Example.” Paper presented at the Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
Google Scholar
Cohen, J., P. Cohen, S. G. West, and L. S. Aiken. 2013. Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences. Routledge.
10.4324/9780203774441
Google Scholar
Corder, G. W., and D. I. Foreman. 2014. Nonparametric Statistics: A Step-By-Step Approach. John Wiley & Sons.
Google Scholar
Dau, H. A., A. Bagnall, K. Kamgar, et al. 2019. “The UCR Time Series Archive.” IEEE/CAA Journal of Automatica Sinica 6, no. 6: 1293–1305.
10.1109/JAS.2019.1911747
Web of Science® Google Scholar
Demšar, J. 2006. “Statistical Comparisons of Classifiers Over Multiple Data Sets.” Journal of Machine Learning Research 7: 1–30.
Web of Science® Google Scholar
Ding, H., G. Trajcevski, P. Scheuermann, X. Wang, and E. Keogh. 2008. “Querying and Mining of Time Series Data: Experimental Comparison of Representations and Distance Measures.” Proceedings of the VLDB Endowment 1, no. 2: 1542–1552.
10.14778/1454159.1454226
Google Scholar
Einot, I., and K. R. Gabriel. 1975. “A Study of the Powers of Several Methods of Multiple Comparisons.” Journal of the American Statistical Association 70, no. 351a: 574–583.
10.1080/01621459.1975.10482474
Google Scholar
Esling, P., and C. Agon. 2012. “Time-Series Data Mining.” ACM Computing Surveys 45, no. 1: 1–34.
10.1145/2379776.2379788
Web of Science® Google Scholar
Galeano, P., and D. P. Pena. 2000. “Multivariate Analysis in Vector Time Series.” Resenhas do Instituto de Matemática e Estatística da Universidade de São Paulo 4, no. 4: 383–403.
Google Scholar
Giorgino, T. 2009. “Computing and Visualizing Dynamic Time Warping Alignments in R: The Dtw Package.” Journal of Statistical Software 31: 1–24.
10.18637/jss.v031.i07
Web of Science® Google Scholar
Górecki, T., M. Łuczak, and P. Piasecki. 2024. “An Exhaustive Comparison of Distance Measures in the Classification of Time Series With 1nn Method.” Journal of Computational Science 76: 102235.
10.1016/j.jocs.2024.102235
Web of Science® Google Scholar
Huynh, H., and L. S. Feldt. 1976. “Estimation of the Box Correction for Degrees of Freedom From Sample Data in Randomized Block and Split-Plot Designs.” Journal of Educational Statistics 1, no. 1: 69–82.
10.3102/10769986001001069
Google Scholar
Jeong, Y.-S., M. K. Jeong, and O. A. Omitaomu. 2011. “Weighted Dynamic Time Warping for Time Series Classification.” Pattern Recognition 44, no. 9: 2231–2240.
10.1016/j.patcog.2010.09.022
Web of Science® Google Scholar
Kaufman, L., and P. J. Rousseeuw. 2009. Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley & Sons.
Google Scholar
Keogh, E., and S. Kasetty. 2002. “On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration.” Paper presented at the Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
Google Scholar
Keogh, E., and C. A. Ratanamahatana. 2005. “Exact Indexing of Dynamic Time Warping.” Knowledge and Information Systems 7: 358–386.
10.1007/s10115-004-0154-9
Web of Science® Google Scholar
Li, B., J. Friedman, R. Olshen, and C. Stone. 1984. “Classification and Regression Trees (CART).” Biometrics 40, no. 3: 358–361.
Google Scholar
Lilliefors, H. W. 1967. “On the Kolmogorov-Smirnov Test for Normality With Mean and Variance Unknown.” Journal of the American Statistical Association 62, no. 318: 399–402.
10.1080/01621459.1967.10482916
Web of Science® Google Scholar
Lines, J., and A. Bagnall. 2015. “Time Series Classification With Ensembles of Elastic Distance Measures.” Data Mining and Knowledge Discovery 29: 565–592.
10.1007/s10618-014-0361-2
Web of Science® Google Scholar
Marteau, P.-F. 2008. “Time Warp Edit Distance With Stiffness Adjustment for Time Series Matching.” IEEE Transactions on Pattern Analysis and Machine Intelligence 31, no. 2: 306–318.
10.1109/TPAMI.2008.76
Google Scholar
Martins, A. A., J. Lagarto, H. Canacsinh, F. Reis, and M. G. M. S. Cardoso. 2022. “Short-Term Load Forecasting Using Time Series Clustering.” Optimization and Engineering 23, no. 4: 2293–2314.
10.1007/s11081-022-09760-1
Web of Science® Google Scholar
Massey, F. J., Jr. 1951. “The Kolmogorov-Smirnov Test for Goodness of Fit.” Journal of the American Statistical Association 46, no. 253: 68–78.
10.1080/01621459.1951.10500769
Web of Science® Google Scholar
Montero, P., and J. A. Vilar. 2015. “TSclust: An R Package for Time Series Clustering.” Journal of Statistical Software 62: 1–43.
Google Scholar
Nemenyi, P. B. 1963. Distribution-Free Multiple Comparisons. Princeton University.
Google Scholar
Paparrizos, J., H. Li, F. Yang, K. Wu, J. E. d'Hondt, and O. Papapetrou. 2024. “A Survey on Time-Series Distance Measures.” Preprint arXiv:2412.20574.
Google Scholar
Paparrizos, J., C. Liu, A. J. Elmore, and M. J. Franklin. 2020. “Debunking Four Long-Standing Misconceptions of Time-Series Distance Measures.” Paper presented at the Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data.
Google Scholar
Rodrigues, P. P., J. Gama, and J. Pedroso. 2008. “Hierarchical Clustering of Time-Series Data Streams.” IEEE Transactions on Knowledge and Data Engineering 20, no. 5: 615–627.
10.1109/TKDE.2007.190727
Web of Science® Google Scholar
Rousseeuw, P. J. 1987. “Silhouettes: A Graphical Aid to the Interpretation and Validation of Cluster Analysis.” Journal of Computational and Applied Mathematics 20: 53–65.
10.1016/0377-0427(87)90125-7
Web of Science® Google Scholar
Ryan, T. H. 1960. “Significance Tests for Multiple Comparison of Proportions, Variances, and Other Statistics.” Psychological Bulletin 57, no. 4: 318–328.
10.1037/h0044320
CAS PubMed Web of Science® Google Scholar
Sakoe, H., and S. Chiba. 1978. “Dynamic Programming Algorithm Optimization for Spoken Word Recognition.” IEEE Transactions on Acoustics, Speech, and Signal Processing 26, no. 1: 43–49.
10.1109/TASSP.1978.1163055
Web of Science® Google Scholar
Vlachos, M., M. Hadjieleftheriou, D. Gunopulos, and E. Keogh. 2006. “Indexing Multidimensional Time-Series.” Very Large Data Bases Journal 15: 1–20.
10.1007/s00778-004-0144-2
Google Scholar
Vlachos, M., G. Kollios, and D. Gunopulos. 2002. “Discovering Similar Multidimensional Trajectories.” Paper presented at the Proceedings 18th International Conference on Data Engineering.
Google Scholar
Wang, X., A. Mueen, H. Ding, G. Trajcevski, P. Scheuermann, and E. Keogh. 2013. “Experimental Comparison of Representation Methods and Distance Measures for Time Series Data.” Data Mining and Knowledge Discovery 26: 275–309.
10.1007/s10618-012-0250-5
Web of Science® Google Scholar
Welsch, R. E. 1977. “Stepwise Multiple Comparison Procedures.” Journal of the American Statistical Association 72, no. 359: 566–575.
10.1080/01621459.1977.10480614
Web of Science® Google Scholar
Wilcoxon, F. 1992. Individual Comparisons by Ranking Methods Breakthroughs in Statistics: Methodology and Distribution, 196–202. Springer.
Google Scholar
Wu, L., I. E.-H. Yen, J. Yi, F. Xu, Q. Lei, and M. Witbrock. 2018. “Random Warping Series: A Random Features Method for Time-Series Embedding.” Paper presented at the International Conference on Artificial Intelligence and Statistics.
Google Scholar
Xi, X., E. Keogh, C. Shelton, L. Wei, and C. A. Ratanamahatana. 2006. “Fast Time Series Classification Using Numerosity Reduction.” Paper presented at the Proceedings of the 23rd International Conference on Machine Learning.
Google Scholar
Xing, Z., J. Pei, and E. Keogh. 2010. “A Brief Survey on Sequence Classification.” ACM Sigkdd Explorations Newsletter 12, no. 1: 40–48.
10.1145/1882471.1882478
Google Scholar

Volume42, Issue8

August 2025

e70093

The Performance of Distances Between Time Series: An In-Depth Comparison

ABSTRACT

Open Research

Data Availability Statement

References

References

Information

About Wiley Online Library

Help & Support

Opportunities

Connect with Wiley

The Performance of Distances Between Time Series: An In-Depth Comparison

ABSTRACT

Open Research

Data Availability Statement

References

References

Related

Information