Volume 29, Issue 7 e4094
Special Issue Paper

Using adaptive runtime filtering to support an event-based performance analysis

Jonas Stolle

Corresponding Author

Jonas Stolle

Center for Information Services and High Performance Computing (ZIH), Technische Universität Dresden, Dresden, 01062 Germany

Correspondence to: Jonas Stolle, Center for Information Services and High Performance Computing (ZIH), Technische Universität Dresden, Dresden 01062, Germany.

E-mail: [email protected]

Search for more papers by this author
Michael Wagner

Michael Wagner

Center for Information Services and High Performance Computing (ZIH), Technische Universität Dresden, Dresden, 01062 Germany

Barcelona Supercomputing Center (BSC), Barcelona, 08034 Spain

Search for more papers by this author
Jens Doleschal

Jens Doleschal

Center for Information Services and High Performance Computing (ZIH), Technische Universität Dresden, Dresden, 01062 Germany

Search for more papers by this author
Felix Schmitt

Felix Schmitt

Center for Information Services and High Performance Computing (ZIH), Technische Universität Dresden, Dresden, 01062 Germany

NVIDIA, Santa Clara, 95050 CA, USA

Search for more papers by this author
Holger Brunst

Holger Brunst

Center for Information Services and High Performance Computing (ZIH), Technische Universität Dresden, Dresden, 01062 Germany

Search for more papers by this author
First published: 24 February 2017
Citations: 1

Summary

Event-based performance monitoring and analysis are effective means when tuning parallel applications for optimal resource usage. In this article, we address the data capacity challenge that arises when applying the tracing methodology to large-scale parallel applications and long execution times. Existing approaches use static, pre-defined event filters to reduce the performance data to a manageable size. In contrast, we propose self-guided filters that automatically adapt to an application's runtime behaviour and therefore, do not require any previous knowledge or application executions. Our contribution consists of four adaptive runtime filters, which target a specific type of data redundancy each. The filters focus on detecting identical events in loop iterations, constant events with no variation in time, and very short, highly frequent, typically not very meaningful events, having a severe impact on the total data volume. We evaluate our prototype implementation with five real-world applications and achieve a data reduction of two orders of magnitude while increasing execution time less than 1%. Likewise, we show that the qualitative impact of our filters on performance analysis in state-of-the-art analysis tools can be reduced by adding feedback methods and statistical information to the filtered traces. Copyright © 2017 John Wiley & Sons, Ltd.

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.