Volume 32, Issue 3 e4467
SPECIAL ISSUE PAPER

Optimizing point-to-point communication between adaptive MPI endpoints in shared memory

Sam White

Corresponding Author

Sam White

Department of Computer Science, University of Illinois at Urbana-Champaign, IL 61801-2302, USA

Sam White, Department of Computer Science, University of Illinois at Urbana-Champaign, IL 61801-2302, USA.

Email: [email protected]

Search for more papers by this author
Laxmikant V. Kale

Laxmikant V. Kale

Department of Computer Science, University of Illinois at Urbana-Champaign, IL 61801-2302, USA

Search for more papers by this author
First published: 12 March 2018
Citations: 7

Summary

Adaptive MPI is an implementation of the MPI standard that supports the virtualization of ranks as user-level threads, rather than OS processes. In this work, we optimize the communication performance of AMPI based on the locality of the endpoints communicating within a cluster of SMP nodes. We differentiate between point-to-point messages with both endpoints co-located on the same execution unit and point-to-point messages with both endpoints residing in the same process but not on the same execution unit. We demonstrate how the messaging semantics of Charm++ enable and hinder AMPI's implementation in different ways, and we motivate extensions to Charm++ to address the limitations. Using the OSU micro-benchmark suite, we show that our locality-aware design offers lower latency, higher bandwidth, and reduced memory footprint for applications.

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.