Chapter 13

Techniques for Fast Computation

David Porter

David Porter

Minnesota Supercomputing Institute (MSI), University of Minnesota, Minneapolis, MN, USA

Search for more papers by this author
First published: 27 April 2018

Summary

This chapter focuses on techniques for achieving both good computational performance and good parallel scaling. It reviews issues that prevent good scaling and techniques for achieving it. Strategies for improving computational performance and reducing time to solution were described and illustrated with working examples. The chapter describes hardware details of the processors, nodes, and parallel compute cluster used for benchmarks. Techniques discussed in the chapter include vectorization, cache re-use, thread parallel, and distributed memory parallel computation. The distributed memory computation illustrates how a cluster of compute nodes can be used to work on a single problem to reduce compute time by large factors. The chapter summarizes the results and principles for good performance and scaling, characterizes performance gained by these techniques in terms of reducing time to solution, and refers to examples of how these methods can be applied in a variety of modeling problems.

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.