Volume 27, Issue 6 pp. 24-32
Article

Improving the performance of global communication on a three-dimensional torus network

Yasushi Kawakura

Yasushi Kawakura

Nonmember

Real World Computing Partnership Toshiba Laboratory, Kawasaki, Japan 210

Yasushi Kawakura received his B.S. in Computer Science from Kyoto University in 1988. In 1990 he obtained MS from Kyoto University and joined Toshiba. He is at the Research and Development Center of Toshiba. He has been working on application techniques of parallel computers, and research and development of programming environ- ments. He is a member of Information Processing Society.

Search for more papers by this author
Noboru Tanabe

Noboru Tanabe

Member

Real World Computing Partnership Toshiba Laboratory, Kawasaki, Japan 210

Noboru Tanabe received his B.S. in Electrical Engineering in 1985 and M.S. in 1987 both from Yokohama Kokuritsu University. In 1987 he joined Toshiba and currently he is at the Research and Development Center of Toshiba. He has been working on research and development of architectures of a computer dedicated for LU decomposition of sparse matrices, highly parallel AI machines, and multiparadigm massively parallel teraflops machines. He is a member of Information Processing Society.

Search for more papers by this author
Shigeru Oyanagi

Shigeru Oyanagi

Member

Real World Computing Partnership Toshiba Laboratory, Kawasaki, Japan 210

Shigeru Oyanagi received his B.S. in Engineering Science in 1972 and Ph.D. in 1977 from Kyoto University. In 1977 he joined Toshiba. He is currently a senior member of the second laboratory of the Information and Communi- cation System Research Laboratory of Toshiba Research and Development Center. He has been working on research and development of architectures of parallel computers, system software and applications. He is a member of Information Processing Society, IEEE and ACM.

Search for more papers by this author

Abstract

A high-speed one-to-all broadcasting algorithm is proposed whose performance does not deteriorate much when the number of processors is increased in a massively parallel computer. For the network topology, 3D torus networks are considered. Two methods are discussed for a system which broadcasts by repeating one-to-one communications. One uses paths having a smaller maximum transfer number to reduce the number of transfers, and the other presets the hardware to reduce the overhead of individual one-to-one communications. These methods are evaluated using a double loop model which consists of an inner loop for local processing and an outer loop for global communications. When these methods are used, the scalability increases and for a 32K processor system a 4.2 times speedup in program execution can be achieved.

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.