Top Global Award for Young Technologists Goes to Researcher Who Advanced AI with High-Performance Computers

Torsten Hoefler Awarded ACM Prize in Computing for Facilitating Breakthroughs in Research and Industry

ACM named Torsten Hoefler, a Professor at ETH Zurich, the recipient of the 2024 ACM Prize in Computing for fundamental contributions to high-performance computing and the ongoing AI revolution. Hoefler developed many of the core capabilities of modern supercomputers and defined key aspects of the algorithms for distributing AI models on them.

The ACM Prize in Computing recognizes early-to-mid-career computer scientists whose research contributions have fundamental impact and broad implications. The award carries a prize of $250,000, from an endowment provided by Infosys Ltd, a global leader in next-generation digital services and consulting.

Overview
High-performance computing (HPC) plays a critical role in AI applications, which require a great deal of computing power. The work of Hoefler and his colleagues to scale network design and programming in supercomputers has revolutionized the capabilities of these large systems. For example, AI algorithms can now be processed on hundreds of thousands of nodes (computers or servers).

Hoefler’s advances in interconnection networks, programming, and parallel algorithms broke new ground in facilitating the use of large-scale massively parallel clusters. His numerous innovations across the whole supercomputer stack—including key contributions such as MPI-3 nonblocking collective operations, foundational parallelism strategies for AI models, and high-performance networking systems—have pushed the boundaries of parallel systems design and translated into dramatic improvements in supercomputer performance and scalability. Many of those innovations are incorporated into the largest and most powerful machines today.

Key Contributions

Message Passing Interface 3
Hoefler played a major role in the evolution of the Message Passing Interface (MPI), an informal industry standard for exchanging messages between numerous individual nodes throughout an HPC network. A messaging standard allows synchronization of the activities of each individual computer, sharing data between nodes, and direction and control of the entire parallel network. The MPI-3 standard, in which Hoefler played a leading role, was adopted in 2012 and made possible many of the critical advances in HPC for simulations and AI applications over the past several years.

For MPI-3, Hoefler chaired both the “Process Topologies” and “Collective Operations” working groups. His nonblocking collective operations such as Allreduce, Allgather, Bcast, and their respective blocking versions are included in various collective communication libraries—even beyond MPI-3. These operations power the core of distributed deep learning today.

3D Parallelism
Hoefler was among the first to develop and discover the now well-known notion of “3D parallelism,” which drives infrastructure design for the whole AI industry. Subsequently, he and his collaborators continued to develop many innovative techniques for efficient pipelining, sparse communication, model sparsity, and quantization. This work has enabled a cumulative 10-1000x acceleration of AI workloads in modern computers.

Routing Protocols and Network Topologies
The low-level network routing protocols and network topologies that Hoefler and his colleagues developed for networks such as Myrinet and InfiniBand power thousands of AI and HPC supercomputers. These contributions form central pieces of modern high-performance AI systems that are used to train large-language models such as ChatGPT.

News Release | Printable PDF

 

About the ACM Prize in Computing

The ACM Prize in Computing recognizes an early- to mid-career fundamental innovative contribution in computing that, through its depth, impact and broad implications, exemplifies the greatest achievements in the discipline. The award carries a prize of $250,000. Financial support is provided by an endowment from Infosys Ltd. The ACM Prize in Computing was previously known as the ACM-Infosys Foundation Award in the Computing Sciences from 2007 through 2015. ACM Prize recipients are invited to participate in the Heidelberg Laureate Forum, an annual networking event that brings together young researchers from around the world with recipients of the ACM A.M. Turing Award, the Abel Prize, the Fields Medal, and the IMU Abacus Medal.

2024 ACM Prize in Computing Laureate

photo of Torsten Hoefler

Torsten Hoefler is a Professor of Computer Science at ETH Zurich (the Swiss Federal Institute of Technology), where he serves as Director of the Scalable Parallel Computing Laboratory. He is also the Chief Architect for AI and Machine Learning at the Swiss National Supercomputing Centre (CSCS). Hoefler received a Diplom Informatik (Master of Computer Science) from Chemnitz University of Technology and a PhD in Computer Science from Indiana University.

Hoefler’s honors include the Max Planck-Humboldt Medal, an award for outstanding mid-career scientists; the IEEE CS Sidney Fernbach Award, which recognizes outstanding contributions in the application of high-performance computers; and the ACM Gordon Bell Prize, which recognizes outstanding achievement in high-performance computing. He is a member of the European Academy of Sciences (Academia Europaea), a Fellow of IEEE, and a Fellow of ACM.


Notable Papers by Torsten Hoefler

Torsten Hoefler has not only developed many of the core capabilities of modern supercomputers, but also defined key aspects of the algorithms for distributing AI models on them. His most important papers include:

LLAMP: Assessing Network Latency Tolerance of HPC Applications With Linear Programming
Initial Publication: SC '24 Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis
Note: Best Paper Award at SC24

HEAR: Homomorphically Encrypted Allreduce
Initial Publication: SC '23 Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
Note: Best Student Paper Award at SC3 and SC23 Reproducibility Advancement Award

The Graph Database Interface: Scaling Online Transactional And Analytical Graph Workloads to Hundreds of Thousands Of Cores
Initial Publication: SC '23 Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
Note: Best Paper Finalist

ProbGraph: High-Performance and High-Accuracy Graph Mining With Probabilistic Set Representations
Initial Publication: SC '22 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Note:  Best Paper at SC22

Hammingmesh: A Network Topology for Large-Scale Deep Learning
Initial Publication: SC '22 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Note: SC22 Reproducibility Advancement Award

A Data-Centric Approach to Extreme-Scale Ab Initio Dissipative Quantum Transport Simulations
Initial Publication: SC '19 Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
Note: Gordon Bell Prize Winner

Selecting Technical Papers for an Interdisciplinary Conference: The Pasc Review Process
Initial Publication: PASC '16 Proceedings of the Platform for Advanced Scientific Computing Conference

Remote Memory Access Programming in Mpi-3
Initial Publication: ACM Transactions on Parallel Computing (TOPC), Volume 2, Issue 2

Enabling Highly-Scalable Remote Memory Access Programming With MPI-3 One Sided
Initial Publication: SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Note: SC13 Best Paper Award and Best Student Paper Finalist

Generic Topology Mapping Strategies for Large-Scale Parallel Architectures
Initial Publication: ICS '11 Proceedings of the International Conference on Supercomputing

Characterizing the Influence of System Noise On Large-Scale Applications by Simulation
Initial Publication: SC '10 Proceedings of the 2010 ACM/IEEE International Conference for High Performance

Implementation and Performance Analysis of Non-Blocking Collective Operations for MPI
Initial Publication: SC '07 Proceedings of the 2007 ACM/IEEE conference on Supercomputing