Torsten Hoefler

Digital Library

ACM Prize in Computing

Germany - 2024

citation

For fundamental contributions to high-performance computing and the ongoing AI revolution

Professor Hoefler's contributions to scalable network design in supercomputers have revolutionized the capabilities of these large systems, which now scale to hundreds of thousands of nodes. His expertise in interconnection networks, including his early work in InfiniBand optimizations as well as his major role in the evolution of Message Passing Interface (MPI), broke new ground in facilitating the use of large scale massively parallel clusters. Moreover, his numerous innovations in novel network topologies, routing, congestion avoidance, and performance -- including key contributions to Slim Fly, PERCS, and HammingMesh -- have not only pushed the boundaries of network design but also have translated into dramatic improvements in supercomputer performance and scalability and have been widely adopted by the largest machines today.

Hoefler is a main contributor to the MPI 3 specification where he chaired both the "Process Topologies" and "Collective Operations" working groups. His nonblocking collective operations such as Iallreduce, Iallgather, and Ibcast and their respective versions in various collective communication libraries now power the core of distributed deep learning today. Furthermore, low-level network routing protocols that he and his team developed for InfiniBand power thousands of AI and HPC supercomputers. These contributions form central pieces of modern high-performance AI systems that are used to train large-language models such as . Hoefler's works benefit hundreds of thousands of AI and HPC programmers and millions of people who profit from the resulting technology and societal changes.

Hoefler not only developed many of the core capabilities of modern supercomputers but also defined key aspects of the algorithms for distributing AI models on them. He was among the first to discover and popularize the now well-known notion of "3D parallelism", which drives infrastructure design of the whole AI industry. He and his collaborators continued to develop many innovative techniques for efficient pipelining, sparse communication, model sparsity, and quantization. Collectively,  those algorithmic contributions enable a cumulative 10-1,000x acceleration of deep learning workloads today.

Equally noteworthy is Professor Hoefler's dedication to benchmarking and reproducibility. His pioneering work in establishing best practices for benchmarking and reproducibility has set the gold standard for rigor and transparency in HPC research.

Press Release

ACM Fellows

Switzerland - 2022

citation

For foundational contributions to High Performance Computing and the application of HPC techniques to machine learning

Press Release

ACM Senior Member

Switzerland - 2020

ACM Gordon Bell Prize

Switzerland - 2019

Sustained Application Performance/ Novelty of Programming Approach

citation

For A Data-Centric Approach to Extreme-Scale Ab initio Dissipative Quantum Transport Simulations

Press Release