Andrei Broder

Digital Library

ACM Paris Kanellakis Theory and Practice Award

USA - 2020

citation

For the discovery and analysis of balanced allocations, known as the power of two choices, and their extensive applications to practice

ACM has named Yossi Azar of Tel Aviv University, Andrei Broder of Google Research, Anna Karlin of the University of Washington, Michael Mitzenmacher of Harvard University, and Eli Upfal of Brown University recipients of the ACM Paris Kanellakis Theory and Practice award for their breakthrough work on the Balanced Allocations paradigm, also known as the power of two choices, an elegant theoretical work which has had profound and widespread practical impact.

When n balls are thrown into n bins chosen uniformly at random, it is known that with high probability, the maximum load on any bin is bounded by (lg n/lg lg n) (1+o(1)). Azar, Broder, Karlin, and Upfal (STOC 94) proved that adding a little bit of choice can work wonders. When throwing each ball, instead of choosing one bin at random, choose two bins at random, and then place the ball in the bin with the lesser load. This minor change brings on an exponential improvement; with high probability, the maximal load in any bin is bounded by (lg lg n/lg 2)+O(1). In the same work they extended the selection of bins from two to d, choosing the bin with the least load for the new ball, obtaining with high probability a bound of (ln ln n/ ln d)+O(1). These results were greatly extended by Mitzenmacher in his 1996 PhD dissertation: he removed the sequential setting in the 1994 work and developed a framework for using the power of two choices in queuing systems. These works led to a wide adoption of the power of two choices paradigm in the networking community, with applications in many areas in Computer Science.

Since bins and balls are the basic model for analyzing data structures such as hashing or processes like load balancing of jobs in servers, it is not surprising that the power of two choices has led to a wide range of practical applications. These include i-Google's web index which uses Balanced Allocations, Akamai's overlay routing network which applies an algorithm inspired by the power of two choices paradigm, and highly reliable distributed data storage systems used by Microsoft and Dropbox which are based on variants of the power of two paradigm. There are many other software systems that use the paradigm as an important ingredient.

The Balanced Allocations paper and the follow-up work on the power of two choices are elegant theoretical results, and their content had and will surely continue to have a demonstrable effect on the practice of computing.

Press Release

ACM Paris Kanellakis Theory and Practice Award

USA - 2012

citation

With Moses S Charikar and Piotr Indyk, for their groundbreaking work on Locality-Sensitive Hashing that has had great impact in many fields of computer science including computer vision, databases, information retrieval, machine learning, and signal processing.

Hashing is one of the fundamental techniques in computer science: generating challenging theoretical questions as well as finding application in almost all fields of computer science. In many of these applications a refined form of hashing is required, where keys that are close in the original space are more likely to be mapped to the same hash value. This is the problem that locality-sensitive hashing addresses. As a result similarity search problems can be solved in the hashed space instead of the original space, leading often to significant improvements either in efficiency or quality.

Motivated by the need to efficiently compute near-duplicate web pages, Andrei Broder introduced specific locality-sensitive hash functions, called minhash functions, and used them to estimate the similarity of sets. Piotr Indyk and the late Rajeev Motwani extended locality-sensitive hash functions to a wider range of distance functions and showed how to use them to design approximate nearest neighbor algorithms with sub-linear query time. Moses Charikar subsequently introduced a highly efficient family of locality-sensitive hash functions, called simhash functions, for angular distances as well as a new variant of the nearest neighbor search algorithm.

Locality-sensitive hashing techniques have been widely used in many fields of computer science including computer vision, databases, information retrieval, data mining, machine learning, and signal processing. Furthermore, they have inspired the development of other hashing-based methods for similarity search such as parameter sensitive hashing, semantic hashing, spectral hashing, and kernelized LSH.

The work has also enabled other significant theoretical developments. In particular, it has led to the design of nearest neighbor algorithms for more complex metric spaces that play an important role in various applications. It has also resulted in new discoveries in, and connections to, other areas such as vector quantization, communication complexity, and streaming algorithms.

ACM Fellows

USA - 2007

citation

For contributions to algorithms and web technology.