Fellow
Poland – 2015
CITATION

For contributions to high-dimensional geometric computing, streaming/sketching algorithms, and the Sparse Fourier Transform.

CITATION

With Andrei Broder and Moses S Charikar, for their groundbreaking work on Locality-Sensitive Hashing that has had great impact in many fields of computer science including computer vision, databases, information retrieval, machine learning, and signal processing.


Hashing is one of the fundamental techniques in computer science: generating challenging theoretical questions as well as finding application in almost all fields of computer science. In many of these applications a refined form of hashing is required, where keys that are close in the original space are more likely to be mapped to the same hash value. This is the problem that locality-sensitive hashing addresses. As a result similarity search problems can be solved in the hashed space instead of the original space, leading often to significant improvements either in efficiency or quality.

Motivated by the need to efficiently compute near-duplicate web pages, Andrei Broder introduced specific locality-sensitive hash functions, called minhash functions, and used them to estimate the similarity of sets. Piotr Indyk and the late Rajeev Motwani extended locality-sensitive hash functions to a wider range of distance functions and showed how to use them to design approximate nearest neighbor algorithms with sub-linear query time. Moses Charikar subsequently introduced a highly efficient family of locality-sensitive hash functions, called simhash functions, for angular distances as well as a new variant of the nearest neighbor search algorithm.

Locality-sensitive hashing techniques have been widely used in many fields of computer science including computer vision, databases, information retrieval, data mining, machine learning, and signal processing. Furthermore, they have inspired the development of other hashing-based methods for similarity search such as parameter sensitive hashing, semantic hashing, spectral hashing, and kernelized LSH.

The work has also enabled other significant theoretical developments. In particular, it has led to the design of nearest neighbor algorithms for more complex metric spaces that play an important role in various applications. It has also resulted in new discoveries in, and connections to, other areas such as vector quantization, communication complexity, and streaming algorithms.

Press Release

Andrei Broder, Moses Charikar, Piotr Indyk named recipients of the 2012 Paris Kanellakis Theory and Practice Award 

 

Broder, Charikar, and Indyk were recognized for their work on algorithms that allow for quickly finding similar entries in large databases, known as locality-sensitive hashing (LSH), These algorithms can drastically reduce the computational time needed for retrieving similar items, at the cost of a small probability of failing to find the absolute closest match.  LSH has impacted fields as diverse as computer vision, databases, information retrieval, data mining, machine learning, and signal processing.  
 
Andrei Broder introduced specific locality-sensitive min-hash functions, used to estimate the similarity of data sets and identify near-duplicate documents.  He is a Google Distinguished Scientist.  
 
Piotr Indyk, with the late Rajeev Motwani, extended LSH functions to a wider range of distance functions, and applied them to design efficient approximate nearest neighbor algorithms. Indyk is a professor at MIT's Computer Science and Artificial Intelligence Lab.  
 
Moses Charikar introduced sim-hash functions for angular distances. He is a professor of Computer Science at Princeton University.