Ramakrishnan Srikant

Digital Library

ACM Grace Murray Hopper Award

USA - 2002

citation

For his seminal work on mining association rules, which has led to association rules becoming a key data mining tool as well as part of the core syllabus in database and data mining courses.

An association rule corresponds to a set of items that occur together; the goal is to automatically discover all significant associations present in a dataset. However, for a dataset of a million items there are a quintillion possible two-item associations and the combinations worsen exponentially for longer associations. Dr. Srikant identified novel pruning techniques and data structures that made discovery of association rules practical for real-life datasets. He generalized association rules along three orthogonal dimensions: discovering associations across different levels of a hierarchy over the items, sequential patterns, and quantitative attributes. He invented pruning techniques and data structures for each of the dimensions that kept the execution times practical, and showed how to push constraints over the set of items in the discovered associations into the mining algorithms.

Dr. Srikant cleverly resolved the conflict between data mining and privacy by developing techniques for "privacy preserving data mining" that exploit the difference between the level that concerns privacy advocates, individual data, and the level where data mining algorithms run, aggregated data. User data is randomized to disallow recovery of anything meaningful at the individual level, while still allowing recovery of aggregate information to build mining models.

The number of citations of Dr. Srikant's work is evidence of his tremendous impact in the research community. His VLDB '94 paper, describing the Apriori algorithm for mining association rules, has over 500 citations.

The commercial impact of Dr. Srikant's work is equally impressive. He was a key architect for IBM Intelligent Miner, which has been widely recognized as a most technically sophisticated data mining product."