Margo Seltzer

Digital Library

ACM Athena Lecturer Award

Canada - 2023

citation

For foundational research in file and storage systems, pioneering research in data provenance, impactful software contributions in Berkeley DB, and tireless dedication to service and mentoring

Margo Seltzer is one of the most influential researchers and practitioners in systems broadly construed. Her impact spans file and storage systems, the architecture/development of popular software systems such as Berkeley DB and capturing and using data provenance.

Seltzer was one of the original authors of BerkeleyDB (with Bostic and Olson), which was an early and influential example of the NoSQL movement and pioneered the "dual-license" approach to software licensing. This design embodies simplicity, quality, and elegance in a high-performance key-value store system that was the de facto data store for web applications for many years. Seltzer's research in log-structured file systems is well-known for its careful and nuanced evaluation of various approaches, and she adapted these ideas for use in the UNIX file systems and updates of file system metadata. Seltzer pioneered whole-system data provenance to enhance our ability to assess the quality of information by understanding where the data comes from, who is using the data, and how it was obtained. Her research demonstrated how provenance could be practically supported at the system level and then used to develop important applications in security and compliance. Her subsequent work focused on applications of provenance, including intrusion detection, attack attribution, computational reproducibility, and software engineering.

Seltzer is an outstanding educator, mentor, and leader in the community. She is currently Cheriton Family Chair in Computer Science at the University of British Columbia Internet and was previously the Herchel Smith Professor of Computer Science, a Harvard College Professor, Director of the Center for Research in Computation for Society, and Associate Dean for Computer Science and Engineering In the Harvard John A. Paulson School of Engineering and Applied Sciences. She has received several awards for excellence in teaching and leadership for her work broadening participation in computer science. She is deeply involved in mentoring, and several of her former students have become leaders in academia and industry. She has served as program chair for conferences in systems and databases, serves on numerous advisory boards for scientific and national boards, and was co-founder of Sleepycat Software. The breadth of Seltzer's influence is reflected by her election as a Fellow of the ACM, the National Academy of Engineering, and the American Academy of Arts and Sciences, and the receipt of the ACM Software Systems Award, the ACM SIGMOD Systems Award, and the USENIX Lifetime Achievement Award.

ACM Software System Award

USA - 2020

The University of British Columbia

citation

For Berkeley DB, which was an early exemplar of the NoSQL movement and pioneered the "dual-license" approach to software licensing

Since 1991, Berkeley DB has been a pervasive force underlying the modern Internet: It is a part of nearly every POSIX or POSIX-like system, as well as the GNU standard C library (glibc) and many higher-level scripting languages. Berkeley DB was the transactional key/value store for a range of first- and second-generation Internet services, including account management, mail and identity servers, online trading platforms and many other software-as-a-service platforms.

As an open source package, Berkeley DB is an invaluable teaching tool, allowing students to see under the hood of a tool that they have grown familiar with by use. The code is clean, well structured, and well documented -- it had to be as it was meant to be consumed and used by an unlimited number of software developers.

As originally created by Seltzer, Olson and Bostic, Berkeley DB was distributed as part of the University of California's Fourth Berkeley Software Distribution. Seltzer and Bostic subsequently founded Sleepycat Software in 1996 to continue development of Berkeley DB and provide commercial support. Olson joined in 1997, and for ten years, Berkeley DB was the de facto data store for major web infrastructure. As the first production quality, commercial key/value store, it helped launched the NoSQL movement; as the engine behind Amazon's Dynamo and the University of Michigan's SLAPD server, Berkeley DB helped move non-relational databases into the public eye.

Sleepycat Software pioneered the "dual-license" model of software licensing: use and redistribution in Open Source applications was always free, and companies could choose a commercial license for support or to distribute Berkeley DB as part of proprietary packages. This model led the way to a number of other open source companies, and this innovation has been widely adopted in open source communities. The open source Berkeley DB release includes all the features of the complete commercial version, and developers building prototypes with open source releases suffer no delay when transitioning to a proprietary product that embeds Berkeley DB.

In summary, Berkeley DB has been one of the most useful, powerful, reliable, and long-lived software packages. The longevity of Berkeley DB's contribution is particularly impressive in an industry with frequent software system turnover.

Press Release

ACM Fellows

USA - 2011

citation

For contributions to data management and computing systems.

Background