Statistics Colloquium Series

Date: 

Monday, February 6, 2023, 12:00pm to 1:00pm

Location: 

Science Center, Room 316

Our upcoming event for the Statistics Department Colloquium Series is scheduled for this Monday, February 6th from 12:00 – 1:00pm (ET) and will be an in-person presentation Science Center Rm. 316. The speaker will be Cynthia Rudin who is an Earl D. McLean, Jr. Professor of Computer Science, Electrical and Computer Engineering, Statistical Science, Mathematics, Biostatistics & Bioinformatics at Duke University, and directs the Interpretable Machine Learning Lab.

Title: Do Simpler Machine Learning Models Exist and How Can We Find Them?

Abstract: While the trend in machine learning has tended towards building more complicated (black box) models, such models are not as useful for high stakes decisions - black box models have led to mistakes in bail and parole decisions in criminal justice, flawed models in healthcare, and inexplicable loan decisions in finance. Simpler, interpretable models would be better. Thus, we consider questions that diametrically oppose the trend in the field: for which types of datasets would we expect to get simpler models at the same level of accuracy as black box models? If such simpler-yet-accurate models exist, how can we use optimization to find these simpler models? In this talk, I present an easy calculation to check for the possibility of a simpler (yet accurate) model before computing one. This calculation indicates that simpler-but-accurate models do exist in practice more often than you might think. Also, some types of these simple models are (surprisingly) small enough that they can be memorized or printed on an index card. I will discuss our work on finding the full set of optimal and near-optimal sparse decision trees, as well as optimization for sparse generalized additive models (sparse GAMs). At the end I will mention current work on scoring systems (NeurIPS 2022), dimension reduction for visualization (JMLR, 2021), optimal sparse regression trees (AAAI, 2023), and interpretable neural networks (NeurIPS 2019). This is joint work with many wonderful students including Lesia Semenova, Chudi Zhong, Zhi Chen, Rui Xin, Jiachang Liu, Hayden McTavish, Jay Wang, Reto Achermann, Ilias Karimalis, Jacques Chen as well as senior collaborators Margo Seltzer, Ron Parr, Brandon Westover, Aaron Struck, Berk Ustun, and Takuya Takagi.