Colloquium Series: Michael Li


Monday, March 18, 2024, 12:00pm to 1:00pm

Our upcoming event for the Statistics Department Colloquium Series is scheduled for Monday, February 26 from 12:00 – 1:00pm (ET) and will be an in-person presentation Science Center Rm. 316. Lunch will be provided to guests following the talk. This week's speaker will be Michael Li of the Harvard Business School.

Title: The Cram Method for Efficient Simultaneous Learning and Evaluation

Abstract: We introduce the `cram' method, a general and efficient approach to simultaneous learning and evaluation using a generic machine learning (ML) algorithm. In a single pass of batched data, the proposed method repeatedly trains an ML algorithm and tests its empirical performance. Because it utilizes the entire sample for both learning and evaluation, cramming is significantly more data-efficient than traditional sample-splitting. The cram method also naturally accommodates online learning algorithms, leading to a computationally efficient methodology. To demonstrate the power of the cram method, we consider the standard policy learning setting where cramming is applied to the same data to both develop an optimal individualized treatment rule (ITR) and estimate the average outcome under the learned ITR. We show that under a minimal set of assumptions, the resulting crammed evaluation estimator is consistent and asymptotically normal. While this result requires a relatively weak stabilization condition of ML algorithm, we develop a simple, generic method that can be used with any policy learning algorithm to satisfy this condition. Our extensive simulation studies show that, when compared to sample-splitting, cramming reduces the evaluation standard error by more than 40\% while improving the performance of learned policy.