Statistics Colloquium Series

Date: 

Monday, April 4, 2022, 12:00pm to 1:00pm

Location: 

Science Center, Room 316
Our upcoming event for the Statistics Department Colloquium Series is scheduled for this Monday, April 4th from 12:00 – 1:00pm (ET) and will be an in person presentation in room 316 of the Science Center. The speaker is Lucas Janson who is an Assistant Professor of Statistics at Harvard University.
 
Title: Controlled Discovery and Localization of Signals via Bayesian Linear Programming (BLiP)
 
Abstract: In many statistical problems, it is necessary to simultaneously discover signals and localize them as precisely as possible. For instance, genetic fine-mapping studies aim to discover causal genetic variants, but the strong local dependence structure of the genome makes it hard to identify the exact locations of those variants. So the statistical task is to output as many regions as possible and have those regions be as small as possible, while controlling how many outputted regions contain no signal. The same problem arises in any application where signals cannot be perfectly localized, such as locating stars in astronomical sky surveys and change point detection in time series data. However, there are two competing objectives: maximizing the number of discoveries and minimizing the size of those discoveries (all while controlling false discoveries), so our first contribution is to propose a single unified measure we call the resolution-adjusted power that formally trades off these two objectives and hence, in principle, can be maximized subject to a constraint on false discoveries. We take a Bayesian approach, but the resulting posterior optimization problem is intractable due to its non-convexity and high-dimensionality. Thus our second contribution is Bayesian Linear Programming (BLiP), a method which overcomes this intractability to jointly detect and localize signals in a way that verifiably nearly maximizes the expected resolution-adjusted power while provably controlling false discoveries. BLiP is very computationally efficient and can wrap around any Bayesian model and algorithm for approximating the posterior distribution over signal locations. Applying BLiP on top of existing state-of-the-art Bayesian analyses of UK Biobank data (for genetic fine-mapping) and the Sloan Digital Sky Survey (for astronomical point source detection) increased the resolution-adjusted power by 30-120% with just a few minutes of computation. BLiP is implemented in the new packages pyblip (Python) and blipr (R).