Recent Advances in Post-Selection Statistical Inference
We describe the problem of “post-selection inference.” This addresses the following challenge: Having mined a set of data to find potential associations, how do we properly assess the strength of these associations? The fact that we have “cherry-picked”—searched for the strongest associations—means that we must set a higher bar for declaring significant the associations that we see. This challenge becomes more important in the era of big data and complex statistical modeling. The cherry tree (dataset) can be very large and the tools for cherry picking (statistical learning methods) are now very sophisticated. We describe some recent new developments in post-selective inference and illustrate their use in forward stepwise regression, the lasso and other settings.
This is joint work with Jonathan Taylor, Richard Lockhart, Ryan Tibshirani, and others.