Abstract: Model-X knockoffs is a wrapper that transforms essentially any feature importance measure into a variable selection algorithm, which discovers true effects while rigorously controlling the expected fraction of false positives. A frequently discussed challenge to apply this method is to construct knockoff variables, which are synthetic variables obeying a crucial exchangeability property with the explanatory variables under study. This paper introduces techniques for knockoff generation in great...
Title: Recent developments on unbiased Monte Carlo methods
Abstract: Monte Carlo estimators, based on Markov chains or interacting particle systems, are typically biased when run with a finite number of iterations (or a finite number of particles). Although this is usually considered unavoidable, and negligible in the usual asymptotic sense, it is an important obstacle on the path towards scalable numerical integration on large-scale distributed computing systems. In a series of works that build on the seminal paper of Glynn and Rhee (2014), a...
Mendelian randomization: A comprehensive statistical approach and applications to preventing heart disease
Mendelian randomization (MR) can give unbiased estimate of a confounded causal effect by using genetic variants as instrumental variables. The summary-data MR design is rapidly gaining popularity in practice due to the increasing availability of large-scale genome-wide association studies. As we are entering the "MR of every risk factor on every disease outcome" era, existing statistical methods still have several major limitations and lack theoretical...
Stability-driven deep model interpretation and provably fast MCMC sampling
Data science is transforming many traditional ways in which we approach scientific problems. While the abundance of data and algorithms generate a lot of excitement in statistical modeling, serious concerns about how to reliably and efficiently extract scientific knowledge from data and models are being raised.
In this talk, I will address particular reliability and efficiency issues that arise from my PhD study on a neuroscience project. Understanding how primates process...