Statistics Colloquium: Yingying Fan (University of Southern California)

Date: 

Monday, November 8, 2021, 12:00pm to 1:00pm

Location: 

Zoom - please contact emilie_campanelli@fas.harvard.edu for more information

Headshot of Yingying FanTitle:

Asymptotic Properties of High-Dimensional Random Forests

Abstract:

As a flexible nonparametric learning tool, random forests has been widely applied to various real applications with appealing empirical performance, even in the presence of high-dimensional feature space. Unveiling the underlying mechanisms has led to some important recent theoretical results on the consistency of the random forests algorithm and its variants. However, to our knowledge, all existing works concerning random forests consistency under the setting of high dimensionality were done for various modified random forests models where the splitting rules are independent of the response. In light of this, in this paper we derive the consistency rates for the original version of the random forests algorithm in a general high-dimensional nonparametric regression setting through a bias-variance decomposition analysis. Our new theoretical results show that random forests can indeed adapt to high dimensionality. In particular, we investigate in depth the conditions under which random forests controls the bias. Furthermore, our bias analysis characterizes explicitly how the random forests bias depends on the sample size, tree height, and column subsampling parameter. Some limitations of our current results are also discussed. I will also briefly discuss the statistical inference based on random forests estimation. This talk is based on joint works with Chien-Ming Chi, Jinchi Lv and Patrick Vossler.