ePCA: Exponential family PCA
Many applications, such as photon-limited imaging and genomics, involve large datasets with entries from exponential family distributions. It is of interest to estimate the covariance structure and principal components of the noiseless distribution. Principal Component Analysis (PCA), the standard method for this setting, can be inefficient for non-Gaussian noise. In this talk we present ePCA, a methodology for PCA on exponential family distributions. ePCA involves the eigendecomposition of a new covariance matrix estimator, constructed in a deterministic non-iterative way using moment calculations, shrinkage, and random matrix theory. We provide several theoretical justifications for our estimator, including the Marchenko-Pastur law in high dimensions. We illustrate ePCA by denoising molecular diffraction maps obtained using photon-limited X-ray free electron laser (XFEL) imaging. This is joint work with Lydia T. Liu and Amit Singer.