BEGIN:VCALENDAR
VERSION:2.0
X-WR-CALNAME;VALUE=TEXT:Colloquium Series: Tracy Ke
PRODID:-//Harvard events data//EN
BEGIN:VEVENT
UID:event_1636531_0
SUMMARY:Colloquium Series: Tracy Ke
DESCRIPTION:<p>Our upcoming event for the Statistics Colloquium Series is scheduled for Monday, April 27 from 12:00 – 1:00pm (ET) and will be an in-person&nbsp;presentation at Maxwell-Dworkin 134A/B. Lunch will be provided to guests following the talk. This week's speaker will be faculty member Tracy Ke from our Statistics department.</p><p>&nbsp;</p><p><span><strong>Title:</strong> Integrating Large Language Models with Statistical Methods for Text Analysis</span></p><p>&nbsp;</p><p><span><strong>Abstract:</strong> In this talk, I will present several recent projects at the intersection of large language models (LLMs) and statistical methodology for text analysis. A central theme of this work is to treat a pre-trained LLM as a feature generator, and to develop principled statistical models for the resulting representations.&nbsp;</span></p><p><span>I will begin with a recent paper introducing PPTM, a new topic model for LLM-generated word embeddings. In this framework, each document is modeled as a mixture of K latent nonparametric densities (“topics”). Empirical studies demonstrate that PPTM effectively captures context-dependent topic structure in real-world text corpora.</span></p><p><span>I will then discuss three follow-up projects spanning theory to applications. The first addresses a fundamental question motivated by PPTM: how to optimally demix mixtures of nonparametric densities. We establish the minimax optimal rate and propose a rate-optimal estimator. The second explores a downstream application of topic modeling, introducing a topic-aware Bradley–Terry–Luce model for ranking problems, such as journal evaluation and LLM leaderboards. The third engages with generative AI, proposing a framework for training LLMs to generate documents with pre-specified topic weights.</span></p><p><span>This work is based on collaborations with Morgane Austern, Jianqing Fan, Yuanchuan Guo, John Lafferty, Tianle Liu, Gabriel Moryoussef, Zhaoyang Shi, and Yuxin Tao.</span></p>
LOCATION:Maxwell-Dworkin 134A/B
STATUS:CONFIRMED
DTSTART:20260427T160000Z
DTEND:20260427T170000Z
END:VEVENT
END:VCALENDAR