About

I’m a postdoctoral fellow at Harvard University as part of the Harvard Data Science Initiative, and I’m also an affiliate with the Laboratory for Information & Decision Systems (LIDS) at MIT. I work on understanding and improving the implicit world models learned by generative models in AI. I aim to formalize the concept of recovering a world model, derive methods to evaluate recovery, and build procedures to improve models. I study these questions both in traditional AI domains and also in the social sciences, where many applications center on making accurate inferences about structure.

I completed my PhD in computer science at Columbia University, where I was advised by David Blei. During my PhD, I was an NSF GRFP Fellow and Cheung-Kong Innovation Doctoral Fellow. I also interned at Google AI and Facebook AI Research. Upon graduating, I received the Morton B. Friedman Memorial Prize for excellence in engineering.

Here is my curriculum vitae. Here is a video I recorded about the methods I’ve worked on to evaluate world models, and here are articles from the Wall Street Journal and Quanta Magazine about this work.

Email: kvafa AT g.harvard.edu

Research Overview

In one line of research, I build tools to extract the implicit world models of generative models. In another line of work, I focus on real-world robustness and building models that have enough structure to support complex, real-world use-cases like decision-making. As an application, I’m interested in building foundation models for statistical estimation problems, especially in the social sciences, where many questions are concerned with measurement and latent structure.

My recent research is summarized below. See my CV for a comprehensive list of papers.

Extracting Implicit World Models

While generative models are often trained to make accurate predictions, we often hope that models recover structure about the real world. Most optimistically, we want them to learn accurate world models. I’ve worked on defining theoretical notions of world model recovery and developing empirical procedures to evaluate models. This video summarizes my recent work in this area, and I also organized the ICML 2025 workshop on this topic.

  • Evaluating the World Model Implicit in a Generative Model
    Keyon Vafa, Justin Chen, Ashesh Rambachan, Jon Kleinberg, Sendhil Mullainathan
    Neural Information Processing Systems (NeurIPS) [spotlight], 2024
    [Paper] [Code] [BibTeX]
    Press: Wall Street Journal, Nature, MIT News, Harvard Gazette, Quanta Magazine

  • What has a Foundation Model Found? Using Inductive Bias to Probe for World Models
    Keyon Vafa, Peter Chang, Ashesh Rambachan, Sendhil Mullainathan
    International Conference on Machine Learning (ICML), 2025
    [Paper] [Code] [BibTeX]
    Press: BBC, MIT News

  • Potemkin Understanding in Large Language Models
    Marinda Mancoridis, Keyon Vafa, Bec Weeks, Sendhil Mullainathan
    International Conference on Machine Learning (ICML), 2025
    [Paper] [Code] [BibTeX]

Real-World Robustness

Even if generative models don’t have coherent world models, they can still be useful if they’re effective based on how people use them. I’ve also worked on methods for evaluating and improving real-world robustness of generative models, in applications like decision-making and steering. I also organized a workshop on a related topic, the NeurIPS 2024 Workshop on Behavioral Machine Learning.

  • Do Large Language Models Perform the Way People Expect? Measuring the Human Generalization Function
    Keyon Vafa, Ashesh Rambachan, Sendhil Mullainathan
    International Conference on Machine Learning (ICML), 2024
    [Paper] [Code] [MIT News] [BibTeX]

  • What’s Producible May Not Be Reachable: Measuring the Steerability of Generative Models
    Keyon Vafa, Sarah Bentley, Jon Kleinberg, Sendhil Mullainathan
    Neural Information Processing Systems (NeurIPS), 2025
    [Paper] [Code] [BibTeX]

Foundation Models for Statistical Estimation

While foundation models can make great predictions about social science data, the ultimate goal in many settings isn’t predicting outcomes, but rather using these predictive models to estimate structural quantities. I’ve worked on adapting foundation models and developing new fine-tuning procedures to address these goals.

  • Estimating Wage Disparities Using Foundation Models
    Keyon Vafa, Susan Athey, David Blei
    Proceedings of the National Academy of Sciences (PNAS), 2025
    [PNAS] [arXiv] [Code] [BibTeX]

  • LABOR-LLM: Language-Based Occupational Representations with Large Language Models
    Susan Athey, Herman Brunborg, Tianyu Du, Ayush Kanodia, Keyon Vafa
    [Revise and resubmit at Quantitative Economics], 2025
    [Paper] [BibTeX]

  • CAREER: A Foundation Model for Labor Sequence Data
    Keyon Vafa, Emil Palikot, Tianyu Du, Ayush Kanodia, Susan Athey, David Blei
    Transactions of Machine Learning Research (TMLR), 2024
    [Paper] [Code] [Data Skeptic Podcast] [Video] [BibTeX]