About

I’m a postdoctoral fellow at Harvard University as part of the Harvard Data Science Initiative, and I’m also an affiliate with LIDS at MIT. I work on developing new evaluation methodology for generative models in AI. I’m interested in capabilities that benchmarks can’t measure, like testing an LLM’s world model (i.e. whether it understands the world). While these ideas seem nebulous, my goal is to formalize them, derive evaluation methods, and implement them on models as they’re deployed.

I completed my PhD in computer science at Columbia University, where I was advised by David Blei. During my PhD, I was an NSF GRFP Fellow and Cheung-Kong Innovation Doctoral Fellow. I also interned at Google AI and Facebook AI Research. Upon graduating, I received the Morton B. Friedman Memorial Prize for excellence in engineering.

Here is my curriculum vitae. Here is a video I recorded about the methods I’ve worked on to evaluate world models, and here are articles from the Wall Street Journal and Quanta Magazine about this work.

Email: kvafa AT g.harvard.edu

Research Overview

My research focuses on three areas: building tools to evaluate and improve the word models of generative models, adapting foundation models to address statistical estimation problems in the social sciences, and behavioral machine learning: improving model performance by incorporating and quantifying insights from the behavioral sciences.

My recent research is summarized below. See my CV for a comprehensive list of papers.

World Models

While generative models are often trained to make accurate predictions, we often hope that models recover structure about the real world. In other words, we want them to learn accurate world models. I’ve worked on defining theoretical notions of world model recovery and developing empirical procedures to evaluate models. This video summarizes my recent work in this area, and I also organized the ICML 2025 workshop on this topic.

Statistical Estimation

While foundation models can make great predictions about social science data, the ultimate goal in many settings isn’t predicting outcomes, but rather using these predictive models to estimate statistical quantities. I’ve worked on adapting foundation models and developing new fine-tuning procedures to address these goals.

  • Estimating Wage Disparities Using Foundation Models
    Keyon Vafa, Susan Athey, David Blei
    Proceedings of the National Academy of Sciences (PNAS), 2025
    [PNAS] [arXiv] [Code] [BibTeX]

  • CAREER: A Foundation Model for Labor Sequence Data
    Keyon Vafa, Emil Palikot, Tianyu Du, Ayush Kanodia, Susan Athey, David Blei
    Transactions of Machine Learning Research (TMLR), 2024
    [Paper] [Code] [Data Skeptic Podcast] [Video] [BibTeX]

Behavioral Machine Learning

Even if generative models don’t have coherent world models, they can still be useful if they’re effective based on how people use them. While there’s a field full of insights about human behavior — the behavioral sciences — these insights are often qualitative and cannot be easily composed with AI systems. I’ve worked on behavioral machine learning: incorporating insights from the behavioral sciences into formal, computational models in order to evaluate and improve AI systems. I also organized the NeurIPS 2024 workshop on this topic.

  • Do Large Language Models Perform the Way People Expect? Measuring the Human Generalization Function
    Keyon Vafa, Ashesh Rambachan, Sendhil Mullainathan
    International Conference on Machine Learning (ICML), 2024
    [Paper] [Code] [MIT News] [BibTeX]

  • What’s Producible May Not Be Reachable: Measuring the Steerability of Generative Models
    Keyon Vafa, Sarah Bentley, Jon Kleinberg, Sendhil Mullainathan
    [Paper] [Code] [BibTeX]