About

I’m a postdoctoral fellow at Harvard University as part of the Harvard Data Science Initiative. I work on behavioral machine learning: developing tools to evaluate how AI models understand the world so we can create shared understanding between models and people. I’m also an affiliate with LIDS at MIT.

I completed my PhD in computer science at Columbia University in 2023, where I was advised by David Blei. During my PhD, I was an NSF GRFP Fellow and Cheung-Kong Innovation Doctoral Fellow. Upon graduating, I received the Morton B. Friedman Memorial Prize for excellence in engineering.

Announcement: I'm co-organizing the ICML 2025 Workshop on Assessing World Models. Are you interested in understanding the world models of AI systems? Come by and check out submitted papers and talks on this topic.

Here is my curriculum vitae.

Email: kvafa AT g.harvard.edu

Recent Papers

Here are some papers from the past year that are representative of what I work on. See my CV for a complete list.

What has a Foundation Model Found? Using Inductive Bias to Probe for World Models
K Vafa, P Chang, A Rambachan, S Mullainathan
International Conference on Machine Learning (ICML), 2025
[Paper] [Code]

Potemkin Understanding in Large Language Models
M Mancoridis, K Vafa, B Weeks, S Mullainathan
International Conference on Machine Learning (ICML), 2025
[Paper] [Code]

Estimating Wage Disparities Using Foundation Models
K Vafa, S Athey, D Blei
Proceedings of the National Academy of Sciences (PNAS), 2025
[PNAS] [arXiv] [Code]

Evaluating the World Model Implicit in a Generative Model
K Vafa, J Y Chen, A Rambachan, J Kleinberg, S Mullainathan
Neural Information Processing Systems (NeurIPS), 2024 [spotlight]
[Paper] [Code] [Twitter summary] Press: Nature, MIT News, Wall Street Journal

Do Large Language Models Perform the Way People Expect? Measuring the Human Generalization Function
K Vafa, A Rambachan, S Mullainathan
International Conference on Machine Learning (ICML), 2024
[Paper] [Code] [MIT News]