About
I’m a Ph.D. candidate in the CILVR lab at NYU Courant, co-advised by Rob Fergus and Lerrel Pinto. My research is supported by a DeepMind Ph.D. Scholarship and an NSF Graduate Research Fellowship.
I’m interested in large generative models that can solve hard tasks in settings like code synthesis, reasoning, decision-making, and open-ended interaction. Recently, I’ve been thinking about:
-
How should we train language/vision-language models to exhibit better inference-time trade-offs between performance and computational budget? Can inference close the performance gap between models of different scales?
-
What is the most efficient way to generate synthetic data for improving language model capabilities?
-
When is next-token prediction a sufficient pretraining objective for reasoning and decision-making? How does the structure of data impact the downstream expressivity of model representations?
My work touches on generative modeling and reinforcement learning across modalities (vision, natural language processing, simulators, etc.). I’m also broadly interested in scientific applications for deep learning, such as weather and climate modeling.
I’ve spent time working on improving small language model reasoners with the GenAI/AI Frontiers Teams at Microsoft Research and studying ML-powered weather/climate simulators with the Applied Science Team at Google Research. I did my undergrad in mathematics at MIT, during which I was exceptionally lucky to be mentored by Kelsey R. Allen, Gigliola Staffilani, and Raffaele Ferrari.