Rishabh Agarwal – Google Brain

I am a founding member at Periodic Labs. I am also an Adjunct Professor at McGill University. Briefly, I worked on reinforcement learning and reasoning at Meta. Before that, I was a staff research scientist in the Google DeepMind Team. I finished my PhD at Mila under the guidance of Aaron Courville and Marc Bellemare. Previously, I spent a year at Geoffrey Hinton's amazing team in Google Brain, Toronto. Earlier, I graduated in Computer Science and Engineering from IIT Bombay.

My current research revolves around RL and LLMs, and my prior work has received an outstanding paper award at NeurIPS. I was also an core contributor for Gemma and Gemini models.

Current PhD Students

Morgane Moss (Co-supervised with Aaron Courville)

Past Interns & Student Researchers

Max Schwarzer (BBF, Now ChatGPT lead @ OpenAI)
Devvrit Khatri (Scaling RL Compute, PhD @ UT Austin)
Yongchao Zhou (DistillSpec, Now @ x.AI)
Arian Hosseini (V-STaR, Now @ GDM)
Jesse Farebrother (Stop Regressing, PhD @ McGill)
Lunjun Zhang (Generative RMs, PhD @ UofT)
Charline Le Lan (RL Generalization, Now Gemini Flash @ GDM)
Michael Noukhovitch ( Asynchronous RL for LLMs, PhD @ Mila)
Wenda Xu (Speculative KD , Now @ Google)
Hritik Bansal (Compute-Optimal STaR / KD / W2S , PhD @ UCLA)
Josh P Zitovsky (Offline Model Selection, Now @ Amazon)
Amrith Setlur (Advantage for PRMs , PhD @ UC Berkeley)
Ghada Sokar (Dormant Neurons, Now @ GDM)
Siddhant Agarwal (Undergrad Researcher, Now PhD @ UT Austin )

News

Thinking Machines blog post based on my 2023 paper about On-policy Distillation of LLMs.
Delta Podcast on research career and lessons learned so far.
Talk at CVPR 2025 on The Bitter Lesson for RL: Verification as the Key to Reasoning LLMs . [Youtube]
Tutorial on Post-Training Distillation of LLMs at Google. [Podcast @ Youtube]
7 papers accepted at ICLR 2025, including Generative Verifiers , SCoRE , Speculative KD, Async RLHF, and Inference-aware RL for LLMs .
Panelist on the Inference Time LLM Algorithms Tutorial at NeurIPS 2024.
Gave a guest lecture at McGill about RL, Reasoning, and Verifiers.