Deep (Learning) Focus
Subscribe
Sign in
Home
Notes
The Author
Archive
About
Rubric-Based Rewards for RL
Extending the benefits of large-scale RL training to non-verifiable domains...
READ THE LATEST
Most Popular
View all
Decoder-Only Transformers: The Workhorse of Generative LLMs
Mar 4, 2024
•
Cameron R. Wolfe, Ph.D.
162
15
10
Demystifying Reasoning Models
Feb 18, 2025
•
Cameron R. Wolfe, Ph.D.
275
5
30
AI Agents from First Principles
Jun 9, 2025
•
Cameron R. Wolfe, Ph.D.
358
24
44
Understanding and Using Supervised Fine-Tuning (SFT) for Language Models
Sep 11, 2023
•
Cameron R. Wolfe, Ph.D.
82
5
8
Latest
Top
Discussions
Continual Learning with RL for LLMs
Exploring the impressive continual learning capabilities of RL training...
Jan 26
•
Cameron R. Wolfe, Ph.D.
126
12
16
GRPO++: Tricks for Making RL Actually Work
How to go from the vanilla GRPO algorithm to functional RL training at scale...
Jan 5
•
Cameron R. Wolfe, Ph.D.
122
11
18
Olmo 3 and the Open LLM Renaissance
Fully-open artifacts with the potential to make LLM research a reality for anyone...
Dec 15, 2025
•
Cameron R. Wolfe, Ph.D.
80
7
14
Group Relative Policy Optimization (GRPO)
How the algorithm that teaches LLMs to reason actually works...
Nov 24, 2025
•
Cameron R. Wolfe, Ph.D.
104
12
14
PPO for LLMs: A Guide for Normal People
Understanding the complex RL algorithm that gave us modern LLMs…
Oct 27, 2025
•
Cameron R. Wolfe, Ph.D.
152
12
14
REINFORCE: Easy Online RL for LLMs
How to get the benefits of online RL without the complexity of PPO...
Sep 29, 2025
•
Cameron R. Wolfe, Ph.D.
97
11
6
Online versus Offline RL for LLMs
A deep dive into the online-offline performance gap in LLM alignment...
Sep 8, 2025
•
Cameron R. Wolfe, Ph.D.
88
5
10
See all
Deep (Learning) Focus
I contextualize and explain important topics in AI research.
Subscribe
Recommendations
View all 13
LLM Watch
Pascal Biese
AI for Software Engineers
Logan Thorneloe
Interconnects AI
Nathan Lambert
Artificial Intelligence Made Simple
Devansh
Javarevisited Newsletter
javinpaul
Deep (Learning) Focus
Subscribe
About
Archive
Recommendations
Sitemap
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts