02
Tinker, smol-RL and QDoRA
Observations from Tinker RL training API abstractions for post-training models
Working notes
Working notes, partial arguments, and technical breadcrumbs worth keeping close.
Featured note
Part 2 teaser and setup for smol-RL experiments with Tinker
Read note
Recent notes
Observations from Tinker RL training API abstractions for post-training models
Understanding deep research agents/models/queries/tasks
Documenting learnings, scripts, tricks, knowledge in a different way