Tinker, smol-RL and QDoRA (Part 2)
Part 2 goes live on Feb 28, 2026 This is a short preface to Part 2. The full write-up will be published on Feb 28, 2026. If you want the context from Part 1 first, start here: Tinker, smol-RL and QDoRA. TLDR; In Part 1, I framed reproducibility as a practical problem in modern LLM work, not just a philosophical ideal. Even with greedy decoding and fixed seeds, determinism can break in subtle ways: GPU type, numerical precision, kernel choices, and non-deterministic log-probs in MoE models all conspire to make “run it again” less reliable than we like to admit. That led to a simple question: do we need a better abstraction layer so that the model ops complexity is hidden but the critical knobs for determinism are explicit and repeatable? ...