David Hall is the Research Engineering Lead at Stanford Center for Research on Foundation Models, where he leads the development of Levanter. Previously, he was cofounder of conversational AI startup Semantic Machines, which was acquired by Microsoft in 2018. Prior to that, he did a PhD in Natural Language Processing at UC Berkeley, where he had a Google PhD Fellowship. David's research interests have included syntactic parsing, computational historical linguistics and social science, conversational AI, and agents in games like StarCraft, accelerating NLP algorithms and now foundation models.
Levanter: Legible, Scalable, Reproducible Foundation Models with JAX
In this talk, I'll describe Levanter, a new JAX framework for training foundation models that we developed at the Stanford Center for Research on Foundation Models (CRFM). We designed Levanter to be legible, scalable, and reproducible. Levanter uses our new named tensor library Haliax to improve code legibility and enable flexible model parallelism: just 10 lines of code add FSDP and/or tensor parallelism without modifying model code at all. This legibility and flexibility does not come at the expense of efficiency: Levanter can achieve in excess of 50%+ model flop utilization on a v3-256 TPU. Thanks to JAX, Levanter also offers bitwise reproducibility. I'll also briefly touch on some of the other features, including Hugging Face compatibility and our online data preprocessing. We'll also briefly describe some of the research that Levanter has enabled or accelerated, including Anticipatory Music Transformers, the new optimizer Sophia, and Backpack architectures.