Sebastian Raschka: Understanding LLMs from First Principles

Table of content

Sebastian Raschka is an LLM research engineer and one of the most effective AI educators working today. His approach is distinctive: build things from scratch to understand them deeply.

His book “Build a Large Language Model (From Scratch)” and its companion repository with 84k GitHub stars exemplify this philosophy. Rather than treating LLMs as black boxes, he walks readers through implementing every component—tokenization, attention mechanisms, pretraining, instruction finetuning—step by step in PyTorch.

Career Path

Raschka’s background spans academia and industry. He held a position as a statistics professor at the University of Wisconsin-Madison before joining Lightning AI as a senior engineer. This dual perspective shapes his teaching: rigorous theory backed by practical implementation.

He now works as an independent LLM research engineer, splitting time between research, writing, and open source contributions. His newsletter Ahead of AI reaches over 159,000 subscribers.

The From-Scratch Method

The core of Raschka’s educational philosophy: implementation drives understanding.

From his book description:

“In Build a Large Language Model (From Scratch), you’ll learn and understand how large language models (LLMs) work from the inside out by coding them from the ground up, step by step.”

This isn’t about reinventing wheels for production use. It’s about building mental models. When you implement multi-head attention yourself, you understand why certain prompting strategies work. When you code the training loop, you grasp why certain hyperparameters matter.

His latest project extends this to reasoning models. The reasoning-from-scratch repository applies the same implementation-first approach to inference-time scaling techniques—chain-of-thought, self-consistency, verification, and search methods.

Key Resources

Books:

“Build a Large Language Model (From Scratch)” (2024)
“Machine Learning with PyTorch and Scikit-Learn” (2022)
“Build a Reasoning Model (From Scratch)” (in progress, early access available)

Open Source:

LLMs-from-scratch — 84k stars, companion to his book
deeplearning-models — 17.4k stars, reference implementations
litgpt — 13.1k stars, production-ready LLM recipes (core contributor)
reasoning-from-scratch — reasoning model implementations

Newsletter: Ahead of AI covers LLM research developments with technical depth but accessible explanations.

Writing Style

Raschka’s technical writing is direct and code-forward. He shows the implementation, then explains the reasoning. No hand-waving, no “the details are left as an exercise.”

From a recent post on inference-time scaling:

“This idea is straightforward. If we are willing to spend a bit more compute, and more time at inference time, we can get the model to produce better answers.”

This directness extends to his newsletter, which maintains technical rigor while remaining accessible to practitioners who want to understand, not just use, the latest developments.

Practical Takeaways

Build to understand. Implementing a technique yourself—even a simplified version—creates deeper understanding than reading papers or using APIs.
Code is documentation. Raschka’s repositories serve as living documentation. The code demonstrates exactly what the concepts mean in practice.
Bridge theory and practice. His academic background shows in the theoretical grounding; his industry experience shows in the focus on what actually works.
Teach at the right level. His materials don’t assume you already know everything, but they also don’t waste time on irrelevant basics.