You send a prompt to ChatGPT and seconds later, a response streams back word by word. What's actually happening inside? This chapter walks through the process end to end, from the moment text enters the model to the moment a prediction comes out. The rest of the course builds each piece of this pipeline from the ground up.
The Core Idea
At its core, a Large Language Model does one thing: predict the next word. When you give it "The cat sat on the", the model looks at those words and outputs a probability for every possible next word in its vocabulary.
The model samples one word from the high-probability candidates, appends it to the input, and repeats.
Each new word feeds back in as input, and the process repeats until the response is complete. Every response you see from a language model is built one word at a time through this loop.
The Full Pipeline
When you send a prompt to an LLM and receive a response, your text goes through a series of transformations.
Don't worry about understanding exactly how or why these steps happen right now. We will explore each one in detail later. For now, just focus on the high-level stages your text passes through.
Characters become numbers computers can process.
Bytes are grouped into meaningful chunks, giving the model a shorter sequence to work with.
Each token ID becomes a vector, a list of numbers that captures the token's meaning.
The Transformer processes the full sequence and predicts what comes next.
The predicted token converts back to text, and the loop continues until the response is complete.
Learning from Data
LLMs are built to predict the next word, but how do they get good at it? A freshly created model is just millions of random numbers. It knows nothing about language, grammar, or the world. If you asked it to complete "The cat sat on the", it might confidently answer "purple" or "seventeen".
These random numbers become useful through training. The model processes billions of text examples, and after each prediction, its parameters are adjusted to reduce the error between what it guessed and what actually came next.
A training example from text data.
The model guesses the next word. At first, it's random (e.g., "banana").
The actual next word is "throne". We measure the error.
The model adjusts its internal numbers to make "throne" more likely next time.
By repeating this process billions of times across massive datasets, the model's parameters gradually shift from random noise into structured representations. It learns to recognize grammar, word relationships, and factual associations, all from predicting the next word.
What You Will Build
In this course, we will build every piece of this pipeline from scratch. By the end, you will understand not just what an LLM does, but how and why each component exists.
Throughout the course, you'll implement what you learn through coding challenges at the end of each chapter. Most challenges run directly in the built‑in editor and give feedback immediately. For local project workflows, check the setup page for the current CLI status and rollout plan.
Let's start with the very first step: how computers see text.