Understanding GPT
A visual guide to how language models work
This site explains a tiny GPT model trained on baby names. No machine learning background needed.
01
Tokenizer
The model cannot read letters—it only understands numbers. Learn how text becomes tokens.
02
Embeddings
Each token becomes a list of numbers. See how the model represents both letters and positions.
03
Forward Pass
Follow one token through the model. Watch attention, normalization, and prediction unfold.
04
Training
The model learns from examples. See loss decrease as predictions improve over time.
05
Inference
A trained model generates new names. Control creativity with temperature sampling.
Based on microgpt.py by Andrej Karpathy