Understanding GPT

A visual guide to how language models work

This site explains a tiny GPT model trained on baby names. No machine learning background needed.

Tokenizer

The model cannot read letters—it only understands numbers. Learn how text becomes tokens.

Each token becomes a list of numbers. See how the model represents both letters and positions.

Follow one token through the model. Watch attention, normalization, and prediction unfold.

The model learns from examples. See loss decrease as predictions improve over time.

A trained model generates new names. Control creativity with temperature sampling.

Based on microgpt.py by Andrej Karpathy