Phase 5 simulation

Transformer Explorer.

Type a sentence and watch a real GPT-2 small process it inside your browser. See how text gets broken into tokens, watch attention flow between them across the model’s 12 layers and 12 heads, and click Generate next token to grow the sentence one piece at a time.

loading the simulation…

What you’re seeing

A transformer language model doesn’t see characters or words — it sees tokens: chunks of text from a fixed vocabulary of about 50,000 entries. The first row shows your sentence after GPT-2’s byte-pair tokenizer has done its work, including the numeric ID each token has in the vocabulary.

Each layer of GPT-2 has twelve attention heads, and each head decides — for every token — how much to look at every other token. The bipartite graph shows one head’s attention map at a time: line opacity is the attention weight from a token on the left to a token on the right. Different heads pick up different patterns; middle layers tend to be the most visually rich.

The bar chart is what GPT-2 thinks the next token should be. Temperature reshapes the same logits — low temperature is confident, high temperature is chaotic. Click Generate to sample a token from that distribution and watch the model continue.