The Context Window

A finite desk

On turn forty of a long chat, your agent suddenly forgets the user's name — or the API rejects the request outright with "too many tokens." Nothing broke in your code. You simply tried to put more on the desk than the desk can hold. Every model has a hard limit, and once the conversation crosses it, something has to come off.

The context window is that desk: a fixed amount of room measured in tokens. Everything the model reads on a turn — the system prompt, every past message, the tool results — has to lie on the desk at once, because the model has no memory between calls. It only knows what you hand it this turn. So when the history grows past the budget, you don't get to keep it all; you choose. And the cheap, reliable heuristic is recency: the newest messages usually carry the live intent, so you keep the newest that fit and slide the oldest off the edge.

Concretely, imagine a 100-token desk and four messages: a 30-token system line, a 20-token question, a 25-token reply, and a fresh 40-token question. Walking newest → oldest, you take the 40, then the 25 (65), then the 20 (85) — and the 30-token system line would push you to 115, so it falls off. You kept 3 of 4 and spent 85/100. The most recent turn survives; the stale instruction is sacrificed.

This is the first lever every agent pulls, and getting it wrong is expensive both ways: keep too much and the request errors out or costs more; drop too aggressively and the agent loses the thread. The later lessons in this track all exist because plain keep/drop is lossy — but you have to master the budget before you can be clever about it.

Below is exactly that conversation: four messages whose tokens sum past the 100-token window. Walk from the newest backward, keeping each only while the running total still fits the budget, and print which messages stay, which get dropped, and the used/budget total. Done means used never exceeds 100 and the oldest message reads DROP.

Context isn't free or infinite. Fitting the window — deciding what to keep on the desk and what to let fall away — is the first real memory skill.

A finite desk

This is 1 of 50 lessons.