Memory · Free preview

Summarize to Remember

Compress the past

When a conversation history outgrows its budget, you compact it: collapse the oldest turns into a one-line summary and keep the most recent turns verbatim — trading exact wording you no longer need for room to keep going.

Compress the past

In the first lesson you fit the window by dropping the oldest messages off the desk. That works until it doesn't: drop the turn where the user said "book a flight to Tokyo" and the agent later confirms a booking to nowhere. Plain keep/drop is lossy in the worst way — it throws out facts you still need just because they're old. The conversation can't grow forever, but the past isn't all noise.

Compaction is the smarter trade. Once the history grows past your budget, you don't delete the oldest turns — you collapse them into a single summary line and keep the most recent turns word-for-word. The agent still knows earlier work happened and roughly what it was, but it stops paying full token price for every old turn. You're swapping high-fidelity wording you no longer need for the room to keep going. The recent turns stay verbatim because that's where the live, exact detail lives.

Take a six-turn history — say hello, ask what they need, look up flights, book a flight to Tokyo, window seat please, confirm the booking — with a budget of 4 and keepRecent = 3. Compaction folds the oldest three into summary: 3 earlier turns and leaves the last three intact. Six lines become four: the agent remembers it greeted the user and looked things up, while preserving the exact recent intent it must act on. In a real agent that summary line would be a model-written sentence; here you build the mechanism that decides what gets folded and what stays.

This is the pattern that lets agents run for hours instead of minutes. Get the boundary wrong — summarize too much and you lose the live request; too little and you blow the budget — and the agent either forgets the task or chokes.

Below is exactly that history and budget. When history.length exceeds budget, replace the oldest (length - keepRecent) turns with one summary: N earlier turns line and keep the last keepRecent turns verbatim, then print the compacted history. Done means one summary line counting 3 earlier turns, followed by the three most recent turns unchanged.

Compaction isn't deletion — it's a trade: you swap exact wording you no longer need for room to keep going.

In the full academy, you write and run this — live, graded:

// A conversation that has grown past what fits in the budget.
const history = [
  "say hello to the user",
  "ask what they need",
  "look up flight options",
  "book a flight to Tokyo",
  "window seat please",
  "confirm the booking",
];

// We can only afford to keep this many turns of context.
const budget = 4;
// How many of the most-recent turns to keep word-for-word.
const keepRecent = 3;

// 🗜️ Compact the history so it fits the budget.
// If history is over budget, collapse the OLDEST turns into a single
// "summary: N earlier turns" line, then keep the last keepRecent verbatim.
// TODO: replace this — right now it just keeps everything.
const compacted = history.slice();

for (const line of compacted) console.log(`turn: ${line}`);

🔒 Live code execution, real agent runs, mastery tracking and verifiable credentials unlock with the full academy.

This is 1 of 50 lessons.

The full academy: write real code, watch real agents run, and earn verifiable credentials — across 8 tracks, in a 3D campus.

Unlock the full academy — $100 →

14-day refund · 🔒 Stripe-secured checkout · lifetime access

More free lessons: An LLM Is a Function  ·  The Agent Loop  ·  Define a Tool  ·  Give an Agent a Tool  ·  Durable State

← The Agent Marketplace