Retrieval · Free preview

Chunking a Document

Cut knowledge into pieces

Chunking splits a long document into small retrievable units — packed by whole sentences so each unit stays under a size budget without cutting a sentence in half.

Cut knowledge into pieces

Last lesson you decided when a question needs grounding. Now you have to make the documents retrievable — and you can't just hand the model a 40-page handbook. A retriever fetches pieces, and the model has a finite context window, so a whole document is both too big to embed as one unit and too coarse to rank: a refund question would drag in the entire file, shipping and hours and all. So before you search anything, you chunk it — cut the text into small units that each fit a size budget.

The catch is where you cut. Slice by raw character count and you get garbage like "...refund within 30 da" — a fragment that means nothing on its own and matches nothing cleanly. The fix is to respect sentence boundaries: pack whole sentences into a chunk, adding them one at a time, and only seal the chunk and start a fresh one when the next sentence would push it past the limit. Every unit stays a complete thought.

Walk the worked example. With maxWords = 10 and the text "The cat sat on the mat. The dog ran fast. Birds fly high in the sky. Fish swim deep.", you add "The cat sat on the mat." (6 words), then "The dog ran fast." (4) — that's 10, right at the budget. The next sentence would overflow, so you seal chunk 1 with those two and open chunk 2 for the birds and fish. Notice the break landed between sentences, never inside one.

Why it matters: chunk too small and a single answer gets split across two pieces, so neither ranks well; chunk too large and every retrieval drags in irrelevant text that crowds the context and dilutes the match. Sentence-packing to a budget is the cheap, sturdy default.

Below, the document is already split into sentences for you. Walk them in order, greedily fill a chunk up to maxWords words, and when the next sentence won't fit, seal the current chunk and begin a new one. Print each as chunk N: .... Done means two chunks, each whole-sentence, neither over budget.

A chunk is the smallest thing your agent can retrieve — make each one a complete thought, not a fragment.

In the full academy, you write and run this — live, graded:

// A retriever can only fetch small pieces, so we cut the document into chunks.
const doc =
  "The cat sat on the mat. The dog ran fast. Birds fly high in the sky. Fish swim deep.";
const maxWords = 10; // each chunk must hold <= this many words

// Split the doc into sentences (each keeps its trailing period).
const sentences = doc.match(/[^.]+\./g).map((s) => s.trim());
const wordCount = (s) => s.split(/\s+/).length;

// TODO: pack WHOLE sentences greedily into chunks of <= maxWords words.
// Right now we just dump the entire document as one chunk — fix that.
const chunks = [doc];

chunks.forEach((c, i) => console.log(`chunk ${i + 1}: ${c}`));

🔒 Live code execution, real agent runs, mastery tracking and verifiable credentials unlock with the full academy.

This is 1 of 50 lessons.

The full academy: write real code, watch real agents run, and earn verifiable credentials — across 8 tracks, in a 3D campus.

Unlock the full academy — $100 →

14-day refund · 🔒 Stripe-secured checkout · lifetime access

More free lessons: An LLM Is a Function  ·  The Agent Loop  ·  Define a Tool  ·  Give an Agent a Tool  ·  Durable State

← The Agent Marketplace