Cut knowledge into pieces
Last lesson you decided when a question needs grounding. Now you have to make the documents retrievable — and you can't just hand the model a 40-page handbook. A retriever fetches pieces, and the model has a finite context window, so a whole document is both too big to embed as one unit and too coarse to rank: a refund question would drag in the entire file, shipping and hours and all. So before you search anything, you chunk it — cut the text into small units that each fit a size budget.
The catch is where you cut. Slice by raw character count and you get garbage like "...refund within 30 da" — a fragment that means nothing on its own and matches nothing cleanly. The fix is to respect sentence boundaries: pack whole sentences into a chunk, adding them one at a time, and only seal the chunk and start a fresh one when the next sentence would push it past the limit. Every unit stays a complete thought.
Walk the worked example. With maxWords = 10 and the text "The cat sat on the mat.
The dog ran fast. Birds fly high in the sky. Fish swim deep.", you add "The cat
sat on the mat." (6 words), then "The dog ran fast." (4) — that's 10, right at
the budget. The next sentence would overflow, so you seal chunk 1 with those two
and open chunk 2 for the birds and fish. Notice the break landed between
sentences, never inside one.
Why it matters: chunk too small and a single answer gets split across two pieces, so neither ranks well; chunk too large and every retrieval drags in irrelevant text that crowds the context and dilutes the match. Sentence-packing to a budget is the cheap, sturdy default.
Below, the document is already split into sentences for you. Walk them in order,
greedily fill a chunk up to maxWords words, and when the next sentence won't fit,
seal the current chunk and begin a new one. Print each as chunk N: .... Done means
two chunks, each whole-sentence, neither over budget.
A chunk is the smallest thing your agent can retrieve — make each one a complete thought, not a fragment.