Frozen knowledge needs grounding
A support user asks your agent "what's our refund window?" and it answers, with total confidence, "14 days." Your actual policy is 30. Nobody lied — the model simply never saw your handbook. Its weights froze the instant training ended, and the most dangerous thing about that snapshot is that the model can't feel its own edge: it fills the gap with the most plausible-sounding number and moves on.
Inside the snapshot live the timeless, public facts — the boiling point of water, who painted the Mona Lisa — and the model recalls those beautifully. Three kinds of question fall outside it. Anything private: our Q3 board deck, your internal pricing. Anything recent: today's deploy, this quarter's numbers — events that happened after the training cutoff. Anything document- specific: the contract attached to this very message. The model was trained on none of these, so the correct move isn't to recall — it's to retrieve the real text and hand it over.
The skill that comes before any of that is triage: looking at a question and
knowing which side of the line it sits on. Take "What did our Q3 board deck
conclude?" — the word our points at a private file the model has never read, so
it's needs-retrieval. Compare "What is the chemical symbol for gold?" — a fixed
public fact, safely in-model. Get this wrong and you either retrieve when you
didn't need to (slow, wasteful) or trust memory when you shouldn't (a confident,
checkable lie reaches the user).
Below is a list of questions. Write classify to return "needs-retrieval" for the
ones that reach past frozen memory — scan for signals like our, today, latest,
attached — and "in-model" for the timeless facts. Print <question> -> <verdict>
for each. Done means every private/recent/document question is flagged for grounding
and no general fact is.
Retrieval isn't about making the model smarter — it's about feeding it the facts it was never trained on, so it can stop guessing and start grounding.