Test behavior, not just strings
Exact match scored you 75% last lesson — but it can't grade free-form replies at all. A billing agent that says "Your refund of $129.50 will arrive in 5 business days" is a great answer, but it will never equal any single gold string you write down: the wording and order drift run to run. Hold a free-form reply to byte-equality and every good answer fails. You don't care about the exact string here — you care about a property of it.
An assertion eval checks that property directly. Instead of output === gold, you
ask narrower yes/no questions: does the reply contain "refund"? Does it match a
$00.00 price format? Is the extracted total in range? Each is a boolean, and the
reply passes only if it satisfies the ones that matter. This is exactly how this
academy grades your code — not by exact text, but by checking your output's shape.
Take that refund reply. contains(reply, "refund") → the substring is there, PASS.
matchesFormat(reply, /\$\d+\.\d{2}/) → $129.50 fits the pattern, PASS. But
inRange(129.5, 0, 100) → 129.5 is over budget, FAIL — and a real assertion has to
say so, not wave it through. That honest FAIL is the whole value: assertions flag
"refund went out, but for too much money," a defect exact match can't even see.
Below, that agent answered a billing question and three assertion functions are stubbed
to always return true — rubber-stamping every check. Fill them in so they test the
reply for real: contains runs .includes, matchesFormat runs regex.test, inRange
checks lo <= n <= hi. Done means the two passes stay PASS and the range check turns to
FAIL.
A good eval is a question with a yes/no answer about the output — substring, format, range. String-equality is just the weakest assertion of them all.