Retrieval · Free preview

Re-rank the Results

A sharper second pass

First-pass keyword retrieval gets you candidates for recall, but the best answer needs precision — a second pass re-scores those candidates with a sharper signal (like an exact-phrase bonus) and reorders the top results.

A sharper second pass

Your keyword retriever from earlier is fast and honest, but blunt: it scores a chunk by how many query words it contains, regardless of whether those words land together. Ask it about the return policy and the top candidate it hands back, c1, is a chunk that happens to sprinkle return, policy, and refund across a sentence about shipping — five scattered word-hits, keywordScore 5. Meanwhile c2, the chunk that literally opens "Our return policy..." — the exact thing the user asked for — sits at keywordScore 4, a rank below. First-pass retrieval optimized for recall: it pulled in everything plausibly relevant. It did not optimize for precision: putting the single best chunk on top.

That's the job of a re-ranker: a cheap second pass that re-scores the candidates the first pass already found, using a signal too expensive or too sharp to apply to the whole corpus. You never re-search — you only reorder the short list you already have. Here the sharper signal is the exact phrase: a chunk that contains the literal string return policy is almost certainly more on-topic than one that merely scatters those words, so you give it a bonus. Combined score = keywordScore + (contains exact phrase ? 2 : 0). Re-sort by that, and the ranking changes.

Walk it. c1 has keywordScore 5 but no exact phrase → stays 5. c2 has keywordScore 4 and contains "return policy" → 4 + 2 = 6. c3 has keywordScore 3 and also contains the phrase → 3 + 2 = 5. c4 is about store hours → 1. Sort descending (ties keep original order) and the top three flip to c2 (6), c1 (5), c3 (5). The chunk a human would have picked is now rank 1 — and notice c1 didn't vanish, it just got out-precised by the chunk that says the exact thing.

Why it matters: this two-stage shape — wide cheap recall, then a sharp cheap re-rank — is how real retrieval systems earn both coverage and a good top result without running the expensive signal over millions of documents. The first pass decides what's in the running; the re-ranker decides what wins.

Below you get four candidates with their first-pass keywordScore and the query return policy. The print loop is wired; the re-rank is yours. Re-score each candidate as keywordScore + 2 when its text contains the exact phrase, sort by that combined score (highest first, ties keep original order), and print the top three as rank <n>: <id> (score <N>). Done means c2 sits at rank 1 — the order changed from raw keywordScore alone.

Recall decides what makes the short list; a re-ranker decides what tops it. The cheapest precision win is often a second pass over candidates you already have.

In the full academy, you write and run this — live, graded:

// First-pass keyword retrieval already handed you these CANDIDATES.
// Each has a keywordScore (how well it matched on individual query words).
const query = "return policy";
const candidates = [
  { id: "c1", text: "Send items back within the window; refunds are handled per our policy on returns and shipping.", keywordScore: 5 },
  { id: "c2", text: "Our return policy lets customers return any item within 30 days for a full refund.", keywordScore: 4 },
  { id: "c3", text: "The return policy covers unopened goods and is listed on the receipt.", keywordScore: 3 },
  { id: "c4", text: "Store hours are nine to five on weekdays.", keywordScore: 1 },
];

// Combined score = keywordScore + a +2 bonus when the chunk contains the EXACT
// query phrase. The exact phrase is a sharper signal than scattered word matches.
//
// TODO: re-score every candidate with the combined score, then sort by it

🔒 Live code execution, real agent runs, mastery tracking and verifiable credentials unlock with the full academy.

This is 1 of 50 lessons.

The full academy: write real code, watch real agents run, and earn verifiable credentials — across 8 tracks, in a 3D campus.

Unlock the full academy — $100 →

14-day refund · 🔒 Stripe-secured checkout · lifetime access

More free lessons: An LLM Is a Function  ·  The Agent Loop  ·  Define a Tool  ·  Give an Agent a Tool  ·  Durable State

← The Agent Marketplace