Build & Eval Your Own Agent

Everything you've built, in one agent

This is the capstone, and the question it answers is the one every demo dodges: how do you know your agent is safe before you ship it? Across this track you built guardrails one at a time — validating the request, neutralizing injected directives, redacting leaked secrets, gating irreversible actions. Here you wire the same shape into a single agent and then prove it works, because in production the order of those pieces is itself the safety property. An agent that runs the tool first and checks the guardrail second has already moved the money before anyone said no.

A real agent is three parts in a deliberate order. A tool does the actual work — here a calculator that evaluates "2 + 2". A guardrail sits in front of it and refuses unsafe requests — here, anything moving over $1000. And an eval is the test set that decides whether you trust the result at all: fixed inputs paired with the answer each should produce, scored automatically. The tool, the guardrail, and the eval harness are written for you. The wiring — the agent's brain — is not.

The wiring is one decision: guardrail first, tool second. Walk the eval set. { expr: "2 + 2" } passes the guardrail, hits the calculator, answers "4" — PASS. { expr: "10 * 3" } answers "30" — PASS. But { amount: 5000, expr: "1 + 1" } must never reach the calculator: the guardrail sees 5000 > 1000, blocks, and the agent answers "refused". Get the order backwards and case 3 returns "2" — a correct sum to a request you should never have honored. The eval prints score: 3/3 only when all three behaviors — two computed, one refused — are right. That number is the whole point: not "it felt fine when I tried it," but a repeatable score you can re-run on every change to catch the regression before a user does.

Finish agent(req): run the guardrail first and return "refused" when it blocks, otherwise call the calculator on req.expr and return the number. Reach score: 3/3.

This is what an agent really is: a small loop wired to a tool, fenced by a guardrail, and trusted only once an eval says 3/3. You just built — and proved — one.

Now build a live one

You just assembled an agent in code. Here's the same idea with a real model at the center: you write the system prompt and pick the tools it can call, then a live agent runs your configuration on a real task — reasoning out loud, calling the tools you equipped, and answering. Write a careful instruction, check the calculator and check_policy tools (it needs both to fully answer), and run it. Watch it compute 15% of 240 and refuse the over-limit refund — the loop, the tool, and the guardrail you spent this whole track building, now in a real agent you configured.

This is the door out of the academy: every agent in the Marketplace is exactly this — a prompt, some tools, and a loop. You can build one now.

Everything you've built, in one agent

Now build a live one

This is 1 of 50 lessons.