Think, act, observe, repeat
Plan-then-act decides the whole route up front. But some steps can't be planned in advance — they depend on what an earlier action actually returns. ReAct weaves reasoning and action together: the agent thinks one step, calls a tool, observes the real result, and only then thinks about the next move. The observation is the hinge — each new thought reacts to ground truth instead of a guess.
Three pieces make the loop work, and you'll build all three:
- Thought — a short reason for the next move ("I need 12*8 first").
- Action / Observation — call a tool, then read back what it returned. This is what separates ReAct from blind planning: the result re-enters the agent's context.
- Stop condition — the loop repeats, but it must end. When a step is an
answer, you emit the final value and break. No stop condition means a loop that spins forever.
Watch this agent work out a date difference with one tool, then feed that real number into a second tool — each thought shaped by the last observation.
Now build the mechanism yourself. The goal is "What is 12 * 8, then add 4?" You're
given a fixed calc(expr) tool and a deterministic script of steps. The starter
takes one action and quits with Answer: (unknown) — it never observes, never
loops. Implement the loop body so it prints each Thought, Action, and
Observation in turn, feeds 96 forward into 96+4, and stops on the answer
step with Answer: 100.
ReAct's power is the observation step: the agent reacts to what the tool actually returned, then loops — and a real stop condition is what turns that loop into an answer instead of an infinite spin.