system, user, assistant
In the last lesson the policy lived in a system prompt. But how does that prompt reach the model alongside the actual conversation? If you mashed it all into one string — "You are a terse travel agent. I want to go somewhere warm. Lisbon. Book it." — the model would have no way to tell its own past words from yours, or to know which sentence is a standing instruction versus a thing someone just said. It would lose the thread, contradict itself, and treat your request as if it had said it. The structure that prevents this is roles.
A model call isn't one flat string — it's an ordered list of messages, each
tagged with a role. The system message carries the standing instructions (last
lesson's policy). user messages are what the human says. assistant messages are
what the model said back. The model reads the whole list top to bottom, and the role
tags are how it parses the dialogue: this line is my instruction, this one is theirs,
this one was me a moment ago.
Concretely, the call here is five messages: system ("You are a terse travel agent.")
leads, then the turns alternate — user "somewhere warm," assistant "Lisbon," user
"Book it," assistant "Booked. Confirmation TRX-1199." This is also how memory works in
later stations: you grow the conversation by appending role-tagged messages, never by
editing one giant string.
Below, the system instruction and the alternating turns are given to you separately.
Assemble them into one list — the system message first, then every turn — and
print it, one line per message as role: text. "Done" is five role: text lines
with system: on top.
A conversation isn't a paragraph; it's a stack of labeled envelopes. The labels are what let the model tell its own voice from yours.