How to Think About Agents

Why a Mental Model Matters

Most of the friction I see when teams adopt LLMs is not technical — it is conceptual. Engineers reach for the model the way they reach for a library: an input goes in, an output comes out, the contract is fixed. That mental model fits a sorting function. It does not fit a system that reads instructions, infers intent, and produces something new every time it is called.

Over the past few months I have had the pleasure of working with Ryan, a startup founder focused on integrating AI into real products. The pattern that has emerged from our collaboration is simple: the teams that get the most out of LLMs are the ones that change how they think about them before they change how they call them.

This article is about that shift in thinking.

Agents Are Not Functions — They Are Collaborators

A useful starting point: an LLM behaves less like a function and more like a smart, eager, slightly forgetful junior teammate.

A function fails the same way every time. An agent fails differently each time.
A function does not need context. An agent needs almost nothing but context.
A function does not improve with examples. An agent does — dramatically.

Once you accept that you are designing for a collaborator instead of a callee, several things follow naturally. You stop writing rigid contracts and start writing briefs. You stop chasing determinism and start designing for variance. You stop measuring "did it return the right shape" and start measuring "did it accomplish the goal."

The shift sounds small. It changes almost every architectural decision downstream.

The Three Levers You Actually Control

When you build with an agent, you control surprisingly few things — and you should be honest about which ones move the needle.

1. Context

Context is the single biggest lever. Most "the model is bad at X" complaints turn out to be "the model was never told what it needed to know about X."

Good context is:

Specific — a paragraph of the actual document, not a summary of it.
Relevant — only what is needed for this step, not the whole world.
Fresh — the current state of the system, not yesterday's snapshot.

A useful test: if you handed the same prompt to a competent stranger, could they do the task? If not, the agent cannot either.

2. Constraints

The smaller the problem you hand an agent, the better it performs. "Write me a marketing site" is a project. "Rewrite this headline to be under ten words and emphasize speed" is a task. Agents are excellent at tasks and mediocre at projects.

Practically, this means breaking work into the smallest unit that still produces something useful, and letting the agent iterate inside that unit instead of attempting the whole arc at once.

3. Feedback Loops

The third lever is what happens after the agent produces an output. Did it work? How do you know? Can the agent see the result of its own action and adjust?

A tight feedback loop — run the code, read the error, try again — turns a one-shot generator into something that resembles actual reasoning. A loose feedback loop, where the agent emits a wall of text and a human reads it three hours later, throws away most of the value.

When to Reach for an Agent (and When Not To)

Not every problem needs an LLM. A heuristic I have come to trust:

Use an agent when…	Use traditional code when…
The input is unstructured (natural language, documents)	The input has a known schema
The task requires judgment or interpretation	The task is deterministic
The "correct answer" is fuzzy or context-dependent	There is one right answer
You need flexibility across many similar-but-different cases	The cases are few and well-defined
Iteration and adaptation are valuable	Repeatability and auditability are non-negotiable

The mistake I see most often is using an agent for a problem that a regex would solve faster, cheaper, and more reliably. The opposite mistake — trying to handcraft rules for something genuinely fuzzy — is just as common and usually more expensive.

Designing the Brief, Not the API

When Ryan and I work through a new feature, we spend most of the time on the brief to the agent, not on the code that calls it. The brief usually has four parts:

Role — who the agent is in this interaction ("You are reviewing a customer support ticket").
Goal — what success looks like ("Decide if this needs a human, and if so, why").
Context — the actual material to work with.
Examples — one or two demonstrations of the kind of output we want.

That is it. No JSON schemas the model has to fight against. No twelve-step workflows. The structure is in the brief; the freedom is in the response.

When the output is wrong, we do not patch the calling code — we revise the brief. Ninety percent of the time, the brief was the bug.

Treat Outputs as Drafts, Not Decisions

A subtle but important reframing: an agent's output is a draft. Sometimes it is a draft of code, sometimes a draft of a decision, sometimes a draft of a message. Drafts are useful. Drafts are not authoritative.

The systems that work in production tend to share a shape:

The agent produces a draft.
A second pass — another agent, a rule, a human, or the result of running the code — validates it.
Only validated drafts move forward.

This is the same instinct that produced code review, type checking, and tests. It applies just as cleanly to non-deterministic systems, and it is the cheapest insurance policy you can buy against the failure modes of LLMs.

Practical Heuristics

A few rules of thumb that have held up across projects:

If you are arguing with the model, the model is missing context. Add context before you change tactics.
Smaller, cheaper models with great prompts beat bigger, expensive models with sloppy prompts. Almost every time.
An agent that can see the result of its action is worth ten that cannot. Wire up the feedback loop early.
Examples beat instructions. When in doubt, show, don't tell.
Logs are your debugger. Save every prompt and every response. You will read them.

Conclusion

The leap from "LLM as API" to "agent as collaborator" is the most useful conceptual move you can make when building AI-powered products. It changes what you optimize for, how you debug, and how you decide whether to use the tool at all.

The principles are not new — give clear context, constrain the problem, design feedback loops, treat outputs as drafts. They are the same principles that make working with humans effective. That is not a coincidence. The interesting part of building with agents is not the model; it is the system you put around it.

Build the system as if you were onboarding a thoughtful new teammate, and the agent will rise to meet it.