The demo worked. That is usually where the trouble starts.
You wire an AI assistant into a real task, it performs once, and you ship it. Three weeks later it is quietly producing work you have to redo. The output drifts. An instruction you gave on turn one is ignored by turn twelve. Nothing threw an error. The results just got worse, slowly, until someone noticed.
That is not a model failure. The model did what models do. It is a system failure: there was no structure around the model to hold it in place.
This newsletter is a weekly field manual for closing that gap. One issue, one technique, one precise term, one real case. The thesis in a sentence - how engineers control AI agents through constraint architecture.
Today is the why. Every issue after this is a how.
Method Deep-Dive: name the three constraints before you run the prompt
The highest-leverage habit in this entire discipline takes thirty seconds. Before you run a prompt, make three things explicit: what you want (Intent), what the model needs to know (Context), and what it must never do (Guardrails). This is the Three-Constraint Rule, and most prompts in the wild are missing the third one entirely.
Here is why the third matters most. A large language model generates by sampling from an enormous space of plausible next tokens. Your prompt narrows that space. State your intent and you cut it down a great deal. Add the relevant context and you cut it down further. But if you never state the guardrails, the actions and outputs that are off-limits, that region stays ungoverned, and the model fills it with its own priors. That is the drift you saw in production. The model is not disobeying you. You left a gap, and the model filled it the way its training distribution told it to.
A minimal system-prompt skeleton that encodes all three constraints:
ROLE: You are a code-fixing assistant for a production Python service.
INTENT
- Fix only the specific defect named in the request. Nothing else.
CONTEXT
- The service is live. Tests run on every commit. Style is enforced by ruff.
- Each request names exactly one file and one defect.
GUARDRAILS
- Edit only the file named in the request. Do not touch any other file.
- Do not refactor, rename, or "improve" code you were not asked to change.
- If the fix genuinely requires changing another file, stop and explain why first.
The guardrail block is the part teams skip and the part that prevents the most expensive failures. Notice that the guardrails are stated as hard boundaries, not preferences, "do not touch any other file," not "try to stay focused." A boundary the model can satisfy or violate is checkable. A preference is a suggestion it will round off under pressure.
There is a second discipline hiding in that block: every line does work. No throat-clearing, no "please," no "you are a helpful assistant who loves clean code." That property has a name, Token Density, the share of a prompt that actually encodes constraint rather than padding. High-density prompts are not terse for style; they are terse because each retained sentence removes a class of failure, and every sentence that removes nothing is just extra surface area for the model to misread. When you cut a prompt, you are not trimming words. You are raising the ratio of constraint to noise.
Constraint Case Study: the same task, twice
A concrete before and after. This is the common pattern, not a named customer, but you have lived some version of it.
Before. An engineer asks an agent to "clean up the auth module." The agent re-formats three files, renames two functions, and rewrites an import that breaks a downstream service. The intent was vague, the context was thin, and the guardrails were absent. Rework took the rest of the afternoon, plus an incident write-up.
After. The same engineer, the same agent, the same model. This time: "In auth/session.py, the token TTL reads from the wrong config key. Fix only that. Change nothing else." One file changed. A five-minute review. Done.
Nothing improved except the prompt, and the prompt did not get longer, it got bounded. Three explicit constraints instead of zero. The cost difference between those two runs was most of a working day. Multiply that by a team and a quarter, and you can see why this is an architecture problem and not a prompting tip.
One subtlety worth stating plainly: the second prompt did not win by being more detailed. It won by being more bounded. "Fix only that, change nothing else" is not extra information, it is a removed permission. That distinction is the whole game. Elaboration tells the model more about what you want; constraint tells it what it may not do. The first helps; only the second contains the failure. And when a bounded agent hits a case where the fix genuinely does need a second file, the guardrail does not block the work, it forces the model to stop and surface the decision to you, which is exactly where a human belongs.
Vocabulary Anchor: Constraint Architecture
Constraint Architecture is the deliberate structure of constraints, intent, context, guardrails, and fallbacks, layered onto the inputs a model receives so its output stays within a bounded, reliable region.
In use: "The prompt failed because there was no constraint architecture behind it, just a request and a hope."
Where it does not apply: open exploration. When you want maximum variance, brainstorming, divergent ideation, a throwaway first draft, heavy constraints work against you. Constraint architecture is the discipline of reliability, not of discovery. Knowing which mode you are in is half the skill. The mistake is not under-constraining a brainstorm; it is under-constraining production.
Architecture Brief: the same move at every scale
The ‘Three-Constraint Rule’ is constraint architecture at the smallest scale, one person, one prompt. The same move repeats as the system grows. A multi-step workflow needs explicit contracts between its steps. Concretely: a two-step pipeline that drafts code and then reviews it needs the reviewer's contract to state what counts as a pass and what gets rejected, otherwise the second step inherits the first step's drift and launders it as approval. A multi-agent system needs guardrails on every handoff, because each handoff is a fresh place for drift to enter. An organization running AI across dozens of teams needs governance, the same three questions asked at the level of policy.
That scaling axis has a name: the ‘Prompt Maturity Model’. Level 1 is one practitioner constraining one prompt. Level 5 is an organization constraining its whole AI surface with measurement and governance in place. Each level up is not a new idea, it is the same discipline applied to a wider system. This newsletter climbs that ladder one rung at a time, and it always shows the rung below so the move stays legible.
Closing Calibration
One thing to try this week. Take the next prompt you are about to run on something that matters. Before you send it, write its Intent, its Context, and its Guardrails as one sentence each. If you cannot name the guardrails, you have just found the exact place it will drift.
If you want the full vocabulary underneath this, the 25 precision terms and the Prompt Maturity Model, the Method page is free and linked below. For engineers who want the deployable version, the Agent Control Architecture Pack is the system prompts and the diagnostic kit, not just the concepts.
Next week: the most common way constraint architecture fails in coding agents, why your assistant edits files you never asked it to touch, and the specific mechanism behind it. Take a guess at the cause before it lands. It is not carelessness, and it is not the model being "too eager." It is something with a name.
- The Constraint