Patterns
Patterns are reusable verification structures for agentic systems. Each card names the failure it controls, the mechanism it uses, the observable signal it produces, and the evidence behind it. Two ways in: start from the failure you are seeing, or browse by family.
Start from the failure
First the pain, then the pattern. Each failure lists the patterns to reach for, closest fit first.
The agent says done, but the work is not actually done.
The workflow has no external signal for completion.
The agent reviews itself and misses obvious problems.
The verifier is too close to the generator.
The same prompt produces different outcomes across runs.
Variance is leaking through sampling, state, timing, or judge behavior.
The check passes because the environment already looked right.
The verifier observes a true fact, but not causality.
Agents disagree, loop, or escalate randomly.
The system has no explicit routing policy for uncertainty.
Tool calls are messy, unsafe, or hard to verify.
The boundary between model intent, tool input, and policy is ambiguous.
Browse by family
Context and State
What exists before the model acts. 5 patterns.
- Causal Tag Stamp every event the agent emits with a stable joinable identifier and, when applicable, a parent identifier, so verification can attribute observed effects to specific agent actions rather than inferring causality from temporal proximity in shared ambient state.
- Constitution Represent the system’s verification criteria as explicit, versioned, machine-readable data, rather than as scattered prompt prose.
- Guardrail Decorator Wrap a model call, tool call, or other model-output boundary in a policy decorator that can deny, replace, sanitize, or convert errors, so policy lives in code at the boundary the model crosses instead of in prompt prose the model is asked to obey.
- State Baseline Capture the relevant environment or process state before an action under verification, so the verifier can prove the action caused the observed change instead of accepting state that already existed.
- Trajectory Cursor Maintain an explicit, structured record of where the agent is in its multi-step process and what happened at each boundary, so the verifier and the next turn can read the trajectory instead of inferring it from chat history or model recall.
Verification
How to turn judgment into observable signals. 6 patterns.
- Adversarial Frame Replace tone-level skepticism instructions with admissibility rules that define what counts as proof, name common shortcut paths to reject, and invert the verifier's default from "accept if plausible" to "fail unless backed by trusted evidence."
- Blind Oracle Derive expected evidence from the spec, the question, or independent re-execution without conditioning that derivation on the agent's draft, reasoning trace, or shortcut history.
- Comparator Express verification comparison as a named operator from a finite family, so the verdict is a deterministic function of (expected, observed, operator) rather than a model's interpretation of "does this look right?"
- Delta Verify the success of an agent's actions by asserting on the change in environment state rather than the absolute environment state.
- Executable Analog Translate a subjective, language-based verification step into a deterministic, programmatic execution step that yields a binary pass/fail signal independent of the agent's judgment.
- Judge Harness Wrap an LLM judge in a structural harness of perturbation, repetition, calibration, and reporting so that one judge verdict becomes a measured signal with visible consistency and bias controls.
Orchestration
How to control bias through independence and feedback routing. 6 patterns.
- Adversary Assign a structurally separate role whose only job is to find failures in another role's output, and require that role to emit a negative channel the orchestrator can inspect.
- Backpressure When a downstream check fails, route the failure back upstream as structured rerun context within a bounded retry budget, instead of swallowing the failure or retrying blindly.
- Cross-Family Run high-leverage generation and high-leverage assessment on deliberately different model families, and record both identities, so shared training-data biases and shared latent priors cannot pass undetected through the verification boundary.
- Debate Run bounded disagreement among multiple roles before a decision, with turn order, round count, phase, and consensus threshold held in orchestration state instead of model discretion.
- Escalation Chain Route work to a higher-authority or different-capability handler through a typed, validated handoff, so the next owner is code-level state instead of a model's memory of who to call.
- Tool Adapter Normalize model-emitted tool calls at a typed boundary: derive or fetch a schema, validate arguments before invocation, call the tool with typed arguments, and return a typed observation.