Build agentic systems that verify their work with evidence, not opinion.
LLMs cannot reliably self-correct through naive self-review 1. AI design patterns make probabilistic systems more reliable by reducing variance and countering systematic bias through explicit context, executable checks, and controlled orchestration. The goal of verification design is to approach deterministic behavior where full determinism is unavailable.
Two failure classes
Variance failures
Sampling instability, ambient state, context contamination, asynchronous timing, non-deterministic tool state.
Bias failures
Sycophancy, self-review blindness, judge preference bias, confirmation framing, same-family blind spots.
Variance is the new coupling. Anchoring is the new cohesion. Each pattern in this catalog names what it constrains.
Context and State
What exists before the model acts. 5 patterns.
- Causal Tag Stamp every event the agent emits with a stable joinable identifier and, when applicable, a parent identifier, so verification can attribute observed effects to specific agent actions rather than inferring causality from temporal proximity in shared ambient state.
- Constitution Represent the system’s verification criteria as explicit, versioned, machine-readable data, rather than as scattered prompt prose.
- Guardrail Decorator Wrap a model call, tool call, or other model-output boundary in a policy decorator that can deny, replace, sanitize, or convert errors, so policy lives in code at the boundary the model crosses instead of in prompt prose the model is asked to obey.
- State Baseline Capture the relevant environment or process state before an action under verification, so the verifier can prove the action caused the observed change instead of accepting state that already existed.
- Trajectory Cursor Maintain an explicit, structured record of where the agent is in its multi-step process and what happened at each boundary, so the verifier and the next turn can read the trajectory instead of inferring it from chat history or model recall.
Verification
How to turn judgment into observable signals. 6 patterns.
- Adversarial Frame Replace tone-level skepticism instructions with admissibility rules that define what counts as proof, name common shortcut paths to reject, and invert the verifier's default from "accept if plausible" to "fail unless backed by trusted evidence."
- Blind Oracle Derive expected evidence from the spec, the question, or independent re-execution without conditioning that derivation on the agent's draft, reasoning trace, or shortcut history.
- Comparator Express verification comparison as a named operator from a finite family, so the verdict is a deterministic function of (expected, observed, operator) rather than a model's interpretation of "does this look right?"
- Delta Verify the success of an agent's actions by asserting on the change in environment state rather than the absolute environment state.
- Executable Analog Translate a subjective, language-based verification step into a deterministic, programmatic execution step that yields a binary pass/fail signal independent of the agent's judgment.
- Judge Harness Wrap an LLM judge in a structural harness of perturbation, repetition, calibration, and reporting so that one judge verdict becomes a measured signal with visible consistency and bias controls.
Orchestration
How to control bias through independence and feedback routing. 6 patterns.
- Adversary Assign a structurally separate role whose only job is to find failures in another role's output, and require that role to emit a negative channel the orchestrator can inspect.
- Backpressure When a downstream check fails, route the failure back upstream as structured rerun context within a bounded retry budget, instead of swallowing the failure or retrying blindly.
- Cross-Family Run high-leverage generation and high-leverage assessment on deliberately different model families, and record both identities, so shared training-data biases and shared latent priors cannot pass undetected through the verification boundary.
- Debate Run bounded disagreement among multiple roles before a decision, with turn order, round count, phase, and consensus threshold held in orchestration state instead of model discretion.
- Escalation Chain Route work to a higher-authority or different-capability handler through a typed, validated handoff, so the next owner is code-level state instead of a model's memory of who to call.
- Tool Adapter Normalize model-emitted tool calls at a typed boundary: derive or fetch a schema, validate arguments before invocation, call the tool with typed arguments, and return a typed observation.