Principles

Verification design rests on nine principles. They emerged from research on LLM self-correction, verification chains, and agent evaluation. The patterns in this catalog implement these principles in code.

The core finding

LLMs cannot reliably self-correct their own reasoning through naive self-review 1. This is the most replicated finding across the recent literature on LLM verification. Performance often degrades when a model is asked to review its own work without external feedback.

1. External signals over self-review

Tests, builds, linters, type checkers, API responses, browser DOM extraction. Binary pass and fail signals that do not depend on the agent's judgment. These are sycophancy-proof.

2. Independence between generation and verification

If the verifier can see the original output, it copies the same errors. Structure verification as extract, then compare, not as read and opine.

3. Step-level checkpoints

Verify intermediate steps throughout the workflow, not just the final output. Step-level verification catches errors at the layer where they originate, before they compound.

4. Adversarial framing

Ask what could fail, not what looks right. Confirmatory framing produces unreliable results by default. SycEval measured 58.19 percent sycophancy overall across major model families 2.

5. Explicit criteria

No hardcoded values, all error paths handled, no TODOs remain. Specific criteria constrain rationalization; vague instructions invite it.

6. Executable verification is king

Run the tests is the single most reliable verification step available. For any check that depends on agent judgment, find the executable analog.

7. Cross-family beats self-verification

Different model family is needed when verification is LLM-based. Self-verification and intra-family verification are systematically biased toward accepting incorrect outputs.

8. Simulate debate

In a single-agent context, instruct the agent to argue against its own output before concluding. Competitive disagreement reveals more than confirmatory review.

9. Isolate verification from ambient state

Assertions must prove the system's actions caused the expected outcome, not that the environment happened to already contain matching data. Delta-based assertions and tagged test data isolate causality.