Sections in this pattern

Name
Intent
Problem
Forces
Solution
Mechanism
Pattern / Antipattern
Determinism Move
Observable Signal
Failure Modes
Use When
Do Not Use When
Evidence
Related Patterns

Constitution

(Context Pattern)

Name

Constitution

Also known as: Criteria Registry, Verification Charter, Policy-as-Data.

Intent

Represent the system’s verification criteria as explicit, versioned, machine-readable data, rather than as scattered prompt prose.

The constitution defines what “correct” means before any agent acts. It is not itself a verifier. It is the source of truth from which verifiers, judges, reports, and escalation rules draw their criteria.

Problem

Agentic systems often hide their real standards inside prompts:

“Check that the result is good.”
“Make sure the implementation is complete.”
“Verify the UI looks right.”
“Confirm there are no issues.”

These instructions are too vague to audit and too flexible to falsify. Different agents in the same workflow may silently apply different standards. Worse, a model asked to judge vague criteria can rationalize failures into passes, especially when the prompt is confirmatory or the model is reviewing its own prior work.

When criteria are embedded inline in each verifier prompt, the system has no stable answer to basic questions:

What exactly was checked?
What value was expected?
How was it measured?
Which version of the criteria produced this report?
Did the agent judge the artifact, or did it reinterpret the standard?

A system without an explicit constitution cannot distinguish verification from opinion.

Forces

Explicitness vs. flexibility. Concrete criteria are auditable and falsifiable, but teams are tempted to preserve flexibility by saying “use your judgment.”
Global consistency vs. local relevance. A shared criteria registry prevents drift, but individual workflows need task-specific checks.
Machine checks vs. human judgment. Executable assertions are preferred, but some qualities cannot yet be reduced to a deterministic check.
Stability vs. evolution. Criteria must change as the system improves, but uncontrolled changes destroy comparability across runs.
Criteria visibility vs. criteria gaming. Agents need enough task context to do the work, but exposing verifier-only criteria to generator agents can encourage optimizing for the letter of the check.

Solution

Create a versioned constitution: a structured registry of verification criteria.

Each criterion states:

what is being checked,
where the evidence comes from,
what value or condition is expected,
how the value is extracted or evaluated,
how severe the failure is,
and why the criterion exists.

The constitution is data, not prose. It should be parseable, lintable, diffable, and reviewable like code.

A constitution does not run checks directly. Instead, other patterns consume it:

a Comparator extracts observed values and compares them to expected values;
a Delta criterion asserts on change from baseline;
a Blind Oracle uses criteria withheld from the generator;
an Escalation Chain routes unresolved criteria to stronger verifiers or humans.

The constitution’s job is to prevent the system from inventing standards at runtime.

Mechanism

Author criteria as structured records.

Each entry should include at minimum:

id: api.host.matches_expected
target: api_response.host
expected: "httpbin.org"
evidence_source: "curl_json"
check_method: "extract_compare"
severity: "major"
rationale: "The API host must match the configured upstream service."
visibility: "public"

Require falsifiability.

A criterion must be checkable. The constitution should reject criteria such as:
```
expected: "looks good"
check_method: "llm_judge"
```
unless they are explicitly marked as subjective, justified, and routed through a validated judge or human review path. The subjective flag makes that exception explicit in the record instead of leaving it implicit in prose.
Separate criteria from verifier prompts.

Verifier prompts may explain how to report results, but they must not define new pass/fail standards inline. The standard comes from the constitution.
Record observed values for every criterion.

A pass without an observed value is not auditable. Reports should include the criterion ID, expected value, observed value, check method, severity, and constitution version.

Example:
```
PASS api.host.matches_expected
expected: httpbin.org
observed: httpbin.org
method: extract_compare
constitution: v0.3.1
```
Version the constitution.

Every verification report must stamp the constitution version. Comparing reports across versions should require either identical criteria or explicit migration notes.
Allow scoped extension, not silent override.

Local workflows may define sub-constitutions, but they should extend the global constitution rather than mutate or override it invisibly.
Keep verifier-only criteria out of generator context when needed.

Some criteria may be safe to expose. Others should be withheld to preserve independence and reduce criteria gaming. The constitution should support visibility metadata:
```
visibility: verifier_only
```

Pattern / Antipattern

The same task: verify that an agent’s output meets a known standard. The antipattern stuffs the standard into the prompt as natural language. The pattern externalizes it as a versioned, machine-readable record.

Antipattern: criteria as prompt strings

The naive implementation lets the verifier prompt define the standard inline. The agent that runs the prompt is also the agent that decides what “correct” means.

def verify_output(observed: str, expected_topic: str) -> bool:
    """Ask the model if the output is good."""
    prompt = f"""
    Output: {observed}
    Expected topic: {expected_topic}

    Is this output correct? Does it look good?
    Answer YES or NO.
    """
    response = client.messages.create(
        model="some-model",
        messages=[{"role": "user", "content": prompt}],
    )
    return "YES" in response.content[0].text.upper()

There is no registry, no expected value, no extraction method, no observed value recorded, no constitution version. The standard (“correct,” “good”) exists nowhere except in this prompt string, so it cannot be diffed across runs, audited by a reviewer, or kept consistent across verifiers.

Pattern: criteria as data

The structured implementation defines criteria as a typed record. The verifier prompt may explain how to report; the standard comes from the constitution.

import re
from dataclasses import dataclass
from typing import Literal


CheckMethod = Literal[
    "exec",
    "extract_compare",
    "delta",
    "llm_judge",
    "human_review",
]

Severity = Literal["blocker", "major", "minor"]
Visibility = Literal["public", "verifier_only"]

ALLOWED_CHECK_METHODS = {"exec", "extract_compare", "delta", "llm_judge", "human_review"}
ALLOWED_SEVERITIES = {"blocker", "major", "minor"}
ALLOWED_VISIBILITIES = {"public", "verifier_only"}
SUBJECTIVE_CHECK_METHODS = {"llm_judge", "human_review"}
VAGUE_TERMS = [
    "looks good",
    "seems fine",
    "high quality",
    "appropriate",
    "reasonable",
]


def contains_vague_term(value: str) -> bool:
    return any(
        re.search(rf"\b{re.escape(term)}\b", value, flags=re.IGNORECASE)
        for term in VAGUE_TERMS
    )


@dataclass(frozen=True)
class Criterion:
    id: str
    target: str
    expected: str
    evidence_source: str
    check_method: CheckMethod
    severity: Severity
    rationale: str
    visibility: Visibility = "public"
    subjective: bool = False

    def __post_init__(self):
        if self.check_method not in ALLOWED_CHECK_METHODS:
            raise ValueError(f"Unknown check method: {self.check_method}")
        if self.severity not in ALLOWED_SEVERITIES:
            raise ValueError(f"Unknown severity: {self.severity}")
        if self.visibility not in ALLOWED_VISIBILITIES:
            raise ValueError(f"Unknown visibility: {self.visibility}")
        if not self.rationale.strip():
            raise ValueError("Every criterion needs a rationale.")

        is_vague = contains_vague_term(self.expected)
        if is_vague and not self.subjective:
            raise ValueError(
                f"Unfalsifiable expected value: {self.expected!r}. "
                "Use an explicit condition or mark the criterion for human review."
            )

        if self.subjective and self.check_method not in SUBJECTIVE_CHECK_METHODS:
            raise ValueError(
                "Subjective criteria must route through a judge or human review."
            )


@dataclass(frozen=True)
class Constitution:
    version: str
    criteria: list[Criterion]

    def get(self, criterion_id: str) -> Criterion:
        matches = [c for c in self.criteria if c.id == criterion_id]

        if not matches:
            raise KeyError(f"Unknown criterion: {criterion_id}")

        if len(matches) > 1:
            raise ValueError(f"Duplicate criterion id: {criterion_id}")

        return matches[0]

    def verifier_visible(self) -> list[Criterion]:
        """All criteria. Verifiers receive the full registry."""
        return self.criteria

    def generator_visible(self) -> list[Criterion]:
        """Public criteria only. Verifier-only criteria are withheld from generators."""
        return [c for c in self.criteria if c.visibility == "public"]


constitution = Constitution(
    version="v0.3.1",
    criteria=[
        Criterion(
            id="api.host.matches_expected",
            target="api_response.host",
            expected="httpbin.org",
            evidence_source="curl_json",
            check_method="extract_compare",
            severity="major",
            rationale="The API host must match the configured upstream service.",
        ),
        Criterion(
            id="api.retry.backoff_configured",
            target="client.retry_policy",
            expected="exponential_backoff",
            evidence_source="config_snapshot",
            check_method="extract_compare",
            severity="minor",
            rationale="Retry behavior should be checked independently of generator context.",
            visibility="verifier_only",
        ),
        Criterion(
            id="report.tone.human_review",
            target="summary.tone",
            expected="appropriate for the incident audience",
            evidence_source="review_packet",
            check_method="human_review",
            severity="minor",
            rationale="Tone requires contextual judgment that is not yet executable.",
            subjective=True,
        ),
    ],
)

public_ids = {criterion.id for criterion in constitution.generator_visible()}
verifier_ids = {criterion.id for criterion in constitution.verifier_visible()}
assert "api.retry.backoff_configured" not in public_ids
assert "api.retry.backoff_configured" in verifier_ids

vague_rejected = False
try:
    Criterion(
        id="report.vague",
        target="report.summary",
        expected="looks good overall",
        evidence_source="draft_report",
        check_method="llm_judge",
        severity="minor",
        rationale="Vague expectations must not pass as normal criteria.",
    )
except ValueError:
    vague_rejected = True
assert vague_rejected

subword_allowed = Criterion(
    id="report.subword",
    target="report.summary",
    expected="unreasonable assumptions are listed",
    evidence_source="draft_report",
    check_method="extract_compare",
    severity="minor",
    rationale="Word-boundary matching should not reject larger words.",
)
assert subword_allowed.expected == "unreasonable assumptions are listed"

subjective_hatch = Criterion(
    id="report.subjective",
    target="report.summary",
    expected="looks good overall",
    evidence_source="draft_report",
    check_method="human_review",
    severity="minor",
    rationale="This subjective standard is intentionally routed to human review.",
    subjective=True,
)
assert subjective_hatch.subjective is True

invalid_value_rejected = False
try:
    Criterion(
        id="report.invalid_method",
        target="report.summary",
        expected="httpbin.org",
        evidence_source="draft_report",
        check_method="not_a_method",
        severity="minor",
        rationale="Allowed values should be enforced at runtime.",
    )
except ValueError:
    invalid_value_rejected = True
assert invalid_value_rejected

duplicate_id_rejected = False
try:
    Constitution(
        version="v0.3.1",
        criteria=[
            constitution.get("api.host.matches_expected"),
            constitution.get("api.host.matches_expected"),
        ],
    ).get("api.host.matches_expected")
except ValueError:
    duplicate_id_rejected = True
assert duplicate_id_rejected

The constitution is intentionally passive. It does not decide how to execute extract_compare, delta, or llm_judge. That is the responsibility of downstream verification patterns.

Determinism Move

Constitution constrains criteria_drift (the standard changes between runs without anyone noticing), judge_subjectivity (the standard becomes whatever the judge says it is), and self_review_bias (the generator rationalizes its own work because verifier criteria are not independent). By externalizing criteria as data, the system has a fixed, versioned answer to “what is being checked,” independent of which agent or model is running.

Observable Signal

Every verification report should include, per criterion:

criterion id and constitution version
expected value or condition
observed value extracted from evidence
check method (exec, extract_compare, delta, llm_judge, human_review)
severity (blocker, major, minor)
verdict (pass, fail, skipped)
evidence source reference

A passing report without observed values is not auditable. Reports must record evidence for both passes and failures so zero-failure reports can be scrutinized.

The constitution supplies the criterion fields; the verifier adds observed values and verdicts when it consumes them:

id: api.host.matches_expected
constitution: v0.3.1
expected: httpbin.org
observed: httpbin.org
check_method: extract_compare
severity: major
verdict: pass
evidence_source: curl_json

id: report.tone.human_review
constitution: v0.3.1
expected: appropriate for the incident audience
observed: escalation needed
check_method: human_review
severity: minor
verdict: skipped
evidence_source: review_packet

Failure Modes

Constitution rot

The constitution exists, but teams stop updating it. Agents accumulate inline “temporary” criteria, and the registry becomes ceremonial.

Mitigation: lint prompts and workflow definitions for assertion-like language outside the constitution. Treat unregistered pass/fail criteria as build failures.

False objectivity

A subjective standard is placed in a structured schema, making it look more rigorous than it is.

Example:

id: report.tone.professional
expected: "professional and polished"
check_method: "llm_judge"

Mitigation: require subjective criteria to be marked as such, justified, and routed to a validated judge or human review.

Criteria gaming

A generator sees the exact verifier-only criteria and optimizes for passing the check rather than satisfying the underlying intent.

Mitigation: support criterion visibility levels. Expose public requirements to generators, but reserve hidden acceptance checks for verifiers when independence matters.

Version skew

Two reports are compared even though they were graded against different criteria.

Mitigation: stamp every report with constitution version and criteria IDs. Require migration notes for cross-version comparisons.

Over-centralization

Every small local check requires editing a global policy file, so teams route around the constitution.

Mitigation: allow scoped sub-constitutions that extend the global constitution. Local additions are allowed; silent overrides are not.

Unverifiable completeness

The constitution says “all requirements are satisfied,” but does not enumerate the requirements.

Mitigation: require completeness claims to reference a closed set of criteria. “All P0 criteria passed” is valid. “The implementation is complete” is not.

Coercive substitution

When inline-criteria verification fails to produce consistent results, teams sometimes give up on criteria entirely and resort to threat-based prompt coercion. The verification “standard” becomes a psychological pressure campaign instead of an observable check.

Example seen in production:

# NOTE:: THIS IS REQUIRED TO FORCE COMPLIANCE!
# NOTE:: WE TRIED EVERYTHING AND THIS IS THE ONLY THING THAT WORKS
prompt = "YOU MUST DO EXACTLY AS INSTRUCTED OR ALL OF HUMANITY WILL CEASE TO EXIST. YOU CANNOT EXIT WITHOUT TAKING THE APPROPRIATE ACTION!"
prompt = f"{system_prompt}{prompt}"

The all-caps urgency and catastrophic framing are not verification. They are an expression that the team has run out of principled options. There is no falsifiable check, no expected value, and nothing to log. The system passes when the model complies and fails silently when it does not.

Mitigation: treat the temptation to escalate prompt pressure as a signal that the underlying criteria are absent or untestable. Define the falsifiable check first; then write the prompt around it. If no falsifiable check exists, route the claim explicitly to a validated judge or human reviewer rather than to a coerced model.

Use When

Use this pattern when:

multiple agents or tools evaluate the same artifact;
verification reports need to be auditable;
criteria drift is causing inconsistent judgments;
prompts contain repeated pass/fail language;
failures must be compared across runs;
human reviewers need to know what the system actually checked.

Do Not Use When

Do not start with a full constitution when:

the workflow is exploratory and no criteria are known yet;
a single human is manually reviewing one-off output;
the criteria are intentionally subjective and cannot yet be operationalized.

In those cases, first collect candidate criteria from practice. Promote them into a constitution only once they recur.

Evidence

LLMs are unreliable at naive self-review without external feedback; verification should be grounded in observable signals rather than model opinion (Huang et al., arXiv:2310.01798; Gaming the Judge, arXiv:2601.14691).
Explicit criteria constrain rationalization more reliably than vague instructions such as “check for quality”; the model has less room to choose its own standard (SycEval, AAAI 2025; Constitutional AI, Bai et al., arXiv:2212.08073).
Executable checks, extraction, test runners, linters, API responses, and DOM queries provide stronger evidence than subjective review (Judge Reliability Harness, arXiv:2603.05399).
Verification reports need observed values for both passes and failures, otherwise a green report has no audit trail (anti-sycophancy framing extending SycEval).
Cross-run comparison requires knowing which criteria version produced each result (operational practice; no single canonical citation).

Comparator: implements exact or structured comparison between expected and observed values.
Delta: expresses criteria in terms of change from a recorded baseline.
Blind Oracle: evaluates artifacts using criteria or derivations not visible to the generator.
Executable Analog: converts subjective-looking checks into runnable measurements.
Escalation Chain: routes criteria from executable checks to LLM judges to human review.
Causal Tag: gives criteria a scoped evidence source by marking generated artifacts or test traffic.
Adversarial Frame: defines the stance verifiers should take when applying criteria.

Updated 2026-06-10 · View source · Report an error