Files
repo-scoping/workplans/RREG-WP-0014-agentic-characteristic-acceptance.md

8.2 KiB

id, type, title, domain, repo, status, owner, topic_slug, created, updated, state_hub_workstream_id
id type title domain repo status owner topic_slug created updated state_hub_workstream_id
RREG-WP-0014 workplan Agentic Characteristic Acceptance capabilities repo-scoping active codex foerster-capabilities 2026-05-15 2026-05-15 7feaa5b5-32d8-4b8e-b377-cbb3ddacf64a

Agentic Characteristic Acceptance

Deterministic rules should not automatically accept candidate characteristics. Determinism is strongest at fast, source-linked observation and at applying transparent rejection or downgrade criteria: facts, provenance, formal quality checks, schema validation, duplicate detection, and clear negative filters.

Acceptance is a judgement step. When automation stands in for human judgement, it should be agentic: inspect the evidence, apply the visible quality criteria, explain the decision, and leave a reviewable trace. Deterministic rules may invalidate, downgrade, or require review, but they should not silently promote a candidate into approved registry truth.

T01: Define Acceptance Policy Boundary

id: RREG-WP-0014-T01
status: done
priority: high
state_hub_task_id: "4bc2e749-ec9e-45d4-8095-63181efb752b"

Write the policy boundary between deterministic gates and acceptance judgement.

Policy principles:

  • Deterministic scanners generate observed facts and source refs.
  • Deterministic quality gates can reject, downgrade, merge, flag, or require review when criteria are formally expressible.
  • Deterministic quality gates cannot approve candidate characteristics.
  • Human reviewers can approve.
  • Trusted agentic reviewers can approve only after producing an evidence-based rationale.
  • All automated review outcomes must be inspectable and reversible.

Acceptance criteria:

  • Documentation states that deterministic auto-approval is prohibited.
  • Existing "trusted auto-approve" terminology is marked for replacement or migration.
  • The allowed deterministic outcomes are explicitly listed.
  • The allowed agentic outcomes are explicitly listed.

Implementation note 2026-05-15: added docs/acceptance-policy.md with policy version acceptance-policy/v1, the deterministic gate versus acceptance judgement boundary, allowed deterministic/human/agentic outcomes, audit requirements, and migration language for legacy trusted_auto_approve_candidate_graph terminology. Added a documentation test that locks in the boundary language.

T02: Create Transparent Quality Criteria Registry

id: RREG-WP-0014-T02
status: todo
priority: high
state_hub_task_id: "101998a4-8cf8-4df0-8d05-c4e2041c0cac"

Create a reviewable quality criteria registry for candidate characteristics.

Initial criteria should cover:

  • Source-role quality: intent/docs/source/tests are stronger than fixtures, schema examples, agent guidance, CI/tooling, dependency declarations, or derived scope.
  • Native utility: owned/facade/adapter claims require explicit product evidence; dependency, tooling, configuration, fixture, schema-example, and mention-only claims are not native capabilities.
  • Hierarchy fit: features should support their parent capability; misplaced API/CLI surfaces should be flagged.
  • Evidence sufficiency: candidate claims need source refs that support the actual abstraction, not just matching vocabulary.
  • Circularity: generated SCOPE.md text cannot be primary proof for rebuilding the same characteristic model.
  • Fixture contamination: tests and expectation files can prove scanner behavior but should not become repo-native product capability claims.

Acceptance criteria:

  • Criteria are stored in a versioned, human-readable format.
  • Each criterion has an identifier, description, severity, deterministic action if applicable, and reviewer guidance.
  • Criteria can be listed through CLI and/or API.
  • Assessment and review records include the criteria version used.

T03: Implement Deterministic Quality Gate Outcomes

id: RREG-WP-0014-T03
status: todo
priority: high
state_hub_task_id: "d599c084-a207-4910-9d0b-578d0c50f282"

Apply quality criteria before any human or agentic acceptance step.

Acceptance criteria:

  • Candidate abilities, capabilities, features, and evidence can carry gate outcomes such as pass, downgraded, rejected, requires_review, and invalidated.
  • Rejected or invalidated candidates remain auditable with reason codes.
  • Downgraded candidates remain visible but cannot be accepted without explicit reviewer override.
  • Deterministic gates never mark a candidate as approved.
  • The known repo-scoping LLM-provider self-scan failure is flagged before acceptance.

T04: Replace Trusted Auto-Approval With Agentic Review

id: RREG-WP-0014-T04
status: todo
priority: high
state_hub_task_id: "b0d29756-7460-4ffa-8d56-d94cfb34e94f"

Replace trusted_auto_approve_candidate_graph behavior with an agentic review workflow.

Acceptance criteria:

  • Existing API/CLI/UI affordances no longer present deterministic auto-approval as a safe path.
  • A configured agentic reviewer receives the candidate graph, source refs, quality-gate outcomes, criteria version, and repository context.
  • The reviewer can approve, reject, downgrade, request human review, relink, or propose edits.
  • Each agentic approval includes a rationale tied to evidence and criteria.
  • If no agentic reviewer is configured, candidates remain pending review.

T05: Add Review Decision Audit Trail

id: RREG-WP-0014-T05
status: todo
priority: high
state_hub_task_id: "0d12559a-831e-40ff-bf82-85f45b763f07"

Extend review decisions so acceptance history is useful for later audits and self-scoping assessments.

Acceptance criteria:

  • Review decisions record reviewer type: human, agent, deterministic-gate, or migration.
  • Agentic decisions record reviewer identity/configuration, criteria version, prompt or policy version, evidence inspected, and rationale.
  • Deterministic gate decisions record rule IDs and outcomes, not approval.
  • Review records distinguish "candidate accepted as-is" from "accepted after edits/relinks".
  • Existing decisions remain readable through a migration or compatibility view.

T06: Add Human Override And Criteria Refinement Flow

id: RREG-WP-0014-T06
status: todo
priority: medium
state_hub_task_id: "bcba3237-fb87-4a38-8e96-12b872d5e6a9"

Make quality criteria reviewable and refineable instead of hidden in code.

Acceptance criteria:

  • Reviewers can inspect which criteria fired for a candidate.
  • Reviewers can override a gate with a reason.
  • Overrides are searchable so repeated overrides can drive criteria changes.
  • Criteria changes are versioned and linked to workplans or decisions.
  • The UI makes it clear when a candidate is blocked by formal criteria versus merely awaiting judgement.

T07: Regression Coverage For Acceptance Boundary

id: RREG-WP-0014-T07
status: todo
priority: high
state_hub_task_id: "37a22c89-ded5-42dd-aaa9-ece79477fcff"

Add tests that lock in the new acceptance boundary.

Acceptance criteria:

  • Deterministic analysis can generate facts and candidates but cannot approve them.
  • Deterministic gates can reject/downgrade/require review with reason codes.
  • Agentic review can approve only with a rationale and criteria version.
  • The repo-scoping self-scan LLM-provider failure is not accepted by deterministic rules.
  • Existing manual review and approval paths keep working.

T08: Migration And Compatibility Plan

id: RREG-WP-0014-T08
status: todo
priority: medium
state_hub_task_id: "3d5475f6-71a7-4ca7-aa69-573e91d1fe1e"

Plan the migration away from trusted deterministic auto-approval.

Acceptance criteria:

  • Existing approved maps created by trusted auto-approval can be identified.
  • Users can rebuild or re-review those maps without losing audit history.
  • API and CLI changes are documented with compatibility notes.
  • The old behavior is either removed or guarded behind an explicit deprecated migration mode that cannot run by default.

Completion Criteria

  • Deterministic rules no longer approve candidate characteristics.
  • Transparent, versioned quality criteria can reject, downgrade, invalidate, or require review.
  • Agentic review is the only automated path that can stand in for human acceptance.
  • Acceptance decisions are auditable, evidence-bound, and useful as training signal for future self-scoping assessment.