Files
repo-scoping/workplans/RREG-WP-0014-agentic-characteristic-acceptance.md

294 lines
12 KiB
Markdown

---
id: RREG-WP-0014
type: workplan
title: "Agentic Characteristic Acceptance"
domain: capabilities
repo: repo-scoping
status: done
owner: codex
topic_slug: foerster-capabilities
created: "2026-05-15"
updated: "2026-05-15"
state_hub_workstream_id: "7feaa5b5-32d8-4b8e-b377-cbb3ddacf64a"
---
# Agentic Characteristic Acceptance
Deterministic rules should not automatically accept candidate
characteristics. Determinism is strongest at fast, source-linked observation and
at applying transparent rejection or downgrade criteria: facts, provenance,
formal quality checks, schema validation, duplicate detection, and clear
negative filters.
Acceptance is a judgement step. When automation stands in for human judgement,
it should be agentic: inspect the evidence, apply the visible quality criteria,
explain the decision, and leave a reviewable trace. Deterministic rules may
invalidate, downgrade, or require review, but they should not silently promote a
candidate into approved registry truth.
## T01: Define Acceptance Policy Boundary
```task
id: RREG-WP-0014-T01
status: done
priority: high
state_hub_task_id: "4bc2e749-ec9e-45d4-8095-63181efb752b"
```
Write the policy boundary between deterministic gates and acceptance
judgement.
Policy principles:
- Deterministic scanners generate observed facts and source refs.
- Deterministic quality gates can reject, downgrade, merge, flag, or require
review when criteria are formally expressible.
- Deterministic quality gates cannot approve candidate characteristics.
- Human reviewers can approve.
- Trusted agentic reviewers can approve only after producing an evidence-based
rationale.
- All automated review outcomes must be inspectable and reversible.
Acceptance criteria:
- Documentation states that deterministic auto-approval is prohibited.
- Existing "trusted auto-approve" terminology is marked for replacement or
migration.
- The allowed deterministic outcomes are explicitly listed.
- The allowed agentic outcomes are explicitly listed.
Implementation note 2026-05-15: added `docs/acceptance-policy.md` with policy
version `acceptance-policy/v1`, the deterministic gate versus acceptance
judgement boundary, allowed deterministic/human/agentic outcomes, audit
requirements, and migration language for legacy
`trusted_auto_approve_candidate_graph` terminology. Added a documentation test
that locks in the boundary language.
## T02: Create Transparent Quality Criteria Registry
```task
id: RREG-WP-0014-T02
status: done
priority: high
state_hub_task_id: "101998a4-8cf8-4df0-8d05-c4e2041c0cac"
```
Create a reviewable quality criteria registry for candidate characteristics.
Initial criteria should cover:
- Source-role quality: intent/docs/source/tests are stronger than fixtures,
schema examples, agent guidance, CI/tooling, dependency declarations, or
derived scope.
- Native utility: owned/facade/adapter claims require explicit product evidence;
dependency, tooling, configuration, fixture, schema-example, and mention-only
claims are not native capabilities.
- Hierarchy fit: features should support their parent capability; misplaced
API/CLI surfaces should be flagged.
- Evidence sufficiency: candidate claims need source refs that support the
actual abstraction, not just matching vocabulary.
- Circularity: generated `SCOPE.md` text cannot be primary proof for rebuilding
the same characteristic model.
- Fixture contamination: tests and expectation files can prove scanner behavior
but should not become repo-native product capability claims.
Acceptance criteria:
- Criteria are stored in a versioned, human-readable format.
- Each criterion has an identifier, description, severity, deterministic action
if applicable, and reviewer guidance.
- Criteria can be listed through CLI and/or API.
- Assessment and review records include the criteria version used.
Implementation note 2026-05-15: added
`docs/quality-criteria/acceptance-quality-criteria.v1.json` and schema
documentation with active criteria version `repo-scoping-quality-criteria/v1`.
Added `src/repo_registry/acceptance/criteria.py`, the
`repo-scoping list-quality-criteria` CLI command, `GET /quality-criteria`, and
threaded the active criteria version into self-scoping assessment engine
identity. Focused tests cover the registry, CLI, API, and assessment export
version binding.
## T03: Implement Deterministic Quality Gate Outcomes
```task
id: RREG-WP-0014-T03
status: done
priority: high
state_hub_task_id: "d599c084-a207-4910-9d0b-578d0c50f282"
```
Apply quality criteria before any human or agentic acceptance step.
Acceptance criteria:
- Candidate abilities, capabilities, features, and evidence can carry gate
outcomes such as `pass`, `downgraded`, `rejected`, `requires_review`, and
`invalidated`.
- Rejected or invalidated candidates remain auditable with reason codes.
- Downgraded candidates remain visible but cannot be accepted without explicit
reviewer override.
- Deterministic gates never mark a candidate as approved.
- The known repo-scoping LLM-provider self-scan failure is flagged before
acceptance.
Implementation note 2026-05-15: added
`src/repo_registry/acceptance/gates.py` with deterministic quality-gate
outcomes tied to `repo-scoping-quality-criteria/v1`. Candidate graph API
responses and self-scoping assessment exports now include
`quality_gate_outcomes`. The legacy trusted auto-approval path now refuses
capabilities with blocking gate outcomes instead of approving them. Focused
tests cover provider-routing regression flags, circular generated-scope
evidence, serializable gate outcomes, candidate graph API exposure, and the
legacy auto-approval guard.
## T04: Replace Trusted Auto-Approval With Agentic Review
```task
id: RREG-WP-0014-T04
status: done
priority: high
state_hub_task_id: "b0d29756-7460-4ffa-8d56-d94cfb34e94f"
```
Replace `trusted_auto_approve_candidate_graph` behavior with an agentic review
workflow.
Acceptance criteria:
- Existing API/CLI/UI affordances no longer present deterministic
auto-approval as a safe path.
- A configured agentic reviewer receives the candidate graph, source refs,
quality-gate outcomes, criteria version, and repository context.
- The reviewer can approve, reject, downgrade, request human review, relink,
or propose edits.
- Each agentic approval includes a rationale tied to evidence and criteria.
- If no agentic reviewer is configured, candidates remain pending review.
Implementation note 2026-05-15: completed the first migration by adding an
`AgenticReviewRequest`/`AgenticReviewer` boundary, routing normal API/CLI/UI
review requests to `request_agentic_review`, and leaving candidates pending with
an `agentic_review_unconfigured` review decision when no reviewer is configured.
Legacy `trusted_auto_approve` requests are treated as deprecated compatibility
input and routed to the same pending agentic-review path. Agentic reviewers now
return structured decisions with approve, approve-with-edits, reject, downgrade,
request-human-review, relink, and propose-edit actions. Approval decisions are
validated for rationale, criteria IDs, and evidence refs before being applied.
## T05: Add Review Decision Audit Trail
```task
id: RREG-WP-0014-T05
status: done
priority: high
state_hub_task_id: "0d12559a-831e-40ff-bf82-85f45b763f07"
```
Extend review decisions so acceptance history is useful for later audits and
self-scoping assessments.
Acceptance criteria:
- Review decisions record reviewer type: human, agent, deterministic-gate, or
migration.
- Agentic decisions record reviewer identity/configuration, criteria version,
prompt or policy version, evidence inspected, and rationale.
- Deterministic gate decisions record rule IDs and outcomes, not approval.
- Review records distinguish "candidate accepted as-is" from "accepted after
edits/relinks".
- Existing decisions remain readable through a migration or compatibility view.
Implementation note 2026-05-15: added the audit-trail compatibility view by
enriching listed review decisions with derived reviewer type, reviewer identity,
policy version, criteria version, rationale, criterion IDs, evidence refs,
accepted-after-edits marker, and decision kind. Existing stored decisions still
use the same table and remain readable. Deterministic gate evaluations now
record aggregate `quality_gate_evaluation` review decisions with criteria IDs,
outcome counts, criteria version, and a rationale that no approval was applied.
## T06: Add Human Override And Criteria Refinement Flow
```task
id: RREG-WP-0014-T06
status: done
priority: medium
state_hub_task_id: "bcba3237-fb87-4a38-8e96-12b872d5e6a9"
```
Make quality criteria reviewable and refineable instead of hidden in code.
Acceptance criteria:
- Reviewers can inspect which criteria fired for a candidate.
- Reviewers can override a gate with a reason.
- Overrides are searchable so repeated overrides can drive criteria changes.
- Criteria changes are versioned and linked to workplans or decisions.
- The UI makes it clear when a candidate is blocked by formal criteria versus
merely awaiting judgement.
Implementation note 2026-05-15: added quality-gate outcome rendering to the
analysis-run UI, a reviewer override form with required rationale, a
`quality_gate_override` review decision path in service/API/UI, and audit fields
that make overrides searchable through review decisions. Criteria changes remain
versioned in `docs/quality-criteria` and linked through workplan notes.
## T07: Regression Coverage For Acceptance Boundary
```task
id: RREG-WP-0014-T07
status: done
priority: high
state_hub_task_id: "37a22c89-ded5-42dd-aaa9-ece79477fcff"
```
Add tests that lock in the new acceptance boundary.
Acceptance criteria:
- Deterministic analysis can generate facts and candidates but cannot approve
them.
- Deterministic gates can reject/downgrade/require review with reason codes.
- Agentic review can approve only with a rationale and criteria version.
- The repo-scoping self-scan LLM-provider failure is not accepted by
deterministic rules.
- Existing manual review and approval paths keep working.
Implementation note 2026-05-15: added
`tests/test_acceptance_boundary.py` as an end-to-end regression boundary for
analysis and review. The suite proves deterministic analysis leaves generated
characteristics pending, deterministic quality gates can flag the provider
routing self-scan regression without approval, configured agentic review is the
only automated approval path and must leave rationale/criteria/evidence audit
metadata, and manual approval still produces human review decisions.
## T08: Migration And Compatibility Plan
```task
id: RREG-WP-0014-T08
status: done
priority: medium
state_hub_task_id: "3d5475f6-71a7-4ca7-aa69-573e91d1fe1e"
```
Plan the migration away from trusted deterministic auto-approval.
Acceptance criteria:
- Existing approved maps created by trusted auto-approval can be identified.
- Users can rebuild or re-review those maps without losing audit history.
- API and CLI changes are documented with compatibility notes.
- The old behavior is either removed or guarded behind an explicit deprecated
migration mode that cannot run by default.
Implementation note 2026-05-15: added a guarded migration compatibility layer
for historical `trusted_auto_approve_candidate_graph` records. The service now
requires `allow_deprecated_migration_mode=True` before replaying the legacy
auto-approval method, exposes an inventory of legacy auto-approval review debt,
and documents the dry-run/rebuild/re-review path in
`docs/migrations/trusted-auto-approval.md`. Added
`repo-scoping list-legacy-auto-approvals` and
`GET /review/migrations/trusted-auto-approvals` so existing maps produced by the
legacy path can be identified before rebuilds.
## Completion Criteria
- Deterministic rules no longer approve candidate characteristics.
- Transparent, versioned quality criteria can reject, downgrade, invalidate, or
require review.
- Agentic review is the only automated path that can stand in for human
acceptance.
- Acceptance decisions are auditable, evidence-bound, and useful as training
signal for future self-scoping assessment.