Files

tegwick de8d184a4b Add trusted auto-approval migration inventory

2026-05-15 16:56:42 +02:00

12 KiB

Raw Permalink Blame History

id, type, title, domain, repo, status, owner, topic_slug, created, updated, state_hub_workstream_id

id	type	title	domain	repo	status	owner	topic_slug	created	updated	state_hub_workstream_id
RREG-WP-0014	workplan	Agentic Characteristic Acceptance	capabilities	repo-scoping	done	codex	foerster-capabilities	2026-05-15	2026-05-15	7feaa5b5-32d8-4b8e-b377-cbb3ddacf64a

Agentic Characteristic Acceptance

Deterministic rules should not automatically accept candidate characteristics. Determinism is strongest at fast, source-linked observation and at applying transparent rejection or downgrade criteria: facts, provenance, formal quality checks, schema validation, duplicate detection, and clear negative filters.

Acceptance is a judgement step. When automation stands in for human judgement, it should be agentic: inspect the evidence, apply the visible quality criteria, explain the decision, and leave a reviewable trace. Deterministic rules may invalidate, downgrade, or require review, but they should not silently promote a candidate into approved registry truth.

T01: Define Acceptance Policy Boundary

id: RREG-WP-0014-T01
status: done
priority: high
state_hub_task_id: "4bc2e749-ec9e-45d4-8095-63181efb752b"

Write the policy boundary between deterministic gates and acceptance judgement.

Policy principles:

Deterministic scanners generate observed facts and source refs.
Deterministic quality gates can reject, downgrade, merge, flag, or require review when criteria are formally expressible.
Deterministic quality gates cannot approve candidate characteristics.
Human reviewers can approve.
Trusted agentic reviewers can approve only after producing an evidence-based rationale.
All automated review outcomes must be inspectable and reversible.

Acceptance criteria:

Documentation states that deterministic auto-approval is prohibited.
Existing "trusted auto-approve" terminology is marked for replacement or migration.
The allowed deterministic outcomes are explicitly listed.
The allowed agentic outcomes are explicitly listed.

Implementation note 2026-05-15: added docs/acceptance-policy.md with policy version acceptance-policy/v1, the deterministic gate versus acceptance judgement boundary, allowed deterministic/human/agentic outcomes, audit requirements, and migration language for legacy trusted_auto_approve_candidate_graph terminology. Added a documentation test that locks in the boundary language.

T02: Create Transparent Quality Criteria Registry

id: RREG-WP-0014-T02
status: done
priority: high
state_hub_task_id: "101998a4-8cf8-4df0-8d05-c4e2041c0cac"

Create a reviewable quality criteria registry for candidate characteristics.

Initial criteria should cover:

Source-role quality: intent/docs/source/tests are stronger than fixtures, schema examples, agent guidance, CI/tooling, dependency declarations, or derived scope.
Native utility: owned/facade/adapter claims require explicit product evidence; dependency, tooling, configuration, fixture, schema-example, and mention-only claims are not native capabilities.
Hierarchy fit: features should support their parent capability; misplaced API/CLI surfaces should be flagged.
Evidence sufficiency: candidate claims need source refs that support the actual abstraction, not just matching vocabulary.
Circularity: generated SCOPE.md text cannot be primary proof for rebuilding the same characteristic model.
Fixture contamination: tests and expectation files can prove scanner behavior but should not become repo-native product capability claims.

Acceptance criteria:

Criteria are stored in a versioned, human-readable format.
Each criterion has an identifier, description, severity, deterministic action if applicable, and reviewer guidance.
Criteria can be listed through CLI and/or API.
Assessment and review records include the criteria version used.

Implementation note 2026-05-15: added docs/quality-criteria/acceptance-quality-criteria.v1.json and schema documentation with active criteria version repo-scoping-quality-criteria/v1. Added src/repo_registry/acceptance/criteria.py, the repo-scoping list-quality-criteria CLI command, GET /quality-criteria, and threaded the active criteria version into self-scoping assessment engine identity. Focused tests cover the registry, CLI, API, and assessment export version binding.

T03: Implement Deterministic Quality Gate Outcomes

id: RREG-WP-0014-T03
status: done
priority: high
state_hub_task_id: "d599c084-a207-4910-9d0b-578d0c50f282"

Apply quality criteria before any human or agentic acceptance step.

Acceptance criteria:

Candidate abilities, capabilities, features, and evidence can carry gate outcomes such as pass, downgraded, rejected, requires_review, and invalidated.
Rejected or invalidated candidates remain auditable with reason codes.
Downgraded candidates remain visible but cannot be accepted without explicit reviewer override.
Deterministic gates never mark a candidate as approved.
The known repo-scoping LLM-provider self-scan failure is flagged before acceptance.

Implementation note 2026-05-15: added src/repo_registry/acceptance/gates.py with deterministic quality-gate outcomes tied to repo-scoping-quality-criteria/v1. Candidate graph API responses and self-scoping assessment exports now include quality_gate_outcomes. The legacy trusted auto-approval path now refuses capabilities with blocking gate outcomes instead of approving them. Focused tests cover provider-routing regression flags, circular generated-scope evidence, serializable gate outcomes, candidate graph API exposure, and the legacy auto-approval guard.

T04: Replace Trusted Auto-Approval With Agentic Review

id: RREG-WP-0014-T04
status: done
priority: high
state_hub_task_id: "b0d29756-7460-4ffa-8d56-d94cfb34e94f"

Replace trusted_auto_approve_candidate_graph behavior with an agentic review workflow.

Acceptance criteria:

Existing API/CLI/UI affordances no longer present deterministic auto-approval as a safe path.
A configured agentic reviewer receives the candidate graph, source refs, quality-gate outcomes, criteria version, and repository context.
The reviewer can approve, reject, downgrade, request human review, relink, or propose edits.
Each agentic approval includes a rationale tied to evidence and criteria.
If no agentic reviewer is configured, candidates remain pending review.

Implementation note 2026-05-15: completed the first migration by adding an AgenticReviewRequest/AgenticReviewer boundary, routing normal API/CLI/UI review requests to request_agentic_review, and leaving candidates pending with an agentic_review_unconfigured review decision when no reviewer is configured. Legacy trusted_auto_approve requests are treated as deprecated compatibility input and routed to the same pending agentic-review path. Agentic reviewers now return structured decisions with approve, approve-with-edits, reject, downgrade, request-human-review, relink, and propose-edit actions. Approval decisions are validated for rationale, criteria IDs, and evidence refs before being applied.

T05: Add Review Decision Audit Trail

id: RREG-WP-0014-T05
status: done
priority: high
state_hub_task_id: "0d12559a-831e-40ff-bf82-85f45b763f07"

Extend review decisions so acceptance history is useful for later audits and self-scoping assessments.

Acceptance criteria:

Review decisions record reviewer type: human, agent, deterministic-gate, or migration.
Agentic decisions record reviewer identity/configuration, criteria version, prompt or policy version, evidence inspected, and rationale.
Deterministic gate decisions record rule IDs and outcomes, not approval.
Review records distinguish "candidate accepted as-is" from "accepted after edits/relinks".
Existing decisions remain readable through a migration or compatibility view.

Implementation note 2026-05-15: added the audit-trail compatibility view by enriching listed review decisions with derived reviewer type, reviewer identity, policy version, criteria version, rationale, criterion IDs, evidence refs, accepted-after-edits marker, and decision kind. Existing stored decisions still use the same table and remain readable. Deterministic gate evaluations now record aggregate quality_gate_evaluation review decisions with criteria IDs, outcome counts, criteria version, and a rationale that no approval was applied.

T06: Add Human Override And Criteria Refinement Flow

id: RREG-WP-0014-T06
status: done
priority: medium
state_hub_task_id: "bcba3237-fb87-4a38-8e96-12b872d5e6a9"

Make quality criteria reviewable and refineable instead of hidden in code.

Acceptance criteria:

Reviewers can inspect which criteria fired for a candidate.
Reviewers can override a gate with a reason.
Overrides are searchable so repeated overrides can drive criteria changes.
Criteria changes are versioned and linked to workplans or decisions.
The UI makes it clear when a candidate is blocked by formal criteria versus merely awaiting judgement.

Implementation note 2026-05-15: added quality-gate outcome rendering to the analysis-run UI, a reviewer override form with required rationale, a quality_gate_override review decision path in service/API/UI, and audit fields that make overrides searchable through review decisions. Criteria changes remain versioned in docs/quality-criteria and linked through workplan notes.

T07: Regression Coverage For Acceptance Boundary

id: RREG-WP-0014-T07
status: done
priority: high
state_hub_task_id: "37a22c89-ded5-42dd-aaa9-ece79477fcff"

Add tests that lock in the new acceptance boundary.

Acceptance criteria:

Deterministic analysis can generate facts and candidates but cannot approve them.
Deterministic gates can reject/downgrade/require review with reason codes.
Agentic review can approve only with a rationale and criteria version.
The repo-scoping self-scan LLM-provider failure is not accepted by deterministic rules.
Existing manual review and approval paths keep working.

Implementation note 2026-05-15: added tests/test_acceptance_boundary.py as an end-to-end regression boundary for analysis and review. The suite proves deterministic analysis leaves generated characteristics pending, deterministic quality gates can flag the provider routing self-scan regression without approval, configured agentic review is the only automated approval path and must leave rationale/criteria/evidence audit metadata, and manual approval still produces human review decisions.

T08: Migration And Compatibility Plan

id: RREG-WP-0014-T08
status: done
priority: medium
state_hub_task_id: "3d5475f6-71a7-4ca7-aa69-573e91d1fe1e"

Plan the migration away from trusted deterministic auto-approval.

Acceptance criteria:

Existing approved maps created by trusted auto-approval can be identified.
Users can rebuild or re-review those maps without losing audit history.
API and CLI changes are documented with compatibility notes.
The old behavior is either removed or guarded behind an explicit deprecated migration mode that cannot run by default.

Implementation note 2026-05-15: added a guarded migration compatibility layer for historical trusted_auto_approve_candidate_graph records. The service now requires allow_deprecated_migration_mode=True before replaying the legacy auto-approval method, exposes an inventory of legacy auto-approval review debt, and documents the dry-run/rebuild/re-review path in docs/migrations/trusted-auto-approval.md. Added repo-scoping list-legacy-auto-approvals and GET /review/migrations/trusted-auto-approvals so existing maps produced by the legacy path can be identified before rebuilds.

Completion Criteria

Deterministic rules no longer approve candidate characteristics.
Transparent, versioned quality criteria can reject, downgrade, invalidate, or require review.
Agentic review is the only automated path that can stand in for human acceptance.
Acceptance decisions are auditable, evidence-bound, and useful as training signal for future self-scoping assessment.

12 KiB Raw Permalink Blame History

Agentic Characteristic Acceptance

T01: Define Acceptance Policy Boundary

T02: Create Transparent Quality Criteria Registry

T03: Implement Deterministic Quality Gate Outcomes

T04: Replace Trusted Auto-Approval With Agentic Review

T05: Add Review Decision Audit Trail

T06: Add Human Override And Criteria Refinement Flow

T07: Regression Coverage For Acceptance Boundary

T08: Migration And Compatibility Plan

Completion Criteria

12 KiB

Raw Permalink Blame History