Capture native self-assessment improvement

2026-05-15 18:34:00 +02:00
parent 2c3dad80d6
commit 83c39a7aa6
5 changed files with 9193 additions and 2 deletions
--- a/docs/self-scoping/README.md
+++ b/docs/self-scoping/README.md
@@ -17,6 +17,11 @@ instead of relying on memory or screenshots.
  generation still collapses repo-scoping's native surfaces under the forbidden
  provider-routing capability, but its source set no longer includes
  `var/checkouts/` contamination.
+- `assessments/repo-scoping-post-wp0016-native-2026-05-15.json` captures the
+  first deterministic challenger after native candidate generation recovery. It
+  matches every expected capability in the golden profile and has no known
+  provider-routing regression, while still leaving generated candidates pending
+  review with quality-gate signals.
 - `workflow.md` explains how to run challenger assessments, interpret outcomes,
  and decide whether to update the golden profile or fix the engine.
 - `outcomes/` stores append-only reviewer decisions created from side-by-side
--- a/docs/self-scoping/assessments/repo-scoping-post-wp0016-native-2026-05-15.json
+++ b/docs/self-scoping/assessments/repo-scoping-post-wp0016-native-2026-05-15.json
--- a/docs/self-scoping/assessments/repo-scoping-post-wp0016-native-2026-05-15.md
+++ b/docs/self-scoping/assessments/repo-scoping-post-wp0016-native-2026-05-15.md
@@ -0,0 +1,33 @@
+# Self-Scoping Comparison: repo-scoping-challenger-run-1
+
+- Status: `candidate_improvement`
+- Golden profile: `repo-scoping-golden-profile-v1`
+- Target repo: `repo-scoping`
+- Summary: Assessment covers the golden profile without known regression patterns.
+
+## Missing Expected Capabilities
+- None
+
+## Forbidden Native Capabilities Present
+- None
+
+## Known Regression Patterns
+- None
+
+## Misplaced Features
+- None
+
+## Matched Expected Capabilities
+- Explore Dependency And Impact Graphs
+- Generate And Maintain SCOPE.md
+- Generate Reviewable Candidate Characteristics
+- Index Source Content With Provenance
+- Provide Scope Context To Downstream Agents
+- Register And Track Repositories
+- Review And Approve Candidate Characteristics
+- Scan Repositories Into Observed Facts
+- Search Compare And Export Approved Profiles
+
+## Review Hints
+- Candidate appears better than the known golden checks.
+- Human or agentic review should still confirm source evidence quality.
--- a/tests/test_self_scoping_artifacts.py
+++ b/tests/test_self_scoping_artifacts.py
@@ -18,6 +18,13 @@ POST_WP0015_PATH = (
    / "assessments"
    / "repo-scoping-post-wp0015-clean-2026-05-15.json"
 )
+POST_WP0016_PATH = (
+    ROOT
+    / "docs"
+    / "self-scoping"
+    / "assessments"
+    / "repo-scoping-post-wp0016-native-2026-05-15.json"
+)
 GOLDEN_PROFILE_PATH = (
    ROOT
    / "docs"
@@ -117,6 +124,33 @@ def test_post_wp0015_self_scoping_artifact_is_cleanly_bound_and_unapproved():
    assert criteria == {"RREG-QC-002", "RREG-QC-003"}


+def test_post_wp0016_self_scoping_artifact_matches_golden_without_regression():
+    artifact = load_json(POST_WP0016_PATH)
+
+    capability_names = {
+        capability["name"]
+        for ability in artifact["generated_tree"]["abilities"]
+        for capability in ability["capabilities"]
+    }
+    expected_names = {
+        capability["name"]
+        for capability in load_json(GOLDEN_PROFILE_PATH)["ability"][
+            "expected_capabilities"
+        ]
+    }
+    regression_ids = {
+        item["id"] for item in artifact.get("known_regression_patterns", [])
+    }
+    criteria = {outcome["criterion_id"] for outcome in artifact["quality_gate_outcomes"]}
+
+    assert artifact["engine_identity"]["release_binding_status"] == "complete"
+    assert artifact["engine_identity"]["engine_dirty_state"] == "clean"
+    assert capability_names == expected_names
+    assert regression_ids == set()
+    assert artifact["approved_map"]["abilities"] == []
+    assert criteria == {"RREG-QC-001", "RREG-QC-006"}
+
+
 def test_golden_profile_names_expected_native_capabilities_and_forbidden_false_positive():
    profile = load_json(GOLDEN_PROFILE_PATH)

--- a/workplans/RREG-WP-0016-native-candidate-generation-recovery.md
+++ b/workplans/RREG-WP-0016-native-candidate-generation-recovery.md
@@ -4,7 +4,7 @@ type: workplan
 title: "Native Candidate Generation Recovery"
 domain: capabilities
 repo: repo-scoping
-status: active
+status: done
 owner: codex
 topic_slug: foerster-capabilities
 created: "2026-05-15"
@@ -101,7 +101,7 @@ and no misplaced API/CLI features were reported.

 ```task
 id: RREG-WP-0016-T03
-status: todo
+status: done
 priority: high
 state_hub_task_id: "9ae662c0-3858-4c7d-b06e-26e1c8da7921"
 ```
@@ -115,3 +115,12 @@ Acceptance criteria:
 - The comparison report shows matched expected capabilities.
 - Remaining gaps are captured as generator follow-up, golden-profile update, or
  human review notes.
+
+Implementation note 2026-05-15: captured
+`docs/self-scoping/assessments/repo-scoping-post-wp0016-native-2026-05-15.json`
+and the paired Markdown comparison report from a clean engine commit. The
+comparison status is `candidate_improvement`: all nine golden expected
+capabilities match, no known provider-routing regression is present, no
+misplaced API/CLI features are reported, and generated candidates remain
+unapproved with transparent quality-gate outcomes `RREG-QC-001` and
+`RREG-QC-006` for review.