generated from coulomb/repo-seed
Capture native self-assessment improvement
This commit is contained in:
@@ -17,6 +17,11 @@ instead of relying on memory or screenshots.
|
||||
generation still collapses repo-scoping's native surfaces under the forbidden
|
||||
provider-routing capability, but its source set no longer includes
|
||||
`var/checkouts/` contamination.
|
||||
- `assessments/repo-scoping-post-wp0016-native-2026-05-15.json` captures the
|
||||
first deterministic challenger after native candidate generation recovery. It
|
||||
matches every expected capability in the golden profile and has no known
|
||||
provider-routing regression, while still leaving generated candidates pending
|
||||
review with quality-gate signals.
|
||||
- `workflow.md` explains how to run challenger assessments, interpret outcomes,
|
||||
and decide whether to update the golden profile or fix the engine.
|
||||
- `outcomes/` stores append-only reviewer decisions created from side-by-side
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,33 @@
|
||||
# Self-Scoping Comparison: repo-scoping-challenger-run-1
|
||||
|
||||
- Status: `candidate_improvement`
|
||||
- Golden profile: `repo-scoping-golden-profile-v1`
|
||||
- Target repo: `repo-scoping`
|
||||
- Summary: Assessment covers the golden profile without known regression patterns.
|
||||
|
||||
## Missing Expected Capabilities
|
||||
- None
|
||||
|
||||
## Forbidden Native Capabilities Present
|
||||
- None
|
||||
|
||||
## Known Regression Patterns
|
||||
- None
|
||||
|
||||
## Misplaced Features
|
||||
- None
|
||||
|
||||
## Matched Expected Capabilities
|
||||
- Explore Dependency And Impact Graphs
|
||||
- Generate And Maintain SCOPE.md
|
||||
- Generate Reviewable Candidate Characteristics
|
||||
- Index Source Content With Provenance
|
||||
- Provide Scope Context To Downstream Agents
|
||||
- Register And Track Repositories
|
||||
- Review And Approve Candidate Characteristics
|
||||
- Scan Repositories Into Observed Facts
|
||||
- Search Compare And Export Approved Profiles
|
||||
|
||||
## Review Hints
|
||||
- Candidate appears better than the known golden checks.
|
||||
- Human or agentic review should still confirm source evidence quality.
|
||||
@@ -18,6 +18,13 @@ POST_WP0015_PATH = (
|
||||
/ "assessments"
|
||||
/ "repo-scoping-post-wp0015-clean-2026-05-15.json"
|
||||
)
|
||||
POST_WP0016_PATH = (
|
||||
ROOT
|
||||
/ "docs"
|
||||
/ "self-scoping"
|
||||
/ "assessments"
|
||||
/ "repo-scoping-post-wp0016-native-2026-05-15.json"
|
||||
)
|
||||
GOLDEN_PROFILE_PATH = (
|
||||
ROOT
|
||||
/ "docs"
|
||||
@@ -117,6 +124,33 @@ def test_post_wp0015_self_scoping_artifact_is_cleanly_bound_and_unapproved():
|
||||
assert criteria == {"RREG-QC-002", "RREG-QC-003"}
|
||||
|
||||
|
||||
def test_post_wp0016_self_scoping_artifact_matches_golden_without_regression():
|
||||
artifact = load_json(POST_WP0016_PATH)
|
||||
|
||||
capability_names = {
|
||||
capability["name"]
|
||||
for ability in artifact["generated_tree"]["abilities"]
|
||||
for capability in ability["capabilities"]
|
||||
}
|
||||
expected_names = {
|
||||
capability["name"]
|
||||
for capability in load_json(GOLDEN_PROFILE_PATH)["ability"][
|
||||
"expected_capabilities"
|
||||
]
|
||||
}
|
||||
regression_ids = {
|
||||
item["id"] for item in artifact.get("known_regression_patterns", [])
|
||||
}
|
||||
criteria = {outcome["criterion_id"] for outcome in artifact["quality_gate_outcomes"]}
|
||||
|
||||
assert artifact["engine_identity"]["release_binding_status"] == "complete"
|
||||
assert artifact["engine_identity"]["engine_dirty_state"] == "clean"
|
||||
assert capability_names == expected_names
|
||||
assert regression_ids == set()
|
||||
assert artifact["approved_map"]["abilities"] == []
|
||||
assert criteria == {"RREG-QC-001", "RREG-QC-006"}
|
||||
|
||||
|
||||
def test_golden_profile_names_expected_native_capabilities_and_forbidden_false_positive():
|
||||
profile = load_json(GOLDEN_PROFILE_PATH)
|
||||
|
||||
|
||||
@@ -4,7 +4,7 @@ type: workplan
|
||||
title: "Native Candidate Generation Recovery"
|
||||
domain: capabilities
|
||||
repo: repo-scoping
|
||||
status: active
|
||||
status: done
|
||||
owner: codex
|
||||
topic_slug: foerster-capabilities
|
||||
created: "2026-05-15"
|
||||
@@ -101,7 +101,7 @@ and no misplaced API/CLI features were reported.
|
||||
|
||||
```task
|
||||
id: RREG-WP-0016-T03
|
||||
status: todo
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "9ae662c0-3858-4c7d-b06e-26e1c8da7921"
|
||||
```
|
||||
@@ -115,3 +115,12 @@ Acceptance criteria:
|
||||
- The comparison report shows matched expected capabilities.
|
||||
- Remaining gaps are captured as generator follow-up, golden-profile update, or
|
||||
human review notes.
|
||||
|
||||
Implementation note 2026-05-15: captured
|
||||
`docs/self-scoping/assessments/repo-scoping-post-wp0016-native-2026-05-15.json`
|
||||
and the paired Markdown comparison report from a clean engine commit. The
|
||||
comparison status is `candidate_improvement`: all nine golden expected
|
||||
capabilities match, no known provider-routing regression is present, no
|
||||
misplaced API/CLI features are reported, and generated candidates remain
|
||||
unapproved with transparent quality-gate outcomes `RREG-QC-001` and
|
||||
`RREG-QC-006` for review.
|
||||
|
||||
Reference in New Issue
Block a user