Files

tegwick 458eb410c4 Capture clean self-assessment regression signal

2026-05-15 17:15:35 +02:00

3.9 KiB

Raw Permalink Blame History

id, type, title, domain, repo, status, owner, topic_slug, created, updated, state_hub_workstream_id

id	type	title	domain	repo	status	owner	topic_slug	created	updated	state_hub_workstream_id
RREG-WP-0015	workplan	Self-Assessment Input Hygiene	capabilities	repo-scoping	done	codex	foerster-capabilities	2026-05-15	2026-05-15	23164348-0133-4e0c-9557-f6077a835aee

Self-Assessment Input Hygiene

The first post-WP0014 self-assessment rerun proved that the acceptance boundary is doing useful work, but it also exposed a sharper input problem: repo-scoping was scanning its own runtime var/checkouts/ directory. That pulled checked-out copies of llm-connect, markitect, and other repositories into repo-scoping's own candidate graph and recreated the provider-routing false positive from foreign source trees.

This workplan keeps the self-improvement loop honest by making the source set for self-assessment match the repository, not its local runtime cache.

T01: Exclude Runtime Checkout State From Scanning

id: RREG-WP-0015-T01
status: done
priority: high
state_hub_task_id: "67091ab9-fb56-4c6f-88c7-851fcef0a934"

Prevent deterministic scanning from reading repo-local runtime state such as var/checkouts/ when the repository is analyzed from its working tree.

Acceptance criteria:

var/ runtime content is excluded from scanner file traversal.
A regression test proves nested checkout files do not produce LLM-provider facts for the parent repo.
Normal repository documentation, source, test, and manifest scanning still works.

Implementation note 2026-05-15: added var to the deterministic scanner's ignored directory set and covered the repo-scoping-like failure with test_scanner_ignores_runtime_var_checkouts. Runtime checkout files no longer contribute documentation, language, or LLM-provider facts to the parent repo.

T02: Capture Clean Post-Acceptance Self-Assessment

id: RREG-WP-0015-T02
status: done
priority: high
state_hub_task_id: "81bb46e7-01dc-4c14-8a32-1d4d456dc209"

Rerun repo-scoping self-assess after input hygiene is fixed and save a reviewable challenger artifact and comparison report.

Acceptance criteria:

The artifact is release-bound to the repo-scoping commit that generated it.
The artifact no longer includes files from var/checkouts/.
The comparison report clearly separates remaining candidate-generation quality issues from approved registry truth.
The artifact/report names make their relationship to WP0014/WP0015 clear.

Implementation note 2026-05-15: captured docs/self-scoping/assessments/repo-scoping-post-wp0015-clean-2026-05-15.json and the paired Markdown comparison report. The artifact is release-bound to a clean engine commit, contains zero var/checkouts/ paths, leaves the approved map empty, and records quality-gate outcomes RREG-QC-002 and RREG-QC-003 against the remaining provider-routing candidate regression.

T03: Triage Remaining Generator Quality Gaps

id: RREG-WP-0015-T03
status: done
priority: medium
state_hub_task_id: "20b6f34e-1d92-407b-84dd-6e3ec7e77eb3"

Use the clean rerun to identify the next generator-quality workplan.

Acceptance criteria:

Remaining missing expected capabilities are summarized.
Remaining forbidden or downgraded candidates are summarized with source refs and quality-gate outcomes.
The next workplan is scoped around generator improvements, not deterministic acceptance.

Implementation note 2026-05-15: the clean challenger still generates only Route LLM Requests Across Providers, misses all curated expected repo-scoping capabilities, and misplaces API/CLI surfaces under provider routing. The approved map remains empty and quality gates flag the candidate with RREG-QC-002 and RREG-QC-003, so the next slice is generator quality. Created RREG-WP-0016 Native Candidate Generation Recovery to focus on separating provider vocabulary from native capability seeds and recovering repo-owned candidate families.

3.9 KiB Raw Permalink Blame History

Self-Assessment Input Hygiene

T01: Exclude Runtime Checkout State From Scanning

T02: Capture Clean Post-Acceptance Self-Assessment

T03: Triage Remaining Generator Quality Gaps

3.9 KiB

Raw Permalink Blame History