repo-scoping/workplans/RREG-WP-0015-self-assessment-input-hygiene.md

---
id: RREG-WP-0015
type: workplan
title: "Self-Assessment Input Hygiene"
domain: capabilities
repo: repo-scoping
status: done
owner: codex
topic_slug: foerster-capabilities
created: "2026-05-15"
updated: "2026-05-15"
state_hub_workstream_id: "23164348-0133-4e0c-9557-f6077a835aee"
---

# Self-Assessment Input Hygiene

The first post-WP0014 self-assessment rerun proved that the acceptance boundary
is doing useful work, but it also exposed a sharper input problem: repo-scoping
was scanning its own runtime `var/checkouts/` directory. That pulled checked-out
copies of `llm-connect`, `markitect`, and other repositories into repo-scoping's
own candidate graph and recreated the provider-routing false positive from
foreign source trees.

This workplan keeps the self-improvement loop honest by making the source set
for self-assessment match the repository, not its local runtime cache.

## T01: Exclude Runtime Checkout State From Scanning

```task
id: RREG-WP-0015-T01
status: done
priority: high
state_hub_task_id: "67091ab9-fb56-4c6f-88c7-851fcef0a934"
```

Prevent deterministic scanning from reading repo-local runtime state such as
`var/checkouts/` when the repository is analyzed from its working tree.

Acceptance criteria:
- `var/` runtime content is excluded from scanner file traversal.
- A regression test proves nested checkout files do not produce LLM-provider
  facts for the parent repo.
- Normal repository documentation, source, test, and manifest scanning still
  works.

Implementation note 2026-05-15: added `var` to the deterministic scanner's
ignored directory set and covered the repo-scoping-like failure with
`test_scanner_ignores_runtime_var_checkouts`. Runtime checkout files no longer
contribute documentation, language, or LLM-provider facts to the parent repo.

## T02: Capture Clean Post-Acceptance Self-Assessment

```task
id: RREG-WP-0015-T02
status: done
priority: high
state_hub_task_id: "81bb46e7-01dc-4c14-8a32-1d4d456dc209"
```

Rerun `repo-scoping self-assess` after input hygiene is fixed and save a
reviewable challenger artifact and comparison report.

Acceptance criteria:
- The artifact is release-bound to the repo-scoping commit that generated it.
- The artifact no longer includes files from `var/checkouts/`.
- The comparison report clearly separates remaining candidate-generation
  quality issues from approved registry truth.
- The artifact/report names make their relationship to WP0014/WP0015 clear.

Implementation note 2026-05-15: captured
`docs/self-scoping/assessments/repo-scoping-post-wp0015-clean-2026-05-15.json`
and the paired Markdown comparison report. The artifact is release-bound to a
clean engine commit, contains zero `var/checkouts/` paths, leaves the approved
map empty, and records quality-gate outcomes `RREG-QC-002` and `RREG-QC-003`
against the remaining provider-routing candidate regression.

## T03: Triage Remaining Generator Quality Gaps

```task
id: RREG-WP-0015-T03
status: done
priority: medium
state_hub_task_id: "20b6f34e-1d92-407b-84dd-6e3ec7e77eb3"
```

Use the clean rerun to identify the next generator-quality workplan.

Acceptance criteria:
- Remaining missing expected capabilities are summarized.
- Remaining forbidden or downgraded candidates are summarized with source refs
  and quality-gate outcomes.
- The next workplan is scoped around generator improvements, not deterministic
  acceptance.

Implementation note 2026-05-15: the clean challenger still generates only
`Route LLM Requests Across Providers`, misses all curated expected
repo-scoping capabilities, and misplaces API/CLI surfaces under provider
routing. The approved map remains empty and quality gates flag the candidate
with `RREG-QC-002` and `RREG-QC-003`, so the next slice is generator quality.
Created `RREG-WP-0016 Native Candidate Generation Recovery` to focus on
separating provider vocabulary from native capability seeds and recovering
repo-owned candidate families.