# Self-Scoping Assessment Artifacts

This directory contains repo-scoping's own baseline and assessment artifacts.
These files are meant to make scoping-engine changes comparable across releases
instead of relying on memory or screenshots.

## Artifact Types

- `golden/repo-scoping-golden-profile.v1.json` is the curated target profile for
  repo-scoping itself.
- `assessments/repo-scoping-known-bad-2026-05-15-run-39.json` captures the
  known-bad self-analysis that promoted LLM-provider vocabulary into native
  repo-scoping capability truth.
- `assessments/repo-scoping-post-wp0015-clean-2026-05-15.json` captures the
  first clean, release-bound deterministic challenger after acceptance-boundary
  and input-hygiene work. It remains a rejected regression because candidate
  generation still collapses repo-scoping's native surfaces under the forbidden
  provider-routing capability, but its source set no longer includes
  `var/checkouts/` contamination.
- `workflow.md` explains how to run challenger assessments, interpret outcomes,
  and decide whether to update the golden profile or fix the engine.
- `outcomes/` stores append-only reviewer decisions created from side-by-side
  comparisons.
- `../schemas/self-scoping-assessment.schema.json` defines the immutable
  assessment-run artifact shape.

## Release Binding

Comparable assessment artifacts must bind generated results to the repo-scoping
engine release that produced them. A complete binding records package version,
engine git commit or release tag, dirty state, scanner version, candidate
generator version, quality criteria version, and prompt version when applicable.

The current known-bad artifact is marked `historical_incomplete` because the
original database run did not record the engine commit. It remains useful as a
negative regression seed, but future challenger artifacts should be fully bound
before they are accepted as comparable baselines.

## Review Use

When the engine changes, run repo-scoping against itself and export a challenger
assessment. Compare the challenger to the golden profile and to the negative
seed. Reviewers should be able to choose whether the old result, new result, or
neither is better, then store that judgement as a new assessment outcome.

The curator UI exposes this loop at `/ui/self-scoping`. It reads the golden and
assessment JSON files from this directory, highlights missing, forbidden, and
misplaced hierarchy entries, and records reviewer preference without mutating
the compared artifacts. The same page can compare two assessment runs directly
so reviewers can choose whether the old baseline or new challenger is better.

## Export Command

Export a completed analysis run as a challenger artifact:

```bash
repo-scoping export-assessment \
  --repo repo-scoping \
  --analysis-run 39 \
  --output docs/self-scoping/assessments/repo-scoping-challenger-run-39.json
```

The command reads an existing registry database and does not clone or scan the
target repository. It records the target analysis metadata, candidate graph,
approved map at export time, review decisions, fact and content summaries, known
regression patterns, and current repo-scoping engine identity.

Compare an assessment against the curated golden profile:

```bash
repo-scoping compare-assessment \
  --golden docs/self-scoping/golden/repo-scoping-golden-profile.v1.json \
  --assessment docs/self-scoping/assessments/repo-scoping-known-bad-2026-05-15-run-39.json \
  --format markdown
```

The first comparison report highlights missing expected capabilities, forbidden
native capabilities, known regression patterns, and misplaced API/CLI features.

Run the full self-assessment loop:

```bash
repo-scoping self-assess \
  --source-path . \
  --assessment-output docs/self-scoping/assessments/repo-scoping-challenger.json \
  --comparison-output docs/self-scoping/assessments/repo-scoping-challenger.md
```

By default this path is deterministic-only and leaves generated candidates
pending review. Add `--with-llm` only when a provider is configured and the run
should include LLM-assisted candidate extraction. Add `--fail-on-regression` in
CI when known regressions should fail the command; ordinary `needs_review`
comparisons still exit successfully.