# Characteristic And Evidence Model The registry should treat a repository profile as a characteristic tree. ## Characteristics A characteristic is an interpreted claim about a repository. The current concrete levels are: - Scope: the single root characteristic for the repository. - Ability: a high-level thing the repository is meant to enable. - Capability: a more specific capacity that contributes to an ability. - Feature: a concrete user-facing, operational, interface, or implementation feature that contributes to a capability. The regular target shape is: ```text Scope -> Ability -> Capability -> Feature -> Observed Fact ``` This regular tree is an orientation tool, not a claim that every real repository is perfectly tree-shaped. Cross references and same-level references can be useful during review, but they are also quality signals: frequent same-level feature references may indicate that features are too coarse, too fine, or organized under the wrong capability. ## Facts, Source References, And Evidence Observed facts are deterministic scanner output. They describe what was seen in the repository: files, languages, frameworks, routes, tests, documentation, provider names, configuration variables, and similar source-linked observations. Facts can carry a source role so generation can separate product evidence from ambient context. Important roles include: - `intent_summary`: `INTENT.md` or equivalent design-intent material describing why the repository should exist and what utility it is meant to provide. - `derived_scope`: `SCOPE.md` or equivalent current-scope material. This is a derived or curated description of what is believed to be true now, not primary evidence for rebuilding the same characteristic model. - `product_documentation`: README, docs, specifications, and user-facing guides. - `implementation_source`: source code owned by the repository. - `dependency_declaration`: manifests, imports, lockfiles, and package metadata. - `configuration`, `ci_tooling`, `test_evidence`, and `agent_guidance`. `INTENT.md` and `SCOPE.md` deliberately answer different questions. Intent is a design artifact: what the repository is supposed to become or provide. Scope is a derived current-state artifact: what the repository is understood to provide after evidence and review. A good `SCOPE.md` is valuable context, but using it as ordinary evidence for generated characteristics creates a circular model. Rebuilds should therefore prefer `INTENT.md`, product documentation, source, and tests; `SCOPE.md` should be used as comparison material or explicit bootstrap input only when a curator chooses that mode. For repositories that already have a useful `SCOPE.md` but no `INTENT.md`, repo-scoping can perform a one-time bootstrap by copying the scope text into a new intent file with a clear provenance note. After that bootstrap, the files should diverge naturally: `INTENT.md` remains design intent, while `SCOPE.md` remains generated or curated current scope. Provider, dependency, and tooling facts should also carry a utility relationship. A provider mentioned in documentation is usually a `mention`; an environment variable is usually `configure`; a manifest entry is usually `dependency`; implementation code under provider or adapter modules may be `owned` or `adapter`. Candidate generation should promote only relationships that show the repository provides the utility directly or intentionally exposes it as a facade/adapter. Mentions, dependencies, configuration, and tooling are context until a curator promotes them or stronger owned evidence appears. Trusted auto-approval applies the same rule. A candidate capability must have source references and an eligible utility relationship (`owned`, `facade`, or `adapter`) before it can be approved automatically. Dependency, tooling, configuration, and mention-only candidates remain review material. The review decision should explain both sides: why approved candidates were considered safe and why skipped candidates need curator review. `INTENT.md` may also seed intended capabilities when it contains an explicit capability section. These intent-derived candidates are marked as review required because intent says what the repository is meant to provide, not what has already been proven. `SCOPE.md` sections with the same wording are not treated as equivalent input during rebuilds, because scope is derived from the registry model being rebuilt. The motivating failure mode was a key-cape-like repository whose agent guidance and generic backend-adapter vocabulary looked superficially like LLM provider routing. That pattern should produce source-linked facts for the files that exist, but it should not become an LLM-provider capability unless there is provider-specific owned, facade, or adapter evidence. The scanner and generator should solve this by provenance and utility relationship rules, not by hard-coding product names. Source references point from interpreted claims back to files or facts. Evidence is support for a characteristic. It is not the same thing as an observed fact. Evidence may reference: - Observed facts. - Source files or content chunks. - Lower-level characteristics, such as a capability using features as evidence. Evidence should usually point downward in abstraction. An ability can use capabilities or features as support. A capability can use features or facts as support. A feature should usually use facts or source references as support, not abilities or capabilities. Same-level evidence references are allowed as review material, but should be treated as a possible organization smell. ## Implementation Direction The current schema still stores evidence on capabilities, with textual references and source refs. The next additive schema step should generalize this without breaking existing data: - Add a scope root per repository. - Add typed evidence targets: supported characteristic kind/id. - Add typed evidence references: fact, source ref, content chunk, or characteristic kind/id. - Keep legacy evidence fields until migration/export/search have been updated. The UI should make this relationship clear by presenting evidence as support under the characteristic it supports, not as a peer of features. ## Rebuilds And Supersession Use a normal analysis rerun when the existing approved map is mostly trustworthy and the goal is to compare new evidence against prior candidates. Use a rebuild from scratch when approved characteristics are polluted by a bad extraction pattern, stale after a major rename, or circularly derived from old scope text. A dry-run rebuild should be the first step. It scans current source, generates a fresh candidate graph, and reports what approved abilities, capabilities, features, and evidence would be superseded. A confirmed rebuild preserves audit history by recording which approved IDs were superseded, then clears the current approved map and leaves the fresh candidate graph for review or trusted auto-approval. Curators should treat superseded characteristics as historical claims, not as deleted facts. They explain what the registry used to believe and why a rebuild was chosen over incremental correction.