generated from coulomb/repo-seed
141 lines
7.0 KiB
Markdown
141 lines
7.0 KiB
Markdown
# Characteristic And Evidence Model
|
|
|
|
The registry should treat a repository profile as a characteristic tree.
|
|
|
|
## Characteristics
|
|
|
|
A characteristic is an interpreted claim about a repository. The current concrete
|
|
levels are:
|
|
|
|
- Scope: the single root characteristic for the repository.
|
|
- Ability: a high-level thing the repository is meant to enable.
|
|
- Capability: a more specific capacity that contributes to an ability.
|
|
- Feature: a concrete user-facing, operational, interface, or implementation
|
|
feature that contributes to a capability.
|
|
|
|
The regular target shape is:
|
|
|
|
```text
|
|
Scope -> Ability -> Capability -> Feature -> Observed Fact
|
|
```
|
|
|
|
This regular tree is an orientation tool, not a claim that every real repository
|
|
is perfectly tree-shaped. Cross references and same-level references can be
|
|
useful during review, but they are also quality signals: frequent same-level
|
|
feature references may indicate that features are too coarse, too fine, or
|
|
organized under the wrong capability.
|
|
|
|
## Facts, Source References, And Evidence
|
|
|
|
Observed facts are deterministic scanner output. They describe what was seen in
|
|
the repository: files, languages, frameworks, routes, tests, documentation,
|
|
provider names, configuration variables, and similar source-linked observations.
|
|
Facts can carry a source role so generation can separate product evidence from
|
|
ambient context. Important roles include:
|
|
|
|
- `intent_summary`: `INTENT.md` or equivalent design-intent material describing
|
|
why the repository should exist and what utility it is meant to provide.
|
|
- `derived_scope`: `SCOPE.md` or equivalent current-scope material. This is a
|
|
derived or curated description of what is believed to be true now, not primary
|
|
evidence for rebuilding the same characteristic model.
|
|
- `product_documentation`: README, docs, specifications, and user-facing guides.
|
|
- `implementation_source`: source code owned by the repository.
|
|
- `dependency_declaration`: manifests, imports, lockfiles, and package metadata.
|
|
- `configuration`, `ci_tooling`, `test_evidence`, and `agent_guidance`.
|
|
|
|
`INTENT.md` and `SCOPE.md` deliberately answer different questions. Intent is a
|
|
design artifact: what the repository is supposed to become or provide. Scope is
|
|
a derived current-state artifact: what the repository is understood to provide
|
|
after evidence and review. A good `SCOPE.md` is valuable context, but using it
|
|
as ordinary evidence for generated characteristics creates a circular model.
|
|
Rebuilds should therefore prefer `INTENT.md`, product documentation, source, and
|
|
tests; `SCOPE.md` should be used as comparison material or explicit bootstrap
|
|
input only when a curator chooses that mode.
|
|
|
|
For repositories that already have a useful `SCOPE.md` but no `INTENT.md`,
|
|
repo-scoping can perform a one-time bootstrap by copying the scope text into a
|
|
new intent file with a clear provenance note. After that bootstrap, the files
|
|
should diverge naturally: `INTENT.md` remains design intent, while `SCOPE.md`
|
|
remains generated or curated current scope.
|
|
|
|
Provider, dependency, and tooling facts should also carry a utility
|
|
relationship. A provider mentioned in documentation is usually a `mention`; an
|
|
environment variable is usually `configure`; a manifest entry is usually
|
|
`dependency`; implementation code under provider or adapter modules may be
|
|
`owned` or `adapter`. Candidate generation should promote only relationships
|
|
that show the repository provides the utility directly or intentionally exposes
|
|
it as a facade/adapter. Mentions, dependencies, configuration, and tooling are
|
|
context until a curator promotes them or stronger owned evidence appears.
|
|
|
|
Deterministic quality gates apply the same source and utility relationship
|
|
signals, but they do not approve automatically. Gates may reject, downgrade,
|
|
invalidate, flag, merge, or require review. Approval requires human judgement or
|
|
a configured agentic reviewer that records evidence, criteria version, and
|
|
rationale. Dependency, tooling, configuration, and mention-only candidates remain
|
|
review material.
|
|
|
|
`INTENT.md` may also seed intended capabilities when it contains an explicit
|
|
capability section. These intent-derived candidates are marked as review
|
|
required because intent says what the repository is meant to provide, not what
|
|
has already been proven. `SCOPE.md` sections with the same wording are not
|
|
treated as equivalent input during rebuilds, because scope is derived from the
|
|
registry model being rebuilt.
|
|
|
|
The motivating failure mode was a key-cape-like repository whose agent guidance
|
|
and generic backend-adapter vocabulary looked superficially like LLM provider
|
|
routing. That pattern should produce source-linked facts for the files that
|
|
exist, but it should not become an LLM-provider capability unless there is
|
|
provider-specific owned, facade, or adapter evidence. The scanner and generator
|
|
should solve this by provenance and utility relationship rules, not by
|
|
hard-coding product names.
|
|
|
|
Source references point from interpreted claims back to files or facts.
|
|
|
|
Evidence is support for a characteristic. It is not the same thing as an observed
|
|
fact. Evidence may reference:
|
|
|
|
- Observed facts.
|
|
- Source files or content chunks.
|
|
- Lower-level characteristics, such as a capability using features as evidence.
|
|
|
|
Evidence should usually point downward in abstraction. An ability can use
|
|
capabilities or features as support. A capability can use features or facts as
|
|
support. A feature should usually use facts or source references as support, not
|
|
abilities or capabilities.
|
|
|
|
Same-level evidence references are allowed as review material, but should be
|
|
treated as a possible organization smell.
|
|
|
|
## Implementation Direction
|
|
|
|
The current schema still stores evidence on capabilities, with textual
|
|
references and source refs. The next additive schema step should generalize this
|
|
without breaking existing data:
|
|
|
|
- Add a scope root per repository.
|
|
- Add typed evidence targets: supported characteristic kind/id.
|
|
- Add typed evidence references: fact, source ref, content chunk, or
|
|
characteristic kind/id.
|
|
- Keep legacy evidence fields until migration/export/search have been updated.
|
|
|
|
The UI should make this relationship clear by presenting evidence as support
|
|
under the characteristic it supports, not as a peer of features.
|
|
|
|
## Rebuilds And Supersession
|
|
|
|
Use a normal analysis rerun when the existing approved map is mostly trustworthy
|
|
and the goal is to compare new evidence against prior candidates. Use a rebuild
|
|
from scratch when approved characteristics are polluted by a bad extraction
|
|
pattern, stale after a major rename, or circularly derived from old scope text.
|
|
|
|
A dry-run rebuild should be the first step. It scans current source, generates a
|
|
fresh candidate graph, and reports what approved abilities, capabilities,
|
|
features, and evidence would be superseded. A confirmed rebuild preserves audit
|
|
history by recording which approved IDs were superseded, then clears the current
|
|
approved map and leaves the fresh candidate graph for review or trusted
|
|
auto-approval.
|
|
|
|
Curators should treat superseded characteristics as historical claims, not as
|
|
deleted facts. They explain what the registry used to believe and why a rebuild
|
|
was chosen over incremental correction.
|