generated from coulomb/repo-seed
116 lines
6.4 KiB
Markdown
116 lines
6.4 KiB
Markdown
# Repository Scoping Terminology
|
|
|
|
Repository Scoping turns repositories into reviewable, source-linked orientation
|
|
maps. The goal is not to infer every possible product story automatically; it is
|
|
to give humans and trusted agents a durable structure for understanding what a
|
|
repository is for and how that claim is supported.
|
|
|
|
## Product Identity
|
|
|
|
- Repository Scoping is the product and UI name.
|
|
- `repo-scoping` is the managed repository slug, Git remote identity, and State
|
|
Hub repository identity.
|
|
- `repo_registry`, `REPO_REGISTRY_`, and `var/repo-registry.sqlite3` are retained
|
|
compatibility names in code and local configuration.
|
|
- Repository Ability Registry and `repo-registry` are historical names from
|
|
before the scope-oriented rename.
|
|
|
|
## Characteristic Model
|
|
|
|
A characteristic is any curated statement about a repository at one of the main
|
|
abstraction levels. The preferred orientation is a mostly tree-shaped model:
|
|
|
|
```text
|
|
Scope -> Ability -> Capability -> Feature -> Evidence -> Observed fact
|
|
```
|
|
|
|
Real repositories are messier than a perfect tree. Evidence may therefore refer
|
|
to facts or to lower-granularity characteristics. Same-level references are
|
|
allowed when useful, but they are also signals that the hierarchy may need manual
|
|
normalization.
|
|
|
|
## Terms
|
|
|
|
- Scope: the one root characteristic describing what the repository is about and
|
|
where it is relevant.
|
|
- Ability: a high-level useful outcome the repository can provide.
|
|
- Capability: a more concrete thing the repository can do in support of an
|
|
ability.
|
|
- Feature: a user-facing, API-facing, backend, UI, or operational behavior that
|
|
contributes to a capability.
|
|
- Evidence: a support link for a characteristic. Evidence can point to observed
|
|
facts or to lower-level characteristics.
|
|
- Observed fact: deterministic scanner output such as files, manifests,
|
|
languages, tests, APIs, routes, commands, or documentation references.
|
|
- Intent: a design-time statement of expected repository utility. `INTENT.md`
|
|
is the preferred file for this. It can guide candidate generation because it
|
|
describes why the repository should exist.
|
|
- Derived scope: a current-state statement of what the repository is understood
|
|
to provide. `SCOPE.md` is the preferred file for this. It is generated or
|
|
curated from evidence and approved characteristics, so it should not be used
|
|
as ordinary evidence for rebuilding those same characteristics.
|
|
- Intent bootstrap: a one-time migration that creates `INTENT.md` from an
|
|
existing `SCOPE.md` when no intent file exists. The generated file carries a
|
|
provenance note and should be reviewed as design intent.
|
|
- Source role: provenance metadata on a fact or content chunk, such as
|
|
`intent_summary`, `derived_scope`, `product_documentation`,
|
|
`implementation_source`, `dependency_declaration`, `configuration`,
|
|
`ci_tooling`, `test_evidence`, or `agent_guidance`.
|
|
- Utility relationship: metadata describing how a fact relates to repository
|
|
utility. Only `owned`, `facade`, and explicit `adapter` relationships should
|
|
be promoted directly into provided capabilities or trusted auto-approval.
|
|
- Owned capability: utility the repository provides through its own product
|
|
behavior, source, interface, or documented intent.
|
|
- Facade capability: utility intentionally exposed through this repository even
|
|
though important work is delegated elsewhere. Public wrapper APIs, CLI
|
|
commands, or product documentation should make the facade role explicit.
|
|
- Adapter capability: utility that connects callers to another implementation
|
|
through repository-owned adapter code. Generic use of the word adapter is not
|
|
enough; the adapter needs source-linked evidence for the capability being
|
|
exposed.
|
|
- Consumer/configuration relationship: evidence that the repository uses or
|
|
configures something, such as an environment variable or client dependency,
|
|
without itself providing that utility.
|
|
- Dependency relationship: evidence from manifests, imports, lockfiles, or
|
|
package metadata. Dependencies belong in evidence or "How It Fits" context
|
|
unless a curator promotes them.
|
|
- Tooling-context relationship: build, CI, release, or agent-operating context.
|
|
Tooling can explain how the repository is worked on, but should not define
|
|
product capabilities by itself.
|
|
- Mention relationship: ambient text that names a provider, framework, sibling
|
|
repo, or product concept without showing that the repository provides it.
|
|
- Candidate: proposed characteristic or evidence from deterministic heuristics
|
|
or optional LLM assistance. Candidates are review inputs, not registry truth.
|
|
- Approved: curated registry truth that appears in ability maps, search, exports,
|
|
and SCOPE generation.
|
|
- Rejected: a candidate judged false or irrelevant. Rejected entries are hidden
|
|
by default but retained for audit and recovery.
|
|
- Rebuild from scratch: an explicit operation that regenerates candidates from
|
|
current source after approved characteristics have become polluted or stale.
|
|
Dry-run first; confirmed rebuilds preserve audit history.
|
|
- Supersede: mark prior approved characteristics as replaced by a rebuild or
|
|
review correction. Superseded entries explain historical registry state rather
|
|
than disappearing.
|
|
- Classification: a main type plus optional additional attributes that help
|
|
users filter and orient without forcing every item into a single rigid box.
|
|
- Dependency: a directed edge showing that one fact or characteristic affects
|
|
another. Edges record type, strength, source, ownership, and whether the edge
|
|
stays within the same layer.
|
|
- Staleness: a freshness state assigned when an upstream dependency changes and
|
|
a downstream characteristic may no longer be current.
|
|
- Recalculation: an automated refresh of deterministic or mixed derived content
|
|
after upstream changes. Curator-owned claims should be reviewed before the new
|
|
value becomes approved registry truth.
|
|
- Propagation rate: the breadth and depth of downstream impact from changed
|
|
inputs. High propagation can indicate rapid discovery, weak normalization, or
|
|
brittle conceptual boundaries; it is a signal for review, not a score by
|
|
itself.
|
|
|
|
## Extraction Philosophy
|
|
|
|
Deterministic scanning should remain useful without LLM support. Optional LLM
|
|
assistance is used as a comparison and acceleration layer: when model-assisted
|
|
expectations reveal missing concepts, the deterministic scanner and heuristics
|
|
should be improved over time. This creates a feedback loop where repository
|
|
inspection, manual curation, and optional model output co-evolve.
|