Files
repo-scoping/docs/terminology.md

6.4 KiB

Repository Scoping Terminology

Repository Scoping turns repositories into reviewable, source-linked orientation maps. The goal is not to infer every possible product story automatically; it is to give humans and trusted agents a durable structure for understanding what a repository is for and how that claim is supported.

Product Identity

  • Repository Scoping is the product and UI name.
  • repo-scoping is the managed repository slug, Git remote identity, and State Hub repository identity.
  • repo_registry, REPO_REGISTRY_, and var/repo-registry.sqlite3 are retained compatibility names in code and local configuration.
  • Repository Ability Registry and repo-registry are historical names from before the scope-oriented rename.

Characteristic Model

A characteristic is any curated statement about a repository at one of the main abstraction levels. The preferred orientation is a mostly tree-shaped model:

Scope -> Ability -> Capability -> Feature -> Evidence -> Observed fact

Real repositories are messier than a perfect tree. Evidence may therefore refer to facts or to lower-granularity characteristics. Same-level references are allowed when useful, but they are also signals that the hierarchy may need manual normalization.

Terms

  • Scope: the one root characteristic describing what the repository is about and where it is relevant.
  • Ability: a high-level useful outcome the repository can provide.
  • Capability: a more concrete thing the repository can do in support of an ability.
  • Feature: a user-facing, API-facing, backend, UI, or operational behavior that contributes to a capability.
  • Evidence: a support link for a characteristic. Evidence can point to observed facts or to lower-level characteristics.
  • Observed fact: deterministic scanner output such as files, manifests, languages, tests, APIs, routes, commands, or documentation references.
  • Intent: a design-time statement of expected repository utility. INTENT.md is the preferred file for this. It can guide candidate generation because it describes why the repository should exist.
  • Derived scope: a current-state statement of what the repository is understood to provide. SCOPE.md is the preferred file for this. It is generated or curated from evidence and approved characteristics, so it should not be used as ordinary evidence for rebuilding those same characteristics.
  • Intent bootstrap: a one-time migration that creates INTENT.md from an existing SCOPE.md when no intent file exists. The generated file carries a provenance note and should be reviewed as design intent.
  • Source role: provenance metadata on a fact or content chunk, such as intent_summary, derived_scope, product_documentation, implementation_source, dependency_declaration, configuration, ci_tooling, test_evidence, or agent_guidance.
  • Utility relationship: metadata describing how a fact relates to repository utility. Only owned, facade, and explicit adapter relationships should be promoted directly into provided capabilities or trusted auto-approval.
  • Owned capability: utility the repository provides through its own product behavior, source, interface, or documented intent.
  • Facade capability: utility intentionally exposed through this repository even though important work is delegated elsewhere. Public wrapper APIs, CLI commands, or product documentation should make the facade role explicit.
  • Adapter capability: utility that connects callers to another implementation through repository-owned adapter code. Generic use of the word adapter is not enough; the adapter needs source-linked evidence for the capability being exposed.
  • Consumer/configuration relationship: evidence that the repository uses or configures something, such as an environment variable or client dependency, without itself providing that utility.
  • Dependency relationship: evidence from manifests, imports, lockfiles, or package metadata. Dependencies belong in evidence or "How It Fits" context unless a curator promotes them.
  • Tooling-context relationship: build, CI, release, or agent-operating context. Tooling can explain how the repository is worked on, but should not define product capabilities by itself.
  • Mention relationship: ambient text that names a provider, framework, sibling repo, or product concept without showing that the repository provides it.
  • Candidate: proposed characteristic or evidence from deterministic heuristics or optional LLM assistance. Candidates are review inputs, not registry truth.
  • Approved: curated registry truth that appears in ability maps, search, exports, and SCOPE generation.
  • Rejected: a candidate judged false or irrelevant. Rejected entries are hidden by default but retained for audit and recovery.
  • Rebuild from scratch: an explicit operation that regenerates candidates from current source after approved characteristics have become polluted or stale. Dry-run first; confirmed rebuilds preserve audit history.
  • Supersede: mark prior approved characteristics as replaced by a rebuild or review correction. Superseded entries explain historical registry state rather than disappearing.
  • Classification: a main type plus optional additional attributes that help users filter and orient without forcing every item into a single rigid box.
  • Dependency: a directed edge showing that one fact or characteristic affects another. Edges record type, strength, source, ownership, and whether the edge stays within the same layer.
  • Staleness: a freshness state assigned when an upstream dependency changes and a downstream characteristic may no longer be current.
  • Recalculation: an automated refresh of deterministic or mixed derived content after upstream changes. Curator-owned claims should be reviewed before the new value becomes approved registry truth.
  • Propagation rate: the breadth and depth of downstream impact from changed inputs. High propagation can indicate rapid discovery, weak normalization, or brittle conceptual boundaries; it is a signal for review, not a score by itself.

Extraction Philosophy

Deterministic scanning should remain useful without LLM support. Optional LLM assistance is used as a comparison and acceleration layer: when model-assisted expectations reveal missing concepts, the deterministic scanner and heuristics should be improved over time. This creates a feedback loop where repository inspection, manual curation, and optional model output co-evolve.