config-atlas/specs/ProductRequirementsDocument.md

# config-atlas — Product Requirements Document

Status: Draft v0.1
Date: 2026-06-26
Owner: config-atlas initiative
Primary integration standard: reuse-surface federation (capability registry model)
Terminology alignment: InfoTechCanon-compatible; extends InfoTechCanon only where a
configuration surface requires precision ITC does not yet provide. See
`../docs/canon-mapping.md` (planned) and `../docs/ecosystem-boundaries.md`.
Companion artifacts: `ArchitectureBlueprint.md`, `../INTENT.md`,
`../research/configuration-control-plane.md`. Relevant workplan: `ATLAS-WP-0002`.

> This PRD defines *what* config-atlas must achieve and under which constraints. It
> is implementation-independent; the *how* lives in `ArchitectureBlueprint.md`.

---

## 1. Product summary

config-atlas is the **read-first, cross-kind configuration map and evidence layer**
for fast-moving, multi-repo, multi-tenant software landscapes. It treats each
**configuration surface** — a bounded, named place where configuration is defined,
read, or overridden — as a first-class, registry-backed entry with ownership,
scope, validation hooks, and source links.

The product answers four questions an operator or agent cannot answer today without
tribal knowledge:

1. What configuration exists for a repo, capability, or deployment context?
2. Who owns it and where is the source of truth?
3. What are the safe defaults and precedence rules?
4. Which other surfaces depend on, override, or are affected by it?

config-atlas is **not** where configuration lives and **not** a runtime engine. It
is where the distributed configuration surface becomes visible, explainable,
governable, and safe to change. (`../INTENT.md`, `../wiki/ProductVision.md`)

---

## 2. Problem statement

Configuration is **distributed control information**: the live mechanism that
changes how systems behave, often faster and with less ceremony than a code deploy.
As cloud-native scale grew, configuration became the dominant operational failure
mode — a disproportionate share of large 2024–2026 incidents trace to a
configuration change, not a code defect (CrowdStrike, AT&T, Cloudflare, Azure;
`../research/configuration-control-plane.md` §2).

Yet configuration knowledge is scattered across repos, manifests, environment
variables, feature-flag platforms, policy files, secret managers, and operator
runbooks. Teams and agents rediscover the same surfaces repeatedly and cannot
reason confidently about defaults, precedence, or ownership. Existing tools manage
the configuration *they own*; few **discover** configuration across tools, and
fewer can resolve and explain the **effective** value that actually applies.

The product thesis: **map the territory before governing it.** A company must first
*see* its configuration surface — discover it, classify it by kind and scope,
attribute ownership, and attach evidence — before any safe-change ambition is
credible.

---

## 3. Goals

### G1 — Discoverable configuration surface
The product shall make every configuration surface that matters to reuse or
operations discoverable from a single, source-linked registry.

### G2 — Effective-configuration explainability
The product shall make it possible to explain, for a key, which layer won, what it
overrode, which validating schema applied, and who owns it — without reading live
values.

### G3 — Ownership and scope clarity
The product shall attribute every surface to an owner and a scope, resolving
ownership against `domain-tree` rather than inventing a private org model.

### G4 — Map before control
The product shall deliver read-first configuration intelligence (discover,
classify, attribute, explain) and shall not require write access to any production
configuration system.

### G5 — Ecosystem reuse over reinvention
The product shall reuse sister-repo capabilities — `reuse-surface` (schema,
validation, federation), `repo-scoping` (scanning/candidate/approval),
`info-tech-canon` (vocabulary), the State Hub (graph/evidence) — rather than
duplicating them (`../docs/ecosystem-boundaries.md`).

### G6 — No secret exposure
The product shall never store secret values; secrets appear only as references.

### G7 — Deterministic, explainable merge semantics
The product shall represent layer precedence and merge rules explicitly so that a
winning value is always attributable to a declared rule, never to hidden
last-writer-wins behavior.

### G8 — Federation compatibility
The product shall participate in `reuse-surface` federation as a typed registry
peer, with a reserved id namespace, so config surfaces interoperate with the
broader capability surface without colliding.

---

## 4. Non-goals

The product shall **not**:

1. Build a runtime configuration **resolver, delivery engine, or control plane** —
   resolution/delivery/control are delegated downstream (`ArchitectureBlueprint.md`
   §1; `../research/configuration-control-plane.md` §5).
2. Own the **runtime resolution or control of feature availability**, including
   feature resolvers and kill switches — that is `feature-control`'s plane; config
   surfaces of kind `feature-flag` link to it and never re-derive it.
3. Store **secret values** or live, environment-specific configuration values
   (OpenBao / railiance-platform own values).
4. Become a **second source of truth** for configuration values; entries point at
   canonical sources.
5. **Replace** sister repos: `info-tech-canon` (vocabulary), `repo-scoping`
   (scanning), `domain-tree` (placement/ownership identity), `reuse-surface`
   (registry/federation), `state-hub` (graph/identity store), `repo-seed`
   (template).
6. Define the configuration **vocabulary** itself — it maps to InfoTechCanon.

---

## 5. Canon Alignment & Terminology

config-atlas conforms to InfoTechCanon (ITC) where possible and consumes, rather
than redefines, established concepts. The authoritative mapping is `docs/canon-mapping.md`
(planned, mirroring `feature-control`'s pattern). Summary of ownership boundaries:

| Concept area | Source | config-atlas relationship |
|---|---|---|
| Policy, decision, evidence, control | ITC-GOV | **Consume** — reuse governance vocabulary for evidence/audit |
| Schema, data contract, classification | ITC-DATA | **Consume** — surface `schema`/`security_class` reference these |
| Delivery flow, mutability, environments | ITC-DEVSECOPS | **Consume** — `mutability` class derives from delivery stages |
| Environment, deployment, service, repository | ITC-LAND | **Consume** — scope/source identifiers |
| Actor / agent / ownership identity | ITC-ORG via `domain-tree` | **Reference** — `owner` resolves to domain-tree bindings |
| Feature availability, evaluation scope | `feature-control` (`EvaluationScope`) | **Align** — share one scope vocabulary; link, do not re-derive |
| **Configuration surface** entry | config-atlas | **Own** |
| **Layering order** (L0–L9) over the shared scope vocabulary | config-atlas | **Own** (an ordering, not new scope names) |
| The **cross-kind effective-config map** | config-atlas | **Own** |

Terminology rule (per ITC "import concepts instead of redefining them"): the L0–L9
layer model is expressed as an **ordering over** the shared ITC/feature-control
scope vocabulary, not a competing set of scope names. New terms genuinely original
to config-atlas — *configuration surface*, *effective-config path* — are proposed
to ITC as extensions via the canon mapping.

---

## 6. Users & stakeholders

Agents are first-class consumers, not an afterthought.

| Stakeholder | Needs |
|---|---|
| Platform engineer | Find what configures a system, its defaults, precedence, and owner without reading every repo |
| SRE / incident commander | During an incident, see which surfaces affect a service and what recently changed |
| Security / compliance owner | Audit configuration ownership, secret references, and change evidence across the company |
| Tenant / installation admin | Understand which settings are tenant-overridable vs non-overridable guardrails |
| Product owner | See entitlements and feature surfaces as part of one configuration picture |
| Coding agent | Orient on a repo's configuration surface from markdown/YAML without bespoke tooling |
| Architect | Reason about cross-repo configuration relationships, drift, and blast radius |
| Configuration-surface owner | Declare, document, and maintain the surfaces they are accountable for |

---

## 7. Conceptual model

### 7.1 Entities

| Entity | Meaning | Canon mapping |
|---|---|---|
| Configuration surface | A bounded, named place where config is defined/read/overridden | **Owned** (proposed ITC extension) |
| Kind | Class of surface: app-config, deploy-config, secret-ref, feature-flag, policy, tenant-config, infra-state, runtime-override | ITC-DATA / ITC-GOV / ITC-LAND |
| Scope / layer | Dimension where a value may be set (company, environment, tenant, …) | ITC-LAND + feature-control `EvaluationScope` |
| Effective configuration | The resolved value that actually applies for a context | **Owned** (resolution delegated; *path* owned) |
| Source | A canonical file/API contributing a value at a given layer role | ITC-LAND / ITC-DEVSECOPS |
| Merge semantics | Declared rule for combining layer contributions | **Owned** |
| Mutability class | build / deploy / startup / hot / per-request / emergency | ITC-DEVSECOPS |
| Evidence | last-seen, change log, drift, who/what/why/when | ITC-GOV.Evidence |
| Relationship / edge | consumed_by, overrides, depends_on_secret, related_to | ITC-GOV + State Hub graph |

### 7.2 Layering order and merge rules

The effective configuration is composed from ordered scopes (from
`../wiki/ConfigLayering.md` and `ArchitectureBlueprint.md` §3):

```text
L0 vendor/product defaults   L5 installation/deployment overlay
L1 company baseline          L6 tenant/customer/community overlay
L2 platform/domain baseline  L7 group/role overlay
L3 environment overlay       L8 user/agent/workload overlay
L4 region/zone/cluster       L9 emergency/runtime override
```

"More specific wins" by default; higher layers may declare **non-overridable
guardrails**. Merge rules are explicit, never implicit:

```text
scalar     more specific layer replaces earlier value
object/map deep merge by key
array/list replace by default; keyed merge only if declared
null       not deletion unless tombstone semantics are defined
secret     never merged into normal config
policy     restrictive rule wins unless explicitly delegated
```

---

## 8. Functional requirements

### FR-1 — Configuration surface registry
**Requirement:** The product shall provide a markdown/YAML registry of
configuration-surface entries, each with a stable id.
**Details:**
- Entry id namespace `surface.<domain>.<system>.<name>`.
- Modeled as a typed sibling of the `reuse-surface` capability entry.
- One file per surface plus a YAML index, mirroring `registry/`.
**Acceptance criteria:**
- A new surface can be added as a single reviewable file + index row.
- Each entry has a unique, stable id validated in CI.

### FR-2 — Kind taxonomy
**Requirement:** Every surface shall declare a `kind` from a closed taxonomy.
**Details:**
- Kinds: `app-config`, `deploy-config`, `secret-ref`, `feature-flag`, `policy`,
  `tenant-config`, `infra-state`, `runtime-override`.
- `kind` drives kind-separation: secrets, flags, and infra-state are never treated
  as ordinary config.
**Acceptance criteria:**
- An entry with an unknown `kind` fails validation.
- Reports can filter and group surfaces by `kind`.

### FR-3 — Scope / layer model
**Requirement:** Each surface shall declare which layers may set it and a default
layer, using the shared scope vocabulary.
**Details:**
- `scope.allowed_layers` is a subset of the L0–L9 ordering.
- Layer names align with ITC / feature-control `EvaluationScope`; no new scope
  names are introduced.
**Acceptance criteria:**
- A surface can declare, e.g., `allowed_layers: [company, environment, tenant]`.
- An override proposed at a disallowed layer is flagged.

### FR-4 — Source linking without values
**Requirement:** Each surface shall reference its canonical sources by location and
layer role, and shall not inline live or secret values.
**Details:**
- `sources[]` carries `repo`, `path`/endpoint, and `role` (the contributed layer).
- No value fields exist in the schema.
**Acceptance criteria:**
- An entry records two or more sources with distinct layer roles.
- CI rejects any entry that embeds a literal configuration value or secret.

### FR-5 — Effective-config explain rendering
**Requirement:** The product shall render an effective-config *path* for a key from
its layered source links, statically, without reading live values.
**Details:**
- Output names the winning source layer, what it overrode, the validating schema,
  and the owner (the `config explain` shape in `../wiki/ConfigLayering.md`).
- Resolution of *actual values* is out of scope; only the path is owned.
**Acceptance criteria:**
- Given a surface with ordered sources, the product emits an ordered override path
  with owner and validator references.

### FR-6 — Ownership resolution
**Requirement:** Every surface shall have an `owner`, resolved against `domain-tree`
bindings rather than a private ownership model.
**Details:**
- `owner` references a team/agent identity, not a person.
- Placement/relevance defers to domain-tree primary/secondary bindings.
**Acceptance criteria:**
- An entry without an owner fails validation.
- Owner references resolve to known domain-tree identities (or are flagged unknown).

### FR-7 — Relationship / edge model
**Requirement:** The product shall record cross-surface relationships and contribute
them as config-typed edges to the State Hub graph.
**Details:**
- Relations: `consumed_by`, `overrides`, `depends_on_secret` (reference only),
  `related_to`.
- config-atlas owns the config semantics of each edge; the State Hub stores topology.
**Acceptance criteria:**
- A surface can declare consumers and secret dependencies by reference.
- Declared edges are expressible to the State Hub without duplicating its store.

### FR-8 — Read-only discovery connectors
**Requirement:** The product shall support read-only connectors that emit *candidate*
surface entries for human/agent review, reusing `repo-scoping`'s
scanner→candidate→approval workflow.
**Details:**
- Connectors are stateless and never write live systems or auto-merge.
- Candidate source is `repo-scoping` observed facts where available, with config-kind
  classification added on top.
- Pipeline: `connector → candidate YAML → PR → validate → merge`.
**Acceptance criteria:**
- A connector run produces candidate entries that enter via PR review.
- No connector mutates any source system.

### FR-9 — Validation
**Requirement:** Every entry shall be schema-validated in CI via `reuse-surface
validate` plus a surface-entry schema (JSON Schema or CUE).
**Details:**
- Validation covers id uniqueness, kind, scope, owner presence, and absence of
  values/secrets.
- `git diff --check` runs on every change.
**Acceptance criteria:**
- A malformed entry blocks merge.
- CI passes on a well-formed seed entry.

### FR-10 — Federation as a typed sibling
**Requirement:** The product shall federate under `reuse-surface` as a registry peer
with a reserved `surface.*` id namespace.
**Details:**
- The configuration-surface entry is a typed sibling of the capability entry, not a
  new federation mechanism.
- The `surface.*` namespace is reserved in the reuse-surface federation roster.
**Acceptance criteria:**
- config-atlas entries are discoverable through reuse-surface federation.
- No id collision occurs between capability and surface registries.

### FR-11 — Evidence and audit
**Requirement:** Each surface shall carry discovery and change evidence.
**Details:**
- `evidence.last_seen`, `discovery_method`, and a change-log reference (PR or State
  Hub progress event).
- Supports answering who/what/why/when and "is this still used?".
**Acceptance criteria:**
- An entry records when it was last observed and by which method.
- A change to an entry is traceable to a PR or progress event.

### FR-12 — Feature-flag delegation
**Requirement:** Surfaces of kind `feature-flag` shall link to the authoritative
`feature-control` key and shall not duplicate its rules, resolver, or kill switches.
**Details:**
- `sources[]` points at the feature-control key; config-atlas records classification,
  ownership, and relationships only.
**Acceptance criteria:**
- A `feature-flag` surface references a feature-control key.
- config-atlas contains no runtime flag-evaluation logic.

---

## 9. Non-functional requirements

### NFR-1 — Markdown- and agent-legible
Entries shall be markdown/YAML, diffable, and parseable by agents without bespoke
tooling.

### NFR-2 — Source-linked, never authoritative
The registry shall reference canonical sources and never become a second source of
truth for configuration values.

### NFR-3 — Read-first, no live values
The product shall function without read/write access to live values; it stores
metadata and references only.

### NFR-4 — Never stores secrets
No secret value shall ever be stored; secrets appear only as references.

### NFR-5 — Deterministic and explainable
Every winning value shall be attributable to a declared precedence/merge rule;
hidden last-writer-wins behavior is prohibited.

### NFR-6 — Low-friction contribution
Adding or updating a surface shall require only a single reviewable PR validated in
CI.

### NFR-7 — Federation compatible
Entry schema and ids shall remain compatible with `reuse-surface` federation and
validation.

### NFR-8 — Boundary-respecting
The product shall not implement capabilities owned by sister repos
(`../docs/ecosystem-boundaries.md`); overlaps are resolved by reference, not
reimplementation.

---

## 10. Data model / repository structure

Surface-entry shape (from `ArchitectureBlueprint.md` §3 — values intentionally
absent):

```yaml
id: surface.<domain>.<system>.<name>
name: Mail delivery batch sizing
kind: app-config | deploy-config | secret-ref | feature-flag |
      policy | tenant-config | infra-state | runtime-override
summary: Controls max batch size for outbound mail delivery.
owner: platform-delivery                 # resolves to domain-tree identity
status: draft | active | deprecated
scope:
  allowed_layers: [company, environment, installation, tenant]
  default_layer: company
mutability: hot-reloadable
security_class: operational              # operational | sensitive | secret-ref | policy
schema:
  type: integer
  default: 500
  minimum: 1
  maximum: 5000
  validator: schemas/mail-delivery.schema.json
sources:
  - { repo: railiance-platform, path: config/mail/delivery.yaml, role: company-baseline }
  - { repo: railiance-platform, path: environments/prod.yaml,    role: environment-overlay }
relations:
  consumed_by: [service.mail-gateway]
  overrides: []
  depends_on_secret: []                  # references only
  related_to: [surface.platform.mail.rate-limit]
evidence:
  last_seen: '2026-06-26'
  discovery_method: connector:repo-scoping | manual
  change_log_ref: <PR or State Hub progress event>
```

Repository layout extends the existing `registry/`:

```text
registry/
  surfaces/        # per-surface markdown+yaml entries (surface.*)
  indexes/         # surfaces.yaml index (+ existing capabilities.yaml)
schemas/           # surface-entry JSON Schema / CUE (Phase 0)
```

---

## 11. MVP proposal

### 11.1 MVP scope
- Surface-entry schema (the Canon) and the L0–L9 + merge-rule model as a
  machine-checkable doc.
- 10–20 hand-authored entries for the highest-value Coulomb surfaces.
- CI validation (`reuse-surface validate` + schema + `git diff --check`).
- Replacement of the inherited `repo-template` registry artifact (`ATLAS-WP-0002`).

### 11.2 MVP non-scope
- Connectors, effective-config rendering, graph push, federation rollout.
- Any runtime resolution, delivery, or control.

### 11.3 MVP success criteria
A human or agent can, from the repo alone:
- find what configures a given high-value system, its owner, and where it lives;
- see which layers may set a key and which sources contribute;
- trust that entries are schema-valid and contain no values or secrets.

---

## 12. Roadmap

Maps to `ArchitectureBlueprint.md` §6:

- **Phase 0 — Canon (days):** surface-entry schema + scope/precedence/merge model;
  replace inherited template artifact. *Exit:* one real entry validates in CI.
- **Phase 1 — Seed by hand (1–2 weeks):** 10–20 entries; CI validation live.
- **Phase 2 — First connectors (2–4 weeks):** reuse `repo-scoping` facts; candidate-PR
  workflow; surface stale/unowned config.
- **Phase 3 — Explain & graph (4+ weeks):** render `config explain`; push config-typed
  edges to the State Hub.
- **Deferred (out of scope):** live resolution, controlled change, approval
  workflows, rollout/rollback — owned by downstream systems.

---

## 13. Risks & mitigations

| Risk | Impact | Mitigation |
|---|---|---|
| Scope creep into a runtime resolver / kill switches | Collision with `feature-control`; boundary erosion | Hard non-goal (FR-12, §4.2); `feature-flag` links out, no eval logic |
| Becoming a second source of truth for values | Drift, stale data, trust loss | No value fields (FR-4, NFR-2); source-linked only |
| Rebuilding discovery instead of reusing repo-scoping | Duplicated, divergent scanners | Connectors consume repo-scoping facts (FR-8; ecosystem-boundaries §2.4) |
| Id-namespace collision with reuse-surface | Federation conflicts | Reserve `surface.*` namespace (FR-10) |
| Inventing a third scope taxonomy | "Integration by interpretation" ITC exists to prevent | Express L0–L9 as an ordering over shared vocab (§5) |
| Canon drift from InfoTechCanon | Terms diverge from the ecosystem | `docs/canon-mapping.md`; consume-don't-redefine (§5) |

---

## 14. Open questions

From `ArchitectureBlueprint.md` §7:

1. What is the minimum viable connector set to prove cross-tool effective-config
   resolution end to end?
2. Can the entry schema carry enough provenance to render a full `config explain`
   without becoming a value source of truth?
3. What is the canonical edge set for the configuration knowledge graph, and does it
   reuse the State Hub's relationship model?
4. CUE vs JSON Schema for entry validation — does order-independent merge justify the
   toolchain cost? (`../research/configuration-control-plane.md` §3.3)
5. Should "agent/model configuration" be a named scope class now, given the
   LaunchDarkly AI-config trajectory?

---

## 15. Formal standards & authoritative sources

config-atlas has no single governing standard; it derives legitimacy from adjacent
standards and the ecosystem canon. Full citations in
`../research/sources.md`.

- **InfoTechCanon** — internal semantic canon; the vocabulary config-atlas maps to.
- **OpenFeature** — vendor-neutral feature-flag standard; the integration boundary
  with `feature-control`.
- **JSON Schema** — declarative structure/constraint validation for entry schemas.
- **CUE** — order-independent unification for deterministic, explainable merge
  (`../research/configuration-control-plane.md` §3.3).
- **The Twelve-Factor App (Config)** — separate config from code; kind separation.
- **Kustomize / Helm / NixOS** — the base+overlay layering pattern config-atlas maps.
- **InfoQ — Configuration as a Control Plane** — the category framing (problem,
  blast-radius/rollback safety patterns).

---

## 16. Related concepts

Condensed from `../wiki/CompetitiveLandscape.md`:

- **Configuration-as-data** (ConfigHub / KRM) — config as authoritative graph-shaped
  data; config-atlas is discovery-first and cross-tool.
- **GitOps desired vs effective state** (Argo CD / Flux) — GitOps owns "desired
  state"; config-atlas adds the "effective state" narrative.
- **Feature management** (LaunchDarkly / Unleash / OpenFeature) — one config *kind*;
  delegated to `feature-control`.
- **Policy-as-code** (OPA / Kyverno / Checkov) — validation backends; config-atlas is
  the context/evidence layer around them.
- **CMDB / SSPM** (ServiceNow / CoreView / AppOmni) — assets and SaaS posture;
  config-atlas models layered behavioral config and integrates rather than replaces.

---

## Appendix: orientation map (descriptive, not prescriptive)

How config-atlas relates to adjacent product categories. Entry points for deeper
exploration, not competing definitions.

| Category | Core question it answers | config-atlas stance |
|---|---|---|
| Feature management | Can we change behavior safely at runtime? | Map flags as one kind; **integrate** (`feature-control`) |
| GitOps / IaC | Is desired state declared and reconciled? | Add effective-state map; **complement** |
| Secrets management | Are sensitive values protected? | Reference dependencies; **never store values** |
| Policy-as-code | Is this change allowed? | Provide context/evidence; **integrate as backend** |
| CMDB / developer portal | What assets/services exist and who owns them? | Enrich with config scope/ownership; **integrate** |
| SSPM | Is SaaS config secure? | Treat SaaS config as part of the surface; **integrate** |
| Config-as-data store | Where should config live authoritatively? | **Not** a store; the map/evidence layer over stores |

---

## Closing — guiding principle

> config-atlas is not where all configuration must live. It is where configuration
> becomes visible, explainable, governable, and safe to change.
</content>