docs: mirror Gitea wiki and add config control plane research

Mirror the five Gitea wiki pages into wiki/ (Home, ProductVision,
BrandFrame, ConfigLayering, CompetitiveLandscape) as a verbatim in-repo
copy.

Add research/ digest on configuration layering and the configuration
control plane: the resolution/merge model, the 2024-2026 config-outage
case, adjacent tool families (config-as-data, GitOps drift, feature
flags + AI config, secrets, policy-as-code, CMDB/portals/SSPM), a
reference architecture, and an annotated bibliography of 17 sources.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2026-06-26 19:28:33 +02:00
parent 7078eaf596
commit 6d6f99d5ea
8 changed files with 1077 additions and 0 deletions

31
research/README.md Normal file
View File

@@ -0,0 +1,31 @@
# research/
Deep-research notes backing **ConfigAtlas** — the configuration control plane for
discovering, mapping, explaining, and governing the living configuration surface
of fast-moving companies.
This directory is the *evidence layer* for the product thesis. It is not part of
the configuration surface registry (`registry/`); it is the reasoning and sourcing
behind the registry's design choices.
## Contents
| File | What it is |
|------|------------|
| [`configuration-control-plane.md`](configuration-control-plane.md) | Main digest: what the configuration control plane is, why it matters, the layering and resolution model, adjacent topics, and where ConfigAtlas fits. |
| [`sources.md`](sources.md) | Annotated bibliography — every cited source with a one-line note on why it matters. |
## Related repo material
- [`../wiki/ConfigLayering.md`](../wiki/ConfigLayering.md) — the layering primer (scope model, precedence, merge rules, mutability classes).
- [`../wiki/CompetitiveLandscape.md`](../wiki/CompetitiveLandscape.md) — adjacent tool families and white space.
- [`../wiki/ProductVision.md`](../wiki/ProductVision.md) / [`../wiki/BrandFrame.md`](../wiki/BrandFrame.md) — product framing.
## Method
Synthesized 2026-06-26 from the repo's own wiki material plus web research across
vendor docs, primary-source engineering writing (Brian Grant / KRM, InfoQ, CUE),
and 20242026 outage retrospectives. Sources captured in `sources.md`; claims tied
to live behavior of tools are dated because this category is moving fast.
</content>
</invoke>

View File

@@ -0,0 +1,244 @@
# Configuration Layering and the Configuration Control Plane — Research Digest
> Compiled 2026-06-26. Numbered references resolve in [`sources.md`](sources.md).
> This digest deepens the repo's own [ConfigLayering primer](../wiki/ConfigLayering.md)
> and [CompetitiveLandscape](../wiki/CompetitiveLandscape.md) with primary sources
> and the surrounding technical context.
---
## 1. The thesis in one paragraph
Configuration stopped being static data a long time ago. It is now *distributed
control information*: the live mechanism that changes how production systems
behave, in real time, often faster and with less ceremony than a code deploy. As
cloud-native scale grew, the industry independently converged on treating
configuration as a **control plane** — something that needs staged rollout,
blast-radius containment, dependency-aware validation, and automated rollback,
exactly like the deployment systems it sits beside [1]. **ConfigAtlas** bets that
before companies can *control* that surface safely, they first need to *see* it:
discover where configuration lives, classify it by kind and scope, resolve the
effective value, and attach ownership and evidence. Map the territory, then govern
it.
---
## 2. Why this matters now: configuration is the dominant failure mode
The strongest argument for a configuration control plane is the outage record. A
disproportionate share of large 20242026 incidents trace to a configuration
change rather than a code defect [4][5]:
- **CrowdStrike (Jul 2024)** — a faulty Falcon *sensor configuration* update
blue-screened Windows hosts worldwide; estimated ~$5.4B impact to Fortune 500
firms alone. A content/config push, not a binary release [5].
- **AT&T Mobility (Feb 2024)** — an equipment *configuration error* took down
~125M devices for 12+ hours, blocking ~92M calls including 25,000 to 911 [5].
- **Cloudflare (Nov 2025)** — a global outage taking down X, ChatGPT, Spotify and
others, triggered by a software bug *exposed by a configuration change* [5].
- **Azure Front Door (Nov 2025) / Azure networking (2025)** — a control-plane
defect and a networking *configuration change* produced multi-hour to ~50-hour
degradations across services [4][7].
ThousandEyes' 2024 internet-outage analysis names configuration change as a
leading, recurring cause [4]. The lesson the hyperscalers drew is not "stop
changing config" — it is "make unsafe configuration changes progressively harder
to express, deploy, or overlook" [1]. That sentence is essentially the ConfigAtlas
mission restated as a safety property.
---
## 3. Configuration layering — the resolution model
Layering is the practice of composing one **effective configuration** from
multiple ordered scopes. The repo's primer [internal] gives the canonical stack;
the research backs *why* each design choice is non-negotiable.
### 3.1 The scope stack
```
L0 vendor/product defaults
L1 company baseline
L2 platform/domain baseline
L3 environment overlay (dev/test/stage/prod)
L4 region/zone/cluster overlay
L5 installation/deployment overlay
L6 tenant/customer/community overlay
L7 group/role overlay
L8 user/agent/workload overlay
L9 emergency/runtime override
```
"More specific wins" is the default, but **higher layers may declare
non-overridable guardrails** (a security baseline a tenant cannot loosen). This is
the same base+overlay pattern behind Kubernetes Kustomize, Helm value precedence,
and NixOS modules [8][9] — the industry already agrees on the shape; what is
missing is a cross-tool *view* of it.
### 3.2 The effective configuration is the only thing that's real
A file or a flag is partial evidence. The value that actually applies to a given
system/tenant/request is the resolved result of every relevant layer. The central
product capability — and the line between a config *database* and a config
*control plane* — is answering: **what value applies here, which layer won, what
did it override, which policy constrained it, and who is affected** [internal,
CompetitiveLandscape §"Effective configuration resolution"].
### 3.3 Merge semantics are where layering quietly fails
Vague merge behavior is the most dangerous part of layering. Define it explicitly:
```
scalar more specific layer replaces earlier value
object/map deep merge by key
array/list replace by default; keyed merge only if declared
null not deletion unless tombstone semantics are defined
secret never merged into normal config
policy restrictive rule wins unless explicitly delegated
```
The schema/validation choice matters here. **JSON Schema** validates structure and
constraints but keeps schema and data separate. **CUE** unifies types and values
in a single lattice where merge (`&`) is commutative, associative, and idempotent
— so the resolved result is *order-independent*, and the same definition both
validates data and reduces boilerplate [2][3]. By contrast Jsonnet's `+` mixin
composition is order-dependent (right-hand side wins on scalar conflicts) [2].
For a control plane whose whole value proposition is a *deterministic, explainable*
effective value, order-independent merge is a meaningful property, not a detail.
Notably, CUE itself now ships **CUE Hub**, explicitly branded "the Configuration
Control Plane" — independent validation that the category name is forming [6].
### 3.4 Mutability classes prevent the worst failure mode
Every key should declare how it can change: `build-time`, `deploy-time`,
`startup-time`, `hot-reloadable`, `per-request`, `emergency`. The recurring
failure is treating dangerous structural config like a harmless flag — exactly the
CrowdStrike-shaped risk where a "content update" had deploy-grade blast radius [5].
---
## 4. The adjacent topics (the converging market)
The control plane is not one product; it is a convergence of tool families.
ConfigAtlas's stance is **integrate and map, don't replace** [internal,
CompetitiveLandscape]. Summary of each adjacency and the research behind it:
### 4.1 Configuration-as-Data (the closest intellectual neighbor)
Brian Grant — creator of the Kubernetes Resource Model (KRM), now CTO of ConfigHub
— argues configuration should be *data*, authoritative and stored like data, with
code that operates on it kept separate [10][11]. ConfigHub stores each variant in
fully-rendered "WET" form (no templates/variables/generators), versioned with
metadata, and — because KRM *is* the API representation — can update config *from*
live state, mitigating drift bidirectionally [10][12]. This is the strongest
direct competitor and the sharpest articulation of "config is graph-shaped
operational data, not files." **ConfigAtlas differentiation:** discovery-first and
cross-tool — map config that already lives in many systems, rather than asking
everyone to move into one store.
### 4.2 GitOps / IaC — desired state and drift
Argo CD and Flux continuously reconcile live cluster state against Git-declared
desired state; any divergence is *drift*, flagged or auto-corrected on a sync loop
[13]. Terraform/OpenTofu do the same for infrastructure lifecycle. This camp owns
the "desired state" narrative. **ConfigAtlas complements it with the "effective
state" narrative:** GitOps tells you what you *intended* to deploy; ConfigAtlas
tells you which scopes contributed, what actually applies, who owns it, and what's
risky to change [internal].
### 4.3 Feature flags / runtime control — and the AI-era expansion
Feature management (LaunchDarkly, Unleash, Flagsmith, OpenFeature as the
vendor-neutral standard) owns live behavior change and **progressive delivery**:
ring-based rollout (internal → 15% canary → 1025% beta → 100%), deterministic
cohorts for blast-radius containment, and kill switches / circuit breakers that
auto-deactivate on SLO breach [14][15]. The frontier is **AI configuration**:
LaunchDarkly's AI Configs / AgentControl move prompts, model selection, and tool
access out of code into runtime config that propagates in <200ms, with guarded
rollouts that auto-revert when eval metrics (accuracy, toxicity) drop [16][17].
This validates the core ConfigAtlas claim — the *kinds* of configuration keep
multiplying (now: agent behavior), so a map that spans kinds is increasingly
valuable. **ConfigAtlas treats flags as one scope class among many**, not the
whole plane [internal].
### 4.4 Secrets management — adjacent but kept separate
Vault, OpenBao, Infisical, Doppler, plus SOPS and External Secrets for the
GitOps path. Secrets differ in sensitivity, lifecycle, and blast radius and must
never be merged into ordinary config [internal]. **ConfigAtlas stores references
and dependencies, never values** — which config depends on which secret, where
it's injected, what's affected if it rotates.
### 4.5 Policy-as-code — the guardrail backend
OPA, Kyverno, Checkov answer "is this change allowed?" across K8s, CI/CD, IaC, and
more [internal]. They are ideal *validation backends* for a control plane but
don't model provenance, ownership, or effective behavior. **ConfigAtlas is the
context and evidence layer around them** — which policy applies, at which scope,
and why.
### 4.6 CMDB / developer portals / SSPM — the enterprise gravity wells
CMDBs (ServiceNow et al.) model assets and services; developer portals (Backstage,
Port, Cortex, OpsLevel) model ownership; SSPM tools (CoreView, AppOmni) model SaaS
posture drift [internal]. None model the layered behavioral config surface with
effective-value resolution. **ConfigAtlas integrates** — enriching catalogs and
portals rather than displacing them; a Backstage/Port plugin is a plausible
adoption path.
---
## 5. Reference architecture for a configuration control plane
Synthesizing the layering primer with the control-plane framing [1][internal]:
```
Config Canon vocabulary + schema (what a key means)
Config Registry every key: owner, type, allowed scopes, lifecycle, mutability, security class
Config Resolver deterministic layer ordering -> effective value (the "explain" engine)
Config Policy allowed values + allowed overrides (OPA/Kyverno/CUE backends)
Config Delivery env vars / ConfigMaps / sidecar / SDK / API lookup
Config Evidence snapshots, who/what/why/when, drift, rollout, rollback
```
The InfoQ framing adds three forward-looking elements that map directly onto this:
**reconciler-first control planes** (resolution as a continuous loop, à la GitOps),
**configuration knowledge graphs** (the `key → service → deployment → tenant →
feature → policy → secret → owner → incident` graph), and **AI-assisted decision
support** (surfacing blast radius and risk before a human approves a change) [1].
The knowledge-graph element is precisely ConfigAtlas's differentiator.
Guiding rule from the primer: **put config as close as possible to its owner, but
as high as necessary for consistency** — defaults with the product, guardrails
high and central, tenant prefs low, secrets outside, flags in the runtime plane,
infra state in GitOps.
---
## 6. The wedge and the white space
The defensible opening is **read-first configuration intelligence**, not
write-first control [internal, CompetitiveLandscape]. The category name
("Configuration Control Plane") is emerging and not yet owned — InfoQ frames it as
a pattern [1], CUE markets a product under the exact phrase [6], ConfigHub attacks
the same instinct from the data angle [10]. None yet own the **companywide living
configuration surface**: cross-tool discovery, effective-value resolution,
organizational scope/ownership governance, blast-radius/dependency intelligence,
and change evidence.
Sharpest positioning [internal]:
> **ConfigAtlas is not where all configuration must live. It is where
> configuration becomes visible, explainable, governable, and safe to change.**
---
## 7. Open questions to drive the next research pass
1. **Discovery connectors** — what is the minimum viable set of ingestion sources
(Git, K8s, Terraform state, a feature-flag platform, a secret manager) to
prove cross-tool effective-config resolution end to end?
2. **Effective-value provenance schema** — can the registry's entry schema carry
enough to render a full `config explain` (source layer, overrides, validating
schema, owner) without becoming a second source of truth for values?
3. **Graph model** — what is the canonical edge set for the configuration
knowledge graph, and does it reuse the State Hub's existing relationship model?
4. **CUE vs JSON Schema** for atlas entry validation — does order-independent
merge buy enough to justify the toolchain cost over JSON Schema? [2][3]
5. **AI-config as a first-class scope** — given the LaunchDarkly trajectory [16],
should "agent/model configuration" be a named scope class in the L-stack now?
</content>

99
research/sources.md Normal file
View File

@@ -0,0 +1,99 @@
# Sources — Configuration Control Plane research
Annotated bibliography for [`configuration-control-plane.md`](configuration-control-plane.md).
Captured 2026-06-26. "internal" citations refer to this repo's own
[`wiki/ConfigLayering.md`](../wiki/ConfigLayering.md) and
[`wiki/CompetitiveLandscape.md`](../wiki/CompetitiveLandscape.md), which already
carry their own source lists.
## Category framing
1. **Configuration as a Control Plane: Designing for Safety and Reliability at Scale** — InfoQ.
The anchor source. Argues hyperscalers independently converged on the same safety
patterns (staged rollout, blast-radius containment, dependency-aware validation,
automated rollback) and names the emerging tech: reconciler-first control planes,
configuration knowledge graphs, AI-assisted decision support.
https://www.infoq.com/articles/configuration-control-plane/
6. **CUE Hub: the Configuration Control Plane** — CUE Labs.
Independent use of the exact category phrase; a vendor branding a product as
"the configuration control plane." Evidence the category name is forming.
https://cue.dev/blog/announcing-cue-labs/
## Layering, schema, and merge semantics
2. **Config Wars — Chapter 3: CUE** — Miru's Blog (Vedant Nair).
Comparative analysis of CUE vs JSON Schema vs Jsonnet merge semantics;
establishes CUE's commutative/associative/idempotent unification and Jsonnet's
order-dependent mixin composition.
https://mirurobotics.substack.com/p/config-wars-chapter-3-cue
3. **Data Validation use case** — CUE official docs.
Primary source: CUE merges schema and data; one definition both validates and
templates.
https://cuelang.org/docs/concept/data-validation-use-case/
8. **Declarative Management of Kubernetes Objects Using Kustomize** — Kubernetes docs.
Canonical base/overlay layering pattern.
https://kubernetes.io/docs/tasks/manage-kubernetes-objects/kustomization/
9. **Store config in the environment** — The Twelve-Factor App.
Foundational "separate config from code" principle underpinning the kind-separation.
https://12factor.net/config
## Configuration-as-data
10. **Introducing ConfigHub** — Brian Grant, ITNEXT.
Closest direct competitor; "configuration as authoritative data," WET rendered
config, versioned units, live-state reconciliation.
https://itnext.io/introducing-confighub-b127736641c5
11. **What is Configuration as Data?** — Brian Grant, ITNEXT.
Primary articulation of CaD vs IaC; data is authoritative, code operates on it separately.
https://itnext.io/what-is-configuration-as-data-210b0c4be324
12. **Configuration as Data** — ConfigHub docs.
Product-doc treatment of the same concept, incl. updating config from live state.
https://docs.confighub.com/background/config-as-data/
## GitOps / drift / desired vs effective state
13. **GitOps Prescription: Curing the Configuration Drift Epidemic** — BridgePhase.
Desired-state vs live-state reconciliation, drift detection/self-healing with
Argo CD and Flux.
https://bridgephase.com/insights/drift-detection/
## Feature flags, progressive delivery, AI-era config
14. **Kill switches vs progressive delivery** — Unleash.
Ring-based rollout, blast-radius containment, kill switch / circuit-breaker patterns.
https://www.getunleash.io/blog/kill-switch-vs-progressive-delivery
15. **7 Advanced Feature Flagging Best Practices for 2025** — OpsMoon.
Progressive delivery cohorts, SLO-triggered automated rollback.
https://opsmoon.com/blog/feature-flagging-best-practices/
16. **AI Configs is now GA: Runtime control for prompts and models** — LaunchDarkly.
Prompts/model selection as runtime config; <200ms propagation; guarded rollouts
that auto-revert on eval-metric regression.
https://launchdarkly.com/blog/ai-configs-ga-runtime-control-prompts-models/
17. **LaunchDarkly launches runtime control layer for the agentic AI era** — SiliconANGLE.
Independent coverage of AgentControl; runtime control of AI agents without redeploy.
https://siliconangle.com/2026/05/19/launchdarkly-launches-runtime-control-layer-agentic-ai-era/
## Outages — why configuration safety matters
4. **Configuration Change Trouble & Other 2024 Outage Trends** — ThousandEyes.
Names configuration change as a leading recurring outage cause.
https://www.thousandeyes.com/blog/internet-report-configuration-change-outages
5. **8 major IT disasters of 2024** — CIO.
CrowdStrike Falcon config update, AT&T equipment config error, McDonald's POS
third-party config change.
https://www.cio.com/article/3624552/8-major-it-disasters-of-2024.html
7. **Azure Front Door Outage: How a Single Control-Plane Defect Exposed Architectural Fragility** — InfoQ.
Control-plane defect as outage cause; reinforces the control-plane safety thesis.
https://www.infoq.com/news/2025/11/azure-afd-control-plane-failure/
</content>

7
wiki/BrandFrame.md Normal file
View File

@@ -0,0 +1,7 @@
ConfigAtlas
The Configuration Control Plane for discovering, mapping, and governing
the living configuration surface of fast-moving companies.
Reveal every configuration scope, understand every override,
and govern change safely across systems, teams, tenants, and environments.

View File

@@ -0,0 +1,378 @@
## Competitive landscape: Configuration Control Plane
As of June 26, 2026, “Configuration Control Plane” looks like an emerging category, not yet a mature analyst-defined software segment. The problem is recognized, though: modern configuration is increasingly treated as a live control surface that changes production behavior, affects reliability, and needs staged rollout, policy enforcement, rollback, blast-radius control, and explainability.
- https://www.infoq.com/articles/configuration-control-plane
For ConfigAtlas, the competition is therefore not one category. It is a converging market made from several adjacent tool families.
## 1. Direct and near-direct competitors
These are closest to the product idea.
Player | What they do | Relevance to ConfigAtlas
-- | -- | --
ConfigHub | Treats configuration as authoritative data, not generated files. It emphasizes API-based config reads/writes, versioned config units, WET “fully rendered” config, validation, policy checks, and live-state reconciliation. (ITNEXT) | Very close conceptual competitor. Strongest direct watch item. More focused on configuration-as-data and deployment operations than companywide discovery/governance.
Configu | Open-source / cloud “Configuration-as-Code” platform for managing application configuration across environments, with validation, dependency checks, integrations, secrets/feature flag awareness, and automation across storage systems. (configu.com) | Directly relevant for application config and ConfigOps. Less obviously positioned around organizational scope discovery, ownership graphs, or effective-config intelligence.
Pulumi ESC | Manages hierarchical environments, secrets, and configuration; supports composing environments, secret management, dynamic values from providers, and use from apps or Pulumi IaC. (pulumi) | Strong in environment/secrets/config composition. More developer/IaC-oriented than enterprise-wide configuration cartography.
Humanitec + Score | Humanitecs Platform Orchestrator generates deployment configuration from Score workload definitions; Score aims to provide platform-agnostic workload configuration and reduce environment inconsistency. (Humanitec) | Competes where the problem is “how do workloads get configured consistently?” Less focused on discovering existing scattered config and overlapping responsibilities.
Crossplane | A framework for building cloud-native control planes and declarative platform APIs. (docs.crossplane.io) | Not a config intelligence product, but a powerful “build your own control plane” substrate. Potential integration or infrastructure-layer competitor.
<hr><h1>Highest-risk competitors</h1><h2>1. ConfigHub</h2><p>ConfigHub is the most dangerous direct competitor because it has a very similar category instinct: configuration as structured data, API-addressable, versioned, queryable, validated, and operationally safer than template-driven Git workflows. (<a href="https://itnext.io/introducing-confighub-b127736641c5" title="Introducing ConfigHub. Why ConfigHub manages configuration as… | by Brian Grant | ITNEXT">ITNEXT</a>)</p><p><strong>ConfigAtlas differentiation:</strong> go broader and more discovery-first: organizational config cartography, existing-tool ingestion, ownership and scope graph, unknown-unknown discovery, and effective-config explanation.</p><h2>2. ServiceNow / CMDB ecosystem</h2><p>Large enterprises may assume this belongs in ServiceNow or another CMDB. ServiceNow defines CMDB around CIs and relationships across infrastructure and services. (<a href="https://www.servicenow.com/products/it-operations-management/what-is-cmdb.html?utm_source=chatgpt.com" title="What is a configuration management database (CMDB)?">ServiceNow</a>)</p><p><strong>ConfigAtlas differentiation:</strong> CMDBs know assets; ConfigAtlas knows layered behavioral control information. Integrate rather than replace.</p><h2>3. LaunchDarkly / feature management platforms</h2><p>Feature management platforms already own runtime behavior changes and progressive delivery. LaunchDarkly explicitly markets runtime control, progressive release, automated rollback, AI agent control, and cost/performance optimization for AI workloads. (<a href="https://launchdarkly.com/?utm_source=chatgpt.com" title="LaunchDarkly: Runtime Control for AI-Era Software | Feature ...">LaunchDarkly</a>)</p><p><strong>ConfigAtlas differentiation:</strong> treat feature flags as one class of configuration scope among many, not the whole control plane.</p><h2>4. Humanitec / platform engineering stack</h2><p>Humanitec/Score is strong where the buyer wants standardized workload configuration and developer self-service. (<a href="https://developer.humanitec.com/app-humanitec-io/docs/platform-orchestrator/overview/?utm_source=chatgpt.com" title="Platform Orchestrator: Overview">Humanitec</a>)</p><p><strong>ConfigAtlas differentiation:</strong> discover and govern config across the company, including legacy and already-existing config, not only platform-generated workload config.</p><h2>5. CoreView/AppOmni/SSPM tools</h2><p>They validate that SaaS configuration drift and tenant resilience are becoming board-level concerns, especially in Microsoft 365 and SaaS-heavy companies. (<a href="https://www.coreview.com/configuration-manager?utm_source=chatgpt.com" title="CoreView Configuration Manager For Microsoft">coreview.com</a>)</p><p><strong>ConfigAtlas differentiation:</strong> become the broader cross-domain configuration map, while SSPM remains a specialized security-posture input.</p><hr><h1>Suggested wedge for ConfigAtlas</h1><p>The best initial wedge is <strong>read-first configuration intelligence</strong>, not write-first control.</p><p>Start with:</p><pre><code class="language-text">discover config sources
classify config by kind and scope
build ownership graph
detect duplicates and conflicts
show effective config paths
surface unknown owners and risky overrides
generate audit/evidence reports
integrate with existing tools
</code></pre><p>Only later add:</p><pre><code class="language-text">controlled changes
approval workflows
policy enforcement
safe rollout
rollback orchestration
runtime override management
</code></pre><p>That reduces adoption friction. Companies are more willing to connect a discovery and evidence layer than to hand over control of production configuration on day one.</p><hr><h1>My overall assessment</h1><p>The market is <strong>real but fragmented</strong>. The exact phrase <strong>Configuration Control Plane</strong> is not yet fully owned, which is good. The strongest adjacent categories are already crowded, but none of them fully cover the <strong>companywide living configuration surface</strong>.</p><p><strong>ConfigAtlas has a credible opening if it becomes the map, resolver, and evidence layer across existing systems.</strong></p><p>The sharpest positioning:</p><blockquote><p><strong>ConfigAtlas is not where all configuration must live. It is where configuration becomes visible, explainable, governable, and safe to change.</strong></p></blockquote></body></html><!--EndFragment-->
</body>
</html>## Competitive landscape: Configuration Control Plane
As of **June 26, 2026**, “Configuration Control Plane” looks like an **emerging category**, not yet a mature analyst-defined software segment. The problem is recognized, though: modern configuration is increasingly treated as a live control surface that changes production behavior, affects reliability, and needs staged rollout, policy enforcement, rollback, blast-radius control, and explainability. ([[InfoQ](https://www.infoq.com/articles/configuration-control-plane/)][1])
For **ConfigAtlas**, the competition is therefore not one category. It is a **converging market** made from several adjacent tool families.
---
# 1. Direct and near-direct competitors
These are closest to the product idea.
| Player | What they do | Relevance to ConfigAtlas |
| --------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **ConfigHub** | Treats configuration as authoritative data, not generated files. It emphasizes API-based config reads/writes, versioned config units, WET “fully rendered” config, validation, policy checks, and live-state reconciliation. ([[ITNEXT](https://itnext.io/introducing-confighub-b127736641c5)][2]) | Very close conceptual competitor. Strongest direct watch item. More focused on configuration-as-data and deployment operations than companywide discovery/governance. |
| **Configu** | Open-source / cloud “Configuration-as-Code” platform for managing application configuration across environments, with validation, dependency checks, integrations, secrets/feature flag awareness, and automation across storage systems. ([[configu.com](https://configu.com/)][3]) | Directly relevant for application config and ConfigOps. Less obviously positioned around organizational scope discovery, ownership graphs, or effective-config intelligence. |
| **Pulumi ESC** | Manages hierarchical environments, secrets, and configuration; supports composing environments, secret management, dynamic values from providers, and use from apps or Pulumi IaC. ([[pulumi](https://www.pulumi.com/docs/esc/environments/?utm_source=chatgpt.com)][4]) | Strong in environment/secrets/config composition. More developer/IaC-oriented than enterprise-wide configuration cartography. |
| **Humanitec + Score** | Humanitecs Platform Orchestrator generates deployment configuration from Score workload definitions; Score aims to provide platform-agnostic workload configuration and reduce environment inconsistency. ([[Humanitec](https://developer.humanitec.com/app-humanitec-io/docs/platform-orchestrator/overview/?utm_source=chatgpt.com)][5]) | Competes where the problem is “how do workloads get configured consistently?” Less focused on discovering existing scattered config and overlapping responsibilities. |
| **Crossplane** | A framework for building cloud-native control planes and declarative platform APIs. ([[docs.crossplane.io](https://docs.crossplane.io/latest/whats-crossplane/?utm_source=chatgpt.com)][6]) | Not a config intelligence product, but a powerful “build your own control plane” substrate. Potential integration or infrastructure-layer competitor. |
**Interpretation:**
The nearest direct threat is **ConfigHub**, because it attacks the same philosophical pain: configuration is graph-shaped operational data, not just files and variables. **Configu** is also close, especially for application configuration and configuration-as-code workflows. **Pulumi ESC** is close around hierarchical environment config and secrets. **Humanitec/Score** is close around workload deployment configuration.
ConfigAtlas should avoid sounding like “another place to store config.” The stronger wedge is:
> **Discover, map, explain, and govern configuration across existing tools before trying to replace them.**
---
# 2. Runtime configuration and feature flag platforms
This is the most mature adjacent category. These tools already own a lot of “live behavior control.”
| Segment | Examples | Strength | Gap vs ConfigAtlas |
| ------------------------------ | -------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------ |
| Enterprise feature management | [LaunchDarkly](https://launchdarkly.com/?utm_source=chatgpt.com), Harness/Split, Optimizely, Statsig, DevCycle | Runtime flags, targeting, progressive delivery, experimentation, rollback, observability. LaunchDarkly now positions itself around runtime control for code and AI-era software. ([LaunchDarkly][7]) | They govern flags well, but usually not the whole company configuration surface. |
| Open-source feature management | [Unleash](https://www.getunleash.io/?utm_source=chatgpt.com), Flagsmith, GrowthBook | Self-hosting, flag governance, remote config, segmentation, experimentation. ([Unleash][8]) | Strong for feature exposure, weaker for infrastructure, secrets, SaaS tenants, policy, and ownership graphs. |
| Standards layer | OpenFeature | Vendor-neutral feature flag API that helps avoid SDK lock-in and supports multiple backends. ([[openfeature.dev](https://openfeature.dev/?utm_source=chatgpt.com)][9]) | Important integration target, not a full control plane. |
| Cloud-native dynamic config | AWS AppConfig, Azure App Configuration, Firebase Remote Config | Dynamic config, feature flags, validation, targeted rollout, app behavior changes without redeploy. ([[AWS Dokumentation](https://docs.aws.amazon.com/appconfig/latest/userguide/what-is-appconfig.html?utm_source=chatgpt.com)][10]) | Powerful inside a cloud/app ecosystem, but not cross-company config cartography. |
**Strategic conclusion:**
Feature flag platforms are not just competitors; they are **required integrations**. ConfigAtlas should not replace LaunchDarkly, Unleash, Flagsmith, AWS AppConfig, or Azure App Configuration. It should inventory them, classify flags/configs by scope and owner, detect stale flags, connect them to services and tenants, and explain how runtime behavior is actually controlled.
---
# 3. GitOps, IaC, and deployment configuration
This is where a lot of current config ownership already lives.
| Segment | Examples | Strength | Gap vs ConfigAtlas |
| ---------------------------- | --------------------------- | ---------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------- |
| GitOps CD | Argo CD, Flux | Git as desired-state source of truth; continuous reconciliation for Kubernetes. ([[argo-cd.readthedocs.io](https://argo-cd.readthedocs.io/?utm_source=chatgpt.com)][11]) | Good at deployment state, weak at cross-tool discovery and semantic config ownership. |
| IaC | Terraform, OpenTofu, Pulumi | Declarative infrastructure lifecycle, versioning, repeatable provisioning. ([[HashiCorp Developer](https://developer.hashicorp.com/terraform/tutorials/aws-get-started/infrastructure-as-code?utm_source=chatgpt.com)][12]) | Great for managed infrastructure, but not enough for runtime config, feature flags, SaaS config, manual drift, or business ownership. |
| IaC orchestration/governance | Spacelift, env0 | Drift detection, policy, RBAC, automation around Terraform/OpenTofu/Pulumi workflows. ([[docs.spacelift.io](https://docs.spacelift.io/concepts/stack/drift-detection?utm_source=chatgpt.com)][13]) | Often stack/workspace-centric, not companywide config intelligence. |
| Cloud asset/IaC drift | [Firefly](https://www.firefly.ai/get-firefly?utm_source=chatgpt.com) | Cloud asset visibility, codification, drift detection, remediation PRs, policy. ([Firefly][14]) | Strong for cloud resources and IaC drift; less focused on application/tenant/user/feature configuration layers. |
| Guardrailed cloud config | Resourcely | Blueprints and guardrails for secure-by-default Terraform/OpenTofu infrastructure configuration. ([[GlobeNewswire](https://www.globenewswire.com/news-release/2024/10/16/2964281/0/en/resourcely-reinvents-infrastructure-devops-with-configuration-platform-for-scaling-hashicorp-s-terraform-and-opentofu-and-launches-new-free-tier.html?utm_source=chatgpt.com)][15]) | Strong “paved road” IaC config, but narrower than full config surface discovery. |
**Strategic conclusion:**
This market owns the “desired state” narrative. ConfigAtlas should complement it with the “effective configuration” narrative:
> GitOps tells you what you intended to deploy. ConfigAtlas tells you which configuration scopes exist, what actually applies, who owns it, what conflicts, and what changes are risky.
---
# 4. Secrets management
Secrets are configuration-adjacent but must remain separate.
| Examples | Strength | Gap vs ConfigAtlas |
| -------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------- |
| HashiCorp Vault, OpenBao, Infisical, Doppler | Centralized secrets, identity-based access, rotation, audit, certificates, keys, developer workflows. ([[HashiCorp | An IBM Company](https://www.hashicorp.com/en/products/vault?utm_source=chatgpt.com)][16]) | They manage sensitive values, but not the broader configuration topology, ownership model, effective config, or non-secret runtime behavior. |
**Strategic conclusion:**
ConfigAtlas should **never try to become the secret vault**. It should store metadata and references: which config depends on which secret, who owns the dependency, where it is injected, which environments or tenants are affected, and whether the secret lifecycle is safe.
---
# 5. Policy-as-code and configuration guardrails
These tools enforce rules, but usually do not discover the whole map.
| Examples | Strength | Gap vs ConfigAtlas |
| ----------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Open Policy Agent, Kyverno, Checkov | Policy enforcement across Kubernetes, CI/CD, microservices, IaC, Dockerfiles, Helm, Terraform, and more. ([[openpolicyagent.org](https://openpolicyagent.org/docs?utm_source=chatgpt.com)][17]) | They answer “is this allowed?” but not always “where did this config come from, who owns it, what overrides it, what depends on it, and what effective behavior results?” |
**Strategic conclusion:**
OPA/Kyverno/Checkov are ideal **policy backends** or validation integrations for ConfigAtlas. ConfigAtlas should become the higher-level context and evidence layer around them.
---
# 6. CMDB, ITSM, and asset discovery
This is the enterprise incumbent category with the biggest installed-base gravity.
| Examples | Strength | Gap vs ConfigAtlas |
| -------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| [ServiceNow](https://www.servicenow.com/products/it-operations-management/what-is-cmdb.html?utm_source=chatgpt.com) CMDB, BMC Helix CMDB, OpenText Universal Discovery and CMDB, Device42 | Configuration items, IT assets, service relationships, infrastructure discovery, dependency mapping, ITSM/change workflows. ([ServiceNow][18]) | CMDBs model assets and services, but often do not model modern layered application configuration, feature flags, tenant overrides, Helm values, GitOps overlays, runtime flags, or effective config resolution well. |
**Strategic conclusion:**
CMDB is budget competition and integration territory. The positioning should be:
> ConfigAtlas is not another CMDB. It is the configuration intelligence layer that enriches CMDB/service catalogs with live configuration scope, override, ownership, and evidence data.
---
# 7. Internal developer portals and service catalogs
These tools are natural homes for ownership and maturity information.
| Examples | Strength | Gap vs ConfigAtlas |
| --------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------- |
| Backstage, Port, Cortex, OpsLevel | Service catalogs, ownership metadata, scorecards, standards, self-service workflows, production-readiness tracking. ([[backstage.io](https://backstage.io/docs/features/software-catalog/?utm_source=chatgpt.com)][19]) | They know “which service exists and who owns it,” but not necessarily the full layered config surface or effective config resolution. |
**Strategic conclusion:**
Developer portals should be distribution surfaces for ConfigAtlas insights. A Backstage/Port/OpsLevel plugin could be a strong adoption path.
---
# 8. SaaS tenant and security posture management
This is a very interesting adjacent space because SaaS platforms are full of hidden configuration.
| Examples | Strength | Gap vs ConfigAtlas |
| ----------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| CoreView, AppOmni, SSPM tools | SaaS configuration monitoring, tenant posture, drift, security misconfiguration, Microsoft 365/SaaS governance. CoreView specifically markets Microsoft 365 configuration drift, audit, backup, and restore; AppOmni describes SSPM as continuously monitoring SaaS app configuration and usage. ([[coreview.com](https://www.coreview.com/configuration-manager?utm_source=chatgpt.com)][20]) | Usually security/posture focused and SaaS-specific, not a general configuration control plane across product, infra, runtime, tenants, and organizational scopes. |
**Strategic conclusion:**
This validates the “companywide config surface” idea beyond DevOps. SaaS tenant config, identity config, collaboration settings, and policy settings are all part of the same configuration fabric.
---
# Competitive white space
The white space for **ConfigAtlas** is not “store config better.” Several tools already do that.
The stronger unsolved space is:
## 1. Configuration discovery across all scopes
Most tools manage the config they own. Few discover config across:
```text
repos
CI/CD variables
Kubernetes ConfigMaps / Secrets references
Helm values
Terraform/OpenTofu variables
cloud parameter stores
feature flag platforms
secret managers
SaaS tenant settings
policy engines
developer portals
manual runtime overrides
tenant/customer/admin settings
```
This is the **Atlas** opportunity: map the territory before controlling it.
## 2. Effective configuration resolution
Many tools show declared values. Fewer answer:
```text
What value actually applies here?
Which layer won?
What did it override?
Which policy constrained it?
Which tenant/user/environment is affected?
Which service behavior changes?
```
This is the difference between a config database and a **Configuration Control Plane**.
## 3. Scope and responsibility governance
Current tools usually model technical ownership. ConfigAtlas can model organizational reality:
```text
company baseline
security guardrail
platform default
environment overlay
regional override
installation setting
tenant entitlement
customer preference
group rule
user/agent-specific behavior
emergency override
```
That scope model is central to your product vision.
## 4. Dependency and blast-radius intelligence
Config changes often affect more than their local file or service. InfoQs 2026 framing emphasizes staged rollout, blast-radius containment, validation, and rollback as emerging common safety patterns for configuration at scale. ([[InfoQ](https://www.infoq.com/articles/configuration-control-plane/)][1])
ConfigAtlas can differentiate by building a configuration graph:
```text
config key -> service -> deployment -> tenant -> feature -> policy -> secret -> owner -> incident history
```
## 5. Evidence, audit, and change explainability
The enterprise buyer will care less about “cool config storage” and more about:
```text
Who changed this?
Why?
Which systems consumed it?
Was it approved?
Was it validated?
What broke?
How do we roll back?
Is this still used?
```
That bridges platform engineering, SRE, security, compliance, and IT governance.
---
# Competitive positioning for ConfigAtlas
I would position it like this:
> **ConfigAtlas is the Configuration Control Plane for discovering, mapping, explaining, and governing the configuration surface of fast-moving companies. It integrates with existing GitOps, IaC, feature flag, secret, policy, CMDB, and developer portal tools to reveal effective configuration, ownership, overrides, dependencies, drift, and change risk.**
The crucial phrase is **integrates with existing tools**. That avoids direct displacement battles too early.
## Category distinction
| Existing category | Core question | ConfigAtlas question |
| ------------------ | ----------------------------------------- | --------------------------------------------------------------------- |
| Feature flags | Can we change behavior safely at runtime? | Which runtime controls exist, who owns them, and what do they affect? |
| GitOps/IaC | Is desired state declared and reconciled? | What config scopes contributed to the effective state? |
| Secrets management | Are sensitive values protected? | Which configuration depends on which secrets and where? |
| Policy-as-code | Is this change allowed? | Which policy applies, why, and at which scope? |
| CMDB | What assets and services exist? | What configuration controls their behavior? |
| Developer portal | Who owns this service? | Who owns each config scope and override path? |
| SSPM | Is SaaS configuration secure? | How does SaaS config fit into the companywide config surface? |
---
# Highest-risk competitors
## 1. ConfigHub
ConfigHub is the most dangerous direct competitor because it has a very similar category instinct: configuration as structured data, API-addressable, versioned, queryable, validated, and operationally safer than template-driven Git workflows. ([[ITNEXT](https://itnext.io/introducing-confighub-b127736641c5)][2])
**ConfigAtlas differentiation:** go broader and more discovery-first: organizational config cartography, existing-tool ingestion, ownership and scope graph, unknown-unknown discovery, and effective-config explanation.
## 2. ServiceNow / CMDB ecosystem
Large enterprises may assume this belongs in ServiceNow or another CMDB. ServiceNow defines CMDB around CIs and relationships across infrastructure and services. ([[ServiceNow](https://www.servicenow.com/products/it-operations-management/what-is-cmdb.html?utm_source=chatgpt.com)][18])
**ConfigAtlas differentiation:** CMDBs know assets; ConfigAtlas knows layered behavioral control information. Integrate rather than replace.
## 3. LaunchDarkly / feature management platforms
Feature management platforms already own runtime behavior changes and progressive delivery. LaunchDarkly explicitly markets runtime control, progressive release, automated rollback, AI agent control, and cost/performance optimization for AI workloads. ([[LaunchDarkly](https://launchdarkly.com/?utm_source=chatgpt.com)][7])
**ConfigAtlas differentiation:** treat feature flags as one class of configuration scope among many, not the whole control plane.
## 4. Humanitec / platform engineering stack
Humanitec/Score is strong where the buyer wants standardized workload configuration and developer self-service. ([[Humanitec](https://developer.humanitec.com/app-humanitec-io/docs/platform-orchestrator/overview/?utm_source=chatgpt.com)][5])
**ConfigAtlas differentiation:** discover and govern config across the company, including legacy and already-existing config, not only platform-generated workload config.
## 5. CoreView/AppOmni/SSPM tools
They validate that SaaS configuration drift and tenant resilience are becoming board-level concerns, especially in Microsoft 365 and SaaS-heavy companies. ([[coreview.com](https://www.coreview.com/configuration-manager?utm_source=chatgpt.com)][20])
**ConfigAtlas differentiation:** become the broader cross-domain configuration map, while SSPM remains a specialized security-posture input.
---
# Suggested wedge for ConfigAtlas
The best initial wedge is **read-first configuration intelligence**, not write-first control.
Start with:
```text
discover config sources
classify config by kind and scope
build ownership graph
detect duplicates and conflicts
show effective config paths
surface unknown owners and risky overrides
generate audit/evidence reports
integrate with existing tools
```
Only later add:
```text
controlled changes
approval workflows
policy enforcement
safe rollout
rollback orchestration
runtime override management
```
That reduces adoption friction. Companies are more willing to connect a discovery and evidence layer than to hand over control of production configuration on day one.
---
# Overall assessment
The market is **real but fragmented**. The exact phrase **Configuration Control Plane** is not yet fully owned, which is good. The strongest adjacent categories are already crowded, but none of them fully cover the **companywide living configuration surface**.
**ConfigAtlas has a credible opening if it becomes the map, resolver, and evidence layer across existing systems.**
Positioning guidance:
> **ConfigAtlas is not where all configuration must live. It is where configuration becomes visible, explainable, governable, and safe to change.**
[1]: https://www.infoq.com/articles/configuration-control-plane/ "Configuration as a Control Plane: Designing for Safety and Reliability at Scale - InfoQ"
[2]: https://itnext.io/introducing-confighub-b127736641c5 "Introducing ConfigHub. Why ConfigHub manages configuration as… | by Brian Grant | ITNEXT"
[3]: https://configu.com/ "Configuration Management Reimagined - Configu"
[4]: https://www.pulumi.com/docs/esc/environments/?utm_source=chatgpt.com "Pulumi ESC Environments"
[5]: https://developer.humanitec.com/app-humanitec-io/docs/platform-orchestrator/overview/?utm_source=chatgpt.com "Platform Orchestrator: Overview"
[6]: https://docs.crossplane.io/latest/whats-crossplane/?utm_source=chatgpt.com "What's Crossplane? · Crossplane v2.3"
[7]: https://launchdarkly.com/?utm_source=chatgpt.com "LaunchDarkly: Runtime Control for AI-Era Software | Feature ..."
[8]: https://www.getunleash.io/?utm_source=chatgpt.com "Feature Management Platform / Feature Flags for Large ..."
[9]: https://openfeature.dev/?utm_source=chatgpt.com "OpenFeature"
[10]: https://docs.aws.amazon.com/appconfig/latest/userguide/what-is-appconfig.html?utm_source=chatgpt.com "What is AWS AppConfig? - AWS AppConfig"
[11]: https://argo-cd.readthedocs.io/?utm_source=chatgpt.com "Argo CD - Declarative GitOps CD for Kubernetes"
[12]: https://developer.hashicorp.com/terraform/tutorials/aws-get-started/infrastructure-as-code?utm_source=chatgpt.com "What is Infrastructure as Code with Terraform?"
[13]: https://docs.spacelift.io/concepts/stack/drift-detection?utm_source=chatgpt.com "Drift detection"
[14]: https://www.firefly.ai/get-firefly?utm_source=chatgpt.com "Manage Your Cloud with Infrastructure-as-Code - Firefly"
[15]: https://www.globenewswire.com/news-release/2024/10/16/2964281/0/en/resourcely-reinvents-infrastructure-devops-with-configuration-platform-for-scaling-hashicorp-s-terraform-and-opentofu-and-launches-new-free-tier.html?utm_source=chatgpt.com "Resourcely Reinvents Infrastructure DevOps With"
[16]: https://www.hashicorp.com/en/products/vault?utm_source=chatgpt.com "HashiCorp Vault | Identity-based secrets management"
[17]: https://openpolicyagent.org/docs?utm_source=chatgpt.com "Open Policy Agent (OPA)"
[18]: https://www.servicenow.com/products/it-operations-management/what-is-cmdb.html?utm_source=chatgpt.com "What is a configuration management database (CMDB)?"
[19]: https://backstage.io/docs/features/software-catalog/?utm_source=chatgpt.com "Backstage Software Catalog and Developer Platform"
[20]: https://www.coreview.com/configuration-manager?utm_source=chatgpt.com "CoreView Configuration Manager For Microsoft"

314
wiki/ConfigLayering.md Normal file
View File

@@ -0,0 +1,314 @@
# Introduction to ConfigLayering
**ConfigLayering** is the practice of composing the effective configuration of a system from multiple ordered configuration scopes. Instead of assuming that configuration lives in one file, one database table, one environment, or one operations team, ConfigLayering recognizes that real companies accumulate configuration across many places: application defaults, infrastructure code, deployment environments, security policies, tenant settings, feature flags, secrets, operational overrides, user preferences, and emergency controls.
The key insight is that configuration is not merely data. Configuration is distributed control information. It determines how systems behave, which capabilities are available, who may access what, which limits apply, which integrations are active, how risks are constrained, and how business rules become executable. In a fast-moving company, this control information is rarely cleanly centralized. It is layered across systems, teams, tools, vendors, environments, tenants, and responsibilities.
ConfigLayering therefore asks a central question:
> How is the final, effective configuration of a system produced from all relevant scopes, and can that result be discovered, explained, validated, governed, and safely changed?
A simple layering model may start with product defaults, then add company baselines, platform or domain settings, environment-specific overlays, regional or cluster-specific settings, installation-specific settings, tenant or customer settings, role or group settings, user or agent settings, and finally temporary runtime or incident overrides. Each layer contributes values, constraints, defaults, or policies. More specific layers may override broader layers, but some higher-level layers may define non-overridable guardrails.
This makes ConfigLayering both a technical and organizational discipline. Technically, it requires clear precedence, schema validation, merge rules, runtime delivery, secrets handling, feature-flag separation, policy enforcement, rollback, and observability. Organizationally, it requires ownership, scope boundaries, change authority, evidence, auditability, and conflict resolution.
The most important concept is the **effective configuration**: the final configuration that actually applies to a given system, environment, tenant, user, request, or operational situation. Individual files or settings are only partial evidence. The effective configuration is the resolved result of all relevant layers. Without visibility into this result, organizations can know what was configured somewhere, but not what is actually in force.
This has several implications.
First, configuration needs a scope model. A company must know whether a setting belongs to the product, the platform, an environment, a region, an installation, a tenant, a group, a user, an agent, or an emergency override. Without a scope model, configuration becomes a mixture of local decisions and inherited assumptions.
Second, configuration needs an ownership model. Every meaningful key should have an owner, a purpose, a lifecycle, and a change process. A setting without an owner becomes operational debt. A setting with multiple implicit owners becomes a conflict surface.
Third, configuration needs explicit precedence and merge semantics. It must be clear which layer wins, whether objects are deep-merged, whether lists are replaced or merged by key, whether null means “unset” or “delete,” and whether a value may be overridden at all.
Fourth, configuration needs separation by kind. Ordinary runtime configuration, infrastructure desired state, secrets, feature flags, policies, tenant entitlements, and emergency controls should not be treated as the same thing. They differ in sensitivity, lifecycle, blast radius, mutability, and governance requirements.
Fifth, configuration needs evidence. For every effective value, it should be possible to answer: where did this value come from, what did it override, who owns it, when was it changed, why was it changed, which systems consume it, and how can it be rolled back?
Best practice is to treat ConfigLayering as a governed configuration supply chain. Configuration should be declared as close as possible to its natural owner, but as high as necessary for consistency and control. Product defaults belong with the product. Company baselines and security guardrails belong in controlled central layers. Environment and deployment settings belong in platform or operations layers. Tenant settings belong in tenant-governed scopes. Secrets belong in dedicated secret management. Feature flags belong in runtime control infrastructure. Emergency overrides require strong audit and expiry.
A practical ConfigLayering standard should define:
1. A canonical configuration registry for all known keys.
2. A scope model describing where configuration may exist.
3. A precedence model describing which layers override others.
4. A schema model describing valid types, ranges, defaults, and constraints.
5. A policy model describing what may never be overridden.
6. A secrets model keeping sensitive values outside ordinary configuration.
7. A feature-control model for runtime behavior switches.
8. An evidence model for audit, rollback, drift detection, and effective-config explanation.
9. A lifecycle model for deprecating, replacing, and migrating configuration keys.
10. An observability model for inspecting redacted effective configuration safely.
The goal is not to centralize every setting into one giant configuration database. That would create a different kind of fragility. The goal is to make the distributed configuration surface of the company discoverable, explainable, governable, and safe to evolve.
In this sense, ConfigLayering is the foundation of a Configuration Control Plane. It turns scattered configuration from an unmanaged source of operational risk into a visible, structured, and auditable company capability.
## What config layering is
**Config layering** means building the final, effective configuration for a system from multiple ordered scopes. Each layer contributes defaults, constraints, or overrides. The result should be deterministic and explainable.
A simple model:
```text
product defaults
< company baseline
< domain/platform baseline
< environment: dev/stage/prod
< region / datacenter / cluster
< installation / deployment
< tenant / customer / community
< group / role
< user / agent
< temporary runtime / incident override
```
“More specific wins” is the normal rule, but company or security layers may define **non-overridable guardrails**.
This is the same basic pattern behind Kubernetes Kustomize bases/overlays, Helm values precedence, NixOS modules, and many application configuration frameworks: start with a reusable base, apply increasingly specific overlays, and produce one effective configuration. Kubernetes documents Kustomize in terms of bases and overlays, Helm explicitly defines override precedence for values, and NixOS uses a modular declarative system configuration model. ([[Kubernetes](https://kubernetes.io/docs/tasks/manage-kubernetes-objects/kustomization/?utm_source=chatgpt.com)][1])
## The most important distinction
Do **not** treat all config as one thing. A companywide config strategy should separate at least these categories:
1. **Code defaults**: safe defaults shipped with the app.
2. **Deployment config**: environment, endpoints, resource limits, region, cluster, runtime mode.
3. **Secrets**: passwords, tokens, keys, certificates.
4. **Feature flags**: runtime behavior switches and experiments.
5. **Policy config**: access rules, compliance constraints, guardrails.
6. **Tenant/customer config**: entitlements, limits, preferences, routing.
7. **Operational overrides**: incident switches, kill switches, temporary throttles.
8. **Infrastructure desired state**: machines, networks, Kubernetes objects, IAM, storage.
The Twelve-Factor App principle is still useful here: configuration should be separated from code, commonly exposed through the environment at runtime. In Kubernetes, ConfigMaps are explicitly meant to decouple environment-specific configuration from container images, while Secrets are a separate object type for sensitive values. ([[12factor.net](https://12factor.net/config?utm_source=chatgpt.com)][2])
## Best-practice architecture
For systemwide or companywide config management, I would use this structure:
```text
Config Registry
- canonical keys
- descriptions
- owners
- schema
- allowed scopes
- mutability class
- security classification
- default value policy
Config Sources
- Git repos for declarative desired state
- secret manager for secrets
- runtime config / feature flag service for dynamic behavior
- tenant/admin UI for allowed business-level settings
Config Resolution Engine
- deterministic layer ordering
- validation
- conflict detection
- policy enforcement
- effective-config rendering
Config Distribution
- env vars
- generated files
- Kubernetes ConfigMaps / Secrets
- sidecar / agent
- SDK lookup
- API lookup
Config Evidence
- audit log
- effective config snapshots
- who changed what, when, why
- rollback points
- drift detection
```
The core idea: **Git for declarative desired state, a secret manager for secrets, a feature/config service for dynamic runtime behavior, and policy-as-code for guardrails.** Argo CD describes the GitOps model as automating desired application states in target environments; Terraform similarly uses human-readable declarative configuration files for infrastructure lifecycle management. ([[argo-cd.readthedocs.io](https://argo-cd.readthedocs.io/?utm_source=chatgpt.com)][3])
## Recommended layer model
A practical companywide layering standard could look like this:
```text
L0 vendor/product defaults
L1 company baseline
L2 platform/domain baseline
L3 environment overlay: dev, test, stage, prod
L4 region/zone/cluster overlay
L5 installation/deployment overlay
L6 tenant/customer/community overlay
L7 group/role overlay
L8 user/agent/workload overlay
L9 emergency/runtime override
```
Each config key should declare which layers may override it. For example:
```yaml
key: mail.delivery.max_batch_size
type: integer
default: 500
allowed_layers:
- company
- environment
- installation
- tenant
minimum: 1
maximum: 5000
hot_reloadable: true
owner: platform-delivery
security_class: operational
```
That gives you a companywide contract: teams know what the key means, who owns it, where it may be changed, and what values are legal.
## Merge rules matter a lot
The most dangerous part of config layering is vague merge behavior. Define it explicitly.
Good default rules:
```text
scalar: more specific layer replaces earlier value
object/map: deep merge by key
array/list: replace by default, unless keyed merge is explicitly declared
null: not deletion unless tombstone semantics are defined
secret: never merged into normal config
policy: restrictive rule wins unless explicitly delegated
```
Avoid hidden “last writer wins” behavior. Every effective value should be explainable:
```text
config explain mail.delivery.max_batch_size
effective value: 1000
source: tenants/acme/prod.yaml
overrides:
- defaults/product.yaml: 500
- baselines/company.yaml: 800
- environments/prod.yaml: 1000
validated by: schemas/mail-delivery.schema.json
owner: platform-delivery
```
JSON Schema and CUE are both useful for typed validation. JSON Schema is a declarative language for defining and validating JSON structure and constraints; CUE is designed for validating data, schemas, and configuration alignment with policies. ([[json-schema.org](https://json-schema.org/docs?utm_source=chatgpt.com)][4])
## Mutability classes
Every key should have a mutability class:
```text
build-time requires rebuild
deploy-time requires redeploy
startup-time requires process restart
hot-reloadable can reload safely while running
per-request can vary by tenant/user/request
emergency can override quickly with strong audit
```
This prevents a common failure mode: teams treat dangerous structural config like a harmless feature flag. Feature flags are excellent for changing application behavior without redeploying code, and OpenFeature provides a vendor-neutral abstraction for that pattern. AWS AppConfig and Azure App Configuration are examples of managed services that support dynamic configuration and feature flags with safer rollout patterns. ([[openfeature.dev](https://openfeature.dev/?utm_source=chatgpt.com)][5])
## Secrets must be separate
Secrets should not live in ordinary config files, not even “encrypted but casually handled” ones unless the lifecycle is deliberately designed.
Best practice:
```text
normal config: Git / config registry / ConfigMap
secrets: OpenBao, Vault, cloud secret manager, SOPS, External Secrets
injection: identity-based, least privilege, short-lived where possible
audit: access and rotation evidence
```
SOPS supports encrypted YAML, JSON, ENV, INI, and binary files with KMS/age/PGP-style backends, while External Secrets Operator synchronizes secrets from external APIs into Kubernetes Secrets. ([[getsops.io](https://getsops.io/docs/?utm_source=chatgpt.com)][6])
## Companywide best practices
The strongest practices are these:
**1. Treat config as a governed product.**
Each key needs a name, owner, description, type, allowed scope, default, lifecycle, validation, and deprecation path.
**2. Prefer declarative config over imperative scripts.**
For infrastructure and system state, use desired-state tools: Terraform/OpenTofu, Ansible, NixOS, Puppet, Kubernetes manifests, Helm, Kustomize, Argo CD, or Flux depending on the layer. Ansible playbooks are explicitly repeatable and source-controllable, and Puppet-style configuration management is built around desired state. ([[Ansible Dokumentation](https://docs.ansible.com/projects/ansible/latest/playbook_guide/playbooks_intro.html?utm_source=chatgpt.com)][7])
**3. Make the effective config observable.**
Every service should be able to expose a redacted effective-config view: version, source layers, schema version, feature flags, and active policy set. This is essential for debugging.
**4. Validate before rollout.**
Use schema validation, policy-as-code, static checks, config unit tests, and environment simulation. OPA is a general-purpose policy engine usable across microservices, Kubernetes, CI/CD, API gateways, and more. ([[openpolicyagent.org](https://openpolicyagent.org/docs?utm_source=chatgpt.com)][8])
**5. Use progressive rollout for risky runtime config.**
Feature flags, rate limits, routing, and model/provider selection should support staged rollout, canary, percentage rollout, tenant allowlists, health checks, and fast rollback.
**6. Keep global config small.**
Companywide config should define defaults and guardrails, not become a giant mutable dictionary. The more global a config key is, the higher the blast radius.
**7. Separate ownership from override rights.**
A tenant admin may change tenant preferences. A platform team may change platform limits. Security may own non-overridable guardrails. Product may own entitlements. Finance may own pricing parameters.
**8. Record evidence.**
For every config change: who changed it, what changed, why, approval link, rollout scope, affected services, previous value, new value, rollback path.
## Anti-patterns to avoid
```text
one giant companywide YAML file
manual console changes not mirrored anywhere
secrets mixed with normal config
environment-specific if/else logic in application code
untyped stringly-typed config
arrays merged by position
feature flags that live forever
global kill switches without ownership
tenant-specific config copied across files
no way to explain the winning value
no rollback path
```
The worst variant is “centralized chaos”: everything is technically in one place, but nobody knows who owns a key, what it means, which systems consume it, or whether changing it is safe.
## A good companywide target state
For your kind of multi-repo, multi-tenant, platform-oriented work, I would frame the target as a **Configuration Control Plane**:
```text
Config Canon
defines the vocabulary and schema
Config Registry
catalogs every key, owner, type, scope, lifecycle
Config Resolver
renders effective config from layered sources
Config Policy
validates allowed values and allowed overrides
Config Delivery
pushes or exposes config to systems
Config Evidence
records snapshots, changes, drift, rollout, rollback
```
The guiding rule:
> Put config as close as possible to the owner, but as high as necessary for consistency.
For example, company security baselines belong high. Tenant preferences belong low. Secrets belong outside normal config. Feature flags belong in a runtime control plane. Infrastructure desired state belongs in GitOps/IaC. Application defaults belong with the code.
[1]: https://kubernetes.io/docs/tasks/manage-kubernetes-objects/kustomization/?utm_source=chatgpt.com "Declarative Management of Kubernetes Objects Using ..."
[2]: https://12factor.net/config?utm_source=chatgpt.com "Store config in the environment"
[3]: https://argo-cd.readthedocs.io/?utm_source=chatgpt.com "Argo CD - Declarative GitOps CD for Kubernetes"
[4]: https://json-schema.org/docs?utm_source=chatgpt.com "Docs"
[5]: https://openfeature.dev/?utm_source=chatgpt.com "OpenFeature"
[6]: https://getsops.io/docs/?utm_source=chatgpt.com "SOPS: Secrets OPerationS"
[7]: https://docs.ansible.com/projects/ansible/latest/playbook_guide/playbooks_intro.html?utm_source=chatgpt.com "Ansible playbooks — Ansible Community Documentation"
[8]: https://openpolicyagent.org/docs?utm_source=chatgpt.com "Open Policy Agent (OPA)"

1
wiki/Home.md Normal file
View File

@@ -0,0 +1 @@
Configuration Control Plane for discovering, mapping, and governing the living configuration surface of fast-moving companies.

3
wiki/ProductVision.md Normal file
View File

@@ -0,0 +1,3 @@
Enable organizations to understand and control the living configuration fabric of their company — from code defaults and infrastructure state to tenant settings, policies, secrets, feature flags, and emergency overrides — so that configuration becomes a managed capability rather than an operational risk.
From unknown unknowns to governed effective configuration.