--- id: ATLAS-WP-0003 type: workplan title: "Discovery connectors" domain: infotech repo: config-atlas status: finished owner: codex topic_slug: custodian created: "2026-06-26" updated: "2026-06-27" state_hub_workstream_id: "e4400d9c-021a-4e44-8e9b-719f94a9561a" --- # Discovery connectors Grow the configuration-surface registry from **automated discovery** instead of only hand authoring — by reusing `repo-scoping`'s scanner → candidate → approval machinery rather than building bespoke scanners. This is **Phase 2** of [`specs/ArchitectureBlueprint.md`](../specs/ArchitectureBlueprint.md) §6 and realizes PRD FR-8 ([`specs/ProductRequirementsDocument.md`](../specs/ProductRequirementsDocument.md)). Connectors are **read-only and stateless**: they emit *candidate* surface entries for human/agent PR review and never write any source system or auto-merge ([`docs/ecosystem-boundaries.md`](../docs/ecosystem-boundaries.md) §2.4). **Depends on:** ATLAS-WP-0002 (surface schema, registry layout, CI gate). The candidate → approval → registry-truth state machine and source-linked evidence model are reused from `repo-scoping`, not reinvented. **Exit condition:** at least one connector produces candidate surfaces that enter via PR, validate against the schema, and can be promoted to entries; stale/unowned surfaces are reported. No connector mutates a source system. **Sequencing:** T01 (contract) first; T02–T04 (connectors) depend on T01 and may proceed in parallel; T05 (stale/unowned) depends on having connector-produced data. ## Define the connector contract and candidate workflow ```task id: ATLAS-WP-0003-T01 status: done priority: high state_hub_task_id: "e7b03e49-7e49-4629-ada1-facdf596569b" ``` Result 2026-06-27: Added `tools/connector_base.py` (validates + writes candidates, refuses to overwrite promoted entries, never stores values) and the contract in `docs/discovery-connectors.md` (read-only/stateless, candidate->PR->promote, `status: candidate` + provenance). Added `candidate` to the schema status enum; candidates/ is gitignored and excluded from the gate. Specify the read-only connector contract and the candidate lifecycle. Define the candidate entry format (a surface entry with `status: candidate` + provenance) and its location (`registry/surfaces/candidates/`), and the `connector → candidate YAML → PR → validate → promote/merge` flow. Reuse `repo-scoping`'s `Scope→…→Evidence→Fact` provenance and candidate→approval model (`~/repo-scoping/docs/scope-md-spec.md`, `src/repo_scoping/candidate_graph/`). Document in `docs/discovery-connectors.md`. - **Acceptance:** a documented contract (inputs, candidate output shape, promotion rules); connectors are specified as stateless and non-mutating; the candidate schema reuses the surface-entry schema with a `candidate` status + provenance. ## repo-scoping fact ingestion connector ```task id: ATLAS-WP-0003-T02 status: done priority: high state_hub_task_id: "2447547b-1776-4225-af4f-f73680ccb2df" ``` Result 2026-06-27: Added `tools/connector_reposcoping.py` (+ make connect-reposcoping). Consumes repo-scoping facts (--facts file or REPO_SCOPING_URL), filters config facts, emits schema-valid candidates; degrades gracefully when the API is down. Verified on synthetic facts (2 config candidates, non-config skipped). Build the connector that consumes `repo-scoping` observed facts/evidence as input and emits candidate configuration surfaces, adding only the config-kind and layer classification on top (ecosystem-boundaries §2.4 option a). Map repo-scoping facts about config files/env/params to `surface.*` candidates with `kind`, `scope`, and `sources[]` populated. - **Acceptance:** running the connector against a registered repo's repo-scoping facts emits valid candidate surfaces (schema-valid) with source links; zero writes to repo-scoping or the scanned repo. ## git-config deterministic scanner ```task id: ATLAS-WP-0003-T03 status: done priority: medium state_hub_task_id: "ddfb8eaf-46b4-4b15-9719-b167538c15fb" ``` Result 2026-06-27: Added `tools/connector_gitconfig.py` (+ make connect-gitconfig). Deterministic scan for *.env.example / values*.yaml / config*.yaml; real .env -> secret-ref (no value read). Verified on ~/state-hub: 4 real candidates including a Helm values.yaml and a secret-ref .env. Build a deterministic scanner over repository config surfaces — env files, YAML/TOML config, Kubernetes ConfigMap/Secret *references*, and Helm `values*.yaml` overlays — emitting candidate entries with inferred `kind` and layer `role` per source. Secret references become `secret-ref` candidates (reference only, never values). - **Acceptance:** scanning a sample repo emits candidates classified by kind, with Helm overlays mapped to layer roles; secret references carry no values. ## feature-control flag connector ```task id: ATLAS-WP-0003-T04 status: done priority: medium state_hub_task_id: "9e2f5893-7b98-4ca6-89d7-94d093d6bd4b" ``` Result 2026-06-27: Added `tools/connector_featurecontrol.py` (+ make connect-featurecontrol). Emits `feature-flag` surfaces linking to feature-control keys (role: feature-control-key, openfeature endpoint) with no eval logic (FR-12); degrades when no key registry exists. Verified on synthetic keys. Build a connector that inventories `feature-control` keys and emits `feature-flag` surfaces that **link** to the feature-control key (`sources[].role: feature-control-key`) and contain no evaluation logic (PRD FR-12; reinforces the delegation boundary). Surface stale flags as a signal. - **Acceptance:** feature-control keys appear as `feature-flag` candidate surfaces linking to the authoritative key; config-atlas holds no flag-evaluation logic. ## Stale and unowned surface detection ```task id: ATLAS-WP-0003-T05 status: done priority: medium state_hub_task_id: "ddcf070c-a863-47df-8c99-61c1980a8d18" ``` Result 2026-06-27: Added `tools/registry_health.py` (+ make registry-health). Reports unowned (missing/unresolvable owner vs reuse-surface roster as domain-tree stand-in) and stale (evidence.last_seen) surfaces. Verified: 4 promoted surfaces, all owned and fresh. Add a report that flags surfaces with no resolvable `owner` (against domain-tree) and surfaces whose sources were not seen in the latest scan (stale/drift signal), using `evidence.last_seen`. Wire it into the validation tooling (`tools/`). - **Acceptance:** the report lists unowned and stale surfaces; an unowned or unseen-since-N-days surface is surfaced for review. --- After workplan or registry file updates, sync from `~/state-hub`: ```bash make fix-consistency REPO=config-atlas ```