--- id: STATE-WP-0065 type: workplan title: "Repo-anchored classification spine (CUST-WP-0050 implementation)" domain: custodian repo: state-hub status: ready owner: custodian topic_slug: custodian created: "2026-06-22" updated: "2026-06-22" state_hub_workstream_id: "8dc7d106-11e2-41df-b512-89ed69d2a65f" --- # STATE-WP-0065 — Repo-anchored classification spine **Origin / driver:** `the-custodian/workplans/CUST-WP-0050-repo-classification-registration-redesign.md` (custodian coordination workplan; owns the canon standard and the portfolio decisions D1/D1a + ADR-005). This workplan is the **state-hub-local implementation** of CUST-WP-0050 Phase 3–4, re-homed per that plan's repo boundary note and ADR-001. ## Inputs already in place (from CUST-WP-0050) - **Standard + allowed-values:** `the-custodian/canon/standards/repo-classification-standard_v1.0.md` and `…/repo-classification.allowed.yaml` (5 categories incl. `tooling`, 14 market domains, business_stake/mechanics enums, capability families). - **Validator:** `the-custodian/tools/validate_repo_classification.py` (self-test PASS). - **Fixtures:** all 11 custodian-domain repos carry committed, human-reviewed `.repo-classification.yaml` (`classified_by: human`). Use these to build and test against; no dependency on classifying the full Gitea inventory first. - **Decisions:** D1 (repo is the primary anchor; market-domain derived from classification; topic demoted/retired). ADR-005 (cross-repo workplans anchor to a dedicated project repo, retired on completion). ## Goal Replace the Hub's ad-hoc 14-domain spine with the repo-anchored classification model: the **repo** (plus its committed `.repo-classification.yaml`) becomes the primary anchor for workplans; market-domain is **derived** from classification; `workstream` is renamed to `workplan`. The Hub stays fully rebuildable from repo-owned files (ADR-001). ## Phases / Tasks ### P1 — Spine migration (single Alembic revision) ```task id: STATE-WP-0065-T01 status: todo priority: high ``` One coordinated migration (merges CUST-WP-0050 T04 + T05 + T10 — they rewrite the same spine, so a single window avoids migrating rows two or three times): - Replace `domains` table contents with the **14 fixed market domains** (infotech, financials, communication, consumer, health, industrials, energy, utilities, materials, realestate, crypto, agents, space, government). - Add classification storage to `managed_repos`: `category`, primary `domain_id`, `secondary_domains[]`, `capability_tags[]`, `business_stake[]`, `business_mechanics[]`, plus provenance (`classified_at`, `classified_by`, `standard_version`). - Make `workstreams.repo_id` the **required** anchor; demote `topic_id` to nullable (optional cross-repo tag). Promote `RepoGoal` to the goal primitive; reduce `DomainGoal` to a thin rollup keyed by the 14 market domains. - **Rename `workstream` → `workplan`** across the table, FKs (workstream_dependency, repo_goal, task.workstream_id, progress_event, decision, etc.), models, and schemas. - **Data migration:** backfill `repo_id` on every workstream (no orphans); map old 14 domains → new model per standard §15; assign existing 57 registered repos a **derived** classification (`classified_by: migration`) so nothing orphans; resolve discrepancies — reconcile `markitect-project`↔`markitect-main`, retire phantom `railiance-bootstrap`/`railiance-hosts` (or relink), collapse the `vergabe_teilnahme` duplicate, decide `personhood` disposition. Done when forward migration applies with zero orphaned coordination records, a dry-run report is reviewed, and a tested downgrade path exists. ### P2 — API / MCP / validation surface ```task id: STATE-WP-0065-T02 status: todo priority: high ``` - Enforce the T01 allowed-values at the API boundary (import the canon `repo-classification.allowed.yaml`); reject invalid category/domain/tags/stake/ mechanics. - Rename `workstream` → `workplan` across REST routes, Pydantic schemas, MCP tools/resources, and `mcp_server/TOOLS.md`; provide compatibility/redirect notes for existing tool callers. - Expose list/filter-by-classification (category, domain, capability, business stake) in MCP/REST. Done when the API/MCP surface uses "workplan", validates classification, and tests are green. ### P3 — Auto-registration tooling ```task id: STATE-WP-0065-T03 status: todo priority: high ``` Build an idempotent `register-from-classification` capability (Make target + script + MCP tool): given a repo (local path or Gitea API), read `.repo-classification.yaml`, validate against T01, and upsert the `managed_repo` with the full classification. Support a **bulk** run and reclassification of existing rows. Reuse the k3s/Gitea access path (Gitea in k3s on coulombcore via `kubectl port-forward svc/gitea-http`). Done when one command registers/reclassifies every repo with a valid file and emits a report of registered / updated / skipped / invalid. ### P4 — Surfaces & cutover ```task id: STATE-WP-0065-T04 status: todo priority: medium ``` - Dashboard: navigate by category / domain / capability / business-stake. - Consistency checker: add a rule flagging registered repos lacking a valid `.repo-classification.yaml`; update existing C-rules coupled to the old domain model to avoid mass false positives. - Update orientation docs that assume the old domain model (`SCOPE.md`, `README.md`, `.claude/rules/*` in the-custodian and state-hub). - Cutover: switch orientation/registration tooling to the new model end-to-end, archive the old domain semantics, run `make fix-consistency REPO=the-custodian`. Done when an end-to-end pass (classify → auto-register → dashboard view) is verified and the old ad-hoc domain model is retired. ## Sequencing P1 → P2 (P2 depends on the renamed/extended schema). P3 depends on P2's validation + classification columns. P4 last (consuming surfaces + cutover). Lazy reclassification of already-registered repos (CUST-WP-0050 T07) runs via P3 as committed `.repo-classification.yaml` files appear — not an upfront batch. Classifying + registering the remaining Gitea inventory is a **post-cutover** task under the new model (CUST-WP-0050 T11). ## Risks - **Breaking-migration blast radius:** `domain_id`/`topic_id` thread through ~11 models and ~12 routers/schemas; `workstream` through ~14 routers + MCP server + TOOLS.md. Mitigate with reviewed dry-run + tested downgrade (P1) and merging the three spine changes into one window. - **Consistency-checker coupling:** existing C-rules assume the current domain model; update alongside (P4).