Add STATE-WP-0065: repo-anchored classification spine (CUST-WP-0050 impl)

Re-homed implementation of CUST-WP-0050 Phase 3-4. P1 merges the schema
redesign, data migration, and workstream->workplan rename into one Alembic
window; P2 API/MCP/validation; P3 auto-registration; P4 surfaces & cutover.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2026-06-22 11:52:55 +02:00
parent c4c38e1697
commit 1620701ae4

View File

@@ -0,0 +1,154 @@
---
id: STATE-WP-0065
type: workplan
title: "Repo-anchored classification spine (CUST-WP-0050 implementation)"
domain: custodian
repo: state-hub
status: ready
owner: custodian
topic_slug: custodian
created: "2026-06-22"
updated: "2026-06-22"
state_hub_workstream_id: "8dc7d106-11e2-41df-b512-89ed69d2a65f"
---
# STATE-WP-0065 — Repo-anchored classification spine
**Origin / driver:** `the-custodian/workplans/CUST-WP-0050-repo-classification-registration-redesign.md`
(custodian coordination workplan; owns the canon standard and the portfolio
decisions D1/D1a + ADR-005). This workplan is the **state-hub-local
implementation** of CUST-WP-0050 Phase 34, re-homed per that plan's repo
boundary note and ADR-001.
## Inputs already in place (from CUST-WP-0050)
- **Standard + allowed-values:** `the-custodian/canon/standards/repo-classification-standard_v1.0.md`
and `…/repo-classification.allowed.yaml` (5 categories incl. `tooling`, 14
market domains, business_stake/mechanics enums, capability families).
- **Validator:** `the-custodian/tools/validate_repo_classification.py` (self-test PASS).
- **Fixtures:** all 11 custodian-domain repos carry committed, human-reviewed
`.repo-classification.yaml` (`classified_by: human`). Use these to build and
test against; no dependency on classifying the full Gitea inventory first.
- **Decisions:** D1 (repo is the primary anchor; market-domain derived from
classification; topic demoted/retired). ADR-005 (cross-repo workplans anchor to
a dedicated project repo, retired on completion).
## Goal
Replace the Hub's ad-hoc 14-domain spine with the repo-anchored classification
model: the **repo** (plus its committed `.repo-classification.yaml`) becomes the
primary anchor for workplans; market-domain is **derived** from classification;
`workstream` is renamed to `workplan`. The Hub stays fully rebuildable from
repo-owned files (ADR-001).
## Phases / Tasks
### P1 — Spine migration (single Alembic revision)
```task
id: STATE-WP-0065-T01
status: todo
priority: high
```
One coordinated migration (merges CUST-WP-0050 T04 + T05 + T10 — they rewrite the
same spine, so a single window avoids migrating rows two or three times):
- Replace `domains` table contents with the **14 fixed market domains**
(infotech, financials, communication, consumer, health, industrials, energy,
utilities, materials, realestate, crypto, agents, space, government).
- Add classification storage to `managed_repos`: `category`, primary `domain_id`,
`secondary_domains[]`, `capability_tags[]`, `business_stake[]`,
`business_mechanics[]`, plus provenance (`classified_at`, `classified_by`,
`standard_version`).
- Make `workstreams.repo_id` the **required** anchor; demote `topic_id` to
nullable (optional cross-repo tag). Promote `RepoGoal` to the goal primitive;
reduce `DomainGoal` to a thin rollup keyed by the 14 market domains.
- **Rename `workstream``workplan`** across the table, FKs (workstream_dependency,
repo_goal, task.workstream_id, progress_event, decision, etc.), models, and
schemas.
- **Data migration:** backfill `repo_id` on every workstream (no orphans); map old
14 domains → new model per standard §15; assign existing 57 registered repos a
**derived** classification (`classified_by: migration`) so nothing orphans;
resolve discrepancies — reconcile `markitect-project``markitect-main`, retire
phantom `railiance-bootstrap`/`railiance-hosts` (or relink), collapse the
`vergabe_teilnahme` duplicate, decide `personhood` disposition.
Done when forward migration applies with zero orphaned coordination records, a
dry-run report is reviewed, and a tested downgrade path exists.
### P2 — API / MCP / validation surface
```task
id: STATE-WP-0065-T02
status: todo
priority: high
```
- Enforce the T01 allowed-values at the API boundary (import the canon
`repo-classification.allowed.yaml`); reject invalid category/domain/tags/stake/
mechanics.
- Rename `workstream``workplan` across REST routes, Pydantic schemas, MCP
tools/resources, and `mcp_server/TOOLS.md`; provide compatibility/redirect notes
for existing tool callers.
- Expose list/filter-by-classification (category, domain, capability, business
stake) in MCP/REST.
Done when the API/MCP surface uses "workplan", validates classification, and tests
are green.
### P3 — Auto-registration tooling
```task
id: STATE-WP-0065-T03
status: todo
priority: high
```
Build an idempotent `register-from-classification` capability (Make target +
script + MCP tool): given a repo (local path or Gitea API), read
`.repo-classification.yaml`, validate against T01, and upsert the `managed_repo`
with the full classification. Support a **bulk** run and reclassification of
existing rows. Reuse the k3s/Gitea access path (Gitea in k3s on coulombcore via
`kubectl port-forward svc/gitea-http`).
Done when one command registers/reclassifies every repo with a valid file and
emits a report of registered / updated / skipped / invalid.
### P4 — Surfaces & cutover
```task
id: STATE-WP-0065-T04
status: todo
priority: medium
```
- Dashboard: navigate by category / domain / capability / business-stake.
- Consistency checker: add a rule flagging registered repos lacking a valid
`.repo-classification.yaml`; update existing C-rules coupled to the old domain
model to avoid mass false positives.
- Update orientation docs that assume the old domain model (`SCOPE.md`,
`README.md`, `.claude/rules/*` in the-custodian and state-hub).
- Cutover: switch orientation/registration tooling to the new model end-to-end,
archive the old domain semantics, run `make fix-consistency REPO=the-custodian`.
Done when an end-to-end pass (classify → auto-register → dashboard view) is
verified and the old ad-hoc domain model is retired.
## Sequencing
P1 → P2 (P2 depends on the renamed/extended schema). P3 depends on P2's
validation + classification columns. P4 last (consuming surfaces + cutover).
Lazy reclassification of already-registered repos (CUST-WP-0050 T07) runs via P3
as committed `.repo-classification.yaml` files appear — not an upfront batch.
Classifying + registering the remaining Gitea inventory is a **post-cutover** task
under the new model (CUST-WP-0050 T11).
## Risks
- **Breaking-migration blast radius:** `domain_id`/`topic_id` thread through ~11
models and ~12 routers/schemas; `workstream` through ~14 routers + MCP server +
TOOLS.md. Mitigate with reviewed dry-run + tested downgrade (P1) and merging the
three spine changes into one window.
- **Consistency-checker coupling:** existing C-rules assume the current domain
model; update alongside (P4).