Files
state-hub/workplans/STATE-WP-0065-repo-anchored-classification-spine.md
tegwick 0949d4c0d8 feat(classification-spine): implement STATE-WP-0065 repo-anchored model
Replace the ad-hoc coordination-domain spine with the Repo Classification
Standard: 14 market domains, classification columns on managed_repos, and
workplans anchored by repo_id (topic_id optional).

- Add Alembic migration d8e9f0a1b2c3 with data backfill and workstream→workplan rename
- Add api/classification.py validation and register-from-classification tooling
- Expose workplan-first REST/MCP surface with legacy workstream aliases
- Add C-24 consistency rule and legacy domain frontmatter mapping
- Update dashboard repos page with category/capability/stake filters
- Update orientation docs; mark STATE-WP-0065 finished
2026-06-22 13:52:13 +02:00

161 lines
6.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
id: STATE-WP-0065
type: workplan
title: "Repo-anchored classification spine (CUST-WP-0050 implementation)"
domain: custodian
repo: state-hub
status: finished
owner: custodian
topic_slug: custodian
created: "2026-06-22"
updated: "2026-06-22"
started: "2026-06-22"
finished: "2026-06-22"
state_hub_workstream_id: "8dc7d106-11e2-41df-b512-89ed69d2a65f"
---
# STATE-WP-0065 — Repo-anchored classification spine
**Origin / driver:** `the-custodian/workplans/CUST-WP-0050-repo-classification-registration-redesign.md`
(custodian coordination workplan; owns the canon standard and the portfolio
decisions D1/D1a + ADR-005). This workplan is the **state-hub-local
implementation** of CUST-WP-0050 Phase 34, re-homed per that plan's repo
boundary note and ADR-001.
## Inputs already in place (from CUST-WP-0050)
- **Standard + allowed-values:** `the-custodian/canon/standards/repo-classification-standard_v1.0.md`
and `…/repo-classification.allowed.yaml` (5 categories incl. `tooling`, 14
market domains, business_stake/mechanics enums, capability families).
- **Validator:** `the-custodian/tools/validate_repo_classification.py` (self-test PASS).
- **Fixtures:** all 11 custodian-domain repos carry committed, human-reviewed
`.repo-classification.yaml` (`classified_by: human`). Use these to build and
test against; no dependency on classifying the full Gitea inventory first.
- **Decisions:** D1 (repo is the primary anchor; market-domain derived from
classification; topic demoted/retired). ADR-005 (cross-repo workplans anchor to
a dedicated project repo, retired on completion).
## Goal
Replace the Hub's ad-hoc 14-domain spine with the repo-anchored classification
model: the **repo** (plus its committed `.repo-classification.yaml`) becomes the
primary anchor for workplans; market-domain is **derived** from classification;
`workstream` is renamed to `workplan`. The Hub stays fully rebuildable from
repo-owned files (ADR-001).
## Phases / Tasks
### P1 — Spine migration (single Alembic revision)
```task
id: STATE-WP-0065-T01
status: done
priority: high
state_hub_task_id: "14cf65f1-e5af-4905-8de4-bc8986ef078e"
```
One coordinated migration (merges CUST-WP-0050 T04 + T05 + T10 — they rewrite the
same spine, so a single window avoids migrating rows two or three times):
- Replace `domains` table contents with the **14 fixed market domains**
(infotech, financials, communication, consumer, health, industrials, energy,
utilities, materials, realestate, crypto, agents, space, government).
- Add classification storage to `managed_repos`: `category`, primary `domain_id`,
`secondary_domains[]`, `capability_tags[]`, `business_stake[]`,
`business_mechanics[]`, plus provenance (`classified_at`, `classified_by`,
`standard_version`).
- Make `workstreams.repo_id` the **required** anchor; demote `topic_id` to
nullable (optional cross-repo tag). Promote `RepoGoal` to the goal primitive;
reduce `DomainGoal` to a thin rollup keyed by the 14 market domains.
- **Rename `workstream``workplan`** across the table, FKs (workstream_dependency,
repo_goal, task.workstream_id, progress_event, decision, etc.), models, and
schemas.
- **Data migration:** backfill `repo_id` on every workstream (no orphans); map old
14 domains → new model per standard §15; assign existing 57 registered repos a
**derived** classification (`classified_by: migration`) so nothing orphans;
resolve discrepancies — reconcile `markitect-project``markitect-main`, retire
phantom `railiance-bootstrap`/`railiance-hosts` (or relink), collapse the
`vergabe_teilnahme` duplicate, decide `personhood` disposition.
Done when forward migration applies with zero orphaned coordination records, a
dry-run report is reviewed, and a tested downgrade path exists.
### P2 — API / MCP / validation surface
```task
id: STATE-WP-0065-T02
status: done
priority: high
state_hub_task_id: "d3afcae1-d47e-42f1-bad8-1de4bd1f126a"
```
- Enforce the T01 allowed-values at the API boundary (import the canon
`repo-classification.allowed.yaml`); reject invalid category/domain/tags/stake/
mechanics.
- Rename `workstream``workplan` across REST routes, Pydantic schemas, MCP
tools/resources, and `mcp_server/TOOLS.md`; provide compatibility/redirect notes
for existing tool callers.
- Expose list/filter-by-classification (category, domain, capability, business
stake) in MCP/REST.
Done when the API/MCP surface uses "workplan", validates classification, and tests
are green.
### P3 — Auto-registration tooling
```task
id: STATE-WP-0065-T03
status: done
priority: high
state_hub_task_id: "bab90f0c-238e-4f43-b34c-a8cdd8faf0e6"
```
Build an idempotent `register-from-classification` capability (Make target +
script + MCP tool): given a repo (local path or Gitea API), read
`.repo-classification.yaml`, validate against T01, and upsert the `managed_repo`
with the full classification. Support a **bulk** run and reclassification of
existing rows. Reuse the k3s/Gitea access path (Gitea in k3s on coulombcore via
`kubectl port-forward svc/gitea-http`).
Done when one command registers/reclassifies every repo with a valid file and
emits a report of registered / updated / skipped / invalid.
### P4 — Surfaces & cutover
```task
id: STATE-WP-0065-T04
status: done
priority: medium
state_hub_task_id: "67c54009-823d-466b-beaf-f27351c279f4"
```
- Dashboard: navigate by category / domain / capability / business-stake.
- Consistency checker: add a rule flagging registered repos lacking a valid
`.repo-classification.yaml`; update existing C-rules coupled to the old domain
model to avoid mass false positives.
- Update orientation docs that assume the old domain model (`SCOPE.md`,
`README.md`, `.claude/rules/*` in the-custodian and state-hub).
- Cutover: switch orientation/registration tooling to the new model end-to-end,
archive the old domain semantics, run `make fix-consistency REPO=the-custodian`.
Done when an end-to-end pass (classify → auto-register → dashboard view) is
verified and the old ad-hoc domain model is retired.
## Sequencing
P1 → P2 (P2 depends on the renamed/extended schema). P3 depends on P2's
validation + classification columns. P4 last (consuming surfaces + cutover).
Lazy reclassification of already-registered repos (CUST-WP-0050 T07) runs via P3
as committed `.repo-classification.yaml` files appear — not an upfront batch.
Classifying + registering the remaining Gitea inventory is a **post-cutover** task
under the new model (CUST-WP-0050 T11).
## Risks
- **Breaking-migration blast radius:** `domain_id`/`topic_id` thread through ~11
models and ~12 routers/schemas; `workstream` through ~14 routers + MCP server +
TOOLS.md. Mitigate with reviewed dry-run + tested downgrade (P1) and merging the
three spine changes into one window.
- **Consistency-checker coupling:** existing C-rules assume the current domain
model; update alongside (P4).