Files
state-hub/workplans/STATE-WP-0065-repo-anchored-classification-spine.md
tegwick 1620701ae4 Add STATE-WP-0065: repo-anchored classification spine (CUST-WP-0050 impl)
Re-homed implementation of CUST-WP-0050 Phase 3-4. P1 merges the schema
redesign, data migration, and workstream->workplan rename into one Alembic
window; P2 API/MCP/validation; P3 auto-registration; P4 surfaces & cutover.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-22 11:52:55 +02:00

155 lines
6.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
id: STATE-WP-0065
type: workplan
title: "Repo-anchored classification spine (CUST-WP-0050 implementation)"
domain: custodian
repo: state-hub
status: ready
owner: custodian
topic_slug: custodian
created: "2026-06-22"
updated: "2026-06-22"
state_hub_workstream_id: "8dc7d106-11e2-41df-b512-89ed69d2a65f"
---
# STATE-WP-0065 — Repo-anchored classification spine
**Origin / driver:** `the-custodian/workplans/CUST-WP-0050-repo-classification-registration-redesign.md`
(custodian coordination workplan; owns the canon standard and the portfolio
decisions D1/D1a + ADR-005). This workplan is the **state-hub-local
implementation** of CUST-WP-0050 Phase 34, re-homed per that plan's repo
boundary note and ADR-001.
## Inputs already in place (from CUST-WP-0050)
- **Standard + allowed-values:** `the-custodian/canon/standards/repo-classification-standard_v1.0.md`
and `…/repo-classification.allowed.yaml` (5 categories incl. `tooling`, 14
market domains, business_stake/mechanics enums, capability families).
- **Validator:** `the-custodian/tools/validate_repo_classification.py` (self-test PASS).
- **Fixtures:** all 11 custodian-domain repos carry committed, human-reviewed
`.repo-classification.yaml` (`classified_by: human`). Use these to build and
test against; no dependency on classifying the full Gitea inventory first.
- **Decisions:** D1 (repo is the primary anchor; market-domain derived from
classification; topic demoted/retired). ADR-005 (cross-repo workplans anchor to
a dedicated project repo, retired on completion).
## Goal
Replace the Hub's ad-hoc 14-domain spine with the repo-anchored classification
model: the **repo** (plus its committed `.repo-classification.yaml`) becomes the
primary anchor for workplans; market-domain is **derived** from classification;
`workstream` is renamed to `workplan`. The Hub stays fully rebuildable from
repo-owned files (ADR-001).
## Phases / Tasks
### P1 — Spine migration (single Alembic revision)
```task
id: STATE-WP-0065-T01
status: todo
priority: high
```
One coordinated migration (merges CUST-WP-0050 T04 + T05 + T10 — they rewrite the
same spine, so a single window avoids migrating rows two or three times):
- Replace `domains` table contents with the **14 fixed market domains**
(infotech, financials, communication, consumer, health, industrials, energy,
utilities, materials, realestate, crypto, agents, space, government).
- Add classification storage to `managed_repos`: `category`, primary `domain_id`,
`secondary_domains[]`, `capability_tags[]`, `business_stake[]`,
`business_mechanics[]`, plus provenance (`classified_at`, `classified_by`,
`standard_version`).
- Make `workstreams.repo_id` the **required** anchor; demote `topic_id` to
nullable (optional cross-repo tag). Promote `RepoGoal` to the goal primitive;
reduce `DomainGoal` to a thin rollup keyed by the 14 market domains.
- **Rename `workstream``workplan`** across the table, FKs (workstream_dependency,
repo_goal, task.workstream_id, progress_event, decision, etc.), models, and
schemas.
- **Data migration:** backfill `repo_id` on every workstream (no orphans); map old
14 domains → new model per standard §15; assign existing 57 registered repos a
**derived** classification (`classified_by: migration`) so nothing orphans;
resolve discrepancies — reconcile `markitect-project``markitect-main`, retire
phantom `railiance-bootstrap`/`railiance-hosts` (or relink), collapse the
`vergabe_teilnahme` duplicate, decide `personhood` disposition.
Done when forward migration applies with zero orphaned coordination records, a
dry-run report is reviewed, and a tested downgrade path exists.
### P2 — API / MCP / validation surface
```task
id: STATE-WP-0065-T02
status: todo
priority: high
```
- Enforce the T01 allowed-values at the API boundary (import the canon
`repo-classification.allowed.yaml`); reject invalid category/domain/tags/stake/
mechanics.
- Rename `workstream``workplan` across REST routes, Pydantic schemas, MCP
tools/resources, and `mcp_server/TOOLS.md`; provide compatibility/redirect notes
for existing tool callers.
- Expose list/filter-by-classification (category, domain, capability, business
stake) in MCP/REST.
Done when the API/MCP surface uses "workplan", validates classification, and tests
are green.
### P3 — Auto-registration tooling
```task
id: STATE-WP-0065-T03
status: todo
priority: high
```
Build an idempotent `register-from-classification` capability (Make target +
script + MCP tool): given a repo (local path or Gitea API), read
`.repo-classification.yaml`, validate against T01, and upsert the `managed_repo`
with the full classification. Support a **bulk** run and reclassification of
existing rows. Reuse the k3s/Gitea access path (Gitea in k3s on coulombcore via
`kubectl port-forward svc/gitea-http`).
Done when one command registers/reclassifies every repo with a valid file and
emits a report of registered / updated / skipped / invalid.
### P4 — Surfaces & cutover
```task
id: STATE-WP-0065-T04
status: todo
priority: medium
```
- Dashboard: navigate by category / domain / capability / business-stake.
- Consistency checker: add a rule flagging registered repos lacking a valid
`.repo-classification.yaml`; update existing C-rules coupled to the old domain
model to avoid mass false positives.
- Update orientation docs that assume the old domain model (`SCOPE.md`,
`README.md`, `.claude/rules/*` in the-custodian and state-hub).
- Cutover: switch orientation/registration tooling to the new model end-to-end,
archive the old domain semantics, run `make fix-consistency REPO=the-custodian`.
Done when an end-to-end pass (classify → auto-register → dashboard view) is
verified and the old ad-hoc domain model is retired.
## Sequencing
P1 → P2 (P2 depends on the renamed/extended schema). P3 depends on P2's
validation + classification columns. P4 last (consuming surfaces + cutover).
Lazy reclassification of already-registered repos (CUST-WP-0050 T07) runs via P3
as committed `.repo-classification.yaml` files appear — not an upfront batch.
Classifying + registering the remaining Gitea inventory is a **post-cutover** task
under the new model (CUST-WP-0050 T11).
## Risks
- **Breaking-migration blast radius:** `domain_id`/`topic_id` thread through ~11
models and ~12 routers/schemas; `workstream` through ~14 routers + MCP server +
TOOLS.md. Mitigate with reviewed dry-run + tested downgrade (P1) and merging the
three spine changes into one window.
- **Consistency-checker coupling:** existing C-rules assume the current domain
model; update alongside (P4).