Files
the-custodian/workplans/CUST-WP-0006-gems-state-hub.md
tegwick 9d8bc4a8e6 chore(CUST-WP-0006): mark workplan done — all deliverables complete
DoD passed: TypeRegistry.md, SWOT.md, CUST-WP-0007 workplan all present
and committed. No automated tests (pure analysis/planning workstream, no code).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 21:46:17 +01:00

195 lines
7.0 KiB
Markdown

---
id: CUST-WP-0006
type: workplan
title: GEMS Analysis & State-Hub Migration Planning
domain: custodian
status: done
owner: custodian
topic_slug: the-custodian
created: 2026-03-02
updated: 2026-03-12
state_hub_workstream_id: "7ce13282-d534-492a-8d42-b3a134028823"
---
# CUST-WP-0006 — GEMS Analysis & State-Hub Migration Planning
## Purpose
Apply the Generic Entity Modelling System (GEMS) — documented in
`wiki/GenericEntityModellingSystem.md` — to the state-hub data store.
This workplan covers:
1. A domain-specific GEMS implementation skeleton (type registry + hierarchy)
2. An audit of current model inconsistencies against GEMS principles
3. Escalation of all non-trivial structural decisions
4. A SWOT analysis of migrating the state-hub data store to the GEMS model
5. (Deferred) A refined migration workplan, once key decisions are resolved
Companion documents:
- `wiki/GEMS-StateHub-TypeRegistry.md` — canonical type registry
- `wiki/GEMS-StateHub-SWOT.md` — SWOT analysis
---
## Phase 1 — GEMS Skeleton Definition
### Task T01: Map current state-hub entities to GEMS types
```task
id: T01
status: done
priority: high
assignee: custodian
state_hub_task_id: "b771f327-128e-4923-a7ad-2080b4e49eb9"
```
**Deliverable:** `wiki/GEMS-StateHub-TypeRegistry.md`
See companion document. Summary:
| Current Table | GEMS Kind | GEMS Primary | Notes |
|---|---|---|---|
| `Domain` | Complex | Ecosystem (implicit root) | 6 canonical domains |
| `Topic` | Complex | Domain | Focus area / active project |
| `ManagedRepo` | Complex | Domain | Managed git repo |
| `Workstream` | Complex | **Repository** (currently Topic) | Work package — ADR-001 mismatch |
| `SBOMSnapshot` | Complex | Repository | Does not yet exist as an entity |
| `Task` | Atom | Workstream | ✓ correct |
| `Decision` | Atom | Repository (currently Topic or Workstream) | Dual-attach ambiguity |
| `TechnicalDebt` | Atom | Repository (currently domain: str) | String FK inconsistency |
| `ExtensionPoint` | Atom | Repository (currently domain: str) | String FK inconsistency |
| `Contribution` | Atom | Repository (no domain FK) | No domain affiliation |
| `ProgressEvent` | Atom | Workstream (or Topic) | Multi-attach ambiguity |
| `SBOMEntry` | Atom | SBOMSnapshot (currently ManagedRepo) | No container |
| `WorkstreamDependency` | Relation | Domain | Flat junction table |
---
### Task T02: Inconsistency audit — current model vs GEMS
```task
id: T02
status: done
priority: high
assignee: custodian
state_hub_task_id: "2be639ea-a19e-4c80-bcde-c3da06ec5a49"
```
**Identified inconsistencies:**
**I-1 — String domain field (high severity)**
`ExtensionPoint.domain` and `TechnicalDebt.domain` are `String(50)` columns, not FKs
to `domains.id`. The rename_domain API patches these manually via string updates —
there is no referential integrity. Dashboard filtering silently returns empty results
when slugs drift.
**I-2 — Workstream primary container is Topic, not Repository (critical severity)**
GEMS §7 places `Workstream.primary = Repository`. ADR-001 states that workplans
(the file backing a workstream) must originate in a repository. However, the current
schema has `Workstream.topic_id NOT NULL` — Topic is the enforced primary container.
This is an ADR-001 violation embedded in the schema itself. There is currently no
`repo_id` on Workstream.
**I-3 — Decision dual attachment without clear hierarchy (medium severity)**
`Decision` has both `topic_id` and `workstream_id` FKs, with a CHECK constraint
requiring at least one. GEMS requires exactly one primary attachment. The current model
allows ambiguous "where does this Decision live" answers.
**I-4 — ManagedRepo has a nullable `topic_id` FK (low-medium severity)**
`ManagedRepo.topic_id` is a nullable FK to `topics`. In GEMS, Repository is a Complex
whose primary is Domain, not Topic. The topic_id on Repo suggests a second, conflicting
hierarchy path.
**I-5 — No SBOMSnapshot container entity (medium severity)**
SBOM entries are flat rows tagged with `repo_id` and `snapshot_at`. GEMS §7 defines
`SBOM (Complex, primary=Repository)` as an organizer. Without this container, it is
impossible to query "all packages in snapshot X" as a first-class concept, or to model
snapshot-to-snapshot diffs.
**I-6 — Contribution has no domain or repository FK (medium severity)**
`Contribution` has only optional `related_topic_id` and `related_workstream_id`. There
is no direct link to Domain or Repository, making domain-scoped contribution queries
fragile.
**I-7 — No Ecosystem root entity (low severity)**
GEMS §3.2 requires at least one root Complex. The current model has no explicit root —
Domain is the de-facto root but is not declared as such. This matters when you want
to express cross-domain relations or system-level policies.
**I-8 — WorkstreamDependency as flat junction table (low severity)**
GEMS §4 defines Relations as first-class entities whose primary is a Complex. The
current `WorkstreamDependency` is a flat table. For the current usage this works, but
it makes contextual queries (e.g. "all dependencies within domain X") less uniform.
---
## Phase 2 — Decision Escalation
### Task T03: Register non-trivial decisions in state-hub
```task
id: T03
status: done
priority: critical
assignee: custodian
state_hub_task_id: "da639706-3b14-42b8-92de-b9de84dbb2be"
```
Six decisions were escalated (see state-hub records):
- DEC-GEMS-001: GEMS implementation architecture (typed tables vs. generic entity model)
- DEC-GEMS-002: Workstream primary container — Topic vs. Repository
- DEC-GEMS-003: Domain string → FK migration for ExtensionPoint and TechnicalDebt
- DEC-GEMS-004: SBOMSnapshot container entity
- DEC-GEMS-005: Ecosystem root entity
- DEC-GEMS-006: WorkstreamDependency as first-class Relation entity
---
## Phase 3 — SWOT Analysis
### Task T04: Produce SWOT analysis document
```task
id: T04
status: done
priority: high
assignee: custodian
state_hub_task_id: "96dd379a-02b4-4629-8db7-3bef15b9639d"
```
**Deliverable:** `wiki/GEMS-StateHub-SWOT.md`
See companion document for the full analysis. Summary verdict:
The migration is **worth pursuing incrementally** (Pattern C from GEMS §9). The most
impactful and least risky first move is:
1. Fix I-1: migrate domain string → FK on EP/TD (low risk, high consistency gain)
2. Fix I-2: add `repo_id` to Workstream (medium risk, fixes ADR-001 alignment)
3. Add SBOMSnapshot container (medium risk, enables snapshot diffing)
Full generic entity table architecture (Option A) is deferred until after the typed-table
alignment is stable and validated.
---
## Phase 4 — Migration Workplan Refinement (deferred)
### Task T05: Write detailed migration workplan (CUST-WP-0007)
```task
id: T05
status: done
priority: high
assignee: custodian
state_hub_task_id: "032649fb-2d21-44b7-9735-346405168d8e"
```
Once the six decisions are resolved, produce `workplans/CUST-WP-0007-gems-migration.md`
covering:
- Schema migrations (Alembic versions)
- Data backfill scripts
- API router changes
- MCP tool changes
- Dashboard updates
- ADR-001 workplan file format updates