--- id: CUST-WP-0006 type: workplan title: GEMS Analysis & State-Hub Migration Planning domain: custodian status: done owner: custodian topic_slug: the-custodian created: 2026-03-02 updated: 2026-03-12 state_hub_workstream_id: "7ce13282-d534-492a-8d42-b3a134028823" --- # CUST-WP-0006 — GEMS Analysis & State-Hub Migration Planning ## Purpose Apply the Generic Entity Modelling System (GEMS) — documented in `wiki/GenericEntityModellingSystem.md` — to the state-hub data store. This workplan covers: 1. A domain-specific GEMS implementation skeleton (type registry + hierarchy) 2. An audit of current model inconsistencies against GEMS principles 3. Escalation of all non-trivial structural decisions 4. A SWOT analysis of migrating the state-hub data store to the GEMS model 5. (Deferred) A refined migration workplan, once key decisions are resolved Companion documents: - `wiki/GEMS-StateHub-TypeRegistry.md` — canonical type registry - `wiki/GEMS-StateHub-SWOT.md` — SWOT analysis --- ## Phase 1 — GEMS Skeleton Definition ### Task T01: Map current state-hub entities to GEMS types ```task id: T01 status: done priority: high assignee: custodian state_hub_task_id: "b771f327-128e-4923-a7ad-2080b4e49eb9" ``` **Deliverable:** `wiki/GEMS-StateHub-TypeRegistry.md` See companion document. Summary: | Current Table | GEMS Kind | GEMS Primary | Notes | |---|---|---|---| | `Domain` | Complex | Ecosystem (implicit root) | 6 canonical domains | | `Topic` | Complex | Domain | Focus area / active project | | `ManagedRepo` | Complex | Domain | Managed git repo | | `Workstream` | Complex | **Repository** (currently Topic) | Work package — ADR-001 mismatch | | `SBOMSnapshot` | Complex | Repository | Does not yet exist as an entity | | `Task` | Atom | Workstream | ✓ correct | | `Decision` | Atom | Repository (currently Topic or Workstream) | Dual-attach ambiguity | | `TechnicalDebt` | Atom | Repository (currently domain: str) | String FK inconsistency | | `ExtensionPoint` | Atom | Repository (currently domain: str) | String FK inconsistency | | `Contribution` | Atom | Repository (no domain FK) | No domain affiliation | | `ProgressEvent` | Atom | Workstream (or Topic) | Multi-attach ambiguity | | `SBOMEntry` | Atom | SBOMSnapshot (currently ManagedRepo) | No container | | `WorkstreamDependency` | Relation | Domain | Flat junction table | --- ### Task T02: Inconsistency audit — current model vs GEMS ```task id: T02 status: done priority: high assignee: custodian state_hub_task_id: "2be639ea-a19e-4c80-bcde-c3da06ec5a49" ``` **Identified inconsistencies:** **I-1 — String domain field (high severity)** `ExtensionPoint.domain` and `TechnicalDebt.domain` are `String(50)` columns, not FKs to `domains.id`. The rename_domain API patches these manually via string updates — there is no referential integrity. Dashboard filtering silently returns empty results when slugs drift. **I-2 — Workstream primary container is Topic, not Repository (critical severity)** GEMS §7 places `Workstream.primary = Repository`. ADR-001 states that workplans (the file backing a workstream) must originate in a repository. However, the current schema has `Workstream.topic_id NOT NULL` — Topic is the enforced primary container. This is an ADR-001 violation embedded in the schema itself. There is currently no `repo_id` on Workstream. **I-3 — Decision dual attachment without clear hierarchy (medium severity)** `Decision` has both `topic_id` and `workstream_id` FKs, with a CHECK constraint requiring at least one. GEMS requires exactly one primary attachment. The current model allows ambiguous "where does this Decision live" answers. **I-4 — ManagedRepo has a nullable `topic_id` FK (low-medium severity)** `ManagedRepo.topic_id` is a nullable FK to `topics`. In GEMS, Repository is a Complex whose primary is Domain, not Topic. The topic_id on Repo suggests a second, conflicting hierarchy path. **I-5 — No SBOMSnapshot container entity (medium severity)** SBOM entries are flat rows tagged with `repo_id` and `snapshot_at`. GEMS §7 defines `SBOM (Complex, primary=Repository)` as an organizer. Without this container, it is impossible to query "all packages in snapshot X" as a first-class concept, or to model snapshot-to-snapshot diffs. **I-6 — Contribution has no domain or repository FK (medium severity)** `Contribution` has only optional `related_topic_id` and `related_workstream_id`. There is no direct link to Domain or Repository, making domain-scoped contribution queries fragile. **I-7 — No Ecosystem root entity (low severity)** GEMS §3.2 requires at least one root Complex. The current model has no explicit root — Domain is the de-facto root but is not declared as such. This matters when you want to express cross-domain relations or system-level policies. **I-8 — WorkstreamDependency as flat junction table (low severity)** GEMS §4 defines Relations as first-class entities whose primary is a Complex. The current `WorkstreamDependency` is a flat table. For the current usage this works, but it makes contextual queries (e.g. "all dependencies within domain X") less uniform. --- ## Phase 2 — Decision Escalation ### Task T03: Register non-trivial decisions in state-hub ```task id: T03 status: done priority: critical assignee: custodian state_hub_task_id: "da639706-3b14-42b8-92de-b9de84dbb2be" ``` Six decisions were escalated (see state-hub records): - DEC-GEMS-001: GEMS implementation architecture (typed tables vs. generic entity model) - DEC-GEMS-002: Workstream primary container — Topic vs. Repository - DEC-GEMS-003: Domain string → FK migration for ExtensionPoint and TechnicalDebt - DEC-GEMS-004: SBOMSnapshot container entity - DEC-GEMS-005: Ecosystem root entity - DEC-GEMS-006: WorkstreamDependency as first-class Relation entity --- ## Phase 3 — SWOT Analysis ### Task T04: Produce SWOT analysis document ```task id: T04 status: done priority: high assignee: custodian state_hub_task_id: "96dd379a-02b4-4629-8db7-3bef15b9639d" ``` **Deliverable:** `wiki/GEMS-StateHub-SWOT.md` See companion document for the full analysis. Summary verdict: The migration is **worth pursuing incrementally** (Pattern C from GEMS §9). The most impactful and least risky first move is: 1. Fix I-1: migrate domain string → FK on EP/TD (low risk, high consistency gain) 2. Fix I-2: add `repo_id` to Workstream (medium risk, fixes ADR-001 alignment) 3. Add SBOMSnapshot container (medium risk, enables snapshot diffing) Full generic entity table architecture (Option A) is deferred until after the typed-table alignment is stable and validated. --- ## Phase 4 — Migration Workplan Refinement (deferred) ### Task T05: Write detailed migration workplan (CUST-WP-0007) ```task id: T05 status: done priority: high assignee: custodian state_hub_task_id: "032649fb-2d21-44b7-9735-346405168d8e" ``` Once the six decisions are resolved, produce `workplans/CUST-WP-0007-gems-migration.md` covering: - Schema migrations (Alembic versions) - Data backfill scripts - API router changes - MCP tool changes - Dashboard updates - ADR-001 workplan file format updates