Implements CUST-WP-0007. Resolves inconsistencies I-1, I-2, I-5, I-6
identified in the GEMS audit (GenericEntityModellingSystem.md).
Pass 1 (e1f2a3b4c5d6): domain_id FK on extension_points and
technical_debt (replaces raw string column); repo_id FK on contributions.
Fixes domain-filtering bugs in EP/TD dashboard pages.
Pass 2 (f2a3b4c5d6e7): repo_id nullable FK on workstreams, aligning
the GEMS primary attachment with ADR-001 (repo > topic). Dashboard
pages updated to prefer repo->domain over topic->domain.
Pass 3 (a3b4c5d6e7f8): SBOMSnapshot container entity (GEMS Complex
between Repository and SBOMEntry). Ingest is now additive — each call
creates a new snapshot; history is retained. List/report endpoints
filter to latest snapshot per repo via _latest_snapshot_ids_subquery().
New endpoints: GET /sbom/snapshots/, GET /sbom/snapshots/{id}/.
Dashboard gains a Snapshot History section.
Also adds GEMS analysis artefacts: wiki/GEMS-StateHub-TypeRegistry.md,
wiki/GEMS-StateHub-SWOT.md, workplans/CUST-WP-0006 (analysis),
workplans/CUST-WP-0007 (migration, now completed).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
11 KiB
Generic entity modeling system
A domain-agnostic data modeling system for organizing “entities under management” in a rigorous, flexible, and extensible way.
Goals
- Rigorous: clear invariants, predictable querying, safe evolution.
- Flexible: new entity types and new relations without migrations that rewrite everything.
- Extensible: supports multiple domains, sub-domains, and incremental adoption over existing data.
1. Core concepts
1.1 Entity
An Entity is the atomic unit of identity and lifecycle.
Entity fields (conceptual)
id(immutable unique identifier)kind∈ {Atom,Complex,Relation}type(domain-specific type name, e.g.Task,Repository,Customer)payload(type-specific attributes; ideally versioned)attachments(ordered list of entity references)meta(timestamps, version, permissions, provenance)
Entity kinds
- Atom: primary facts / content objects.
- Complex: organizational containers and structure owners (hierarchy, collections, indexes, contexts).
- Relation: first-class edge object that encodes a relationship between entities; owned by a Complex.
2. Attachments
2.1 Attachment list
Every entity has an ordered list:
attachments: [EntityRef]attachments[0]is the Primary Attachment.
Derived notion: “Part-of”
If an entity’s primary attachment is an Atom, then the entity is a Part of that Atom.
This is a classification derived from data, not a separate stored relation.
2.2 Attachment roles (recommended)
To avoid ambiguity and allow validation, each attachment can optionally have a role label.
Conceptually:
attachment = { targetId, position, role }
Common roles:
primary(implicit by position 0)index(entity appears in this complex for navigation/search)provenance(source reference)tag(classification)context(additional scope)
Ordering remains canonical; roles improve clarity and constraints.
3. Hierarchy and layering
3.1 Primary chain
The Primary Chain of an entity is obtained by repeatedly following attachments[0].
Invariant (recommended): the primary chain must be acyclic.
This yields a robust layering model:
- Every entity “lives in” a context (a Complex), or is “part of” an Atom.
- You can always answer: “Where does this belong?” by walking the primary chain.
3.2 Roots and scopes
A system should define at least one root Complex (e.g. Ecosystem, Workspace, Tenant).
All managed entities must be reachable from a root by following primary attachments.
4. Relations as first-class entities
4.1 Relation entity
A Relation is an entity whose purpose is to define a connection among other entities.
Key rule
- A Relation’s primary attachment MUST be a Complex. That Complex is the relation-space (the context that “owns” the relationship).
This avoids “atoms knowing” relation details: atoms remain content, complexes and relations hold structure.
4.2 Relation endpoints convention
To make relations queryable and consistent, standardize attachment slots:
attachments[0] = contextComplex(primary; relation-space)attachments[1] = fromEndpointattachments[2] = toEndpointattachments[3..] = optional extra endpoints(evidence, via, stakeholder, etc.)
Relation semantics live in:
type(e.g.DependsOn,Implements,References)- and/or payload fields like
{ relType: "...", strength: ..., rationale: ... }
5. Type system and constraints
5.1 Entity Type Registry
Maintain a registry of types describing:
kind: Atom/Complex/Relation- allowed primary attachment kinds/types
- allowed secondary attachment kinds/types
- payload schema (optional but recommended)
- indexing / query defaults
Example (conceptual):
Task: kind=Atom, primary must beRepository(Complex)Repository: kind=Complex, primary must beDomain(Complex)
5.2 Validation invariants (recommended minimum)
-
Exactly one primary attachment (position 0).
-
Primary chain must be acyclic.
-
Primary attachment kind/type constraints must match the registry.
-
Context-consistency constraints for organizer complexes:
- if
Taskhas a secondary attachment toWorkstream, thenTask.primary == Workstream.primary(same repository).
- if
-
Relation constraints:
- primary must be Complex
- endpoint types must match relation type definition
- relation context must match endpoint context rules (usually same repo/domain)
These constraints give rigor without hard-coding a single domain model.
6. Query model (domain-agnostic)
These queries exist in any domain:
6.1 Locate context
context(entity)= walk primary chain to root, or to the nearest scope boundary (e.g. nearest Domain/Workspace).
6.2 Membership
- Members of a Complex: all entities with
primary == complexId.
6.3 Parts of an Atom
- Parts of an Atom: all entities with
primary == atomId.
6.4 Relations in a relation-space
- Relations owned by a Complex: all Relation entities with
primary == complexId.
6.5 Neighborhood (graph view)
- For entity X: find all relations in the same relation-space where X appears as endpoint.
7. Example domain: Ecosystem → Domain → Repository → Workstreams/SBOMs
This section makes the system concrete using your types.
7.1 Complexes
Ecosystem(Complex, root)Domain(Complex, primary = Ecosystem)Repository(Complex, primary = Domain)Workstream(Complex, primary = Repository) — organizes work itemsSBOM(Complex, primary = Repository) — organizes dependencies
7.2 Atoms
Decision(Atom, primary = Repository)Task(Atom, primary = Repository)TechDebt(Atom, primary = Repository)Extend(Atom, primary = Repository)Dependency(Atom, primary = Repository)
7.3 Organizing via secondary attachments
-
A Task in a Workstream:
Task.attachments = [Repo42, Workstream7]
-
A Dependency in an SBOM:
Dependency.attachments = [Repo42, Sbom3]
Atoms remain ignorant of how the workstream orders tasks; the workstream can store structure.
7.4 Relation examples
Task → Task dependency (repo-scoped)
Relation type: DependsOn (Relation)
DependsOn.attachments = [Repo42, TaskA, TaskB]- payload:
{ critical: true, reason: "API contract needed first" }
Decision influences tasks (repo-scoped)
Relation type: Motivates (Relation)
Motivates.attachments = [Repo42, Decision9, TaskA]
Dependency graph inside an SBOM (sbom-scoped)
Relation type: Requires (Relation)
Requires.attachments = [Sbom3, DependencyX, DependencyY]- payload:
{ scope: "runtime" }
This cleanly separates:
- planning relations (Repo relation-space)
- supply-chain relations (SBOM relation-space)
8. Applying the modeling system to a new domain
You can apply this to any domain by following a small method.
8.1 Step-by-step method
Step 1 — Choose a root Complex
Pick the top-level scope:
Workspace,Tenant,Organization,Ecosystem, etc.
Step 2 — Identify “containers” vs “content”
- Containers become Complexes (projects, folders, accounts, repositories, case files).
- Content objects become Atoms (documents, customers, invoices, tickets, assets).
Rule of thumb:
- If it organizes others or defines a scope, it’s a Complex.
- If it’s a “thing” with intrinsic content/lifecycle, it’s an Atom.
Step 3 — Define the primary hierarchy (layering)
Decide what “belongs to what” as the default place where entities live. Example pattern:
Atom.primary = nearest containing Complex
Step 4 — Define organizer complexes (optional)
Introduce complexes like Workstream, Board, Collection, SBOM, Timeline that provide structure.
Use secondary attachments from atoms to these complexes.
Step 5 — Define relation-spaces
Choose where relations live:
- typically in the “owning” complex (project/repo/case)
- sometimes in a specialized complex (SBOM, timeline, graph)
Step 6 — Create a Type Registry + constraints
For each type, specify:
- kind
- required primary attachment type(s)
- optional secondary attachment types
- allowed relation endpoints (if relation type)
Step 7 — Migrate incrementally
Start with primary attachments and identity first. Add organizer complexes and relations later without breaking identity.
9. Applying it to an existing domain with pre-existing entities
The key is to wrap existing entities as Entities in this system without rewriting them all at once.
9.1 Integration patterns
Pattern A — “Entity wrapper” over existing tables/documents
-
Keep existing storage unchanged.
-
Create an
Entityrecord that references external storage:- payload contains
{ externalType, externalId, sourceSystem }
- payload contains
-
Attachments, relations, and organization are managed in the new layer.
This is the safest “overlay” approach.
Pattern B — “Dual write” for new objects
- New entities are created in the new model as the source of truth.
- Optionally mirrored into legacy storage for compatibility.
Pattern C — “Progressive normalization”
- Start overlay-style.
- Gradually move the most valuable types (e.g., Tasks, Decisions) into native entities.
- Leave rarely touched legacy objects wrapped indefinitely.
9.2 Migration steps for existing data
-
Assign stable IDs
- If legacy IDs exist, reuse them with a namespace prefix.
-
Create root complexes
- e.g. one
Ecosystemor per-tenantWorkspace.
- e.g. one
-
Attach existing entities to a primary context
- even if initially coarse (everything attaches to one domain/project).
-
Introduce finer complexes
- split into domains, repos/projects later by moving primary attachments.
-
Add relations incrementally
- create relation entities for the relationships you query most.
-
Backfill organizer complexes
- workstreams, boards, SBOMs, etc., via secondary attachments.
Because relations and organization are additive, you can evolve structure without breaking identity.
10. What this system buys you
- A uniform modeling surface across domains.
- A clean separation of content (atoms) from structure (complexes + relations).
- Multiple overlapping organizations via secondary attachments without duplication.
- First-class relationships with auditability and contextual ownership.
- Incremental adoption over legacy systems.
Extension Points
This could be turned into a compact “spec” format (like a small RFC) plus a concrete “Type Registry” table for your example (including recommended relation types and constraints).
xxx