## Generic entity modeling system A domain-agnostic data modeling system for organizing “entities under management” in a rigorous, flexible, and extensible way. ### Goals * **Rigorous**: clear invariants, predictable querying, safe evolution. * **Flexible**: new entity types and new relations without migrations that rewrite everything. * **Extensible**: supports multiple domains, sub-domains, and incremental adoption over existing data. --- # 1. Core concepts ## 1.1 Entity An **Entity** is the atomic unit of identity and lifecycle. **Entity fields (conceptual)** * `id` (immutable unique identifier) * `kind` ∈ {`Atom`, `Complex`, `Relation`} * `type` (domain-specific type name, e.g. `Task`, `Repository`, `Customer`) * `payload` (type-specific attributes; ideally versioned) * `attachments` (ordered list of entity references) * `meta` (timestamps, version, permissions, provenance) ### Entity kinds * **Atom**: primary facts / content objects. * **Complex**: organizational containers and structure owners (hierarchy, collections, indexes, contexts). * **Relation**: first-class edge object that encodes a relationship between entities; owned by a Complex. --- # 2. Attachments ## 2.1 Attachment list Every entity has an ordered list: * `attachments: [EntityRef]` * `attachments[0]` is the **Primary Attachment**. ### Derived notion: “Part-of” If an entity’s primary attachment is an **Atom**, then the entity is a **Part** of that Atom. This is a *classification* derived from data, not a separate stored relation. ## 2.2 Attachment roles (recommended) To avoid ambiguity and allow validation, each attachment can optionally have a role label. Conceptually: * `attachment = { targetId, position, role }` Common roles: * `primary` (implicit by position 0) * `index` (entity appears in this complex for navigation/search) * `provenance` (source reference) * `tag` (classification) * `context` (additional scope) Ordering remains canonical; roles improve clarity and constraints. --- # 3. Hierarchy and layering ## 3.1 Primary chain The **Primary Chain** of an entity is obtained by repeatedly following `attachments[0]`. **Invariant (recommended):** the primary chain must be **acyclic**. This yields a robust layering model: * Every entity “lives in” a context (a Complex), or is “part of” an Atom. * You can always answer: “Where does this belong?” by walking the primary chain. ## 3.2 Roots and scopes A system should define at least one **root Complex** (e.g. `Ecosystem`, `Workspace`, `Tenant`). All managed entities must be reachable from a root by following primary attachments. --- # 4. Relations as first-class entities ## 4.1 Relation entity A **Relation** is an entity whose purpose is to define a connection among other entities. **Key rule** * A Relation’s **primary attachment MUST be a Complex**. That Complex is the **relation-space** (the context that “owns” the relationship). This avoids “atoms knowing” relation details: atoms remain content, complexes and relations hold structure. ## 4.2 Relation endpoints convention To make relations queryable and consistent, standardize attachment slots: * `attachments[0] = contextComplex` (primary; relation-space) * `attachments[1] = fromEndpoint` * `attachments[2] = toEndpoint` * `attachments[3..] = optional extra endpoints` (evidence, via, stakeholder, etc.) Relation semantics live in: * `type` (e.g. `DependsOn`, `Implements`, `References`) * and/or payload fields like `{ relType: "...", strength: ..., rationale: ... }` --- # 5. Type system and constraints ## 5.1 Entity Type Registry Maintain a registry of types describing: * `kind`: Atom/Complex/Relation * allowed primary attachment kinds/types * allowed secondary attachment kinds/types * payload schema (optional but recommended) * indexing / query defaults Example (conceptual): * `Task`: kind=Atom, primary must be `Repository` (Complex) * `Repository`: kind=Complex, primary must be `Domain` (Complex) ## 5.2 Validation invariants (recommended minimum) 1. **Exactly one primary attachment** (position 0). 2. **Primary chain must be acyclic**. 3. **Primary attachment kind/type constraints** must match the registry. 4. **Context-consistency constraints** for organizer complexes: * if `Task` has a secondary attachment to `Workstream`, then `Task.primary == Workstream.primary` (same repository). 5. **Relation constraints**: * primary must be Complex * endpoint types must match relation type definition * relation context must match endpoint context rules (usually same repo/domain) These constraints give rigor without hard-coding a single domain model. --- # 6. Query model (domain-agnostic) These queries exist in any domain: ## 6.1 Locate context * `context(entity)` = walk primary chain to root, or to the nearest scope boundary (e.g. nearest Domain/Workspace). ## 6.2 Membership * Members of a Complex: all entities with `primary == complexId`. ## 6.3 Parts of an Atom * Parts of an Atom: all entities with `primary == atomId`. ## 6.4 Relations in a relation-space * Relations owned by a Complex: all Relation entities with `primary == complexId`. ## 6.5 Neighborhood (graph view) * For entity X: find all relations in the same relation-space where X appears as endpoint. --- # 7. Example domain: Ecosystem → Domain → Repository → Workstreams/SBOMs This section makes the system concrete using your types. ## 7.1 Complexes * `Ecosystem` (Complex, root) * `Domain` (Complex, primary = Ecosystem) * `Repository` (Complex, primary = Domain) * `Workstream` (Complex, primary = Repository) — organizes work items * `SBOM` (Complex, primary = Repository) — organizes dependencies ## 7.2 Atoms * `Decision` (Atom, primary = Repository) * `Task` (Atom, primary = Repository) * `TechDebt` (Atom, primary = Repository) * `Extend` (Atom, primary = Repository) * `Dependency` (Atom, primary = Repository) ## 7.3 Organizing via secondary attachments * A Task in a Workstream: * `Task.attachments = [Repo42, Workstream7]` * A Dependency in an SBOM: * `Dependency.attachments = [Repo42, Sbom3]` Atoms remain ignorant of *how* the workstream orders tasks; the workstream can store structure. ## 7.4 Relation examples ### Task → Task dependency (repo-scoped) Relation type: `DependsOn` (Relation) * `DependsOn.attachments = [Repo42, TaskA, TaskB]` * payload: `{ critical: true, reason: "API contract needed first" }` ### Decision influences tasks (repo-scoped) Relation type: `Motivates` (Relation) * `Motivates.attachments = [Repo42, Decision9, TaskA]` ### Dependency graph inside an SBOM (sbom-scoped) Relation type: `Requires` (Relation) * `Requires.attachments = [Sbom3, DependencyX, DependencyY]` * payload: `{ scope: "runtime" }` This cleanly separates: * planning relations (Repo relation-space) * supply-chain relations (SBOM relation-space) --- # 8. Applying the modeling system to a new domain You can apply this to any domain by following a small method. ## 8.1 Step-by-step method ### Step 1 — Choose a root Complex Pick the top-level scope: * `Workspace`, `Tenant`, `Organization`, `Ecosystem`, etc. ### Step 2 — Identify “containers” vs “content” * Containers become **Complexes** (projects, folders, accounts, repositories, case files). * Content objects become **Atoms** (documents, customers, invoices, tickets, assets). Rule of thumb: * If it *organizes* others or defines a scope, it’s a Complex. * If it’s a “thing” with intrinsic content/lifecycle, it’s an Atom. ### Step 3 — Define the primary hierarchy (layering) Decide what “belongs to what” as the default place where entities live. Example pattern: * `Atom.primary = nearest containing Complex` ### Step 4 — Define organizer complexes (optional) Introduce complexes like `Workstream`, `Board`, `Collection`, `SBOM`, `Timeline` that provide structure. Use **secondary attachments** from atoms to these complexes. ### Step 5 — Define relation-spaces Choose where relations live: * typically in the “owning” complex (project/repo/case) * sometimes in a specialized complex (SBOM, timeline, graph) ### Step 6 — Create a Type Registry + constraints For each type, specify: * kind * required primary attachment type(s) * optional secondary attachment types * allowed relation endpoints (if relation type) ### Step 7 — Migrate incrementally Start with primary attachments and identity first. Add organizer complexes and relations later without breaking identity. --- # 9. Applying it to an existing domain with pre-existing entities The key is to **wrap** existing entities as Entities in this system without rewriting them all at once. ## 9.1 Integration patterns ### Pattern A — “Entity wrapper” over existing tables/documents * Keep existing storage unchanged. * Create an `Entity` record that references external storage: * payload contains `{ externalType, externalId, sourceSystem }` * Attachments, relations, and organization are managed in the new layer. This is the safest “overlay” approach. ### Pattern B — “Dual write” for new objects * New entities are created in the new model as the source of truth. * Optionally mirrored into legacy storage for compatibility. ### Pattern C — “Progressive normalization” * Start overlay-style. * Gradually move the most valuable types (e.g., Tasks, Decisions) into native entities. * Leave rarely touched legacy objects wrapped indefinitely. ## 9.2 Migration steps for existing data 1. **Assign stable IDs** * If legacy IDs exist, reuse them with a namespace prefix. 2. **Create root complexes** * e.g. one `Ecosystem` or per-tenant `Workspace`. 3. **Attach existing entities to a primary context** * even if initially coarse (everything attaches to one domain/project). 4. **Introduce finer complexes** * split into domains, repos/projects later by moving primary attachments. 5. **Add relations incrementally** * create relation entities for the relationships you query most. 6. **Backfill organizer complexes** * workstreams, boards, SBOMs, etc., via secondary attachments. Because relations and organization are additive, you can evolve structure without breaking identity. --- # 10. What this system buys you * A **uniform modeling surface** across domains. * A **clean separation** of content (atoms) from structure (complexes + relations). * **Multiple overlapping organizations** via secondary attachments without duplication. * **First-class relationships** with auditability and contextual ownership. * **Incremental adoption** over legacy systems. ## Extension Points This could be turned into a compact “spec” format (like a small RFC) plus a concrete “Type Registry” table for your example (including recommended relation types and constraints). xxx