generated from coulomb/repo-seed
550 lines
17 KiB
Markdown
550 lines
17 KiB
Markdown
# ResearchSeed.md
|
||
|
||
# identity-canon Research Seed
|
||
|
||
This file captures the initial research seeding information for `identity-canon`.
|
||
|
||
The research goal is to distill a canonical terminology and conceptual data model for identity, user, organization, community, tenant, and relationship management in complex systems that are multi-tenant, multi-vendor, multi-community, and multi-user capable.
|
||
|
||
The model should support enterprises with sub-organizations, social communities, social-media follower graphs, single users, family entities, spontaneous interest groups, bots, service accounts, AI agents, and weak/strong synonymity between identity records.
|
||
|
||
## Initial Framing
|
||
|
||
The project should not start from a simple `user` table or from a classic `users + groups + roles` IAM schema.
|
||
|
||
A more robust canonical core is a graph of:
|
||
|
||
- actors;
|
||
- identities;
|
||
- accounts;
|
||
- identifiers;
|
||
- profiles;
|
||
- personas;
|
||
- scopes;
|
||
- tenants;
|
||
- organizations;
|
||
- communities;
|
||
- families/households;
|
||
- memberships;
|
||
- relationships;
|
||
- credentials;
|
||
- claims;
|
||
- evidence;
|
||
- synonymity assertions.
|
||
|
||
Classic IAM systems, social networks, enterprise directories, family accounts, communities, vendors, customers, and spontaneous groups can then be modeled as specializations or patterns over that graph.
|
||
|
||
## Important Research Domains
|
||
|
||
## 1. Identity Provisioning and Directory Models
|
||
|
||
Important sources:
|
||
|
||
- SCIM 2.0: RFC 7643 and RFC 7644;
|
||
- LDAP and inetOrgPerson: RFC 4519 and RFC 2798;
|
||
- Keycloak Organizations;
|
||
- ZITADEL organizations and projects;
|
||
- Ory Kratos and Keto.
|
||
|
||
Research focus:
|
||
|
||
- provisioning semantics;
|
||
- users and groups;
|
||
- organization/member terminology;
|
||
- directory assumptions;
|
||
- account lifecycle;
|
||
- separation between identity management and authorization.
|
||
|
||
SCIM is especially important as a provisioning baseline because it defines platform-neutral schemas and protocol operations for user and group resources.
|
||
|
||
LDAP and inetOrgPerson remain important because lightweight IAM stacks and enterprise systems still inherit LDAP-style person, organizational unit, and group terminology.
|
||
|
||
Keycloak and ZITADEL provide live multi-tenant IAM product vocabularies. Ory is useful because it separates identity management from authorization.
|
||
|
||
## 2. Authentication and Federation
|
||
|
||
Important sources:
|
||
|
||
- OpenID Connect Core;
|
||
- SAML 2.0;
|
||
- NIST SP 800-63-4;
|
||
- OpenID Shared Signals, CAEP, and RISC.
|
||
|
||
Research focus:
|
||
|
||
- issuer and subject concepts;
|
||
- pairwise and public subject identifiers;
|
||
- authentication assurance;
|
||
- federation assurance;
|
||
- assertions and claims;
|
||
- risk and security event streams;
|
||
- account linking and pseudonymous identifiers.
|
||
|
||
OIDC is central because externally issued subject identifiers and pairwise identifiers directly affect synonymity and account-linking semantics.
|
||
|
||
SAML remains important for enterprise federation and assertion semantics.
|
||
|
||
NIST identity guidance is useful for separating identity proofing, authentication assurance, federation assurance, and lifecycle management.
|
||
|
||
Shared Signals, CAEP, and RISC suggest that canonical identity models should also anticipate dynamic security and lifecycle events.
|
||
|
||
## 3. Social Graph and Community Models
|
||
|
||
Important sources:
|
||
|
||
- ActivityPub;
|
||
- FOAF;
|
||
- WebID;
|
||
- Solid profiles;
|
||
- Schema.org Person and Organization.
|
||
|
||
Research focus:
|
||
|
||
- actors;
|
||
- followers/following;
|
||
- public profiles;
|
||
- handles;
|
||
- accounts on federated servers;
|
||
- communities;
|
||
- groups;
|
||
- social relationships;
|
||
- semantic vocabularies for persons and organizations.
|
||
|
||
ActivityPub is especially relevant because it treats users as server-side actors with inboxes and outboxes. A person may have several actors across servers, which maps well to contextual identities and personas.
|
||
|
||
FOAF and Schema.org are useful because they distinguish persons, agents, organizations, groups, accounts, and membership-like properties.
|
||
|
||
WebID/Solid are useful for user-controlled profiles and decentralized identity-style profile discovery.
|
||
|
||
## 4. Authorization and Relationship Semantics
|
||
|
||
Important sources:
|
||
|
||
- Google Zanzibar;
|
||
- OpenFGA;
|
||
- Cedar;
|
||
- AWS Verified Permissions;
|
||
- Cerbos.
|
||
|
||
Research focus:
|
||
|
||
- relationship-based authorization;
|
||
- subject-relation-object tuples;
|
||
- principals;
|
||
- resources;
|
||
- actions;
|
||
- context;
|
||
- roles vs permissions vs relationships;
|
||
- delegated administration.
|
||
|
||
Zanzibar/OpenFGA-style relationship tuples are especially close to what `identity-canon` needs for memberships, ownership, representation, delegation, family roles, community moderation, vendor/customer relationships, and tenant administration.
|
||
|
||
Cedar’s principal-action-resource-context distinction is useful for preserving orthogonality between identity, action, resource, and request context.
|
||
|
||
## 5. Decentralized Identity and Verifiable Claims
|
||
|
||
Important sources:
|
||
|
||
- W3C DID Core;
|
||
- W3C Verifiable Credentials Data Model 2.0;
|
||
- OpenID for Verifiable Credentials.
|
||
|
||
Research focus:
|
||
|
||
- decentralized identifiers;
|
||
- DID subjects and controllers;
|
||
- verification methods;
|
||
- claims;
|
||
- issuers;
|
||
- holders;
|
||
- verifiers;
|
||
- presentations;
|
||
- portable identity claims;
|
||
- externally controlled identifiers.
|
||
|
||
DID and Verifiable Credentials are relevant when identity, membership, authorization, or representation claims are issued outside the platform.
|
||
|
||
The canonical model should distinguish claims from verified facts and should preserve issuer, evidence, scope, validity, and revocation state.
|
||
|
||
## 6. Entity Resolution, Synonymity, and Privacy
|
||
|
||
Important sources:
|
||
|
||
- deterministic matching;
|
||
- probabilistic matching;
|
||
- entity resolution and record linkage literature;
|
||
- GDPR pseudonymization and anonymization guidance.
|
||
|
||
Research focus:
|
||
|
||
- weak identity matches;
|
||
- strong identity links;
|
||
- scoped identity equivalence;
|
||
- operational account linking;
|
||
- legal identity links;
|
||
- privacy-preserving links;
|
||
- source and evidence;
|
||
- confidence;
|
||
- revocation;
|
||
- GDPR implications.
|
||
|
||
The model should avoid treating identity linkage as a destructive merge. Instead, synonymity should be modeled as an assertion with strength, scope, source, evidence, confidence, validity, and revocation state.
|
||
|
||
## Terminology Challenge
|
||
|
||
Many common terms are overloaded:
|
||
|
||
| Term | Common Meanings | Modeling Risk |
|
||
| --- | --- | --- |
|
||
| User | Human, account, login principal, profile, customer record, app user | Collapses person, account, and actor |
|
||
| Account | Login credential set, billing account, social media handle, tenant account | Collapses authentication and business relationship |
|
||
| Organization | Legal entity, tenant, department, team, community, vendor, customer | Collapses legal structure, membership scope, and operational boundary |
|
||
| Group | LDAP group, social group, permission group, family, team, community | Collapses social grouping and authorization grouping |
|
||
| Role | Job function, permission bundle, relationship label, social role | Collapses semantics, permissions, and responsibility |
|
||
| Identity | Real-world personhood, credentialed subject, account identity, profile | Collapses entity, claim, authenticator, and identifier |
|
||
| Principal | Human user, service account, agent, organization acting entity | Good for authorization, too narrow for social modeling |
|
||
| Tenant | Isolation boundary, customer organization, billing unit, realm | Collapses infrastructure boundary and social/legal actor |
|
||
|
||
The key design move is to stop using `user` as the root concept.
|
||
|
||
## Candidate Canonical Vocabulary
|
||
|
||
## Entity and Actor Layer
|
||
|
||
### Entity
|
||
|
||
Anything that can be referred to as a modeled thing: person, organization, family, community, bot, service, account, resource, project, domain, or device.
|
||
|
||
### Actor
|
||
|
||
An entity capable of intentional or delegated action in a system. Examples include human persons, organizations acting through representatives, AI agents, service accounts, and community bots.
|
||
|
||
### Natural Person
|
||
|
||
A human being. This should not be identical to `user`, because a person can have many accounts, profiles, personas, and relationships.
|
||
|
||
### Collective Actor
|
||
|
||
A group-like actor that can act collectively or be represented by members/admins. Subtypes include enterprise, department, family, community, interest group, vendor, customer tenant, and project team.
|
||
|
||
### Artificial Actor
|
||
|
||
A bot, service account, automation, coding agent, or autonomous agent.
|
||
|
||
## Identity and Account Layer
|
||
|
||
### Identity
|
||
|
||
A claim-bearing representation of an actor in a context. An actor can have multiple identities.
|
||
|
||
### Identifier
|
||
|
||
A value used to refer to an identity or entity: UUID, email address, username, OIDC subject, SAML NameID, DID, domain name, phone number, employee number.
|
||
|
||
### Account
|
||
|
||
A system-local operational identity used for login, profile, preferences, sessions, and credentials.
|
||
|
||
### Profile
|
||
|
||
A presentation surface of an identity or account. A profile may be public, private, tenant-local, app-local, community-local, or audience-specific.
|
||
|
||
### Persona
|
||
|
||
A deliberate contextual identity expression of an actor. Examples include private person, employee persona, admin persona, and pseudonymous community handle.
|
||
|
||
### Credential
|
||
|
||
Something used to authenticate or prove a claim: password, passkey, certificate, TOTP seed, recovery factor, verifiable credential, or domain ownership proof.
|
||
|
||
### Authenticator
|
||
|
||
The concrete authentication factor or mechanism bound to an account/subscriber.
|
||
|
||
## Scope and Tenancy Layer
|
||
|
||
### Scope
|
||
|
||
A bounded context in which identifiers, memberships, roles, policies, and profile data have meaning.
|
||
|
||
### Tenant
|
||
|
||
A scope with operational isolation and delegated administration. A tenant may be backed by an organization, family, community, individual, vendor, or platform unit.
|
||
|
||
### Realm / Identity Domain
|
||
|
||
A hard identity boundary with separate users, credentials, clients, policies, and lifecycle.
|
||
|
||
### Organization
|
||
|
||
A structured collective actor with governance, membership, and possibly sub-organizations. It may or may not be a legal entity.
|
||
|
||
### Legal Entity
|
||
|
||
An organization recognized by a jurisdiction. Not every organization, community, or team is a legal entity.
|
||
|
||
### Community
|
||
|
||
A collective actor primarily organized by shared interest, social graph, participation, or moderation rules rather than employment/legal hierarchy.
|
||
|
||
### Household / Family
|
||
|
||
A collective actor organized around family/household relationships, guardianship, shared resources, and dependent accounts.
|
||
|
||
### Spontaneous Group
|
||
|
||
A lightweight collective actor created ad hoc around temporary interest, event, project, or conversation.
|
||
|
||
Important distinction: tenant, organization, and community must not be synonyms. A tenant is an operational boundary. An organization, community, or family is a social/legal actor that may own or inhabit a tenant.
|
||
|
||
## Relationship Layer
|
||
|
||
### Relationship
|
||
|
||
A typed edge between entities, actors, accounts, scopes, resources, or other modeled concepts.
|
||
|
||
### Membership
|
||
|
||
A relationship where an actor participates in a collective actor or scope.
|
||
|
||
### Affiliation
|
||
|
||
A looser relationship indicating association without necessarily implying membership, authority, or access.
|
||
|
||
### Representation
|
||
|
||
A relationship where one actor can act on behalf of another.
|
||
|
||
### Delegation
|
||
|
||
A scoped, revocable grant of authority from one actor to another.
|
||
|
||
### Administration
|
||
|
||
A delegated authority to manage lifecycle, membership, policy, or resources in a scope.
|
||
|
||
### Ownership
|
||
|
||
A strong control or responsibility relationship over an entity, resource, or scope. This may require legal, operational, and data-control subtypes.
|
||
|
||
### Follower Relationship
|
||
|
||
A directional social relationship expressing subscription or attention, not necessarily trust, membership, or permission.
|
||
|
||
### Trust Relationship
|
||
|
||
A relationship where one actor accepts claims, credentials, or decisions from another actor under defined conditions.
|
||
|
||
## Role and Capability Layer
|
||
|
||
### Role
|
||
|
||
A named relationship pattern in a scope. Examples include member, owner, moderator, billing admin, guardian, employee, and vendor admin.
|
||
|
||
### Capability
|
||
|
||
An ability to perform an action, usually derived from roles, policies, relationships, credentials, or explicit grants.
|
||
|
||
### Permission
|
||
|
||
A concrete allowed action on a resource type or instance.
|
||
|
||
### Policy
|
||
|
||
A rule that derives permissions or capabilities from relationships, attributes, credentials, and context.
|
||
|
||
This prevents the classic collapse of role, group, permission bundle, and job title.
|
||
|
||
## Synonymity and Identity Resolution Layer
|
||
|
||
### Strong Synonymity
|
||
|
||
Two identifiers, accounts, or identities are asserted to refer to the same underlying actor with high confidence and strong evidence.
|
||
|
||
Examples:
|
||
|
||
- same verified OIDC subject from the same issuer;
|
||
- account explicitly linked after re-authentication;
|
||
- verifiable credential bound to the same DID/controller.
|
||
|
||
### Weak Synonymity
|
||
|
||
Two records may refer to the same actor based on partial, contextual, or probabilistic evidence.
|
||
|
||
Examples:
|
||
|
||
- same email seen in imported CSV and social profile;
|
||
- matching name/domain;
|
||
- same account handle without explicit proof.
|
||
|
||
### Scoped Synonymity
|
||
|
||
Two identifiers are treated as equivalent only within a defined context.
|
||
|
||
Example:
|
||
|
||
- a pairwise OIDC subject mapped to a local account for one relying party.
|
||
|
||
### Operational Link
|
||
|
||
A system-level account link used for convenience, not necessarily a real-world identity assertion.
|
||
|
||
### Legal Identity Link
|
||
|
||
A stronger assertion that may support contracts, billing, employment, guardianship, or compliance.
|
||
|
||
### Privacy-Preserving Link
|
||
|
||
A link that enables continuity without exposing global identity.
|
||
|
||
Examples:
|
||
|
||
- pairwise identifiers;
|
||
- pseudonymous handles;
|
||
- tenant-local subjects.
|
||
|
||
## Synonymity Assertion Fields
|
||
|
||
A synonymity assertion should carry at least:
|
||
|
||
```text
|
||
source
|
||
target
|
||
relation_type: same_as | probably_same_as | linked_to | represents | controls | acts_for
|
||
strength: weak | medium | strong | authoritative
|
||
scope
|
||
evidence
|
||
issuer/source_system
|
||
created_at
|
||
valid_from / valid_until
|
||
revocation_state
|
||
privacy_classification
|
||
```
|
||
|
||
## Initial Conceptual Model Shape
|
||
|
||
```text
|
||
Entity
|
||
├─ Actor
|
||
│ ├─ NaturalPerson
|
||
│ ├─ CollectiveActor
|
||
│ │ ├─ Organization
|
||
│ │ ├─ LegalEntity
|
||
│ │ ├─ Community
|
||
│ │ ├─ FamilyOrHousehold
|
||
│ │ └─ SpontaneousGroup
|
||
│ └─ ArtificialActor
|
||
│ ├─ ServiceAccount
|
||
│ ├─ Bot
|
||
│ └─ Agent
|
||
├─ Account
|
||
├─ Profile
|
||
├─ Credential
|
||
├─ Resource
|
||
└─ Scope
|
||
├─ Tenant
|
||
├─ Realm
|
||
├─ OrganizationScope
|
||
├─ CommunityScope
|
||
└─ ApplicationScope
|
||
```
|
||
|
||
## Initial Relationship Model Shape
|
||
|
||
```text
|
||
Relationship
|
||
subject_entity_id
|
||
relation_type
|
||
object_entity_id
|
||
scope_id
|
||
source
|
||
evidence_ref
|
||
strength
|
||
status
|
||
valid_from
|
||
valid_until
|
||
metadata
|
||
```
|
||
|
||
## Example Statements the Model Should Express
|
||
|
||
```text
|
||
Bernd is member of Binect
|
||
Binect is sub-organization of Whynot GmbH
|
||
User account A is operated by Bernd
|
||
ActivityPub actor @x follows @y
|
||
Child account C is represented by guardian G
|
||
Vendor tenant V provides application App1
|
||
Customer tenant C consumes application App1
|
||
Service account S acts for organization O in scope T
|
||
OIDC subject sub123 is strongly linked to local account U in relying-party scope R
|
||
Email e@example.com is weakly linked to person P based on imported evidence
|
||
```
|
||
|
||
## CLI/UI Implications for Later Work
|
||
|
||
Although `identity-canon` is not an implementation repository, the model should later support convenient CLI/UI workflows such as:
|
||
|
||
```text
|
||
create-person
|
||
create-organization
|
||
create-community
|
||
create-family
|
||
create-spontaneous-group
|
||
create-tenant-for-actor
|
||
invite-member
|
||
link-account
|
||
claim-domain
|
||
assign-admin
|
||
delegate-authority
|
||
create-service-account
|
||
create-agent
|
||
add-follower-edge
|
||
assert-synonymity
|
||
review-synonymity
|
||
revoke-link
|
||
export-scim
|
||
sync-ldap
|
||
provision-keycloak
|
||
```
|
||
|
||
These workflows should remain downstream implementation concerns.
|
||
|
||
## Working Hypothesis
|
||
|
||
A strong canonical model can be based on five orthogonal primitives:
|
||
|
||
```text
|
||
Actor who/what can act
|
||
Identity how an actor is represented or claimed in a context
|
||
Scope where a statement has meaning
|
||
Relationship how modeled things are connected
|
||
Evidence why a statement is trusted
|
||
```
|
||
|
||
From these, operational IAM concepts can be derived:
|
||
|
||
```text
|
||
User = account/identity used by a natural person in a scope
|
||
Tenant = operational scope with delegated administration
|
||
Group = collective actor or membership set, depending on context
|
||
Role = named relationship/policy pattern in a scope
|
||
Org = structured collective actor
|
||
Community = participatory collective actor
|
||
Family = household/kinship collective actor
|
||
```
|
||
|
||
## Research Direction
|
||
|
||
The next step is to populate the source-stack notes, extract terminology from each source, and create:
|
||
|
||
- terminology inventory;
|
||
- terminology conflict map;
|
||
- canonical glossary;
|
||
- concept cards;
|
||
- scenario tests;
|
||
- conceptual model;
|
||
- synonymity model;
|
||
- scope model;
|
||
- downstream recommendations.
|