Files
user-engine/docs/postgres-durable-store-consumer-requirements.md

300 lines
13 KiB
Markdown

# Postgres Durable Store Consumer Requirements
Status: requirements + store contract boundary
Date: 2026-06-15
Related workplan: USER-WP-0009
## Purpose
This document defines what `user-engine` needs from a durable Postgres-backed
store, from the consumer side. It intentionally does not design or implement
the Postgres provider. The expected direction is that an independent
NetKingdom infrastructure repository provides a tenant-aware, security
integrated Postgres capability, and `user-engine` consumes that capability
through a durable store adapter.
The consumer-side contract is now represented in code by
`user_engine.ports.UserEngineStore`. The protocol is intentionally
adapter-neutral: it names the service behavior a durable store must satisfy
without adding a Postgres dependency or giving this repository ownership of
database provisioning.
## Consumer Story
As a `user-engine` consumer, I want the service to persist identity-domain
facts durably while keeping NetKingdom security, IAM, secrets, network, tenant
isolation, backup, and operational controls outside the user-engine domain
implementation.
The desired experience is:
```text
NetKingdom gives user-engine a scoped Postgres capability.
user-engine applies or verifies its own schema for its own tables.
service operations keep the same behavior as the isolated MVP.
mutations, audit records, and outbox events commit atomically.
tenant boundaries and security controls are enforced by both adapter logic and
the provided database capability.
```
## Ownership Boundary
### NetKingdom Postgres Provider Owns
- Database or cluster provisioning.
- Tenant isolation primitive, such as database-per-tenant, schema-per-tenant,
row-level security, or another accepted model.
- Roles, credentials, certificate material, TLS requirements, secret rotation,
and credential lease policy.
- Network reachability, firewall rules, service identity admission, and
runtime policy integration.
- Backup, restore, PITR, replication, retention, and disaster recovery
controls.
- Platform-level metrics, logs, traces, alert routing, and operational
runbooks for the database capability.
- Base security posture, hardening, encryption at rest, and administrative
access controls.
### user-engine Owns
- Domain table definitions for its own data.
- Schema version expectations and forward migrations for user-engine tables.
- Store adapter behavior that satisfies the public service contract.
- Transaction boundaries for user-engine mutations.
- Domain constraints, validation, and deterministic query behavior.
- Local audit and outbox table semantics when those records are persisted in
the user-engine store.
- Store conformance tests.
### External Systems Continue To Own
- Identity provider configuration, token issuance, credentials, MFA, and
sessions.
- Authorization policy decisions.
- Platform audit custody and long-term audit archive.
- Secrets authority and secret distribution.
- Organization or tenant authority beyond user-engine references.
## Functional Requirements
### Store Parity
The durable store must satisfy the same behavior currently exercised against
the isolated store:
- Persist users, accounts, tenant accounts, external identities, applications,
application bindings, catalogs, profile values, memberships, audit records,
and outbox events.
- Return stable records by the same logical keys used by
`UserEngineService`.
- Preserve `schema_version`, `ready`, and migration readiness semantics.
- Support the same service-level exceptions for not found, conflict,
validation, and authorization-denied flows.
### Store Protocol Boundary
`UserEngineService` consumes the `UserEngineStore` protocol rather than local
in-memory collections. A future Postgres adapter must provide:
- Schema readiness through `schema_version`, `ready`, and `migrate`.
- A `transaction` context that makes each mutating write unit atomic.
- Logical read/write methods for users, accounts, tenant accounts, external
identities, memberships, applications, bindings, catalogs, family
invitations, and profile values.
- Audit and outbox append/read methods that preserve write order.
- Adapter-neutral record counts for diagnostics and operability snapshots.
Concrete tables, SQL, connection pools, and row locks remain adapter details.
Service and domain code should not depend on Postgres-specific concepts.
### Identity And Account Constraints
- `(issuer, subject)` must uniquely identify one external identity link.
- An external identity must not be linked to two different users.
- A user must have at most one primary account record in the current model.
- Tenant account records must be unique by `(tenant, user_id)`.
- Deleted or disabled account states must remain inspectable for audit and
lifecycle decisions.
### Tenant Boundary Requirements
- Every tenant-scoped row must carry an explicit tenant identifier or be
reachable only through an explicit tenant-scoped relationship.
- Queries that resolve tenant-scoped data must require tenant context from the
service layer.
- The adapter must fail closed when tenant context is missing for tenant-bound
operations.
- The provider should make tenant isolation enforceable below application code,
for example through separate databases, schemas, RLS policies, or scoped
database roles.
- Platform-level access must be represented as an explicit NetKingdom security
capability, not as a default database superuser path.
### Application And Catalog Requirements
- Application ids must be unique.
- Application bindings must be retrievable by application id.
- Active catalog namespace ownership must not move silently between
applications.
- Catalog versioning must preserve the existing rule that active definitions
cannot downgrade sensitivity or move versions backwards.
- Attribute lookup by key must remain deterministic and efficient enough for
projection generation.
### Profile And Projection Requirements
- Profile values must be unique by user, attribute key, scope, and scope id.
- Effective profile resolution must remain deterministic across global,
tenant, application, and membership scopes.
- Sensitive and secret values must not leak through diagnostics or logs.
- Projection reads should avoid N+1 query patterns for common application
runtime and claims-enrichment use cases.
### Membership Requirements
- Memberships must be queryable by user and tenant.
- Membership facts must carry scope type, scope id, kind, source system,
owning system, and freshness version.
- Privileged memberships should remain traceable to audit records, evidence
references, or explicit evidence gaps.
- The store must support future revoke/update behavior without losing the
ability to inspect historical role changes.
### Audit And Outbox Requirements
- Mutating service operations must commit domain changes, local audit records,
and outbox events atomically.
- Authorization denials must be auditable without emitting outbox events.
- Audit records should be append-only from the service perspective.
- Outbox records must support pending reads and future claim/ack/retry
semantics.
- Outbox event ids must be stable delivery ids, and correlation ids must remain
queryable for cross-system tracing.
### Transaction Requirements
- Each public mutating service operation must run in one transaction.
- Failed validation or authorization must not partially write domain state.
- Store implementation must handle uniqueness races deterministically and map
them to `ConflictError` where appropriate.
- Migration and outbox claiming should use explicit locking strategies that do
not require consumers to understand Postgres internals.
- Authorization-denial audit records must persist without outbox events even
when the denied operation occurs inside a composed transaction that rolls
back domain writes.
### Migration Requirements
- user-engine owns migrations for its own tables.
- Migrations must be forward-only unless an explicit rollback strategy is
accepted for a release.
- Readiness must report the expected schema version and actual store version.
- Startup behavior must distinguish "store unreachable", "migration required",
"migration in progress", and "ready".
- Destructive migrations require an explicit operator-controlled process.
- The provider may supply the database and schema container, but should not
need to know user-engine domain tables.
## Security Requirements
- Database credentials must come through a NetKingdom secret or identity
mechanism, not literal config files.
- Connections must require TLS when crossing process or host boundaries.
- Credentials should be scoped to the minimum database, schema, tenant, and
operations needed by user-engine.
- Logs, errors, readiness output, and diagnostics must not expose credentials,
connection strings, secret values, sensitive profile data, or full personal
records.
- The adapter must make tenant context explicit and auditable.
- The provider should expose enough security metadata for `identity_context`
evidence or gap references when privileged access or lifecycle work depends
on database-side controls.
## Operability Requirements
- Health checks should report whether the adapter can reach the store.
- Readiness checks should report schema compatibility and migration state.
- Diagnostics should include redacted connection target, schema version, last
migration, pending outbox count, and recent store error class.
- Metrics should cover connection failures, transaction failures, conflicts,
migration duration, query latency, and outbox backlog.
- Backup/restore expectations must be testable from the consumer side through
restore validation or equivalent provider evidence.
- Store failures should produce actionable errors without leaking sensitive
details.
## Provider Interface Expectations
The future provider repository should be able to give user-engine:
- A logical store reference for the NetKingdom environment and tenant scope.
- A secret handle or service identity mechanism for credentials.
- TLS or certificate requirements.
- Tenant isolation metadata that the adapter can record in diagnostics.
- Migration permission policy for user-engine-owned tables.
- Backup and restore evidence or status references.
- Operational contact/runbook references.
`user-engine` should not require:
- Cluster administrator credentials.
- Knowledge of physical cluster topology.
- Direct control over backups, replication, firewall rules, or secret
rotation.
- Provider-specific SQL outside an adapter layer.
## Acceptance Tests For A Future Adapter
A future Postgres adapter should pass conformance tests for:
- Creating a user from verified identity claims and reading it through `me`.
- Preventing duplicate `(issuer, subject)` links across users.
- Creating tenant accounts and denying cross-tenant reads through the service
layer.
- Adding memberships and returning them in `identity_context`.
- Registering an application, publishing a catalog, writing profile values,
and producing application runtime and claims-enrichment projections.
- Redacting sensitive values in non-eligible projections.
- Rolling back all writes when a mutation fails after validation or
authorization.
- Persisting audit records and outbox events atomically with mutations.
- Reporting not-ready state when schema version is missing or incompatible.
- Recovering from restart without losing users, memberships, profiles, audit,
or outbox records.
## Open Questions For The Provider Repository
- Should NetKingdom use database-per-tenant, schema-per-tenant, RLS, or a
hybrid model for user-engine data?
- Who runs user-engine migrations in production: user-engine startup, a
deployment job, or a provider-controlled migration runner?
- How are credential leases issued, renewed, revoked, and audited?
- What backup unit maps to a family or organization dataspace: cluster,
database, schema, tenant row set, or application scope?
- What evidence references can the provider expose for backup status, restore
tests, encryption posture, and access reviews?
- How should local development emulate the provider without weakening the
production contract?
## First Implementation Follow-Ups
The first consumer-side follow-up is complete: `UserEngineStore` defines the
adapter boundary and the in-memory store acts as the reference implementation
for service-level behavior.
USER-WP-0016 adds the next consumer-side slice: `user_engine.migrations`
declares the ordered migration manifest and latest schema version,
`migrations/postgres/0001_user_engine_store.sql` defines a provider-facing
bootstrap schema, and `user_engine.testing.store_conformance` exposes a
reusable harness that future adapters can run with their own store factory.
The standard local suite runs that harness against `InMemoryUserEngineStore`.
Likely future follow-up work should be:
- Add a Postgres adapter behind the existing store boundary.
- Add provider-backed conformance tests for locking, uniqueness races,
migration readiness, outbox claiming, redacted diagnostics, and restore
validation.
- Add conformance tests that run against both in-memory and Postgres stores.
- Integrate the adapter with the future NetKingdom Postgres provider repo.