diff --git a/INTENT.md b/INTENT.md new file mode 100644 index 0000000..cf5b6e8 --- /dev/null +++ b/INTENT.md @@ -0,0 +1,356 @@ +# Audit Core Intent + +## Purpose + +Audit Core is the independent audit fabric for collecting, normalizing, +retaining, searching, exporting, and proving the integrity of audit events +across multi-tenant infrastructure and applications. + +It exists so platforms can treat audit as a first-class product capability +rather than as an incidental log-forwarding detail. Audit Core should support +small installations that only need durable audit retention, while also scaling +toward enterprise and critical-infrastructure requirements such as tenant +isolation, immutable archive, tamper evidence, searchable investigation +workflows, retention policy, and controlled export. + +Audit Core must integrate cleanly with NetKingdom, Railiance, OpenBao, +Kubernetes, identity providers, and application runtimes, but it must not depend +on NetKingdom to exist or operate. + +## Problem + +Modern platforms generate audit-relevant events in many places: + +- identity systems; +- secret managers; +- orchestration layers; +- administrative control surfaces; +- workload runtimes; +- tenant-facing applications; +- policy engines; +- backup, restore, and break-glass operations. + +If these events remain scattered across local pod logs, PVCs, application +databases, shell histories, or ad hoc State Hub notes, they cannot satisfy +serious operational, forensic, or compliance needs. + +The platform needs an audit system that can answer: + +- who did what, when, from where, and under which authority; +- which tenant, scope, workload, system, or asset was affected; +- whether the event stream is complete enough for the configured policy; +- where the event is retained; +- who can search or export it; +- whether retained records were altered, deleted, or withheld; +- how long each class of event must be kept; +- whether a tenant has purchased or enabled each audit capability tier. + +## Independence + +Audit Core is a standalone product and repository. + +It must not import NetKingdom internals, assume NetKingdom deployment paths, or +require NetKingdom identity in order to function. A small installation should be +able to deploy Audit Core with local configuration, static tenants, and a basic +OIDC provider. + +NetKingdom integration is important, but it must happen through public, +documented contracts: + +- tenant and scope registration APIs; +- issuer and identity mapping APIs; +- audit event ingestion APIs; +- retention and entitlement APIs; +- setup and validation APIs; +- export and evidence APIs; +- optional adapters for NetKingdom IAM Profile claims. + +This keeps Audit Core reusable for non-NetKingdom platforms while still letting +NetKingdom provide a polished setup and operating experience. + +## Product Shape + +Audit Core is not just a log store. It is an audit control plane and data plane. + +The control plane owns: + +- tenants, scopes, sources, streams, and audit policies; +- retention profiles and cost tiers; +- ingestion credentials and source registration; +- search/export entitlements; +- archive and tamper-evidence configuration; +- setup validation and health reporting; +- integration status for upstream platforms. + +The data plane owns: + +- event ingestion; +- normalization to the Audit Core event envelope; +- routing to hot search, immutable archive, and optional SIEM sinks; +- buffering and retry behavior; +- batch manifests and integrity proofs; +- source-specific adapters and collectors. + +## Core Principles + +1. **Audit is a product boundary.** Audit Core owns audit semantics and + retention behavior. Other systems emit events; they do not define the whole + audit fabric. + +2. **Standalone first, integrated second.** NetKingdom should be an excellent + integration target, not a hidden prerequisite. + +3. **Tenant-aware by design.** Every event must be attributable to a tenant, + scope, source, or platform-control-plane context. Shared platform events are + never silently downgraded to tenant-owned optional audit. + +4. **Policy before plumbing.** A source is not "onboarded" merely because logs + arrive. It is onboarded when ownership, retention, access, export, and + evidence policy are declared and validated. + +5. **Immutable archive and hot search are separate concerns.** Hot search can + be short-lived and cost-tuned. Immutable archive is the evidence record. + +6. **Tamper evidence is explicit.** Audit Core should store signed or + hash-chained manifests for retained batches so operators can later prove + whether a record set was changed, omitted, or truncated. + +7. **Least-privilege access.** Tenants may access their own audit records + according to policy. Platform operators may access cross-tenant records only + through explicit privileged workflows. + +8. **Opt-in where legitimate, mandatory where necessary.** Tenants may choose + optional audit tiers for their own workloads. Platform-control-plane audit + for shared infrastructure is mandatory. + +9. **No secret dumping.** Audit events may contain sensitive metadata and + secret identifiers, but must avoid plaintext secrets, OTP seeds, private + keys, root tokens, unseal shares, and password material. + +10. **Failure must be visible.** Dropped, delayed, blocked, or degraded audit + streams are themselves audit and operations events. + +## Initial Scope + +Audit Core should initially cover: + +- a canonical audit event envelope; +- source registration and tenant/scope mapping; +- HTTP ingestion API; +- batch file/archive writer; +- hash-chain or signed manifest generation; +- retention profile metadata; +- source health and validation API; +- local development deployment; +- NetKingdom setup adapter; +- OpenBao audit ingestion path; +- Kubernetes audit or workload log ingestion path; +- basic query/export API over archived batches or configured hot search. + +## Out Of Scope Initially + +Audit Core should not initially attempt to own: + +- every observability use case; +- metrics and traces; +- generic application logging; +- incident response automation; +- policy decision making; +- tenant identity provisioning; +- SIEM correlation rules beyond simple routing metadata; +- replacing NetKingdom, Railiance, OpenBao, or application-specific audit + semantics. + +Those capabilities may integrate later, but the first product must stay focused +on audit custody, integrity, retention, and integration contracts. + +## Architecture Fit + +Audit Core should sit beside identity, secret management, and platform services: + +```text +Audit sources + OpenBao, Kubernetes, identity providers, policy engines, apps + | +Collectors / adapters + Fluent Bit, Vector, OTel Collector, source-specific shippers + | +Audit Core ingestion API + validation, normalization, tenant/scope mapping + | + +--> immutable archive + | object storage, WORM/object lock, encrypted batches + | + +--> tamper evidence + | signed manifests, hash chains, optional transparency log + | + +--> hot search + | Loki, OpenSearch, or another configured backend + | + +--> optional SIEM + Wazuh, OpenSearch Security Analytics, or external SOC tools +``` + +Audit Core should prefer pluggable sinks. The default implementation can start +with object storage for archive and one hot-search backend, but the contracts +must allow future alternatives. + +## NetKingdom Integration + +NetKingdom should integrate with Audit Core through APIs and adapters. + +NetKingdom can provide: + +- tenant and scope metadata; +- IAM Profile claim mappings; +- operator and tenant-admin identity; +- setup wizard integration; +- entitlement selection; +- audit policy approval; +- status display in the NetKingdom control surface; +- links to evidence, export, and validation views. + +Audit Core should provide NetKingdom: + +- a setup API that declares required configuration steps; +- a validation API that reports readiness by source, tenant, and sink; +- an ingestion credential bootstrap flow; +- tenant/scope registration endpoints; +- policy templates for common NetKingdom tiers; +- OpenAPI specifications and machine-readable capability descriptors; +- health and evidence endpoints that NetKingdom can surface without knowing + Audit Core internals. + +NetKingdom must not be required for Audit Core's internal authorization model. +Audit Core may accept NetKingdom OIDC claims when configured, but should also +support a generic OIDC provider and local development auth mode. + +## Tenant And Scope Model + +Audit Core should distinguish: + +- **platform scope**: shared infrastructure and control-plane audit that is + mandatory for the operator; +- **tenant scope**: tenant-owned systems and workload audit; +- **application scope**: a specific app, product, or service surface; +- **security scope**: high-trust operations such as break-glass, restore, + key rotation, and privileged access; +- **source scope**: a concrete emitter such as OpenBao, Kubernetes API server, + KeyCape, privacyIDEA, or a workload. + +Tenants should be able to select audit tiers for tenant-owned scopes. They +should not be able to disable platform-scope records required to operate shared +critical infrastructure safely. + +## Audit Capability Tiers + +Audit Core should support tiered operation: + +- **Platform Mandatory**: shared control-plane audit, always on. +- **Archive Only**: immutable retained batches, limited query, lowest cost. +- **Search**: archive plus hot-search backend for operational investigation. +- **Security**: search plus alerting, detection rules, and SOC export. +- **Dedicated**: stronger tenant isolation through dedicated buckets, indexes, + keys, or runtime instances. + +The tier model is both technical and commercial: it controls cost, retention, +query capability, and operational responsibilities. + +## API Surface + +The first API surface should include: + +- `POST /v1/events` for normalized event ingestion; +- `POST /v1/sources` for source registration; +- `GET /v1/sources/{id}/status` for source health; +- `POST /v1/tenants` and `POST /v1/scopes` for tenant/scope registration; +- `PUT /v1/policies/{scope}` for retention and routing policy; +- `GET /v1/readiness` for setup validation; +- `GET /v1/evidence/batches` for retained batch inventory; +- `GET /v1/evidence/batches/{id}/manifest` for integrity metadata; +- `POST /v1/exports` for controlled export jobs; +- `GET /v1/capabilities` for integration discovery. + +The API should be specified with OpenAPI and backed by conformance tests. + +## Event Envelope + +The canonical event envelope should carry at least: + +- event id; +- source id and source type; +- tenant id, scope id, and platform/tenant ownership class; +- timestamp observed and timestamp emitted; +- actor subject, issuer, groups, and authentication context where known; +- action, resource, result, and reason; +- correlation id and request id; +- sensitivity classification; +- retention class; +- payload reference or normalized payload; +- original event hash; +- ingestion batch id; +- schema version. + +The envelope should permit source-specific extension fields without making +every consumer understand every source. + +## Trust And Compliance Posture + +Audit Core should be suitable for environments that care about: + +- critical infrastructure operations; +- privileged access review; +- tenant isolation; +- incident response; +- restore and break-glass evidence; +- tamper-evident retention; +- export under legal, regulatory, or customer request; +- operational cost control. + +It should be honest about guarantees. A searchable log backend is not the same +as immutable evidence. A hash in a task note is not audit custody. A local PVC +is not durable retention. + +## Early Success Criteria + +Audit Core is useful when: + +- OpenBao audit events can be shipped out of the OpenBao audit PVC; +- events are normalized with tenant/scope/source metadata; +- an immutable archive batch is written and has a verifiable manifest; +- a readiness API can tell NetKingdom whether audit is configured enough for a + given tenant or platform gate; +- a tenant or operator can see whether audit is enabled, degraded, or disabled + by policy; +- local development works without NetKingdom; +- NetKingdom integration works without special-case code inside Audit Core. + +## Relationship To Existing Projects + +- **NetKingdom** owns identity, tenant/scope semantics, setup UX, and policy + decisions for NetKingdom deployments. +- **Audit Core** owns audit ingestion, normalization, retention routing, + evidence manifests, and audit APIs. +- **Railiance Platform** may operate storage, hot-search, and infrastructure + backends used by Audit Core. +- **OpenBao** emits secret-manager audit events; Audit Core does not replace + OpenBao audit devices. +- **Application repositories** emit application audit events using the Audit + Core envelope or adapters. + +## Initial Direction + +Start small but choose the right spine: + +1. define the event envelope and API contract; +2. implement local file/object archive batches with manifests; +3. add OpenBao file-audit ingestion; +4. expose readiness and validation APIs; +5. integrate NetKingdom setup through the API; +6. add hot-search sink support; +7. add tenant tiers, retention policy, and export flows; +8. add stronger tamper-evidence and optional SIEM integrations. + +The first version should be boring, inspectable, and trustworthy. Enterprise +features should grow from clear contracts, not from hidden coupling to the first +deployment environment.