Files

tegwick 617b2420d3 docs(adr): establish three foundational ADRs for Event Bridge architecture

ADR-001: activity-core as org-wide Event Bridge — boundaries, NATS as
org infrastructure, state hub delegation, rules-core module-first,
issue-core adapter interface, capabilities domain assignment.

ADR-002: markdown-as-definition format for event types and
ActivityDefinitions — co-located intent/schema/logic/debugging,
publisher-declared governance with environment-configurable curator gate,
attribute type system, task template files.

ADR-003: Rule vs. Instruction model and expression DSL — sandboxed
Python-like AST evaluator for Rules, trusted-fields prompt injection
protection for Instructions, output schema enforcement, audit trail,
testing strategy, rules-core module boundary.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-05-14 16:48:42 +02:00

8.3 KiB

Raw Blame History

id, type, title, status, decided_by, date, scope, affects, tags

type

title

status

decided_by

date

scope

affects

ACT-ADR-001: Activity-Core as Coulomb Org Event Bridge

Status

Accepted.

Context

The Coulomb organization's set of repositories, services, and deployments is growing beyond what a single person can coordinate manually. The state hub tracks cross-domain state but has no mechanism to automatically respond to it. Recurring maintenance (dependency scans, SBOM staleness checks, consistency audits) is implemented as bespoke cron jobs baked into individual services — scattered, hard to audit, and impossible to govern from a single vantage point.

Three forces drive the need for a dedicated orchestration layer:

Scale: as the repo count grows, manual coordination becomes the bottleneck.
Reactivity: org-level events (new repo registered, CVE published, deployment completed) should trigger coordinated responses without human intervention.
Separation of concerns: the state hub is a read model and should remain one. It must not accumulate automation logic to avoid becoming a God object.

Decision

activity-core is the org-wide Event Bridge for the Coulomb organization.

Its responsibility is exactly three things:

Receive events — time-based (cron, one-off scheduled) and domain events (NATS, Gitea webhooks, state hub lifecycle signals).
Evaluate rules and instructions — given event payload and resolved context, determine what work must be created.
Emit task sets — publish structured task creation requests to issue-core.

It does not execute work. It does not track task lifecycle. It does not manage projects.

Boundary rules

Concern	Owner
Cross-org task scheduling and reactive automation	activity-core
Task lifecycle (create, assign, track, close)	issue-core
Project and initiative management (phased, completion-gated)	project-core (future)
Repository capability profiling	repo-scoping
Cross-domain coordination state	state hub
Execution of automatable tasks	Temporal workers (per-repo)

Event type registry

Event types are declared by publishers as markdown definition files (see ACT-ADR-002). Governance is publisher-declared by default: a publisher registers its event types by committing definition files to the event-types registry. In production environments, a curator gate can be enabled — registry entries must be reviewed before the runtime accepts events of that type. This is a configuration flag per runtime scope (dev, staging, prod), not a hard-coded rule.

State hub relationship

The state hub delegates automation to activity-core rather than implementing it internally. Concretely:

Maintenance jobs currently baked into the state hub (consistency sync, SBOM staleness checks) are migrated to ActivityDefinitions in activity-core.
The state hub becomes a publisher of lifecycle events on NATS (org.workstream.created, org.decision.resolved, org.repo.registered, etc.).
The state hub does not subscribe to activity-core's output directly; it reads task state from issue-core when needed.

This preserves the state hub as a read model and makes activity-core the single home for automation policy.

rules-core: module-first

The rule and instruction evaluation engine starts as src/activity_core/rules/ — a module with a clean internal boundary (no imports from Temporal, Postgres, or FastAPI within the module). Extraction to a standalone rules-core repository happens when a second consumer (e.g. state hub governance, project-core) needs the engine. This follows the same discipline as the task-flow-engine extraction plan (CUST-TFE-SCOPE).

NATS as org infrastructure

NATS JetStream is promoted from an activity-core internal component to org-wide event bus infrastructure. It runs as a standalone service (not bundled in activity-core's docker-compose) with its own lifecycle. All services that publish or subscribe to org events do so via NATS streams.

issue-core integration

activity-core communicates with issue-core via a task emission adapter — an abstraction layer that, in the initial implementation, calls issue-core's REST API. The adapter interface is defined now; the transport can migrate to NATS subscription (issue-core subscribes to task.spawned events) once issue-core adds that capability. This avoids hardcoding REST coupling throughout the codebase.

Webhook receiver

A new HTTP endpoint within activity-core accepts inbound webhooks from Gitea (and later GitHub, other services). It normalises payloads to the canonical EventEnvelope format, validates against the event type registry, and publishes to NATS. This runs alongside the existing FastAPI api.py.

Domain assignment

activity-core and issue-core are assigned to the capabilities domain — the same domain as repo-scoping. These are org-wide infrastructure tools that serve all domains equally, not artefacts of any single project or custodian's personal workflow. issue-core is explicitly disassociated from the markitect domain.

Trigger types

Three trigger types are supported:

Type	Description	Temporal mechanism
`cron`	Recurring schedule (5-field cron + timezone + misfire policy)	Temporal Schedule (implemented WP-0002)
`event`	React to a named event type on NATS	Temporal workflow started by Event Router
`scheduled`	One-off at a future datetime	Temporal Schedule with `remaining_actions: 1`

scheduled is a new trigger type added in WP-0003.

Consequences

Immediate

activity-core's INTENT.md and SCOPE.md are rewritten to reflect this architecture.
The task_instances Postgres table is reclassified as a spawn audit trail — it records the act of spawning (what was created, when, which issue-core reference) but is not the authoritative task record. Authoritative lifecycle state lives in issue-core.
A task emission adapter interface (src/activity_core/issue_sink.py) replaces any direct Postgres writes to task_instances with calls through the adapter.
The TaskExecutorWorkflow stub from WP-0001 is replaced with the actual adapter call in WP-0003.

Medium term

State hub adds NATS publishing to its lifecycle operations.
Gitea webhook receiver added to activity-core as a new HTTP router.
Existing state hub maintenance crons are migrated to ActivityDefinitions.
issue-facade is renamed issue-core and re-registered under the capabilities domain.

Long term

rules-core extracted as a standalone package when a second consumer appears.
project-core created (depends on task-flow-engine extraction) for multi-phase initiative management — explicitly out of scope for activity-core.
NATS gets its own operational runbook and monitoring as org infrastructure.

Alternatives Considered

State hub absorbs activity-core functionality: rejected — turns the state hub into a God object, violates the read-model boundary, and makes automation logic impossible to test independently.

Per-repo automation (GitHub Actions style): rejected — cross-repo coordination requires a single vantage point that can see all repos; per-repo actions can't express org-level triggers or context.

Activity-core as a thin Temporal wrapper only: rejected — without the event type registry and rule model, it's just a scheduler. The governance and introspection properties are the point.

Separate rules-core from day one: rejected — premature extraction adds dependency management overhead before a second consumer exists. Module-first with a clean boundary costs nothing and preserves the extraction option.

ACT-ADR-002 — Event type and ActivityDefinition definition format
ACT-ADR-003 — Rule vs. Instruction model and DSL
CUST-ADR-001 — Workplans as repository artefacts (canon/architecture/)
CUST-TFE-SCOPE-2026-000001 — task-flow-engine extraction plan (canon/projects/)
activity-core INTENT.md (to be written)
activity-core WP-0003 (to be written)

8.3 KiB Raw Blame History