Align activity-core scope boundaries

This commit is contained in:
2026-06-18 15:11:48 +02:00
parent 78eed5f942
commit 977a3bd97f
7 changed files with 530 additions and 56 deletions

175
SCOPE.md
View File

@@ -1,7 +1,7 @@
---
domain: capabilities
repo: activity-core
updated: "2026-06-03"
updated: "2026-06-16"
---
# SCOPE
@@ -16,7 +16,8 @@ updated: "2026-06-03"
activity-core is the org-wide Event Bridge for the Coulomb organization — a
rule-governed event loop that receives time-based and domain events, evaluates
declarative rules and LLM instructions against current org context, and emits
structured task sets to issue-core.
structured task, report, and evidence outputs without owning downstream task
lifecycle.
---
@@ -27,8 +28,11 @@ An `ActivityDefinition` (a markdown file checked into a repo) declares a trigger
resolve before evaluation, and a set of rules and instructions that determine
what tasks to create. When triggered, a durable Temporal workflow loads the
definition, resolves context, evaluates the rule/instruction set, and emits task
creation requests to issue-core. Everything is auditable: the spawn log records
the triggering event, matched rule, and resulting task references.
creation requests to issue-core or configured dry-run/audit sinks. Instructions
may also emit validated reports, and selected context resolvers may emit compact
non-secret evidence. Everything is auditable: the spawn log records the
triggering event, matched rule/instruction metadata, model/prompt hash where
applicable, and resulting task references.
The two evaluation modes:
- **Rule** — deterministic condition (sandboxed Python-like DSL) → fixed task
@@ -48,21 +52,33 @@ The two evaluation modes:
attribute schemas, example payloads, and intent documentation.
Curator-gating configurable per runtime environment.
- **Trigger types**: 5-field cron with timezone and misfire policy; one-off
scheduled datetime; event-type subscription via NATS.
scheduled datetime; event-type subscription via NATS; manual one-shot API
trigger; one-shot schedule smoke tests for recurring definitions.
- **Context resolution adapters**: repo-scoping (repository capability queries),
state hub (domain and workstream state), extensible for other sources.
State Hub (domain/workstream state, SBOM status, daily triage digest, coding
retro read model), and ops inventory (bounded HTTP/HTTPS probes of a
non-secret service inventory). The adapter registry is extensible for other
sources.
- **Rule evaluator**: sandboxed AST walker for Python-like boolean expressions
over event attributes and resolved context. Rule actions support safe
`context.*` / `event.*` interpolation and explicit `for_each` per-item
binding. No `exec()`.
- **Instruction executor**: trusted-field prompt rendering, LLM call via
llm-connect, structured output validation, optional curator review queue,
and deterministic report sinks.
llm-connect, structured output validation, bounded validation-failure
artifacts for report instructions, review-required audit metadata, and
deterministic report sinks. A real downstream review queue is not implemented
in this repo.
- **Task emission adapter**: abstraction over issue-core; current transport is
REST; designed to migrate to NATS subscription without code changes.
REST, with `ISSUE_SINK_TYPE=null` for dry-run/audit mode. It is designed to
migrate to a durable issue-core-owned NATS command boundary when issue-core
provides that contract.
- **Report sinks**: instruction report outputs can be persisted to bounded
local working memory and posted as State Hub progress events. These are
reporting outputs, not task lifecycle ownership.
- **Ops evidence sinks**: `ops-inventory` context sources can post compact
non-secret `ops_inventory_probe` summaries to State Hub. Inter-Hub submission
is present only as a gated/deferred sink result until operator-owned
`OPS_HUB_KEY` custody and widget mapping are ready.
- **Spawn audit log**: every task emission recorded with rule/instruction id,
triggering event id, model and prompt hash (instructions), issue-core task ref.
- **Webhook receiver**: HTTP endpoint normalising inbound Gitea/GitHub webhook
@@ -84,6 +100,14 @@ The two evaluation modes:
coordinated changes belong to project-core (future).
- **Execution of automatable tasks** — Temporal Activities that do real work
(run a scan, apply a patch, call an API) live in per-repo workers, not here.
- **General ops execution** — Kubernetes, SSH, tunnel, authenticated service
checks, secret custody, OpenBao writes, and Inter-Hub widget/API-key
provisioning belong to the owning operational repos and operator workflows.
activity-core may record non-secret probe evidence; it must not become the ops
control plane.
- **Service inventory authority** — the Custodian inventory remains owned by
the custodian/state-hub surface. activity-core may read a projected
non-secret snapshot.
- **Event broker hosting** — NATS JetStream is org infrastructure; activity-core
consumes it but does not own its lifecycle.
- **Temporal server hosting** — activity-core uses the Temporal SDK; the server
@@ -101,6 +125,9 @@ The two evaluation modes:
structured tasks in the right repos."
- You need one-off future task scheduling without a separate reminder system.
- You want an auditable record of what triggered what and why.
- You need a scheduled, non-secret evidence note proving that declared service
endpoints or access paths were observed, without executing privileged ops
commands.
- You are replacing scattered bespoke cron jobs and manual coordination with
a governed, observable automation layer.
@@ -117,29 +144,45 @@ The two evaluation modes:
## Current State
- **Status**: active production-backed service. Foundation, triggers/ops,
event bridge, Railiance deployment, and the production service workplans are
complete. The stale March WP-0002 handoff note has been reconciled and
archived.
- **Status**: active production-backed service with two visible open gates:
`ACTIVITY-WP-0006` still waits on three clean consecutive scheduled daily
triage runs and calibration feedback, and `ACTIVITY-WP-0008` is blocked until
Helix Forge publishes the upstream `coding_retro` read model needed to enable
the Saturday schedule. `ACTIVITY-WP-0007` is finished: the bounded
ops-inventory probe/evidence slice has live Railiance evidence.
- **Implementation**: core is functional. `RunActivityWorkflow`,
`TaskExecutorWorkflow` (stub), PostgreSQL schema, Temporal Schedules, NATS
Event Router, FastAPI admin API, Prometheus metrics, event type registry,
markdown ActivityDefinition parser/sync, rule evaluator, instruction
executor, context resolvers, issue sink, report sinks, Kubernetes deployment,
and operational runbook are all implemented.
- **Operational proof**: the daily State Hub WSJF triage cutover has completed
far enough that activity-core is now the trusted scheduled substrate for the
routine report. Recent hardening fixed the State Hub SBOM resolver contract,
made slow LLM activity timeouts configurable, and added safe rule action
interpolation plus explicit `for_each` binding for per-repo SBOM staleness
tasks.
- **Stability**: construction risk has shifted to operational hardening risk.
The full test suite passed on 2026-06-03 (`125 passed, 1 skipped`). The
remaining work is mostly observability, status-canon adaptation, contract
documentation, and broader production adoption rather than first
implementation.
- **Next**: `ACTIVITY-WP-0006` — post-triage operational hardening and scope
alignment.
`TaskExecutorWorkflow` (stub), PostgreSQL schema, Temporal Schedules and smoke
schedules, NATS Event Router, FastAPI admin API, Prometheus metrics, event
type registry, markdown ActivityDefinition parser/sync, rule evaluator,
instruction executor, context resolvers, issue sink, report sinks, ops
evidence sink, Kubernetes deployment, and operational runbook are all
implemented.
- **Current definitions**: `weekly-sbom-staleness` is enabled and demonstrates
the deterministic rule/fan-out path. `weekly-coding-retro` is present and
tested but intentionally disabled until live `coding_retro` evidence exists.
Railiance projects the daily State Hub WSJF triage definition and the disabled
ops-service-inventory probe definition from the runtime bundle.
- **Operational proof**: the State Hub daily WSJF triage path has produced
validated reports and working-memory notes, but the calibration gate is not
closed. A 2026-06-16 recheck found State Hub `daily_triage` progress and
working-memory `daily-triage-*` notes only through 2026-06-06, so there is not
yet evidence for three clean consecutive scheduled runs after the June 7
runtime projection failure. The ops inventory probe path has live fallback
evidence in State Hub; Inter-Hub per-entity submission remains deferred.
- **Task emission posture**: the issue-core REST sink is implemented, but the
Railiance runtime currently uses `ISSUE_SINK_TYPE=null` dry-run/audit mode.
Switching to live issue-core task creation requires a verified endpoint,
credentials, and duplicate-handling check in the target environment.
- **Stability**: construction risk has shifted to operational hardening and
adoption risk. The last recorded full-suite pass in the workplans was
2026-06-04 (`128 passed, 1 skipped`), with later targeted coverage added for
ops inventory, ops evidence sinks, Railiance projection wiring, and weekly
coding retro parsing/rule behavior.
- **Next**: close `ACTIVITY-WP-0006-T03` with real scheduled-run calibration
evidence; close `ACTIVITY-WP-0008-T03` once upstream `coding_retro` publication
exists and the dry-run/duplicate check passes; decide when to move selected
task/report/evidence sinks from dry-run or fallback mode to their intended
live backends.
---
@@ -159,9 +202,9 @@ database, the project planner, or a general execution worker. The local
workplan explicitly rehomes execution responsibility.
One boundary nuance is now explicit: activity-core may post State Hub progress
events as a configured report sink. That is acceptable because it records the
result of an activity-core activation; it is not ownership of State Hub state,
task lifecycle, or workstream planning.
events as a configured report or evidence sink. That is acceptable because it
records the result of an activity-core activation; it is not ownership of State
Hub state, task lifecycle, or workstream planning.
The main drift risk is convenience creep: adding direct task tracking,
project-phase state, or bespoke operational scripts because the Temporal
@@ -169,27 +212,58 @@ substrate is already nearby. Future work should prefer declarative
ActivityDefinitions, bounded context resolvers, and outbound adapters over
new one-off control paths.
## Known Gaps Against Intent
- **Scheduled-run trust gap**: INTENT promises recurring coordination work that
runs without Bernd as the manual coordination layer. The daily triage path is
implemented, but its current calibration task still lacks three clean
consecutive scheduled runs after the June 7 runtime failure. Until that closes,
daily triage remains a production-backed capability with an evidence gap, not
a fully proven standing substrate.
- **Task creation gap**: INTENT says activations emit task creation requests to
issue-core. The REST sink exists, but Railiance is still in `ISSUE_SINK_TYPE=null`
mode. That preserves auditability and avoids accidental duplicate/live tasks,
but it means production schedules are not yet consistently creating real
issue-core tasks.
- **Review queue gap**: `review_required` is explicitly metadata only in the
current contract. No issue-core review queue integration exists here, so any
future queue routing needs a downstream issue-core contract before high-impact
instruction outputs rely on it.
- **Evidence backend posture**: the State Hub fallback evidence path is the
accepted current backend for `ops_inventory_probe`. Inter-Hub/ops-hub
submission is deliberately deferred behind `OPS_HUB_KEY`, widget mapping, and
operator approval, so per-entity ops evidence publication is future work.
- **Execution-boundary residue**: `TaskExecutorWorkflow` is still registered as
a stub that writes a done `task_instances` row. It should remain inert or be
removed/re-homed before it attracts real execution work, because execution is
explicitly outside activity-core's intent.
- **API exposure posture**: the FastAPI surface stays ClusterIP-only for now.
External ingress remains future work until an authenticated access policy is
designed.
---
## How It Fits
```
[NATS JetStream] ← publishers: state hub, Gitea webhooks, Temporal signals, cron
[NATS JetStream] ← publishers: State Hub, Gitea webhooks, Temporal signals, cron
[activity-core] ← event type registry, rule evaluator, instruction executor
[activity-core] → [issue-core] → [repos/services]
[activity-core] → [report sinks]
[activity-core] → [report/evidence sinks] → [State Hub / working memory / future Inter-Hub]
```
- **Upstream**: NATS (event bus), Temporal (durable workflow engine), PostgreSQL
(definitions and audit log), repo-scoping (context adapter), state hub (context
(definitions and audit log), repo-scoping (context adapter), State Hub (context
adapter and event publisher).
- **Downstream**: issue-core (task management) and configured report sinks.
- **Downstream**: issue-core (task management) and configured report/evidence sinks.
Agents and humans pick up tasks from issue-core and do the actual work.
Railiance may use the null sink for dry-run/audit mode until live issue-core
emission is approved.
- **Coordinates with**: the state hub delegates maintenance automations to
activity-core by publishing lifecycle events or by being resolved as context.
activity-core may post progress events as report outputs, but it does not own
State Hub task/workstream state.
activity-core may post progress events as report/evidence outputs, but it
does not own State Hub task/workstream state.
---
@@ -203,6 +277,11 @@ new one-off control paths.
by a sandboxed AST walker.
- **Instruction** — LLM-evaluated task generation with trusted-field prompt
interpolation and structured output schema enforcement.
- **Report sink** — configured persistence for instruction reports, currently
working-memory markdown notes and State Hub progress events.
- **Evidence sink** — configured persistence for compact non-secret resolver
evidence, currently State Hub progress for ops inventory probes; Inter-Hub is
a deferred gated target.
- **Event type** — a registered, schema-documented category of event (e.g.
`org.repo.registered`). Publisher-declared; curator-gated per environment.
- **Spawn audit trail** — activity-core's local record of what tasks were emitted,
@@ -219,8 +298,12 @@ new one-off control paths.
- `issue-core` (formerly issue-facade) — downstream task management; receives
all task emission from activity-core.
- `repo-scoping` — context adapter for repository capability queries.
- `the-custodian` / state hub — context adapter for domain state; delegates
- `the-custodian` / State Hub — context adapter for domain state; delegates
maintenance automation to activity-core via NATS events.
- `llm-connect` — instruction execution backend for judgement-oriented reports
such as daily State Hub WSJF triage.
- `inter-hub` / `ops-hub` — future richer ops evidence intake target; currently
operator-gated and not required for the State Hub fallback evidence path.
- `rules-core` (future extraction) — the rule evaluator and instruction executor
module, currently in `src/activity_core/rules/`.
- `project-core` (future) — project and initiative management; will use
@@ -248,7 +331,10 @@ new one-off control paths.
`src/activity_core/activities.py` (Temporal activities),
`src/activity_core/event_router.py` (NATS → Temporal),
`src/activity_core/schedule_manager.py` (Temporal Schedules),
`src/activity_core/api.py` (FastAPI admin).
`src/activity_core/api.py` (FastAPI admin),
`src/activity_core/report_sinks.py` (instruction reports),
`src/activity_core/ops_evidence_sinks.py` (ops evidence),
and `src/activity_core/context_resolvers/` (external context adapters).
- Definition files: `event-types/`, `activity-definitions/`, and `tasks/`.
- Dev environment: `docker-compose.dev.yml` (Temporal + PostgreSQL + NATS).
- Entry points: `uv run python -m activity_core.worker` (Temporal worker),
@@ -264,6 +350,7 @@ title: Durable event-triggered task factory
description: >
Org-wide Event Bridge that receives time-based and domain events, evaluates
declarative rules and LLM instructions against current org context, and emits
structured task sets to issue-core with a full spawn audit trail.
keywords: [temporal, workflow, event-bridge, task, cron, event, rule, instruction, org-automation]
structured task, report, and evidence outputs with a full spawn/report audit
trail while leaving task lifecycle ownership downstream.
keywords: [temporal, workflow, event-bridge, task, report, evidence, cron, event, rule, instruction, org-automation]
```