diff --git a/SCOPE.md b/SCOPE.md index 7f838f9..4c3850f 100644 --- a/SCOPE.md +++ b/SCOPE.md @@ -1,7 +1,7 @@ --- domain: capabilities repo: activity-core -updated: "2026-06-03" +updated: "2026-06-16" --- # SCOPE @@ -16,7 +16,8 @@ updated: "2026-06-03" activity-core is the org-wide Event Bridge for the Coulomb organization — a rule-governed event loop that receives time-based and domain events, evaluates declarative rules and LLM instructions against current org context, and emits -structured task sets to issue-core. +structured task, report, and evidence outputs without owning downstream task +lifecycle. --- @@ -27,8 +28,11 @@ An `ActivityDefinition` (a markdown file checked into a repo) declares a trigger resolve before evaluation, and a set of rules and instructions that determine what tasks to create. When triggered, a durable Temporal workflow loads the definition, resolves context, evaluates the rule/instruction set, and emits task -creation requests to issue-core. Everything is auditable: the spawn log records -the triggering event, matched rule, and resulting task references. +creation requests to issue-core or configured dry-run/audit sinks. Instructions +may also emit validated reports, and selected context resolvers may emit compact +non-secret evidence. Everything is auditable: the spawn log records the +triggering event, matched rule/instruction metadata, model/prompt hash where +applicable, and resulting task references. The two evaluation modes: - **Rule** — deterministic condition (sandboxed Python-like DSL) → fixed task @@ -48,21 +52,33 @@ The two evaluation modes: attribute schemas, example payloads, and intent documentation. Curator-gating configurable per runtime environment. - **Trigger types**: 5-field cron with timezone and misfire policy; one-off - scheduled datetime; event-type subscription via NATS. + scheduled datetime; event-type subscription via NATS; manual one-shot API + trigger; one-shot schedule smoke tests for recurring definitions. - **Context resolution adapters**: repo-scoping (repository capability queries), - state hub (domain and workstream state), extensible for other sources. + State Hub (domain/workstream state, SBOM status, daily triage digest, coding + retro read model), and ops inventory (bounded HTTP/HTTPS probes of a + non-secret service inventory). The adapter registry is extensible for other + sources. - **Rule evaluator**: sandboxed AST walker for Python-like boolean expressions over event attributes and resolved context. Rule actions support safe `context.*` / `event.*` interpolation and explicit `for_each` per-item binding. No `exec()`. - **Instruction executor**: trusted-field prompt rendering, LLM call via - llm-connect, structured output validation, optional curator review queue, - and deterministic report sinks. + llm-connect, structured output validation, bounded validation-failure + artifacts for report instructions, review-required audit metadata, and + deterministic report sinks. A real downstream review queue is not implemented + in this repo. - **Task emission adapter**: abstraction over issue-core; current transport is - REST; designed to migrate to NATS subscription without code changes. + REST, with `ISSUE_SINK_TYPE=null` for dry-run/audit mode. It is designed to + migrate to a durable issue-core-owned NATS command boundary when issue-core + provides that contract. - **Report sinks**: instruction report outputs can be persisted to bounded local working memory and posted as State Hub progress events. These are reporting outputs, not task lifecycle ownership. +- **Ops evidence sinks**: `ops-inventory` context sources can post compact + non-secret `ops_inventory_probe` summaries to State Hub. Inter-Hub submission + is present only as a gated/deferred sink result until operator-owned + `OPS_HUB_KEY` custody and widget mapping are ready. - **Spawn audit log**: every task emission recorded with rule/instruction id, triggering event id, model and prompt hash (instructions), issue-core task ref. - **Webhook receiver**: HTTP endpoint normalising inbound Gitea/GitHub webhook @@ -84,6 +100,14 @@ The two evaluation modes: coordinated changes belong to project-core (future). - **Execution of automatable tasks** — Temporal Activities that do real work (run a scan, apply a patch, call an API) live in per-repo workers, not here. +- **General ops execution** — Kubernetes, SSH, tunnel, authenticated service + checks, secret custody, OpenBao writes, and Inter-Hub widget/API-key + provisioning belong to the owning operational repos and operator workflows. + activity-core may record non-secret probe evidence; it must not become the ops + control plane. +- **Service inventory authority** — the Custodian inventory remains owned by + the custodian/state-hub surface. activity-core may read a projected + non-secret snapshot. - **Event broker hosting** — NATS JetStream is org infrastructure; activity-core consumes it but does not own its lifecycle. - **Temporal server hosting** — activity-core uses the Temporal SDK; the server @@ -101,6 +125,9 @@ The two evaluation modes: structured tasks in the right repos." - You need one-off future task scheduling without a separate reminder system. - You want an auditable record of what triggered what and why. +- You need a scheduled, non-secret evidence note proving that declared service + endpoints or access paths were observed, without executing privileged ops + commands. - You are replacing scattered bespoke cron jobs and manual coordination with a governed, observable automation layer. @@ -117,29 +144,45 @@ The two evaluation modes: ## Current State -- **Status**: active production-backed service. Foundation, triggers/ops, - event bridge, Railiance deployment, and the production service workplans are - complete. The stale March WP-0002 handoff note has been reconciled and - archived. +- **Status**: active production-backed service with two visible open gates: + `ACTIVITY-WP-0006` still waits on three clean consecutive scheduled daily + triage runs and calibration feedback, and `ACTIVITY-WP-0008` is blocked until + Helix Forge publishes the upstream `coding_retro` read model needed to enable + the Saturday schedule. `ACTIVITY-WP-0007` is finished: the bounded + ops-inventory probe/evidence slice has live Railiance evidence. - **Implementation**: core is functional. `RunActivityWorkflow`, - `TaskExecutorWorkflow` (stub), PostgreSQL schema, Temporal Schedules, NATS - Event Router, FastAPI admin API, Prometheus metrics, event type registry, - markdown ActivityDefinition parser/sync, rule evaluator, instruction - executor, context resolvers, issue sink, report sinks, Kubernetes deployment, - and operational runbook are all implemented. -- **Operational proof**: the daily State Hub WSJF triage cutover has completed - far enough that activity-core is now the trusted scheduled substrate for the - routine report. Recent hardening fixed the State Hub SBOM resolver contract, - made slow LLM activity timeouts configurable, and added safe rule action - interpolation plus explicit `for_each` binding for per-repo SBOM staleness - tasks. -- **Stability**: construction risk has shifted to operational hardening risk. - The full test suite passed on 2026-06-03 (`125 passed, 1 skipped`). The - remaining work is mostly observability, status-canon adaptation, contract - documentation, and broader production adoption rather than first - implementation. -- **Next**: `ACTIVITY-WP-0006` — post-triage operational hardening and scope - alignment. + `TaskExecutorWorkflow` (stub), PostgreSQL schema, Temporal Schedules and smoke + schedules, NATS Event Router, FastAPI admin API, Prometheus metrics, event + type registry, markdown ActivityDefinition parser/sync, rule evaluator, + instruction executor, context resolvers, issue sink, report sinks, ops + evidence sink, Kubernetes deployment, and operational runbook are all + implemented. +- **Current definitions**: `weekly-sbom-staleness` is enabled and demonstrates + the deterministic rule/fan-out path. `weekly-coding-retro` is present and + tested but intentionally disabled until live `coding_retro` evidence exists. + Railiance projects the daily State Hub WSJF triage definition and the disabled + ops-service-inventory probe definition from the runtime bundle. +- **Operational proof**: the State Hub daily WSJF triage path has produced + validated reports and working-memory notes, but the calibration gate is not + closed. A 2026-06-16 recheck found State Hub `daily_triage` progress and + working-memory `daily-triage-*` notes only through 2026-06-06, so there is not + yet evidence for three clean consecutive scheduled runs after the June 7 + runtime projection failure. The ops inventory probe path has live fallback + evidence in State Hub; Inter-Hub per-entity submission remains deferred. +- **Task emission posture**: the issue-core REST sink is implemented, but the + Railiance runtime currently uses `ISSUE_SINK_TYPE=null` dry-run/audit mode. + Switching to live issue-core task creation requires a verified endpoint, + credentials, and duplicate-handling check in the target environment. +- **Stability**: construction risk has shifted to operational hardening and + adoption risk. The last recorded full-suite pass in the workplans was + 2026-06-04 (`128 passed, 1 skipped`), with later targeted coverage added for + ops inventory, ops evidence sinks, Railiance projection wiring, and weekly + coding retro parsing/rule behavior. +- **Next**: close `ACTIVITY-WP-0006-T03` with real scheduled-run calibration + evidence; close `ACTIVITY-WP-0008-T03` once upstream `coding_retro` publication + exists and the dry-run/duplicate check passes; decide when to move selected + task/report/evidence sinks from dry-run or fallback mode to their intended + live backends. --- @@ -159,9 +202,9 @@ database, the project planner, or a general execution worker. The local workplan explicitly rehomes execution responsibility. One boundary nuance is now explicit: activity-core may post State Hub progress -events as a configured report sink. That is acceptable because it records the -result of an activity-core activation; it is not ownership of State Hub state, -task lifecycle, or workstream planning. +events as a configured report or evidence sink. That is acceptable because it +records the result of an activity-core activation; it is not ownership of State +Hub state, task lifecycle, or workstream planning. The main drift risk is convenience creep: adding direct task tracking, project-phase state, or bespoke operational scripts because the Temporal @@ -169,27 +212,58 @@ substrate is already nearby. Future work should prefer declarative ActivityDefinitions, bounded context resolvers, and outbound adapters over new one-off control paths. +## Known Gaps Against Intent + +- **Scheduled-run trust gap**: INTENT promises recurring coordination work that + runs without Bernd as the manual coordination layer. The daily triage path is + implemented, but its current calibration task still lacks three clean + consecutive scheduled runs after the June 7 runtime failure. Until that closes, + daily triage remains a production-backed capability with an evidence gap, not + a fully proven standing substrate. +- **Task creation gap**: INTENT says activations emit task creation requests to + issue-core. The REST sink exists, but Railiance is still in `ISSUE_SINK_TYPE=null` + mode. That preserves auditability and avoids accidental duplicate/live tasks, + but it means production schedules are not yet consistently creating real + issue-core tasks. +- **Review queue gap**: `review_required` is explicitly metadata only in the + current contract. No issue-core review queue integration exists here, so any + future queue routing needs a downstream issue-core contract before high-impact + instruction outputs rely on it. +- **Evidence backend posture**: the State Hub fallback evidence path is the + accepted current backend for `ops_inventory_probe`. Inter-Hub/ops-hub + submission is deliberately deferred behind `OPS_HUB_KEY`, widget mapping, and + operator approval, so per-entity ops evidence publication is future work. +- **Execution-boundary residue**: `TaskExecutorWorkflow` is still registered as + a stub that writes a done `task_instances` row. It should remain inert or be + removed/re-homed before it attracts real execution work, because execution is + explicitly outside activity-core's intent. +- **API exposure posture**: the FastAPI surface stays ClusterIP-only for now. + External ingress remains future work until an authenticated access policy is + designed. + --- ## How It Fits ``` -[NATS JetStream] ← publishers: state hub, Gitea webhooks, Temporal signals, cron +[NATS JetStream] ← publishers: State Hub, Gitea webhooks, Temporal signals, cron ↓ [activity-core] ← event type registry, rule evaluator, instruction executor [activity-core] → [issue-core] → [repos/services] -[activity-core] → [report sinks] +[activity-core] → [report/evidence sinks] → [State Hub / working memory / future Inter-Hub] ``` - **Upstream**: NATS (event bus), Temporal (durable workflow engine), PostgreSQL - (definitions and audit log), repo-scoping (context adapter), state hub (context + (definitions and audit log), repo-scoping (context adapter), State Hub (context adapter and event publisher). -- **Downstream**: issue-core (task management) and configured report sinks. +- **Downstream**: issue-core (task management) and configured report/evidence sinks. Agents and humans pick up tasks from issue-core and do the actual work. + Railiance may use the null sink for dry-run/audit mode until live issue-core + emission is approved. - **Coordinates with**: the state hub delegates maintenance automations to activity-core by publishing lifecycle events or by being resolved as context. - activity-core may post progress events as report outputs, but it does not own - State Hub task/workstream state. + activity-core may post progress events as report/evidence outputs, but it + does not own State Hub task/workstream state. --- @@ -203,6 +277,11 @@ new one-off control paths. by a sandboxed AST walker. - **Instruction** — LLM-evaluated task generation with trusted-field prompt interpolation and structured output schema enforcement. +- **Report sink** — configured persistence for instruction reports, currently + working-memory markdown notes and State Hub progress events. +- **Evidence sink** — configured persistence for compact non-secret resolver + evidence, currently State Hub progress for ops inventory probes; Inter-Hub is + a deferred gated target. - **Event type** — a registered, schema-documented category of event (e.g. `org.repo.registered`). Publisher-declared; curator-gated per environment. - **Spawn audit trail** — activity-core's local record of what tasks were emitted, @@ -219,8 +298,12 @@ new one-off control paths. - `issue-core` (formerly issue-facade) — downstream task management; receives all task emission from activity-core. - `repo-scoping` — context adapter for repository capability queries. -- `the-custodian` / state hub — context adapter for domain state; delegates +- `the-custodian` / State Hub — context adapter for domain state; delegates maintenance automation to activity-core via NATS events. +- `llm-connect` — instruction execution backend for judgement-oriented reports + such as daily State Hub WSJF triage. +- `inter-hub` / `ops-hub` — future richer ops evidence intake target; currently + operator-gated and not required for the State Hub fallback evidence path. - `rules-core` (future extraction) — the rule evaluator and instruction executor module, currently in `src/activity_core/rules/`. - `project-core` (future) — project and initiative management; will use @@ -248,7 +331,10 @@ new one-off control paths. `src/activity_core/activities.py` (Temporal activities), `src/activity_core/event_router.py` (NATS → Temporal), `src/activity_core/schedule_manager.py` (Temporal Schedules), - `src/activity_core/api.py` (FastAPI admin). + `src/activity_core/api.py` (FastAPI admin), + `src/activity_core/report_sinks.py` (instruction reports), + `src/activity_core/ops_evidence_sinks.py` (ops evidence), + and `src/activity_core/context_resolvers/` (external context adapters). - Definition files: `event-types/`, `activity-definitions/`, and `tasks/`. - Dev environment: `docker-compose.dev.yml` (Temporal + PostgreSQL + NATS). - Entry points: `uv run python -m activity_core.worker` (Temporal worker), @@ -264,6 +350,7 @@ title: Durable event-triggered task factory description: > Org-wide Event Bridge that receives time-based and domain events, evaluates declarative rules and LLM instructions against current org context, and emits - structured task sets to issue-core with a full spawn audit trail. -keywords: [temporal, workflow, event-bridge, task, cron, event, rule, instruction, org-automation] + structured task, report, and evidence outputs with a full spawn/report audit + trail while leaving task lifecycle ownership downstream. +keywords: [temporal, workflow, event-bridge, task, report, evidence, cron, event, rule, instruction, org-automation] ``` diff --git a/docs/adr/adr-003-rule-instruction-model.md b/docs/adr/adr-003-rule-instruction-model.md index 5b3750f..5a5505b 100644 --- a/docs/adr/adr-003-rule-instruction-model.md +++ b/docs/adr/adr-003-rule-instruction-model.md @@ -216,11 +216,21 @@ it. The output schema must define `List[TaskSpec]` or a compatible envelope. #### `review_required: true` -When set, the instruction's proposed task list is written to a **pending review -queue** in issue-core rather than directly created. A human or curator agent -reviews and approves/rejects before tasks are materialised. This is the default -for instructions that create high-impact tasks (cross-repo changes, security -responses, production operations). +When set today, the instruction's task/report output is marked with +`review_required=true` in activity-core audit metadata. For report-producing +instructions, this flag is also persisted in configured report sinks so an +operator can distinguish validated-but-review-worthy output from routine +output. + +activity-core does **not** currently route proposed tasks to a pending review +queue. That queue must be owned by issue-core, because issue-core owns task +lifecycle state. Until issue-core exposes a review contract, `review_required` +is metadata only; it must not be treated as evidence that live task creation was +held for approval. + +Future issue-core review integration may use the same field, but that change +must update the issue sink contract and tests before any ActivityDefinition +relies on queue routing. #### Evaluation semantics @@ -286,7 +296,8 @@ This boundary makes future extraction to `rules-core` a packaging exercise, not tasks" behaviour is replaced by explicit rule blocks. - A new `RuleEvaluator` class (AST walker) is added to `src/activity_core/rules/`. - A new `InstructionExecutor` class handles prompt rendering, LLM call, output - validation, and review queue routing. + validation, and review-required audit metadata. Pending review queue routing + remains a future issue-core integration. - Integration tests for rule evaluation use fixture JSON; no running Temporal required. - The `task_spawn_log` table is added to the Postgres schema (new Alembic migration). - ActivityDefinition files that omit both `rules` and `instructions` are valid diff --git a/docs/conventions.md b/docs/conventions.md index 304f727..5b0011e 100644 --- a/docs/conventions.md +++ b/docs/conventions.md @@ -18,7 +18,7 @@ extension point `af654abb`). | Queue name | Registered workers | |---|---| | `orchestrator-tq` | `RunActivityWorkflow` and all its activities (`load_activity_definition`, `resolve_context`, `log_run`) | -| `task-execution-tq` | `TaskExecutorWorkflow` and all concrete task type workflows | +| `task-execution-tq` | `TaskExecutorWorkflow` compatibility stub only; real execution belongs in per-repo workers | **Rule:** a workflow and its activities must be registered on the same task queue. Cross-queue activity calls require an explicit `task_queue` argument on @@ -60,6 +60,12 @@ A single process may run workers for multiple task queues, but each `Worker` instance is bound to one task queue. Use separate `Worker` instances for `orchestrator-tq` and `task-execution-tq`. +`TaskExecutorWorkflow` is not a production execution surface for activity-core. +It exists only as a compatibility/idempotency stub that writes a synthetic +`task_instances` row in older tests and dev flows. Do not add concrete task +execution logic here; execution ownership belongs to per-repo workers or a +future execution-owned repo/workplan. + --- ## Search attributes diff --git a/history/2026-06-16-intent-gap-analysis.md b/history/2026-06-16-intent-gap-analysis.md new file mode 100644 index 0000000..53dfa45 --- /dev/null +++ b/history/2026-06-16-intent-gap-analysis.md @@ -0,0 +1,118 @@ +--- +type: history +title: "activity-core INTENT gap analysis" +date: "2026-06-16" +author: codex +repo: activity-core +related_workplan: ACTIVITY-WP-0009 +--- + +# activity-core INTENT Gap Analysis - 2026-06-16 + +## Context + +This note preserves the findings from a repository review against `INTENT.md`. +The review refreshed `SCOPE.md` for the current repo state and identified the +remaining gaps between the intended Event Bridge boundary and the implemented / +deployed surface. + +Files and surfaces reviewed: + +- `INTENT.md` +- `SCOPE.md` +- `src/activity_core/` +- `activity-definitions/` +- `docs/runbook.md` +- `docs/issue-core-emission-boundary.md` +- `k8s/railiance/` +- `workplans/ACTIVITY-WP-0006-post-triage-operational-hardening.md` +- `workplans/ACTIVITY-WP-0007-ops-inventory-probe-runner.md` +- `workplans/ACTIVITY-WP-0008-weekly-coding-retro.md` + +## Summary + +activity-core matches the core INTENT boundary in shape: it owns trigger +durability, context resolution, rule/instruction evaluation, outbound +task/report/evidence emission, and local audit records. It still must avoid +owning task lifecycle, project state, privileged ops execution, or service +inventory authority. + +The current implementation has grown a useful bounded report/evidence surface: +instruction reports can write working-memory notes and State Hub progress, and +`ops-inventory` context sources can emit compact non-secret +`ops_inventory_probe` summaries. This is still consistent with INTENT as long as +those outputs remain records of activity-core activations rather than an +authoritative task, project, or ops control plane. + +## Findings + +### 1. Scheduled-run trust gap + +`INTENT.md` expects recurring coordination work to run without Bernd as the +manual coordination layer. The daily State Hub WSJF triage path is implemented +and has produced validated reports, but `ACTIVITY-WP-0006-T03` still lacks +three clean consecutive scheduled runs after the June 7 runtime projection +failure. + +Current evidence as of 2026-06-16: + +- State Hub `daily_triage` progress only shows activity-core entries through + 2026-06-06. +- `/home/worsch/the-custodian/memory/working` only has `daily-triage-*` notes + for 2026-06-02 through 2026-06-06. + +Impact: daily triage is production-backed, but not yet fully proven as a +standing substrate. + +### 2. Live task creation gap + +`INTENT.md` says each activation emits task creation requests to issue-core and +records only the spawn audit trail. The REST issue sink exists, but Railiance is +currently configured with `ISSUE_SINK_TYPE=null`, so production runs record +synthetic audit references instead of consistently creating live issue-core +tasks. + +Impact: the task emission boundary is implemented but not yet broadly proven in +the production deployment. + +### 3. Review queue gap + +The original ADR text described `review_required` as routing instruction output +to a pending review queue. Current code records `review_required` in +report/spawn metadata but does not integrate with an issue-core review queue. + +Impact: current behavior is safe as metadata. As of the ACTIVITY-WP-0009 +implementation pass, ADR-003 and SCOPE.md have been aligned to that behavior. + +### 4. Evidence backend gap + +The State Hub fallback evidence path works for `ops_inventory_probe`, and +`ACTIVITY-WP-0007` has live Railiance evidence. Inter-Hub / ops-hub submission +is intentionally deferred behind operator-owned `OPS_HUB_KEY` custody, widget +mapping, and approval. + +Impact: activity-core can preserve non-secret continuity evidence, but richer +per-entity ops evidence publication is not yet live. + +### 5. Execution-boundary residue + +`TaskExecutorWorkflow` remains registered as a stub that persists a done +`task_instances` row. INTENT explicitly says activity-core must not execute the +work or track lifecycle state. + +Impact: low immediate risk because the workflow is inert, but it is an attractive +wrong hook for future execution creep. + +### 6. API exposure gap + +The FastAPI admin surface is useful for internal CRUD and manual triggers. +Railiance docs keep it as ClusterIP until an authenticated ingress and access +policy are chosen. + +Impact: operationally acceptable for now, but production access posture remains +an explicit decision. + +## Follow-up + +`workplans/ACTIVITY-WP-0009-intent-gap-closure.md` was created to turn these +findings into tracked closure work. diff --git a/src/activity_core/workflows.py b/src/activity_core/workflows.py index 3c3aa08..8e804a6 100644 --- a/src/activity_core/workflows.py +++ b/src/activity_core/workflows.py @@ -209,11 +209,12 @@ class RunActivityWorkflow: @workflow.defn class TaskExecutorWorkflow: - """Child workflow that executes one concrete task instance. + """Compatibility stub for legacy task-instance workflows. - Stub behaviour: persists a task_instances row with status=done and - returns immediately. Real task execution logic replaces this in a - later workstream. + This is not a production execution surface for activity-core. It persists a + task_instances row with status=done and returns immediately so legacy/dev + flows keep their idempotency behavior. Real task execution belongs in + per-repo workers or a future execution-owned repo/workplan, not here. task_id is derived deterministically from the workflow's own ID so persist_task_instance retries remain idempotent. @@ -221,7 +222,7 @@ class TaskExecutorWorkflow: @workflow.run async def run(self, run_id: str, task_type: str, params: dict) -> dict: - # Derive a stable task_id from this workflow's own ID. + # Keep the stub idempotent without implying task lifecycle ownership. task_id = str( uuid.uuid5(uuid.NAMESPACE_URL, workflow.info().workflow_id) ) diff --git a/tests/test_issue_sink.py b/tests/test_issue_sink.py index ae9f3e4..9a6c35e 100644 --- a/tests/test_issue_sink.py +++ b/tests/test_issue_sink.py @@ -70,6 +70,7 @@ def test_issue_core_rest_sink_posts_task_contract(monkeypatch) -> None: "timeout": 10.0, } ] + assert "review_required" not in posts[0]["json"] @pytest.mark.asyncio diff --git a/workplans/ACTIVITY-WP-0009-intent-gap-closure.md b/workplans/ACTIVITY-WP-0009-intent-gap-closure.md new file mode 100644 index 0000000..1eb0686 --- /dev/null +++ b/workplans/ACTIVITY-WP-0009-intent-gap-closure.md @@ -0,0 +1,250 @@ +--- +id: ACTIVITY-WP-0009 +type: workplan +title: "Intent gap closure" +domain: custodian +repo: activity-core +status: blocked +owner: codex +topic_slug: custodian +created: "2026-06-16" +updated: "2026-06-18" +state_hub_workstream_id: "d64cfbba-6da7-4737-afb9-866afa0e9cda" +--- + +# ACTIVITY-WP-0009 - Intent gap closure + +## Context + +The 2026-06-16 review of activity-core against `INTENT.md` found that the repo +matches the intended Event Bridge shape, but several production and contract +gaps remain before the implementation fully satisfies the operational promise: + +- recurring scheduled work must be trusted without manual coordination +- live task creation must be proven through issue-core, not only null-sink audit +- `review_required` semantics must either be implemented or documented as + metadata only +- ops evidence must either remain explicitly fallback-first or activate the + Inter-Hub / ops-hub backend behind operator-owned secrets +- the `TaskExecutorWorkflow` stub must not become a back door into execution + ownership +- the internal FastAPI surface needs an explicit production access decision + +The preserved analysis lives in: + +`history/2026-06-16-intent-gap-analysis.md` + +## Close Daily Triage Scheduled-Run Trust Gap + +```task +id: ACTIVITY-WP-0009-T01 +status: wait +priority: high +state_hub_task_id: "7012e4fd-2530-49b7-9c2f-1d949809a144" +``` + +Close the scheduled-run trust gap identified in `ACTIVITY-WP-0006-T03`. + +Acceptance criteria: + +- activity-core has three clean consecutive scheduled daily State Hub WSJF + triage runs after the June 7 runtime projection failure +- each run has matching Temporal workflow history, `activity_runs` row, State + Hub `daily_triage` progress, and working-memory report note +- calibration feedback is recorded in State Hub +- `ACTIVITY-WP-0006-T03` can move from `wait` to `done` + +Current wait reason: as of 2026-06-16, State Hub `daily_triage` progress and +working-memory `daily-triage-*` notes only show activity-core evidence through +2026-06-06. + +2026-06-18 update: activity-core now consumes the verified in-cluster +llm-connect Service URL in `k8s/railiance/20-runtime.yaml`: +`LLM_CONNECT_URL=http://llm-connect.activity-core.svc.cluster.local:8080` with +`LLM_CONNECT_TIMEOUT_SECONDS=300`. This removes the activity-core repo-side URL +gap. Closure still waits on the operator-owned provider Secret for llm-connect, +a schema-valid fixture smoke, and three clean scheduled daily triage runs with +matching State Hub and working-memory evidence. + +2026-06-18 follow-up: State Hub message +`6a098e1e-65de-4309-ab4a-446aba2f3587` reports that the llm-connect side is now +complete: the provider Secret has a populated key count and the in-namespace +fixture smoke passed. The remaining work is the activity-core / Railiance +runtime reconciliation and daily-triage evidence collection path captured in +`ACTIVITY-WP-0010`. + +## Promote Issue-Core Task Emission Safely + +```task +id: ACTIVITY-WP-0009-T02 +status: wait +priority: high +state_hub_task_id: "3854677b-32b4-43f8-a6ca-5a2b25a08dd9" +``` + +Move selected production-safe definitions from `ISSUE_SINK_TYPE=null` audit mode +toward real issue-core task creation. + +Acceptance criteria: + +- issue-core endpoint, credentials, and duplicate-handling posture are approved + for the target environment +- one known-safe definition is run first in null-sink mode and its task specs are + reviewed +- the same definition creates exactly the expected issue-core task(s) through + `IssueCoreRestSink` +- `task_spawn_log` records the real returned task references +- rollback to null-sink mode is documented + +Current wait reason: production Railiance currently uses null-sink audit mode; +live issue-core credentials/access and duplicate-handling are not yet verified +for this repo. + +## Resolve Review-Required Contract Drift + +```task +id: ACTIVITY-WP-0009-T03 +status: done +priority: medium +state_hub_task_id: "1eafe5e4-8412-4104-a417-933efe8e7bbd" +``` + +Resolve the mismatch between ADR language and current code for +`review_required`. + +Options: + +- implement an issue-core-owned pending review queue contract and route + `review_required=true` instruction outputs there, or +- update ADR/docs to state that `review_required` is currently audit/report + metadata only + +Acceptance criteria: + +- `docs/adr/adr-003-rule-instruction-model.md`, `SCOPE.md`, and tests describe + the same behavior +- no ActivityDefinition implies a review queue exists unless that downstream + contract is live +- report/spawn metadata remains available for operator review either way + +2026-06-16: Completed by aligning ADR-003 with the implemented behavior: +`review_required` is audit/report metadata only until issue-core owns a pending +review queue contract. `SCOPE.md` already had the same boundary, and +`tests/test_issue_sink.py` now asserts the REST issue sink does not send a +`review_required` field as though a review queue existed. + +## Decide And Gate Ops Evidence Backend + +```task +id: ACTIVITY-WP-0009-T04 +status: done +priority: medium +state_hub_task_id: "61300966-c119-4ebf-af89-a6c50df93ac8" +``` + +Decide whether the `ops-inventory` evidence path should remain State Hub +fallback-first for now or activate Inter-Hub / ops-hub submission. + +Acceptance criteria: + +- the decision is recorded in State Hub and the relevant docs/workplans +- if fallback-first remains the chosen mode, docs explicitly say State Hub + `ops_inventory_probe` progress is the accepted closure path +- if Inter-Hub is activated, `OPS_HUB_KEY` is provisioned outside Git, widget / + capability mapping is configured, and live submission is tested without + printing or storing secrets + +2026-06-16: Completed the current posture decision. State Hub decision +`7c235bbb-ee6f-4c3e-b1dd-74717eac9082` records that State Hub +`ops_inventory_probe` progress is the accepted live evidence backend for now. +Inter-Hub / ops-hub per-entity submission remains future work gated on +operator-owned `OPS_HUB_KEY` custody, widget mapping, and production intake +smoke tests. `docs/runbook.md` documents the fallback-first posture. + +## Remove Or Rehome TaskExecutor Stub Risk + +```task +id: ACTIVITY-WP-0009-T05 +status: done +priority: medium +state_hub_task_id: "fbe3e822-1a7c-4fe6-8251-cc8a782b9516" +``` + +Reduce the chance that `TaskExecutorWorkflow` attracts real execution work +inside activity-core. + +Acceptance criteria: + +- decide whether the stub should stay registered, be removed, or be moved to an + execution-owned repo/workplan +- if it stays, docs and comments explicitly mark it as non-production and + outside the activity-core ownership boundary +- no production ActivityDefinition or workflow path depends on `task_instances` + as task lifecycle state + +2026-06-16: Completed by deciding to keep `TaskExecutorWorkflow` registered only +as a compatibility/idempotency stub. `src/activity_core/workflows.py` and +`docs/conventions.md` now mark it as non-production and outside activity-core's +execution boundary. No production ActivityDefinition uses `task_instances` for +task lifecycle state. + +## Decide FastAPI Production Access Posture + +```task +id: ACTIVITY-WP-0009-T06 +status: done +priority: medium +state_hub_task_id: "99e1e301-296b-4f78-8843-2a39e59ecd7d" +``` + +Choose and document the production access posture for the FastAPI admin surface. + +Acceptance criteria: + +- operator decides whether the API remains ClusterIP-only or receives an + authenticated ingress +- if ingress is chosen, hostname, auth layer, allowed users/agents, and audit + expectations are documented before exposure +- runbook and Railiance deployment docs match the chosen posture + +2026-06-16: Completed the current access posture decision. State Hub decision +`9ffaf7a9-227a-4e39-92e3-cd93d8cda1f2` records that the FastAPI admin surface +remains ClusterIP-only until a separate authenticated ingress/access-policy work +item chooses hostname, auth layer, allowed users/agents, and audit expectations. +`docs/runbook.md` and `k8s/railiance/README.md` now agree on this posture. + +## Completion Criteria + +- The historical findings are preserved under `history/`. +- `SCOPE.md`, ADRs, workplans, and implementation agree on activity-core's + boundary. +- Daily scheduled triage has real consecutive-run calibration evidence. +- At least one production-safe task creation path is proven against issue-core, + or null-sink mode is explicitly accepted as the current production posture. +- Ops evidence backend posture is explicit and tested in the chosen mode. +- No registered workflow or API path invites activity-core to own execution, + task lifecycle, project state, or privileged ops control. + +## Implementation Pass - 2026-06-16 + +Agent-actionable closure is complete for T03, T04, T05, and T06. + +Remaining waits: + +- T01 waits on real scheduled daily triage run evidence. +- T02 waits on issue-core production endpoint/credentials and duplicate-handling + approval. + +Verification: + +```bash +.venv/bin/pytest tests/test_issue_sink.py tests/rules/test_executor.py -k "review_required or issue_core_rest_sink" +``` + +Result: 3 passed, 24 deselected. + +After this workplan is synced by the custodian operator, run from `~/state-hub`: + +```bash +make fix-consistency REPO=activity-core +```