Files
activity-core/workplans/ACTIVITY-WP-0009-intent-gap-closure.md

251 lines
8.9 KiB
Markdown

---
id: ACTIVITY-WP-0009
type: workplan
title: "Intent gap closure"
domain: custodian
repo: activity-core
status: blocked
owner: codex
topic_slug: custodian
created: "2026-06-16"
updated: "2026-06-18"
state_hub_workstream_id: "d64cfbba-6da7-4737-afb9-866afa0e9cda"
---
# ACTIVITY-WP-0009 - Intent gap closure
## Context
The 2026-06-16 review of activity-core against `INTENT.md` found that the repo
matches the intended Event Bridge shape, but several production and contract
gaps remain before the implementation fully satisfies the operational promise:
- recurring scheduled work must be trusted without manual coordination
- live task creation must be proven through issue-core, not only null-sink audit
- `review_required` semantics must either be implemented or documented as
metadata only
- ops evidence must either remain explicitly fallback-first or activate the
Inter-Hub / ops-hub backend behind operator-owned secrets
- the `TaskExecutorWorkflow` stub must not become a back door into execution
ownership
- the internal FastAPI surface needs an explicit production access decision
The preserved analysis lives in:
`history/2026-06-16-intent-gap-analysis.md`
## Close Daily Triage Scheduled-Run Trust Gap
```task
id: ACTIVITY-WP-0009-T01
status: wait
priority: high
state_hub_task_id: "7012e4fd-2530-49b7-9c2f-1d949809a144"
```
Close the scheduled-run trust gap identified in `ACTIVITY-WP-0006-T03`.
Acceptance criteria:
- activity-core has three clean consecutive scheduled daily State Hub WSJF
triage runs after the June 7 runtime projection failure
- each run has matching Temporal workflow history, `activity_runs` row, State
Hub `daily_triage` progress, and working-memory report note
- calibration feedback is recorded in State Hub
- `ACTIVITY-WP-0006-T03` can move from `wait` to `done`
Current wait reason: as of 2026-06-16, State Hub `daily_triage` progress and
working-memory `daily-triage-*` notes only show activity-core evidence through
2026-06-06.
2026-06-18 update: activity-core now consumes the verified in-cluster
llm-connect Service URL in `k8s/railiance/20-runtime.yaml`:
`LLM_CONNECT_URL=http://llm-connect.activity-core.svc.cluster.local:8080` with
`LLM_CONNECT_TIMEOUT_SECONDS=300`. This removes the activity-core repo-side URL
gap. Closure still waits on the operator-owned provider Secret for llm-connect,
a schema-valid fixture smoke, and three clean scheduled daily triage runs with
matching State Hub and working-memory evidence.
2026-06-18 follow-up: State Hub message
`6a098e1e-65de-4309-ab4a-446aba2f3587` reports that the llm-connect side is now
complete: the provider Secret has a populated key count and the in-namespace
fixture smoke passed. The remaining work is the activity-core / Railiance
runtime reconciliation and daily-triage evidence collection path captured in
`ACTIVITY-WP-0010`.
## Promote Issue-Core Task Emission Safely
```task
id: ACTIVITY-WP-0009-T02
status: wait
priority: high
state_hub_task_id: "3854677b-32b4-43f8-a6ca-5a2b25a08dd9"
```
Move selected production-safe definitions from `ISSUE_SINK_TYPE=null` audit mode
toward real issue-core task creation.
Acceptance criteria:
- issue-core endpoint, credentials, and duplicate-handling posture are approved
for the target environment
- one known-safe definition is run first in null-sink mode and its task specs are
reviewed
- the same definition creates exactly the expected issue-core task(s) through
`IssueCoreRestSink`
- `task_spawn_log` records the real returned task references
- rollback to null-sink mode is documented
Current wait reason: production Railiance currently uses null-sink audit mode;
live issue-core credentials/access and duplicate-handling are not yet verified
for this repo.
## Resolve Review-Required Contract Drift
```task
id: ACTIVITY-WP-0009-T03
status: done
priority: medium
state_hub_task_id: "1eafe5e4-8412-4104-a417-933efe8e7bbd"
```
Resolve the mismatch between ADR language and current code for
`review_required`.
Options:
- implement an issue-core-owned pending review queue contract and route
`review_required=true` instruction outputs there, or
- update ADR/docs to state that `review_required` is currently audit/report
metadata only
Acceptance criteria:
- `docs/adr/adr-003-rule-instruction-model.md`, `SCOPE.md`, and tests describe
the same behavior
- no ActivityDefinition implies a review queue exists unless that downstream
contract is live
- report/spawn metadata remains available for operator review either way
2026-06-16: Completed by aligning ADR-003 with the implemented behavior:
`review_required` is audit/report metadata only until issue-core owns a pending
review queue contract. `SCOPE.md` already had the same boundary, and
`tests/test_issue_sink.py` now asserts the REST issue sink does not send a
`review_required` field as though a review queue existed.
## Decide And Gate Ops Evidence Backend
```task
id: ACTIVITY-WP-0009-T04
status: done
priority: medium
state_hub_task_id: "61300966-c119-4ebf-af89-a6c50df93ac8"
```
Decide whether the `ops-inventory` evidence path should remain State Hub
fallback-first for now or activate Inter-Hub / ops-hub submission.
Acceptance criteria:
- the decision is recorded in State Hub and the relevant docs/workplans
- if fallback-first remains the chosen mode, docs explicitly say State Hub
`ops_inventory_probe` progress is the accepted closure path
- if Inter-Hub is activated, `OPS_HUB_KEY` is provisioned outside Git, widget /
capability mapping is configured, and live submission is tested without
printing or storing secrets
2026-06-16: Completed the current posture decision. State Hub decision
`7c235bbb-ee6f-4c3e-b1dd-74717eac9082` records that State Hub
`ops_inventory_probe` progress is the accepted live evidence backend for now.
Inter-Hub / ops-hub per-entity submission remains future work gated on
operator-owned `OPS_HUB_KEY` custody, widget mapping, and production intake
smoke tests. `docs/runbook.md` documents the fallback-first posture.
## Remove Or Rehome TaskExecutor Stub Risk
```task
id: ACTIVITY-WP-0009-T05
status: done
priority: medium
state_hub_task_id: "fbe3e822-1a7c-4fe6-8251-cc8a782b9516"
```
Reduce the chance that `TaskExecutorWorkflow` attracts real execution work
inside activity-core.
Acceptance criteria:
- decide whether the stub should stay registered, be removed, or be moved to an
execution-owned repo/workplan
- if it stays, docs and comments explicitly mark it as non-production and
outside the activity-core ownership boundary
- no production ActivityDefinition or workflow path depends on `task_instances`
as task lifecycle state
2026-06-16: Completed by deciding to keep `TaskExecutorWorkflow` registered only
as a compatibility/idempotency stub. `src/activity_core/workflows.py` and
`docs/conventions.md` now mark it as non-production and outside activity-core's
execution boundary. No production ActivityDefinition uses `task_instances` for
task lifecycle state.
## Decide FastAPI Production Access Posture
```task
id: ACTIVITY-WP-0009-T06
status: done
priority: medium
state_hub_task_id: "99e1e301-296b-4f78-8843-2a39e59ecd7d"
```
Choose and document the production access posture for the FastAPI admin surface.
Acceptance criteria:
- operator decides whether the API remains ClusterIP-only or receives an
authenticated ingress
- if ingress is chosen, hostname, auth layer, allowed users/agents, and audit
expectations are documented before exposure
- runbook and Railiance deployment docs match the chosen posture
2026-06-16: Completed the current access posture decision. State Hub decision
`9ffaf7a9-227a-4e39-92e3-cd93d8cda1f2` records that the FastAPI admin surface
remains ClusterIP-only until a separate authenticated ingress/access-policy work
item chooses hostname, auth layer, allowed users/agents, and audit expectations.
`docs/runbook.md` and `k8s/railiance/README.md` now agree on this posture.
## Completion Criteria
- The historical findings are preserved under `history/`.
- `SCOPE.md`, ADRs, workplans, and implementation agree on activity-core's
boundary.
- Daily scheduled triage has real consecutive-run calibration evidence.
- At least one production-safe task creation path is proven against issue-core,
or null-sink mode is explicitly accepted as the current production posture.
- Ops evidence backend posture is explicit and tested in the chosen mode.
- No registered workflow or API path invites activity-core to own execution,
task lifecycle, project state, or privileged ops control.
## Implementation Pass - 2026-06-16
Agent-actionable closure is complete for T03, T04, T05, and T06.
Remaining waits:
- T01 waits on real scheduled daily triage run evidence.
- T02 waits on issue-core production endpoint/credentials and duplicate-handling
approval.
Verification:
```bash
.venv/bin/pytest tests/test_issue_sink.py tests/rules/test_executor.py -k "review_required or issue_core_rest_sink"
```
Result: 3 passed, 24 deselected.
After this workplan is synced by the custodian operator, run from `~/state-hub`:
```bash
make fix-consistency REPO=activity-core
```