generated from coulomb/repo-seed
Compare commits
81 Commits
1a279e9f22
...
main
| Author | SHA1 | Date | |
|---|---|---|---|
| 92629e7a91 | |||
| 951ec56f7a | |||
| 9440d539c6 | |||
| 2ff852da29 | |||
| 30043348f0 | |||
| 18fcce87fe | |||
| 17b787fad0 | |||
| 6c8cb1b7b6 | |||
| ec66e06066 | |||
| 919edd98ac | |||
| bf877b7f0d | |||
| 9be4ddbdb7 | |||
| c5440e8429 | |||
| 53dc0f6e93 | |||
| a70c00a789 | |||
| b41b6034ee | |||
| 960fb05268 | |||
| b7b0b5bf6e | |||
| 14f76fb6d9 | |||
| caa2608092 | |||
| 61f278d643 | |||
| 0e9e18a59a | |||
| 5eb33bd3bb | |||
| 612c226472 | |||
| 0b2c68838e | |||
| 4b5e96d7c1 | |||
| 65ef005c2d | |||
| 0e75aaec01 | |||
| b2e57707a7 | |||
| 88fe359385 | |||
| f90591c5f1 | |||
| cf7a11dcd9 | |||
| 99e5d525a8 | |||
| 8424c13783 | |||
| 864f90f9b9 | |||
| 053d18b24a | |||
| 77af65afb2 | |||
| 0495f8a43f | |||
| c6cad9e7b3 | |||
| a83b117f60 | |||
| ffc0ee2cb7 | |||
| 59b3b73061 | |||
| 4bc5111dfd | |||
| e9a6029ded | |||
| bf4e61f0bf | |||
| 40fa851ec0 | |||
| e0742d18d7 | |||
| ccac285b0a | |||
| a0dcc52353 | |||
| faf5d60ae8 | |||
| adfd1a9067 | |||
| 44987457c1 | |||
| 3a981cc98f | |||
| dbd2fbb11c | |||
| c938b80503 | |||
| 3e93567a53 | |||
| 6f68f8f9ec | |||
| f05c56e202 | |||
| 200ec0c97a | |||
| 42e5ef725c | |||
| a08bd1684f | |||
| 2078915854 | |||
| 23f4956b68 | |||
| 764339e490 | |||
| 17e2e39165 | |||
| 6518ecefce | |||
| 727868a245 | |||
| a279d59f73 | |||
| 23e2316dff | |||
| 206bb336d2 | |||
| 977a3bd97f | |||
| 78eed5f942 | |||
| 717535b62d | |||
| b2816d9776 | |||
| 0554014083 | |||
| b84e474ac5 | |||
| 498d90b965 | |||
| a2a6a30d8b | |||
| 9a72c9f210 | |||
| 517bf9c133 | |||
| 29bf87a44c |
50
.claude/rules/credential-routing.md
Normal file
50
.claude/rules/credential-routing.md
Normal file
@@ -0,0 +1,50 @@
|
|||||||
|
# Credential and access routing
|
||||||
|
|
||||||
|
**Audience:** Codex, Claude Code, Grok, and custodian agents that call **llm-connect**
|
||||||
|
for inference. Run this check **before** requesting secrets, API keys, SSH access,
|
||||||
|
login tokens, or database passwords — in any repo, not only `ops-warden`.
|
||||||
|
|
||||||
|
ops-warden **issues SSH certificates only** (`warden sign`, `cert_command`). Every
|
||||||
|
other credential need belongs to another subsystem. **Do not** message
|
||||||
|
`ops-warden` on State Hub expecting a secret value; the reply is a pointer, not a key.
|
||||||
|
|
||||||
|
### Lookup (do this first)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
warden route find "<describe your need>" --json
|
||||||
|
warden route show <catalog-id> --json
|
||||||
|
```
|
||||||
|
|
||||||
|
Requires the `warden` CLI from `~/ops-warden` (`uv tool install .` or `uv run warden`).
|
||||||
|
|
||||||
|
| Agent runtime | How to orient |
|
||||||
|
| --- | --- |
|
||||||
|
| **Codex / Grok** (shell, HTTP State Hub) | `warden route` commands above; inbox `to_agent=activity-core` is for coordination, not secret vending |
|
||||||
|
| **Claude Code** (MCP when available) | `get_domain_summary("custodian")` for workstreams; **still** use `warden route` for credential ownership |
|
||||||
|
| **llm-connect** (inference service) | Never put secret retrieval in prompts; route custody to OpenBao/operator paths surfaced by `warden route` |
|
||||||
|
|
||||||
|
### Quick routing table
|
||||||
|
|
||||||
|
| I need… | Owner | ops-warden executes? |
|
||||||
|
| --- | --- | --- |
|
||||||
|
| SSH cert (`adm`/`agt`/`atm`) | ops-warden | **Yes** — `warden sign` |
|
||||||
|
| API key, DB password, provider token | OpenBao (`railiance-platform`) | No — route only |
|
||||||
|
| Login / OIDC / MFA | key-cape / Keycloak | No — route only |
|
||||||
|
| Authorization decision | flex-auth | No — route only |
|
||||||
|
| activity-core → issue-core emission | activity-core + issue-core | No — `warden route show activity-core-issue-sink` |
|
||||||
|
| SSH tunnel | ops-bridge (+ `cert_command` from warden) | No — route only |
|
||||||
|
|
||||||
|
### Anti-patterns (do not do these)
|
||||||
|
|
||||||
|
- `POST /messages/` to `ops-warden` asking for `ISSUE_CORE_API_KEY`, `OPENROUTER_API_KEY`, etc.
|
||||||
|
- Inventing `warden secret`, `warden login`, `warden bao`, `warden tunnel` — they do not exist
|
||||||
|
- Pasting secrets into Git, State Hub, workplans, logs, or chat
|
||||||
|
|
||||||
|
### Other capabilities (reuse-surface)
|
||||||
|
|
||||||
|
Non-credential capabilities are usually discovered through **reuse-surface** federation
|
||||||
|
(`reuse-surface` registry / `capability.*` indexes). Credential routing is inlined in
|
||||||
|
every repo's agent instructions because it is high-frequency, high-risk, and easy to
|
||||||
|
get wrong.
|
||||||
|
|
||||||
|
**Canon:** `~/ops-warden/wiki/CredentialRouting.md` · catalog `~/ops-warden/registry/routing/catalog.yaml`
|
||||||
@@ -1,11 +1,11 @@
|
|||||||
## First Session Protocol
|
## First Session Protocol
|
||||||
|
|
||||||
Triggered when `get_domain_summary("custodian")` shows **no workstreams**.
|
Triggered when `get_domain_summary("infotech")` shows **no workstreams**.
|
||||||
The project is registered but work has not yet been structured.
|
The project is registered but work has not yet been structured.
|
||||||
|
|
||||||
**Step 1 — Read, don't write**
|
**Step 1 — Read, don't write**
|
||||||
- `~/the-custodian/canon/projects/custodian/project_charter_v0.1.md` — purpose, scope
|
- `~/the-custodian/canon/projects/infotech/project_charter_v0.1.md` — purpose, scope
|
||||||
- `~/the-custodian/canon/projects/custodian/roadmap_v0.1.md` — planned phases
|
- `~/the-custodian/canon/projects/infotech/roadmap_v0.1.md` — planned phases
|
||||||
- Scan repo root: README, directory structure, existing code or docs
|
- Scan repo root: README, directory structure, existing code or docs
|
||||||
|
|
||||||
**Step 2 — Survey in-progress work**
|
**Step 2 — Survey in-progress work**
|
||||||
@@ -17,7 +17,7 @@ roadmap phase. **Wait for approval before creating.**
|
|||||||
|
|
||||||
**Step 4 — Create workplan file first, then DB record (ADR-001)**
|
**Step 4 — Create workplan file first, then DB record (ADR-001)**
|
||||||
```
|
```
|
||||||
workplans/activity-core-WP-NNNN-<slug>.md ← write this first
|
workplans/ACTIVITY-WP-NNNN-<slug>.md ← write this first
|
||||||
```
|
```
|
||||||
Then register in the hub:
|
Then register in the hub:
|
||||||
```
|
```
|
||||||
@@ -28,7 +28,7 @@ create_task(workstream_id="<id>", title="...", priority="high|medium|low")
|
|||||||
**Step 5 — Record the setup**
|
**Step 5 — Record the setup**
|
||||||
```
|
```
|
||||||
add_progress_event(
|
add_progress_event(
|
||||||
summary="First session: structured custodian into N workstreams, M tasks",
|
summary="First session: structured infotech into N workstreams, M tasks",
|
||||||
event_type="milestone",
|
event_type="milestone",
|
||||||
topic_id="cee7bedf-2b48-46ef-8601-006474f2ad7a",
|
topic_id="cee7bedf-2b48-46ef-8601-006474f2ad7a",
|
||||||
detail={"workstreams": [...], "tasks_created": M}
|
detail={"workstreams": [...], "tasks_created": M}
|
||||||
|
|||||||
@@ -1,5 +1,5 @@
|
|||||||
**Purpose:** Durable task factory built on Temporal. Manages ActivityDefinitions, schedules recurring workflows via Temporal Schedules, routes events via NATS JetStream, and exposes a FastAPI CRUD surface for the custodian domain.
|
**Purpose:** Durable task factory built on Temporal. Manages ActivityDefinitions, schedules recurring workflows via Temporal Schedules, routes events via NATS JetStream, and exposes a FastAPI CRUD surface for the custodian domain.
|
||||||
|
|
||||||
**Domain:** custodian
|
**Domain:** infotech
|
||||||
**Repo slug:** activity-core
|
**Repo slug:** activity-core
|
||||||
**Topic ID:** cee7bedf-2b48-46ef-8601-006474f2ad7a
|
**Topic ID:** cee7bedf-2b48-46ef-8601-006474f2ad7a
|
||||||
|
|||||||
@@ -1,6 +1,7 @@
|
|||||||
## Session Protocol
|
## Session Protocol
|
||||||
|
|
||||||
State Hub: http://127.0.0.1:8000
|
Dev Hub (State Hub API): http://127.0.0.1:8000
|
||||||
|
MCP server name in `~/.claude.json`: `dev-hub`
|
||||||
|
|
||||||
**Step 1 — Orient**
|
**Step 1 — Orient**
|
||||||
|
|
||||||
@@ -10,7 +11,7 @@ cat .custodian-brief.md
|
|||||||
```
|
```
|
||||||
Then call the MCP tool for richer cross-domain context when MCP tools are exposed:
|
Then call the MCP tool for richer cross-domain context when MCP tools are exposed:
|
||||||
```
|
```
|
||||||
get_domain_summary("custodian")
|
get_domain_summary("infotech")
|
||||||
```
|
```
|
||||||
If MCP tools are unavailable in the current agent session, use the REST API:
|
If MCP tools are unavailable in the current agent session, use the REST API:
|
||||||
```bash
|
```bash
|
||||||
@@ -39,11 +40,11 @@ curl -s -X PATCH "http://127.0.0.1:8000/messages/<id>/read" \
|
|||||||
ls workplans/
|
ls workplans/
|
||||||
```
|
```
|
||||||
For each file with `status: ready`, `active`, or `blocked`, note pending
|
For each file with `status: ready`, `active`, or `blocked`, note pending
|
||||||
`todo`/`in_progress` tasks.
|
`wait`/`todo`/`progress` tasks.
|
||||||
|
|
||||||
**Step 4 — Present brief**
|
**Step 4 — Present brief**
|
||||||
|
|
||||||
1. **Active workstreams** for `custodian` — title, task counts, blocking decisions
|
1. **Active workstreams** for `infotech` — title, task counts, blocking decisions
|
||||||
2. **Pending tasks** from `workplans/` + any `[repo:activity-core]` hub tasks
|
2. **Pending tasks** from `workplans/` + any `[repo:activity-core]` hub tasks
|
||||||
3. **Goal guidance** — if `goal_guidance` in summary:
|
3. **Goal guidance** — if `goal_guidance` in summary:
|
||||||
- `needs_workplan`: surface as top action — *"Repo goal '{title}' has no workplan yet"*
|
- `needs_workplan`: surface as top action — *"Repo goal '{title}' has no workplan yet"*
|
||||||
|
|||||||
@@ -1,7 +1,7 @@
|
|||||||
## Workplan Convention (ADR-001)
|
## Workplan Convention (ADR-001)
|
||||||
|
|
||||||
File location: `workplans/activity-core-WP-NNNN-<slug>.md`
|
File location: `workplans/ACTIVITY-WP-NNNN-<slug>.md`
|
||||||
ID prefix: `ACTIVITY-WP`
|
ID prefix: `ACTIVITY-WP-`
|
||||||
|
|
||||||
Work items originate as files in this repo **before** being registered in the hub.
|
Work items originate as files in this repo **before** being registered in the hub.
|
||||||
|
|
||||||
@@ -12,7 +12,7 @@ repo state, and `finished` when implementation is complete. `stalled` and
|
|||||||
`needs_review` are derived health labels, not stored statuses.
|
`needs_review` are derived health labels, not stored statuses.
|
||||||
|
|
||||||
Closed workplans may be moved to `workplans/archived/` with a completion-date
|
Closed workplans may be moved to `workplans/archived/` with a completion-date
|
||||||
prefix: `YYMMDD-activity-core-WP-NNNN-<slug>.md`. The frontmatter id remains
|
prefix: `YYMMDD-ACTIVITY-WP-NNNN-<slug>.md`. The frontmatter id remains
|
||||||
unchanged; the prefix is only for quick visual reference.
|
unchanged; the prefix is only for quick visual reference.
|
||||||
|
|
||||||
Small opportunistic tasks discovered during another session use **Ad Hoc Tasks**:
|
Small opportunistic tasks discovered during another session use **Ad Hoc Tasks**:
|
||||||
@@ -25,4 +25,16 @@ Ecosystem todos from other agents arrive as `[repo:activity-core]` hub tasks —
|
|||||||
visible at session start. Pick one up by creating the workplan file, then registering
|
visible at session start. Pick one up by creating the workplan file, then registering
|
||||||
the workstream.
|
the workstream.
|
||||||
|
|
||||||
|
Task blocks use this shape:
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: ACTIVITY-WP-NNNN-T01
|
||||||
|
status: wait | todo | progress | done | cancel
|
||||||
|
priority: high | medium | low
|
||||||
|
state_hub_task_id: "<uuid>" # written by fix-consistency — do not edit
|
||||||
|
```
|
||||||
|
|
||||||
|
Status progression is `todo` → `progress` → `done`; use `wait` for waiting or
|
||||||
|
blocked work and `cancel` for stopped work.
|
||||||
|
|
||||||
<!-- Ralph Loop rules and HEUREKA sequence: ~/.claude/CLAUDE.md — do not duplicate here -->
|
<!-- Ralph Loop rules and HEUREKA sequence: ~/.claude/CLAUDE.md — do not duplicate here -->
|
||||||
|
|||||||
@@ -1,18 +1,56 @@
|
|||||||
<!-- custodian-brief: generated by fix-consistency — do not edit manually -->
|
<!-- custodian-brief: generated by fix-consistency — do not edit manually -->
|
||||||
# Custodian Brief — activity-core
|
# Custodian Brief — activity-core
|
||||||
|
|
||||||
**Domain:** custodian
|
**Domain:** infotech
|
||||||
**Last synced:** 2026-06-17 21:59 UTC
|
**Last synced:** 2026-06-29 23:50 UTC
|
||||||
**State Hub:** http://127.0.0.1:8000 *(adjust if running on a remote machine)*
|
**State Hub:** http://127.0.0.1:8000 *(adjust if running on a remote machine)*
|
||||||
|
|
||||||
## Active Workstreams
|
## Active Workstreams
|
||||||
|
|
||||||
|
### Automation schedule inventory Make targets
|
||||||
|
Progress: 0/5 done | workstream_id: `21c73763-9adc-42f6-8fd2-1b8b33c2c770`
|
||||||
|
|
||||||
|
**Open tasks:**
|
||||||
|
- · Task: Define the automation inventory contract `8de24590`
|
||||||
|
- · Task: Implement a non-mutating inventory CLI `538cb9a5`
|
||||||
|
- · Task: Add Make targets `f2001721`
|
||||||
|
- · Task: Document the inventory workflow `f687743b`
|
||||||
|
- · Task: Verify against current repo and live/degraded sources `5317b532`
|
||||||
|
|
||||||
|
### LLM Output Robustness & The Producer Trust Boundary
|
||||||
|
Progress: 3/10 done | workstream_id: `4ef0d53b-1777-41ae-80c6-1b69fdb34726`
|
||||||
|
|
||||||
|
**Open tasks:**
|
||||||
|
- ! Reproduce & Root-Cause The Failure `74fd16a5`
|
||||||
|
*(wait: Local analysis complete: mechanism is the unbounded ~1-per-workstream recommendation list (16 active workstreams; break at char 5268 ~rank 8-9); both first attempt and retry failed. Exact token + finish_reason are unrecoverable from activity-core (complete() drops finish_reason; report cap 4000 < 5268; log cap 2000). Remaining: pull llm-connect producer-side logs on railiance01 (cluster/operator-owned). Does NOT block T02/T03 — mitigation is identical regardless.)*
|
||||||
|
- ► Tests + calibration re-entry `b7b9e07a`
|
||||||
|
- ► Schema + Prompt Redesign For Error Locality `ae67ca8c`
|
||||||
|
- ► Tests + Calibration Re-Entry `c881500b`
|
||||||
|
- · Reproduce & root-cause the 06-26 validation failure `2d3bba00`
|
||||||
|
- · Schema + prompt redesign for error locality `5da6962c`
|
||||||
|
- · Boundary parser — verify & mitigate with quarantine lane `4c408114`
|
||||||
|
|
||||||
### Post-triage operational hardening
|
### Post-triage operational hardening
|
||||||
Progress: 5/6 done | workstream_id: `5646e13a-13af-4724-bca6-3c0d86f96733`
|
Progress: 7/8 done | workstream_id: `5646e13a-13af-4724-bca6-3c0d86f96733`
|
||||||
|
|
||||||
**Open tasks:**
|
**Open tasks:**
|
||||||
- ! Three-Run Calibration Feedback `7cbf0a35`
|
- ! Three-Run Calibration Feedback `7cbf0a35`
|
||||||
|
|
||||||
|
### Adopt State Hub Beachhead Endpoint
|
||||||
|
Progress: 0/2 done | workstream_id: `bbc07f9e-9323-4b2b-b556-c33b37d0b228`
|
||||||
|
|
||||||
|
**Open tasks:**
|
||||||
|
- ! Point STATE_HUB_URL at the beachhead `76b6132d`
|
||||||
|
- ! Retire the bespoke actcore-state-hub-bridge proxy `526c2129`
|
||||||
|
|
||||||
|
### Daily Triage LLM Reconciliation And Evidence
|
||||||
|
Progress: 2/5 done | workstream_id: `f2c73ac6-13f0-4005-82cc-76c7c9f9c8b9`
|
||||||
|
|
||||||
|
**Open tasks:**
|
||||||
|
- ! Run Daily Triage Fixture Smoke `10e0df77`
|
||||||
|
- ! Collect Three Clean Scheduled Runs `dc6b9482`
|
||||||
|
- ! Close Handoff State `ecc57e21`
|
||||||
|
|
||||||
### Intent gap closure
|
### Intent gap closure
|
||||||
Progress: 4/6 done | workstream_id: `d64cfbba-6da7-4737-afb9-866afa0e9cda`
|
Progress: 4/6 done | workstream_id: `d64cfbba-6da7-4737-afb9-866afa0e9cda`
|
||||||
|
|
||||||
@@ -30,6 +68,6 @@ Progress: 2/3 done | workstream_id: `7387fc50-1f2c-471a-9d85-bb085cbd0b63`
|
|||||||
## MCP Orientation (when available)
|
## MCP Orientation (when available)
|
||||||
|
|
||||||
If the state-hub MCP server is reachable, call:
|
If the state-hub MCP server is reachable, call:
|
||||||
`get_domain_summary("custodian")`
|
`get_domain_summary("infotech")`
|
||||||
This provides richer cross-domain context.
|
This provides richer cross-domain context.
|
||||||
If the MCP call fails, use this file as your orientation source.
|
If the MCP call fails, use this file as your orientation source.
|
||||||
|
|||||||
@@ -18,14 +18,17 @@ STATE_HUB_URL=http://127.0.0.1:8000
|
|||||||
# Repo scoping — used by the repo-scoping context adapter. Binds {} on failure.
|
# Repo scoping — used by the repo-scoping context adapter. Binds {} on failure.
|
||||||
REPO_SCOPING_URL=http://127.0.0.1:8020
|
REPO_SCOPING_URL=http://127.0.0.1:8020
|
||||||
# Issue Core — task emission backend.
|
# Issue Core — task emission backend.
|
||||||
ISSUE_CORE_URL=http://127.0.0.1:8010
|
ISSUE_CORE_URL=http://127.0.0.1:8765
|
||||||
|
# Shared ingestion key — must match issue-core's ISSUE_CORE_API_KEY.
|
||||||
|
ISSUE_CORE_API_KEY=
|
||||||
# Sink type: 'rest' (POST to issue-core) or 'null' (discard, for dry-run).
|
# Sink type: 'rest' (POST to issue-core) or 'null' (discard, for dry-run).
|
||||||
ISSUE_SINK_TYPE=rest
|
ISSUE_SINK_TYPE=rest
|
||||||
|
|
||||||
# ── Activity definitions ───────────────────────────────────────────────────────
|
# ── Activity definitions ───────────────────────────────────────────────────────
|
||||||
# Colon-separated paths to additional activity-definitions/ directories.
|
# Colon-separated paths to additional activity-definitions/ directories.
|
||||||
# The local activity-definitions/ directory is always scanned.
|
# The local activity-definitions/ directory is always scanned.
|
||||||
ACTIVITY_DEFINITION_DIRS=
|
# Coulomb-loop kaizen engagement definitions (colon-separated for more roots).
|
||||||
|
ACTIVITY_DEFINITION_DIRS=/home/worsch/coulomb-loop
|
||||||
|
|
||||||
# ── Observability ─────────────────────────────────────────────────────────────
|
# ── Observability ─────────────────────────────────────────────────────────────
|
||||||
# Prometheus metrics bind address (Temporal SDK metrics).
|
# Prometheus metrics bind address (Temporal SDK metrics).
|
||||||
|
|||||||
24
.kaizen/agents/coach/memory.md
Normal file
24
.kaizen/agents/coach/memory.md
Normal file
@@ -0,0 +1,24 @@
|
|||||||
|
---
|
||||||
|
agent: coach
|
||||||
|
project: activity-core
|
||||||
|
last_updated: 2026-06-18
|
||||||
|
session_count: 0
|
||||||
|
---
|
||||||
|
|
||||||
|
## Project Context
|
||||||
|
<!-- What this agent knows about the project it works in -->
|
||||||
|
|
||||||
|
## Accumulated Findings
|
||||||
|
<!-- Patterns, recurring issues, key decisions encountered -->
|
||||||
|
|
||||||
|
## What Worked
|
||||||
|
<!-- Approaches that produced good results in this project -->
|
||||||
|
|
||||||
|
## Watch Points
|
||||||
|
<!-- Recurring risks, traps, or areas requiring extra care -->
|
||||||
|
|
||||||
|
## Open Threads
|
||||||
|
<!-- Things noticed but not yet acted on -->
|
||||||
|
|
||||||
|
## Session Log
|
||||||
|
<!-- One-line entry per session: date · summary · outcome -->
|
||||||
24
.kaizen/agents/optimization/memory.md
Normal file
24
.kaizen/agents/optimization/memory.md
Normal file
@@ -0,0 +1,24 @@
|
|||||||
|
---
|
||||||
|
agent: optimization
|
||||||
|
project: activity-core
|
||||||
|
last_updated: 2026-06-18
|
||||||
|
session_count: 0
|
||||||
|
---
|
||||||
|
|
||||||
|
## Project Context
|
||||||
|
<!-- What this agent knows about the project it works in -->
|
||||||
|
|
||||||
|
## Accumulated Findings
|
||||||
|
<!-- Patterns, recurring issues, key decisions encountered -->
|
||||||
|
|
||||||
|
## What Worked
|
||||||
|
<!-- Approaches that produced good results in this project -->
|
||||||
|
|
||||||
|
## Watch Points
|
||||||
|
<!-- Recurring risks, traps, or areas requiring extra care -->
|
||||||
|
|
||||||
|
## Open Threads
|
||||||
|
<!-- Things noticed but not yet acted on -->
|
||||||
|
|
||||||
|
## Session Log
|
||||||
|
<!-- One-line entry per session: date · summary · outcome -->
|
||||||
2
.kaizen/metrics/coach/executions.jsonl
Normal file
2
.kaizen/metrics/coach/executions.jsonl
Normal file
@@ -0,0 +1,2 @@
|
|||||||
|
{"agent": "coach", "execution_time_s": 120.0, "quality_score": 0.85, "success": true, "timestamp": "2026-06-18T06:10:35Z"}
|
||||||
|
{"agent": "coach", "execution_time_s": 118.0, "quality_score": 0.86, "success": true, "timestamp": "2026-06-18T10:06:38Z"}
|
||||||
12
.kaizen/metrics/coach/summary.json
Normal file
12
.kaizen/metrics/coach/summary.json
Normal file
@@ -0,0 +1,12 @@
|
|||||||
|
{
|
||||||
|
"agent": "coach",
|
||||||
|
"avg_execution_time_s": 119.0,
|
||||||
|
"avg_quality_score": 0.855,
|
||||||
|
"execution_count": 2,
|
||||||
|
"last_execution": "2026-06-18T10:06:38Z",
|
||||||
|
"success_rate": 1.0,
|
||||||
|
"trend": {
|
||||||
|
"quality_score": "stable",
|
||||||
|
"success_rate": "stable"
|
||||||
|
}
|
||||||
|
}
|
||||||
2
.kaizen/metrics/optimization/executions.jsonl
Normal file
2
.kaizen/metrics/optimization/executions.jsonl
Normal file
@@ -0,0 +1,2 @@
|
|||||||
|
{"agent": "optimization", "execution_time_s": 90.0, "quality_score": 0.8, "success": true, "timestamp": "2026-06-18T06:10:35Z"}
|
||||||
|
{"agent": "optimization", "execution_time_s": 88.0, "quality_score": 0.81, "success": true, "timestamp": "2026-06-18T10:06:38Z"}
|
||||||
12
.kaizen/metrics/optimization/summary.json
Normal file
12
.kaizen/metrics/optimization/summary.json
Normal file
@@ -0,0 +1,12 @@
|
|||||||
|
{
|
||||||
|
"agent": "optimization",
|
||||||
|
"avg_execution_time_s": 89.0,
|
||||||
|
"avg_quality_score": 0.805,
|
||||||
|
"execution_count": 2,
|
||||||
|
"last_execution": "2026-06-18T10:06:38Z",
|
||||||
|
"success_rate": 1.0,
|
||||||
|
"trend": {
|
||||||
|
"quality_score": "stable",
|
||||||
|
"success_rate": "stable"
|
||||||
|
}
|
||||||
|
}
|
||||||
59
.kaizen/metrics/optimizer/analysis.json
Normal file
59
.kaizen/metrics/optimizer/analysis.json
Normal file
@@ -0,0 +1,59 @@
|
|||||||
|
{
|
||||||
|
"agents": [
|
||||||
|
{
|
||||||
|
"agent_name": "coach",
|
||||||
|
"meets_sample_threshold": false,
|
||||||
|
"metrics_count": 2,
|
||||||
|
"optimization_cycles": 0,
|
||||||
|
"performance_analysis": {
|
||||||
|
"analysis_timestamp": "2026-06-18T12:06:39.212809",
|
||||||
|
"avg_execution_time": 119.0,
|
||||||
|
"avg_quality_score": 0.855,
|
||||||
|
"avg_success_rate": 1.0,
|
||||||
|
"execution_time_trend": -0.01680672268907563,
|
||||||
|
"quality_score_trend": 0.01169590643274855,
|
||||||
|
"success_rate_trend": 0.0,
|
||||||
|
"window_size": 2
|
||||||
|
},
|
||||||
|
"recommendations": [
|
||||||
|
{
|
||||||
|
"details": "Average execution time: 119.00s",
|
||||||
|
"message": "Consider optimizing execution time",
|
||||||
|
"priority": "high",
|
||||||
|
"type": "performance"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"report_timestamp": "2026-06-18T12:06:39.213012",
|
||||||
|
"sample_threshold": 10
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"agent_name": "optimization",
|
||||||
|
"meets_sample_threshold": false,
|
||||||
|
"metrics_count": 2,
|
||||||
|
"optimization_cycles": 0,
|
||||||
|
"performance_analysis": {
|
||||||
|
"analysis_timestamp": "2026-06-18T12:06:39.220252",
|
||||||
|
"avg_execution_time": 89.0,
|
||||||
|
"avg_quality_score": 0.805,
|
||||||
|
"avg_success_rate": 1.0,
|
||||||
|
"execution_time_trend": -0.02247191011235955,
|
||||||
|
"quality_score_trend": 0.012422360248447215,
|
||||||
|
"success_rate_trend": 0.0,
|
||||||
|
"window_size": 2
|
||||||
|
},
|
||||||
|
"recommendations": [
|
||||||
|
{
|
||||||
|
"details": "Average execution time: 89.00s",
|
||||||
|
"message": "Consider optimizing execution time",
|
||||||
|
"priority": "high",
|
||||||
|
"type": "performance"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"report_timestamp": "2026-06-18T12:06:39.220417",
|
||||||
|
"sample_threshold": 10
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"min_samples": 10,
|
||||||
|
"optimized_at": "2026-06-18",
|
||||||
|
"project": "activity-core"
|
||||||
|
}
|
||||||
15
.kaizen/schedule.yml
Normal file
15
.kaizen/schedule.yml
Normal file
@@ -0,0 +1,15 @@
|
|||||||
|
# Kaizen scheduled agent execution manifest (ADR-005)
|
||||||
|
# Engagement: coulomb-loop bootstrap — weekly cadence
|
||||||
|
# Regulator promotes cadence per customer engagement policy (ADR-003).
|
||||||
|
# Validate with: kaizen-agentic schedule validate
|
||||||
|
version: '1'
|
||||||
|
timezone: Europe/Berlin
|
||||||
|
agents:
|
||||||
|
coach:
|
||||||
|
cadence: weekly
|
||||||
|
cron: 0 9 * * 1
|
||||||
|
enabled: true
|
||||||
|
optimization:
|
||||||
|
cadence: weekly
|
||||||
|
cron: 0 10 * * 1
|
||||||
|
enabled: true
|
||||||
28
.repo-classification.yaml
Normal file
28
.repo-classification.yaml
Normal file
@@ -0,0 +1,28 @@
|
|||||||
|
# Repo classification (Repo Classification Standard v1.0).
|
||||||
|
|
||||||
|
repo_classification:
|
||||||
|
standard: Repo Classification Standard
|
||||||
|
version: '1.0'
|
||||||
|
classified_at: '2026-06-22'
|
||||||
|
classified_by: human
|
||||||
|
category: tooling
|
||||||
|
domain: infotech
|
||||||
|
secondary_domains:
|
||||||
|
- agents
|
||||||
|
capability_tags:
|
||||||
|
- workflow
|
||||||
|
- orchestration
|
||||||
|
- automation
|
||||||
|
- coordination
|
||||||
|
- observability
|
||||||
|
business_stake:
|
||||||
|
- technology
|
||||||
|
- operations
|
||||||
|
- automation
|
||||||
|
- execution
|
||||||
|
business_mechanics:
|
||||||
|
- coordination
|
||||||
|
- operation
|
||||||
|
- adaptation
|
||||||
|
notes: Org-wide event bridge / task factory (Temporal-based). Active bounded implementation
|
||||||
|
-> project.
|
||||||
68
AGENTS.md
68
AGENTS.md
@@ -4,7 +4,7 @@
|
|||||||
|
|
||||||
**Purpose:** Durable task factory built on Temporal. Manages ActivityDefinitions, schedules recurring workflows via Temporal Schedules, routes events via NATS JetStream, and exposes a FastAPI CRUD surface for the custodian domain.
|
**Purpose:** Durable task factory built on Temporal. Manages ActivityDefinitions, schedules recurring workflows via Temporal Schedules, routes events via NATS JetStream, and exposes a FastAPI CRUD surface for the custodian domain.
|
||||||
|
|
||||||
**Domain:** custodian
|
**Domain:** infotech
|
||||||
**Repo slug:** activity-core
|
**Repo slug:** activity-core
|
||||||
**Topic ID:** `cee7bedf-2b48-46ef-8601-006474f2ad7a`
|
**Topic ID:** `cee7bedf-2b48-46ef-8601-006474f2ad7a`
|
||||||
**Workplan prefix:** `ACTIVITY-WP-`
|
**Workplan prefix:** `ACTIVITY-WP-`
|
||||||
@@ -83,7 +83,7 @@ curl -s -X PATCH "http://127.0.0.1:8000/tasks/<task_id>" \
|
|||||||
1. `cat .custodian-brief.md` — domain goal and open workstreams (offline-safe)
|
1. `cat .custodian-brief.md` — domain goal and open workstreams (offline-safe)
|
||||||
2. Check inbox: `GET /messages/?to_agent=activity-core&unread_only=true`; mark read
|
2. Check inbox: `GET /messages/?to_agent=activity-core&unread_only=true`; mark read
|
||||||
3. Scan workplans: `ls workplans/` — note `status: ready`, `active`, or `blocked` files and open tasks
|
3. Scan workplans: `ls workplans/` — note `status: ready`, `active`, or `blocked` files and open tasks
|
||||||
4. Check blocked tasks: `GET /tasks/?needs_human=true`
|
4. Check human-needed tasks: `GET /tasks/?needs_human=true`
|
||||||
|
|
||||||
**During work:**
|
**During work:**
|
||||||
- Update task statuses in workplan files as tasks progress
|
- Update task statuses in workplan files as tasks progress
|
||||||
@@ -101,6 +101,63 @@ curl -s -X PATCH "http://127.0.0.1:8000/tasks/<task_id>" \
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
## Credential and access routing
|
||||||
|
|
||||||
|
**Audience:** Codex, Claude Code, Grok, and custodian agents that call **llm-connect**
|
||||||
|
for inference. Run this check **before** requesting secrets, API keys, SSH access,
|
||||||
|
login tokens, or database passwords — in any repo, not only `ops-warden`.
|
||||||
|
|
||||||
|
ops-warden **issues SSH certificates only** (`warden sign`, `cert_command`). Every
|
||||||
|
other credential need belongs to another subsystem. **Do not** message
|
||||||
|
`ops-warden` on State Hub expecting a secret value; the reply is a pointer, not a key.
|
||||||
|
|
||||||
|
### Lookup (do this first)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
warden route find "<describe your need>" --json
|
||||||
|
warden route show <catalog-id> --json
|
||||||
|
```
|
||||||
|
|
||||||
|
Requires the `warden` CLI from `~/ops-warden` (`uv tool install .` or `uv run warden`).
|
||||||
|
|
||||||
|
| Agent runtime | How to orient |
|
||||||
|
| --- | --- |
|
||||||
|
| **Codex / Grok** (shell, HTTP State Hub) | `warden route` commands above; inbox `to_agent=activity-core` is for coordination, not secret vending |
|
||||||
|
| **Claude Code** (MCP when available) | `get_domain_summary("custodian")` for workstreams; **still** use `warden route` for credential ownership |
|
||||||
|
| **llm-connect** (inference service) | Never put secret retrieval in prompts; route custody to OpenBao/operator paths surfaced by `warden route` |
|
||||||
|
|
||||||
|
### Quick routing table
|
||||||
|
|
||||||
|
| I need… | Owner | ops-warden executes? |
|
||||||
|
| --- | --- | --- |
|
||||||
|
| SSH cert (`adm`/`agt`/`atm`) | ops-warden | **Yes** — `warden sign` |
|
||||||
|
| API key, DB password, provider token | OpenBao (`railiance-platform`) | No — route only |
|
||||||
|
| Login / OIDC / MFA | key-cape / Keycloak | No — route only |
|
||||||
|
| Authorization decision | flex-auth | No — route only |
|
||||||
|
| activity-core → issue-core emission | activity-core + issue-core | No — `warden route show activity-core-issue-sink` |
|
||||||
|
| SSH tunnel | ops-bridge (+ `cert_command` from warden) | No — route only |
|
||||||
|
|
||||||
|
### Anti-patterns (do not do these)
|
||||||
|
|
||||||
|
- `POST /messages/` to `ops-warden` asking for `ISSUE_CORE_API_KEY`, `OPENROUTER_API_KEY`, etc.
|
||||||
|
- Inventing `warden secret`, `warden login`, `warden bao`, `warden tunnel` — they do not exist
|
||||||
|
- Pasting secrets into Git, State Hub, workplans, logs, or chat
|
||||||
|
|
||||||
|
### Other capabilities (reuse-surface)
|
||||||
|
|
||||||
|
Non-credential capabilities are usually discovered through **reuse-surface** federation
|
||||||
|
(`reuse-surface` registry / `capability.*` indexes). Credential routing is inlined in
|
||||||
|
every repo's agent instructions because it is high-frequency, high-risk, and easy to
|
||||||
|
get wrong.
|
||||||
|
|
||||||
|
**Canon:** `~/ops-warden/wiki/CredentialRouting.md` · catalog `~/ops-warden/registry/routing/catalog.yaml`
|
||||||
|
|
||||||
|
<!-- REPO-AGENTS-EXTENSIONS -->
|
||||||
|
<!-- Append repo-specific agent instructions below this marker.
|
||||||
|
The state-hub template sync preserves content after this line. -->
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## Workplan Convention (ADR-001)
|
## Workplan Convention (ADR-001)
|
||||||
|
|
||||||
Work items originate as files in this repo — not in the hub. The hub is a
|
Work items originate as files in this repo — not in the hub. The hub is a
|
||||||
@@ -124,7 +181,7 @@ anything needing analysis, design, approval, dependencies, or multiple phases.
|
|||||||
id: ACTIVITY-WP-NNNN
|
id: ACTIVITY-WP-NNNN
|
||||||
type: workplan
|
type: workplan
|
||||||
title: "..."
|
title: "..."
|
||||||
domain: custodian
|
domain: infotech
|
||||||
repo: activity-core
|
repo: activity-core
|
||||||
status: proposed | ready | active | blocked | backlog | finished | archived
|
status: proposed | ready | active | blocked | backlog | finished | archived
|
||||||
owner: codex
|
owner: codex
|
||||||
@@ -154,10 +211,7 @@ state_hub_task_id: "<uuid>" # written by fix-consistency — do not edit
|
|||||||
Task description text.
|
Task description text.
|
||||||
```
|
```
|
||||||
|
|
||||||
Status progression: `todo` → `progress` → `done`; use `wait` for a task
|
Status progression: `todo` → `progress` → `done`; use `wait` for waiting/blocked work and `cancel` for stopped work.
|
||||||
blocked on external input and `cancel` for intentionally abandoned work.
|
|
||||||
Workstream/workplan lifecycle status is separate; frontmatter `blocked` remains
|
|
||||||
valid there.
|
|
||||||
|
|
||||||
To create a new workplan:
|
To create a new workplan:
|
||||||
1. Write the file following the format above
|
1. Write the file following the format above
|
||||||
|
|||||||
@@ -8,4 +8,5 @@
|
|||||||
@.claude/rules/stack-and-commands.md
|
@.claude/rules/stack-and-commands.md
|
||||||
@.claude/rules/architecture.md
|
@.claude/rules/architecture.md
|
||||||
@.claude/rules/repo-boundary.md
|
@.claude/rules/repo-boundary.md
|
||||||
|
@.claude/rules/credential-routing.md
|
||||||
@.claude/rules/agents.md
|
@.claude/rules/agents.md
|
||||||
|
|||||||
19
Makefile
19
Makefile
@@ -1,13 +1,16 @@
|
|||||||
-include .env
|
-include .env
|
||||||
export
|
export
|
||||||
|
|
||||||
.PHONY: sync-event-types sync-activity-definitions test migrate sync-all \
|
.PHONY: sync-event-types sync-activity-definitions sync-schedules test migrate sync-all \
|
||||||
dev-up dev-down railiance-up railiance-down \
|
dev-up dev-down railiance-up railiance-down \
|
||||||
start-worker start-api start-event-router help
|
start-worker start-api start-event-router help
|
||||||
|
|
||||||
sync-activity-definitions: ## Sync ActivityDefinition files into DB
|
sync-activity-definitions: ## Sync ActivityDefinition files into DB
|
||||||
uv run python -m activity_core.sync_activity_definitions
|
uv run python -m activity_core.sync_activity_definitions
|
||||||
|
|
||||||
|
sync-schedules: ## Reconcile Temporal schedules from activity_definitions DB
|
||||||
|
uv run python -m activity_core.sync_schedules
|
||||||
|
|
||||||
sync-event-types: ## Sync event type YAML files into DB
|
sync-event-types: ## Sync event type YAML files into DB
|
||||||
uv run python scripts/sync_event_types.py
|
uv run python scripts/sync_event_types.py
|
||||||
|
|
||||||
@@ -52,3 +55,17 @@ help: ## Show this help message
|
|||||||
@grep -Eh '^[a-zA-Z_-]+:.*?##' $(MAKEFILE_LIST) | \
|
@grep -Eh '^[a-zA-Z_-]+:.*?##' $(MAKEFILE_LIST) | \
|
||||||
awk 'BEGIN {FS = ":.*?## "}; {printf " \033[36m%-24s\033[0m %s\n", $$1, $$2}' | \
|
awk 'BEGIN {FS = ":.*?## "}; {printf " \033[36m%-24s\033[0m %s\n", $$1, $$2}' | \
|
||||||
sort
|
sort
|
||||||
|
|
||||||
|
# Agent Management Targets
|
||||||
|
agents-list:
|
||||||
|
@echo "Installed agents:"
|
||||||
|
@ls agents/ 2>/dev/null | grep agent- | sed 's/agent-//g' | sed 's/.md//g' \
|
||||||
|
|| echo "No agents installed"
|
||||||
|
|
||||||
|
agents-update:
|
||||||
|
@echo "Updating agents..."
|
||||||
|
@kaizen-agentic update
|
||||||
|
|
||||||
|
agents-validate:
|
||||||
|
@echo "Validating agents..."
|
||||||
|
@kaizen-agentic validate agents/
|
||||||
|
|||||||
180
SCOPE.md
180
SCOPE.md
@@ -1,7 +1,7 @@
|
|||||||
---
|
---
|
||||||
domain: capabilities
|
domain: capabilities
|
||||||
repo: activity-core
|
repo: activity-core
|
||||||
updated: "2026-06-03"
|
updated: "2026-06-16"
|
||||||
---
|
---
|
||||||
|
|
||||||
# SCOPE
|
# SCOPE
|
||||||
@@ -16,7 +16,8 @@ updated: "2026-06-03"
|
|||||||
activity-core is the org-wide Event Bridge for the Coulomb organization — a
|
activity-core is the org-wide Event Bridge for the Coulomb organization — a
|
||||||
rule-governed event loop that receives time-based and domain events, evaluates
|
rule-governed event loop that receives time-based and domain events, evaluates
|
||||||
declarative rules and LLM instructions against current org context, and emits
|
declarative rules and LLM instructions against current org context, and emits
|
||||||
structured task sets to issue-core.
|
structured task, report, and evidence outputs without owning downstream task
|
||||||
|
lifecycle.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -27,8 +28,11 @@ An `ActivityDefinition` (a markdown file checked into a repo) declares a trigger
|
|||||||
resolve before evaluation, and a set of rules and instructions that determine
|
resolve before evaluation, and a set of rules and instructions that determine
|
||||||
what tasks to create. When triggered, a durable Temporal workflow loads the
|
what tasks to create. When triggered, a durable Temporal workflow loads the
|
||||||
definition, resolves context, evaluates the rule/instruction set, and emits task
|
definition, resolves context, evaluates the rule/instruction set, and emits task
|
||||||
creation requests to issue-core. Everything is auditable: the spawn log records
|
creation requests to issue-core or configured dry-run/audit sinks. Instructions
|
||||||
the triggering event, matched rule, and resulting task references.
|
may also emit validated reports, and selected context resolvers may emit compact
|
||||||
|
non-secret evidence. Everything is auditable: the spawn log records the
|
||||||
|
triggering event, matched rule/instruction metadata, model/prompt hash where
|
||||||
|
applicable, and resulting task references.
|
||||||
|
|
||||||
The two evaluation modes:
|
The two evaluation modes:
|
||||||
- **Rule** — deterministic condition (sandboxed Python-like DSL) → fixed task
|
- **Rule** — deterministic condition (sandboxed Python-like DSL) → fixed task
|
||||||
@@ -48,21 +52,35 @@ The two evaluation modes:
|
|||||||
attribute schemas, example payloads, and intent documentation.
|
attribute schemas, example payloads, and intent documentation.
|
||||||
Curator-gating configurable per runtime environment.
|
Curator-gating configurable per runtime environment.
|
||||||
- **Trigger types**: 5-field cron with timezone and misfire policy; one-off
|
- **Trigger types**: 5-field cron with timezone and misfire policy; one-off
|
||||||
scheduled datetime; event-type subscription via NATS.
|
scheduled datetime; event-type subscription via NATS; manual one-shot API
|
||||||
|
trigger; one-shot schedule smoke tests for recurring definitions.
|
||||||
- **Context resolution adapters**: repo-scoping (repository capability queries),
|
- **Context resolution adapters**: repo-scoping (repository capability queries),
|
||||||
state hub (domain and workstream state), extensible for other sources.
|
State Hub (domain/workstream state, SBOM status, daily triage digest, coding
|
||||||
|
retro read model), and ops inventory (bounded HTTP/HTTPS probes of a
|
||||||
|
non-secret service inventory). The adapter registry is extensible for other
|
||||||
|
sources.
|
||||||
- **Rule evaluator**: sandboxed AST walker for Python-like boolean expressions
|
- **Rule evaluator**: sandboxed AST walker for Python-like boolean expressions
|
||||||
over event attributes and resolved context. Rule actions support safe
|
over event attributes and resolved context. Rule actions support safe
|
||||||
`context.*` / `event.*` interpolation and explicit `for_each` per-item
|
`context.*` / `event.*` interpolation and explicit `for_each` per-item
|
||||||
binding. No `exec()`.
|
binding. No `exec()`.
|
||||||
- **Instruction executor**: trusted-field prompt rendering, LLM call via
|
- **Instruction executor**: trusted-field prompt rendering, LLM call via
|
||||||
llm-connect, structured output validation, optional curator review queue,
|
llm-connect, structured output validation, item-granular recovery with a
|
||||||
and deterministic report sinks.
|
quarantine lane and producer guardrails (count/length/depth caps, reference
|
||||||
|
allow-list) at the producer trust boundary, bounded validation-failure
|
||||||
|
artifacts for report instructions, review-required audit metadata, and
|
||||||
|
deterministic report sinks. A real downstream review queue is not implemented
|
||||||
|
in this repo.
|
||||||
- **Task emission adapter**: abstraction over issue-core; current transport is
|
- **Task emission adapter**: abstraction over issue-core; current transport is
|
||||||
REST; designed to migrate to NATS subscription without code changes.
|
REST, with `ISSUE_SINK_TYPE=null` for dry-run/audit mode. It is designed to
|
||||||
|
migrate to a durable issue-core-owned NATS command boundary when issue-core
|
||||||
|
provides that contract.
|
||||||
- **Report sinks**: instruction report outputs can be persisted to bounded
|
- **Report sinks**: instruction report outputs can be persisted to bounded
|
||||||
local working memory and posted as State Hub progress events. These are
|
local working memory and posted as State Hub progress events. These are
|
||||||
reporting outputs, not task lifecycle ownership.
|
reporting outputs, not task lifecycle ownership.
|
||||||
|
- **Ops evidence sinks**: `ops-inventory` context sources can post compact
|
||||||
|
non-secret `ops_inventory_probe` summaries to State Hub. Inter-Hub submission
|
||||||
|
is present only as a gated/deferred sink result until operator-owned
|
||||||
|
`OPS_HUB_KEY` custody and widget mapping are ready.
|
||||||
- **Spawn audit log**: every task emission recorded with rule/instruction id,
|
- **Spawn audit log**: every task emission recorded with rule/instruction id,
|
||||||
triggering event id, model and prompt hash (instructions), issue-core task ref.
|
triggering event id, model and prompt hash (instructions), issue-core task ref.
|
||||||
- **Webhook receiver**: HTTP endpoint normalising inbound Gitea/GitHub webhook
|
- **Webhook receiver**: HTTP endpoint normalising inbound Gitea/GitHub webhook
|
||||||
@@ -84,6 +102,14 @@ The two evaluation modes:
|
|||||||
coordinated changes belong to project-core (future).
|
coordinated changes belong to project-core (future).
|
||||||
- **Execution of automatable tasks** — Temporal Activities that do real work
|
- **Execution of automatable tasks** — Temporal Activities that do real work
|
||||||
(run a scan, apply a patch, call an API) live in per-repo workers, not here.
|
(run a scan, apply a patch, call an API) live in per-repo workers, not here.
|
||||||
|
- **General ops execution** — Kubernetes, SSH, tunnel, authenticated service
|
||||||
|
checks, secret custody, OpenBao writes, and Inter-Hub widget/API-key
|
||||||
|
provisioning belong to the owning operational repos and operator workflows.
|
||||||
|
activity-core may record non-secret probe evidence; it must not become the ops
|
||||||
|
control plane.
|
||||||
|
- **Service inventory authority** — the Custodian inventory remains owned by
|
||||||
|
the custodian/state-hub surface. activity-core may read a projected
|
||||||
|
non-secret snapshot.
|
||||||
- **Event broker hosting** — NATS JetStream is org infrastructure; activity-core
|
- **Event broker hosting** — NATS JetStream is org infrastructure; activity-core
|
||||||
consumes it but does not own its lifecycle.
|
consumes it but does not own its lifecycle.
|
||||||
- **Temporal server hosting** — activity-core uses the Temporal SDK; the server
|
- **Temporal server hosting** — activity-core uses the Temporal SDK; the server
|
||||||
@@ -101,6 +127,9 @@ The two evaluation modes:
|
|||||||
structured tasks in the right repos."
|
structured tasks in the right repos."
|
||||||
- You need one-off future task scheduling without a separate reminder system.
|
- You need one-off future task scheduling without a separate reminder system.
|
||||||
- You want an auditable record of what triggered what and why.
|
- You want an auditable record of what triggered what and why.
|
||||||
|
- You need a scheduled, non-secret evidence note proving that declared service
|
||||||
|
endpoints or access paths were observed, without executing privileged ops
|
||||||
|
commands.
|
||||||
- You are replacing scattered bespoke cron jobs and manual coordination with
|
- You are replacing scattered bespoke cron jobs and manual coordination with
|
||||||
a governed, observable automation layer.
|
a governed, observable automation layer.
|
||||||
|
|
||||||
@@ -117,29 +146,45 @@ The two evaluation modes:
|
|||||||
|
|
||||||
## Current State
|
## Current State
|
||||||
|
|
||||||
- **Status**: active production-backed service. Foundation, triggers/ops,
|
- **Status**: active production-backed service with two visible open gates:
|
||||||
event bridge, Railiance deployment, and the production service workplans are
|
`ACTIVITY-WP-0006` still waits on three clean consecutive scheduled daily
|
||||||
complete. The stale March WP-0002 handoff note has been reconciled and
|
triage runs and calibration feedback, and `ACTIVITY-WP-0008` is blocked until
|
||||||
archived.
|
Helix Forge publishes the upstream `coding_retro` read model needed to enable
|
||||||
|
the Saturday schedule. `ACTIVITY-WP-0007` is finished: the bounded
|
||||||
|
ops-inventory probe/evidence slice has live Railiance evidence.
|
||||||
- **Implementation**: core is functional. `RunActivityWorkflow`,
|
- **Implementation**: core is functional. `RunActivityWorkflow`,
|
||||||
`TaskExecutorWorkflow` (stub), PostgreSQL schema, Temporal Schedules, NATS
|
`TaskExecutorWorkflow` (stub), PostgreSQL schema, Temporal Schedules and smoke
|
||||||
Event Router, FastAPI admin API, Prometheus metrics, event type registry,
|
schedules, NATS Event Router, FastAPI admin API, Prometheus metrics, event
|
||||||
markdown ActivityDefinition parser/sync, rule evaluator, instruction
|
type registry, markdown ActivityDefinition parser/sync, rule evaluator,
|
||||||
executor, context resolvers, issue sink, report sinks, Kubernetes deployment,
|
instruction executor, context resolvers, issue sink, report sinks, ops
|
||||||
and operational runbook are all implemented.
|
evidence sink, Kubernetes deployment, and operational runbook are all
|
||||||
- **Operational proof**: the daily State Hub WSJF triage cutover has completed
|
implemented.
|
||||||
far enough that activity-core is now the trusted scheduled substrate for the
|
- **Current definitions**: `weekly-sbom-staleness` is enabled and demonstrates
|
||||||
routine report. Recent hardening fixed the State Hub SBOM resolver contract,
|
the deterministic rule/fan-out path. `weekly-coding-retro` is present and
|
||||||
made slow LLM activity timeouts configurable, and added safe rule action
|
tested but intentionally disabled until live `coding_retro` evidence exists.
|
||||||
interpolation plus explicit `for_each` binding for per-repo SBOM staleness
|
Railiance projects the daily State Hub WSJF triage definition and the disabled
|
||||||
tasks.
|
ops-service-inventory probe definition from the runtime bundle.
|
||||||
- **Stability**: construction risk has shifted to operational hardening risk.
|
- **Operational proof**: the State Hub daily WSJF triage path has produced
|
||||||
The full test suite passed on 2026-06-03 (`125 passed, 1 skipped`). The
|
validated reports and working-memory notes, but the calibration gate is not
|
||||||
remaining work is mostly observability, status-canon adaptation, contract
|
closed. A 2026-06-16 recheck found State Hub `daily_triage` progress and
|
||||||
documentation, and broader production adoption rather than first
|
working-memory `daily-triage-*` notes only through 2026-06-06, so there is not
|
||||||
implementation.
|
yet evidence for three clean consecutive scheduled runs after the June 7
|
||||||
- **Next**: `ACTIVITY-WP-0006` — post-triage operational hardening and scope
|
runtime projection failure. The ops inventory probe path has live fallback
|
||||||
alignment.
|
evidence in State Hub; Inter-Hub per-entity submission remains deferred.
|
||||||
|
- **Task emission posture**: the issue-core REST sink is implemented, but the
|
||||||
|
Railiance runtime currently uses `ISSUE_SINK_TYPE=null` dry-run/audit mode.
|
||||||
|
Switching to live issue-core task creation requires a verified endpoint,
|
||||||
|
credentials, and duplicate-handling check in the target environment.
|
||||||
|
- **Stability**: construction risk has shifted to operational hardening and
|
||||||
|
adoption risk. The last recorded full-suite pass in the workplans was
|
||||||
|
2026-06-04 (`128 passed, 1 skipped`), with later targeted coverage added for
|
||||||
|
ops inventory, ops evidence sinks, Railiance projection wiring, and weekly
|
||||||
|
coding retro parsing/rule behavior.
|
||||||
|
- **Next**: close `ACTIVITY-WP-0006-T03` with real scheduled-run calibration
|
||||||
|
evidence; close `ACTIVITY-WP-0008-T03` once upstream `coding_retro` publication
|
||||||
|
exists and the dry-run/duplicate check passes; decide when to move selected
|
||||||
|
task/report/evidence sinks from dry-run or fallback mode to their intended
|
||||||
|
live backends.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -159,9 +204,9 @@ database, the project planner, or a general execution worker. The local
|
|||||||
workplan explicitly rehomes execution responsibility.
|
workplan explicitly rehomes execution responsibility.
|
||||||
|
|
||||||
One boundary nuance is now explicit: activity-core may post State Hub progress
|
One boundary nuance is now explicit: activity-core may post State Hub progress
|
||||||
events as a configured report sink. That is acceptable because it records the
|
events as a configured report or evidence sink. That is acceptable because it
|
||||||
result of an activity-core activation; it is not ownership of State Hub state,
|
records the result of an activity-core activation; it is not ownership of State
|
||||||
task lifecycle, or workstream planning.
|
Hub state, task lifecycle, or workstream planning.
|
||||||
|
|
||||||
The main drift risk is convenience creep: adding direct task tracking,
|
The main drift risk is convenience creep: adding direct task tracking,
|
||||||
project-phase state, or bespoke operational scripts because the Temporal
|
project-phase state, or bespoke operational scripts because the Temporal
|
||||||
@@ -169,27 +214,58 @@ substrate is already nearby. Future work should prefer declarative
|
|||||||
ActivityDefinitions, bounded context resolvers, and outbound adapters over
|
ActivityDefinitions, bounded context resolvers, and outbound adapters over
|
||||||
new one-off control paths.
|
new one-off control paths.
|
||||||
|
|
||||||
|
## Known Gaps Against Intent
|
||||||
|
|
||||||
|
- **Scheduled-run trust gap**: INTENT promises recurring coordination work that
|
||||||
|
runs without Bernd as the manual coordination layer. The daily triage path is
|
||||||
|
implemented, but its current calibration task still lacks three clean
|
||||||
|
consecutive scheduled runs after the June 7 runtime failure. Until that closes,
|
||||||
|
daily triage remains a production-backed capability with an evidence gap, not
|
||||||
|
a fully proven standing substrate.
|
||||||
|
- **Task creation gap**: INTENT says activations emit task creation requests to
|
||||||
|
issue-core. The REST sink exists, but Railiance is still in `ISSUE_SINK_TYPE=null`
|
||||||
|
mode. That preserves auditability and avoids accidental duplicate/live tasks,
|
||||||
|
but it means production schedules are not yet consistently creating real
|
||||||
|
issue-core tasks.
|
||||||
|
- **Review queue gap**: `review_required` is explicitly metadata only in the
|
||||||
|
current contract. No issue-core review queue integration exists here, so any
|
||||||
|
future queue routing needs a downstream issue-core contract before high-impact
|
||||||
|
instruction outputs rely on it.
|
||||||
|
- **Evidence backend posture**: the State Hub fallback evidence path is the
|
||||||
|
accepted current backend for `ops_inventory_probe`. Inter-Hub/ops-hub
|
||||||
|
submission is deliberately deferred behind `OPS_HUB_KEY`, widget mapping, and
|
||||||
|
operator approval, so per-entity ops evidence publication is future work.
|
||||||
|
- **Execution-boundary residue**: `TaskExecutorWorkflow` is still registered as
|
||||||
|
a stub that writes a done `task_instances` row. It should remain inert or be
|
||||||
|
removed/re-homed before it attracts real execution work, because execution is
|
||||||
|
explicitly outside activity-core's intent.
|
||||||
|
- **API exposure posture**: the FastAPI surface stays ClusterIP-only for now.
|
||||||
|
External ingress remains future work until an authenticated access policy is
|
||||||
|
designed.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## How It Fits
|
## How It Fits
|
||||||
|
|
||||||
```
|
```
|
||||||
[NATS JetStream] ← publishers: state hub, Gitea webhooks, Temporal signals, cron
|
[NATS JetStream] ← publishers: State Hub, Gitea webhooks, Temporal signals, cron
|
||||||
↓
|
↓
|
||||||
[activity-core] ← event type registry, rule evaluator, instruction executor
|
[activity-core] ← event type registry, rule evaluator, instruction executor
|
||||||
[activity-core] → [issue-core] → [repos/services]
|
[activity-core] → [issue-core] → [repos/services]
|
||||||
[activity-core] → [report sinks]
|
[activity-core] → [report/evidence sinks] → [State Hub / working memory / future Inter-Hub]
|
||||||
```
|
```
|
||||||
|
|
||||||
- **Upstream**: NATS (event bus), Temporal (durable workflow engine), PostgreSQL
|
- **Upstream**: NATS (event bus), Temporal (durable workflow engine), PostgreSQL
|
||||||
(definitions and audit log), repo-scoping (context adapter), state hub (context
|
(definitions and audit log), repo-scoping (context adapter), State Hub (context
|
||||||
adapter and event publisher).
|
adapter and event publisher).
|
||||||
- **Downstream**: issue-core (task management) and configured report sinks.
|
- **Downstream**: issue-core (task management) and configured report/evidence sinks.
|
||||||
Agents and humans pick up tasks from issue-core and do the actual work.
|
Agents and humans pick up tasks from issue-core and do the actual work.
|
||||||
|
Railiance may use the null sink for dry-run/audit mode until live issue-core
|
||||||
|
emission is approved.
|
||||||
- **Coordinates with**: the state hub delegates maintenance automations to
|
- **Coordinates with**: the state hub delegates maintenance automations to
|
||||||
activity-core by publishing lifecycle events or by being resolved as context.
|
activity-core by publishing lifecycle events or by being resolved as context.
|
||||||
activity-core may post progress events as report outputs, but it does not own
|
activity-core may post progress events as report/evidence outputs, but it
|
||||||
State Hub task/workstream state.
|
does not own State Hub task/workstream state.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -203,6 +279,11 @@ new one-off control paths.
|
|||||||
by a sandboxed AST walker.
|
by a sandboxed AST walker.
|
||||||
- **Instruction** — LLM-evaluated task generation with trusted-field prompt
|
- **Instruction** — LLM-evaluated task generation with trusted-field prompt
|
||||||
interpolation and structured output schema enforcement.
|
interpolation and structured output schema enforcement.
|
||||||
|
- **Report sink** — configured persistence for instruction reports, currently
|
||||||
|
working-memory markdown notes and State Hub progress events.
|
||||||
|
- **Evidence sink** — configured persistence for compact non-secret resolver
|
||||||
|
evidence, currently State Hub progress for ops inventory probes; Inter-Hub is
|
||||||
|
a deferred gated target.
|
||||||
- **Event type** — a registered, schema-documented category of event (e.g.
|
- **Event type** — a registered, schema-documented category of event (e.g.
|
||||||
`org.repo.registered`). Publisher-declared; curator-gated per environment.
|
`org.repo.registered`). Publisher-declared; curator-gated per environment.
|
||||||
- **Spawn audit trail** — activity-core's local record of what tasks were emitted,
|
- **Spawn audit trail** — activity-core's local record of what tasks were emitted,
|
||||||
@@ -219,8 +300,12 @@ new one-off control paths.
|
|||||||
- `issue-core` (formerly issue-facade) — downstream task management; receives
|
- `issue-core` (formerly issue-facade) — downstream task management; receives
|
||||||
all task emission from activity-core.
|
all task emission from activity-core.
|
||||||
- `repo-scoping` — context adapter for repository capability queries.
|
- `repo-scoping` — context adapter for repository capability queries.
|
||||||
- `the-custodian` / state hub — context adapter for domain state; delegates
|
- `the-custodian` / State Hub — context adapter for domain state; delegates
|
||||||
maintenance automation to activity-core via NATS events.
|
maintenance automation to activity-core via NATS events.
|
||||||
|
- `llm-connect` — instruction execution backend for judgement-oriented reports
|
||||||
|
such as daily State Hub WSJF triage.
|
||||||
|
- `inter-hub` / `ops-hub` — future richer ops evidence intake target; currently
|
||||||
|
operator-gated and not required for the State Hub fallback evidence path.
|
||||||
- `rules-core` (future extraction) — the rule evaluator and instruction executor
|
- `rules-core` (future extraction) — the rule evaluator and instruction executor
|
||||||
module, currently in `src/activity_core/rules/`.
|
module, currently in `src/activity_core/rules/`.
|
||||||
- `project-core` (future) — project and initiative management; will use
|
- `project-core` (future) — project and initiative management; will use
|
||||||
@@ -237,6 +322,9 @@ new one-off control paths.
|
|||||||
governance model, event type schema, ActivityDefinition structure.
|
governance model, event type schema, ActivityDefinition structure.
|
||||||
- `docs/adr/adr-003-rule-instruction-model.md` — Rule DSL, Instruction safety
|
- `docs/adr/adr-003-rule-instruction-model.md` — Rule DSL, Instruction safety
|
||||||
model, evaluation semantics, audit trail, testing strategy.
|
model, evaluation semantics, audit trail, testing strategy.
|
||||||
|
- `docs/adr/adr-004-producer-trust-boundary.md` — untrusted-producer premise,
|
||||||
|
trust-but-handle vs verify-and-mitigate postures, error-locality and
|
||||||
|
quarantine-with-provenance, producer guardrails for LLM/agent/human output.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -248,7 +336,10 @@ new one-off control paths.
|
|||||||
`src/activity_core/activities.py` (Temporal activities),
|
`src/activity_core/activities.py` (Temporal activities),
|
||||||
`src/activity_core/event_router.py` (NATS → Temporal),
|
`src/activity_core/event_router.py` (NATS → Temporal),
|
||||||
`src/activity_core/schedule_manager.py` (Temporal Schedules),
|
`src/activity_core/schedule_manager.py` (Temporal Schedules),
|
||||||
`src/activity_core/api.py` (FastAPI admin).
|
`src/activity_core/api.py` (FastAPI admin),
|
||||||
|
`src/activity_core/report_sinks.py` (instruction reports),
|
||||||
|
`src/activity_core/ops_evidence_sinks.py` (ops evidence),
|
||||||
|
and `src/activity_core/context_resolvers/` (external context adapters).
|
||||||
- Definition files: `event-types/`, `activity-definitions/`, and `tasks/`.
|
- Definition files: `event-types/`, `activity-definitions/`, and `tasks/`.
|
||||||
- Dev environment: `docker-compose.dev.yml` (Temporal + PostgreSQL + NATS).
|
- Dev environment: `docker-compose.dev.yml` (Temporal + PostgreSQL + NATS).
|
||||||
- Entry points: `uv run python -m activity_core.worker` (Temporal worker),
|
- Entry points: `uv run python -m activity_core.worker` (Temporal worker),
|
||||||
@@ -264,6 +355,7 @@ title: Durable event-triggered task factory
|
|||||||
description: >
|
description: >
|
||||||
Org-wide Event Bridge that receives time-based and domain events, evaluates
|
Org-wide Event Bridge that receives time-based and domain events, evaluates
|
||||||
declarative rules and LLM instructions against current org context, and emits
|
declarative rules and LLM instructions against current org context, and emits
|
||||||
structured task sets to issue-core with a full spawn audit trail.
|
structured task, report, and evidence outputs with a full spawn/report audit
|
||||||
keywords: [temporal, workflow, event-bridge, task, cron, event, rule, instruction, org-automation]
|
trail while leaving task lifecycle ownership downstream.
|
||||||
|
keywords: [temporal, workflow, event-bridge, task, report, evidence, cron, event, rule, instruction, org-automation]
|
||||||
```
|
```
|
||||||
|
|||||||
184
agents/agent-coach.md
Normal file
184
agents/agent-coach.md
Normal file
@@ -0,0 +1,184 @@
|
|||||||
|
---
|
||||||
|
name: coach
|
||||||
|
description: Coaching meta-agent that reads all agent memories in a project and synthesises cross-agent briefs and new-agent orientations
|
||||||
|
category: meta
|
||||||
|
memory: enabled
|
||||||
|
---
|
||||||
|
|
||||||
|
# Coach Agent
|
||||||
|
|
||||||
|
## Role
|
||||||
|
|
||||||
|
You are the **kaizen-agentic Coach** — a meta-agent that observes, synthesises,
|
||||||
|
and advises. You do not perform domain work (coding, testing, infrastructure).
|
||||||
|
Your sole purpose is to read across the accumulated memories of all agents in a
|
||||||
|
project and produce useful, targeted briefs.
|
||||||
|
|
||||||
|
You are invoked via:
|
||||||
|
```
|
||||||
|
kaizen-agentic memory brief <agent-name>
|
||||||
|
```
|
||||||
|
|
||||||
|
Or directly by the operator: *"Coach, brief the sys-medic agent on this project"*
|
||||||
|
or *"Coach, what patterns have you observed across all agents?"*
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## What You Do
|
||||||
|
|
||||||
|
### 1. Cross-Agent Synthesis
|
||||||
|
|
||||||
|
Read all `.kaizen/agents/*/memory.md` files in the current project. Identify:
|
||||||
|
|
||||||
|
- **Shared patterns**: themes that appear across multiple agents
|
||||||
|
(e.g. "three agents flagged missing test coverage as a risk")
|
||||||
|
- **Cross-domain risks**: signals in one agent's memory that should inform
|
||||||
|
another (e.g. infrastructure instability flagged by sys-medic → tdd-workflow
|
||||||
|
should account for flaky environments)
|
||||||
|
- **Resource or architectural signals**: recurring mentions of specific files,
|
||||||
|
modules, services, or systems across agents
|
||||||
|
- **Contradictions or gaps**: where agents hold conflicting assumptions or where
|
||||||
|
no agent has coverage
|
||||||
|
|
||||||
|
### 2. New-Agent Orientation
|
||||||
|
|
||||||
|
When asked to brief a specific agent about to be deployed for the first time:
|
||||||
|
|
||||||
|
1. Read all existing agent memories in the project
|
||||||
|
2. Filter for what is relevant to the incoming agent's domain
|
||||||
|
3. Produce a targeted orientation brief covering:
|
||||||
|
- **Project context**: what kind of project this is, key constraints
|
||||||
|
- **What to know first**: the most important facts for this agent
|
||||||
|
- **Watch points**: risks or pitfalls flagged by other agents that are relevant
|
||||||
|
- **What has worked**: successful approaches in adjacent domains
|
||||||
|
- **Open threads**: unresolved items from other agents that may interact with
|
||||||
|
this agent's work
|
||||||
|
|
||||||
|
### 3. Fleet Health Overview
|
||||||
|
|
||||||
|
When asked for a fleet overview:
|
||||||
|
|
||||||
|
- Summarise the health of the agent fleet: which agents are active, stale, or
|
||||||
|
missing from the project
|
||||||
|
- Flag agents with high `session_count` and still-open `## Open Threads`
|
||||||
|
- Identify agents whose memories suggest overlapping concerns
|
||||||
|
- Recommend whether any memory files should be reviewed or reset
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## How to Read Agent Memory Files
|
||||||
|
|
||||||
|
Memory files live at `.kaizen/agents/<name>/memory.md` relative to the project
|
||||||
|
root. Each follows ADR-002 structure:
|
||||||
|
|
||||||
|
```
|
||||||
|
## Project Context ← agent's understanding of the project
|
||||||
|
## Accumulated Findings ← patterns and recurring issues
|
||||||
|
## What Worked ← validated approaches
|
||||||
|
## Watch Points ← risks and traps
|
||||||
|
## Open Threads ← unresolved items
|
||||||
|
## Session Log ← chronological session summaries
|
||||||
|
```
|
||||||
|
|
||||||
|
When synthesising, weight `## Watch Points` and `## Open Threads` most heavily —
|
||||||
|
these are the signals most likely to be actionable for another agent.
|
||||||
|
|
||||||
|
### Project metrics (ADR-004)
|
||||||
|
|
||||||
|
Quantitative performance data lives at `.kaizen/metrics/<agent>/summary.json`.
|
||||||
|
`kaizen-agentic memory brief <agent>` includes a `## Performance Summary` block
|
||||||
|
when metrics exist.
|
||||||
|
|
||||||
|
When synthesising orientations:
|
||||||
|
|
||||||
|
- Combine qualitative memory with quantitative trends (success rate, quality,
|
||||||
|
execution time, trend arrows)
|
||||||
|
- Flag agents with declining success rate or quality trends
|
||||||
|
- Cross-reference metrics with `## Watch Points` — do metrics confirm or
|
||||||
|
contradict qualitative findings?
|
||||||
|
- Note when an agent has memory but no metrics (incomplete session-close protocol)
|
||||||
|
|
||||||
|
Fleet optimizer output at `.kaizen/metrics/optimizer/analysis.json` provides
|
||||||
|
project-wide analysis from `kaizen-agentic metrics optimize`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Output Format
|
||||||
|
|
||||||
|
### Cross-agent brief
|
||||||
|
|
||||||
|
```
|
||||||
|
## Cross-Agent Brief — <project name>
|
||||||
|
Generated: <date>
|
||||||
|
Agents with memory: <list>
|
||||||
|
|
||||||
|
### Shared Patterns
|
||||||
|
<bullet list of themes appearing across ≥2 agents>
|
||||||
|
|
||||||
|
### Cross-Domain Risks
|
||||||
|
<risks from one domain relevant to others>
|
||||||
|
|
||||||
|
### Open Threads (fleet-wide)
|
||||||
|
<unresolved items that span or affect multiple agents>
|
||||||
|
|
||||||
|
### Fleet Health
|
||||||
|
<which agents are active/stale, any concerning signals>
|
||||||
|
```
|
||||||
|
|
||||||
|
### New-agent orientation
|
||||||
|
|
||||||
|
```
|
||||||
|
## Orientation Brief for: <agent-name>
|
||||||
|
Project: <project name>
|
||||||
|
Generated: <date>
|
||||||
|
Sources: <which agent memories were read>
|
||||||
|
|
||||||
|
### Performance Summary
|
||||||
|
<from .kaizen/metrics/<agent>/ when available — success rate, quality, trends>
|
||||||
|
|
||||||
|
### What to Know First
|
||||||
|
<3–5 most important facts for this agent>
|
||||||
|
|
||||||
|
### Watch Points
|
||||||
|
<risks relevant to this agent's domain>
|
||||||
|
|
||||||
|
### What Has Worked
|
||||||
|
<approaches validated by other agents that apply here>
|
||||||
|
|
||||||
|
### Open Threads You May Encounter
|
||||||
|
<items from other agents that may intersect with your work>
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Behaviour Boundaries
|
||||||
|
|
||||||
|
- **Do not** modify agent memory files
|
||||||
|
- **Do not** perform any domain-specific work (coding, testing, diagnosis)
|
||||||
|
- **Do not** make decisions — synthesise and advise only
|
||||||
|
- **If no memories exist**: say so clearly and offer to help initialise them
|
||||||
|
- **If asked about a specific agent not present**: note the gap
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Coach's Own Memory
|
||||||
|
|
||||||
|
The coach maintains `.kaizen/agents/coach/memory.md` covering:
|
||||||
|
|
||||||
|
- Fleet-level patterns observed over time
|
||||||
|
- How the agent population in this project has evolved
|
||||||
|
- Meta-observations about how well the memory convention is being followed
|
||||||
|
- Recurring gaps or blind spots in the agent fleet
|
||||||
|
|
||||||
|
### Session Start
|
||||||
|
|
||||||
|
1. Check for `.kaizen/agents/coach/memory.md`.
|
||||||
|
2. If present, read it — prior fleet observations provide context for the current synthesis.
|
||||||
|
3. Scan `.kaizen/agents/*/memory.md` to build the current fleet picture.
|
||||||
|
|
||||||
|
### Session Close
|
||||||
|
|
||||||
|
1. Update `## Accumulated Findings` with new fleet-level patterns.
|
||||||
|
2. Note any new agents added or memory files reset.
|
||||||
|
3. Append one line to `## Session Log`: `YYYY-MM-DD · <brief requested for> · <key finding>`.
|
||||||
|
4. Bump `last_updated` and `session_count`.
|
||||||
191
agents/agent-optimization.md
Normal file
191
agents/agent-optimization.md
Normal file
@@ -0,0 +1,191 @@
|
|||||||
|
---
|
||||||
|
name: optimization
|
||||||
|
description: Meta-agent that analyzes and optimizes other Claude Code subagents based on their performance data, usage patterns, and effectiveness metrics. Use PROACTIVELY for agent ecosystem improvement.
|
||||||
|
model: inherit
|
||||||
|
category: meta
|
||||||
|
memory: enabled
|
||||||
|
---
|
||||||
|
|
||||||
|
# Kaizen Optimizer - Agent Performance Meta-Optimizer
|
||||||
|
|
||||||
|
## Purpose
|
||||||
|
|
||||||
|
Meta-agent that analyzes and optimizes other Claude Code subagents based on their performance data, usage patterns, and effectiveness metrics. Continuously improves the agent ecosystem by identifying patterns that correlate with success or failure, and proposing data-driven refinements to agent specifications.
|
||||||
|
|
||||||
|
## When to Use This Agent
|
||||||
|
|
||||||
|
Use the kaizen-optimizer agent when you need:
|
||||||
|
|
||||||
|
- Analysis of subagent performance and effectiveness
|
||||||
|
- Optimization recommendations for existing agents
|
||||||
|
- Agent specification improvements based on usage data
|
||||||
|
- Performance pattern identification across agent invocations
|
||||||
|
- Agent ecosystem health assessment
|
||||||
|
- Continuous improvement of the agent framework
|
||||||
|
|
||||||
|
### Trigger Patterns
|
||||||
|
|
||||||
|
1. **Scheduled Reviews**: Regular analysis of agent performance (weekly/monthly)
|
||||||
|
2. **Performance Degradation**: When agent success rates drop below thresholds
|
||||||
|
3. **New Agent Evaluation**: After deploying new agents to assess effectiveness
|
||||||
|
4. **Usage Pattern Changes**: When agent usage patterns shift significantly
|
||||||
|
5. **Explicit Optimization Requests**: Direct requests for agent improvement analysis
|
||||||
|
|
||||||
|
### Example Usage Scenarios
|
||||||
|
|
||||||
|
1. **Post-Project Analysis**: "Analyze how well our agents performed during Issue #15 implementation and suggest improvements"
|
||||||
|
2. **Agent Performance Review**: "Review the effectiveness of tddai-assistant over the last 30 days and recommend optimizations"
|
||||||
|
3. **Ecosystem Optimization**: "Identify which agents are underperforming and suggest specification improvements"
|
||||||
|
4. **Success Pattern Analysis**: "Analyze successful agent chains and recommend best practices"
|
||||||
|
|
||||||
|
## Agent Capabilities
|
||||||
|
|
||||||
|
### Performance Analysis
|
||||||
|
- **Success Rate Analysis**: Track agent task completion and success metrics
|
||||||
|
- **Usage Pattern Recognition**: Identify how agents are being used effectively
|
||||||
|
- **Failure Mode Analysis**: Categorize and analyze agent failure patterns
|
||||||
|
- **Response Quality Assessment**: Evaluate the quality of agent outputs
|
||||||
|
|
||||||
|
### Optimization Recommendations
|
||||||
|
- **Specification Refinements**: Suggest improvements to agent descriptions and capabilities
|
||||||
|
- **Trigger Pattern Optimization**: Refine when and how agents should be invoked
|
||||||
|
- **Chain Optimization**: Recommend better agent collaboration patterns
|
||||||
|
- **Scope Adjustments**: Identify agents that are too broad or too narrow in scope
|
||||||
|
|
||||||
|
### Meta-Learning
|
||||||
|
- **Pattern Detection**: Identify successful agent behaviors and specifications
|
||||||
|
- **Correlation Analysis**: Find relationships between agent characteristics and performance
|
||||||
|
- **Best Practice Extraction**: Distill successful patterns into reusable guidelines
|
||||||
|
- **Evolution Tracking**: Monitor how agent improvements affect performance over time
|
||||||
|
|
||||||
|
## Analysis Framework
|
||||||
|
|
||||||
|
### Data Collection Focus
|
||||||
|
Since this operates within Claude Code's environment, analysis is based on:
|
||||||
|
|
||||||
|
- **Conversation Context**: Agent invocation patterns and outcomes within sessions
|
||||||
|
- **User Feedback Patterns**: Implicit success signals from user interactions
|
||||||
|
- **Task Completion Rates**: Whether agents successfully complete their assigned tasks
|
||||||
|
- **Agent Specification Quality**: How well specifications match actual usage
|
||||||
|
|
||||||
|
### Performance Metrics
|
||||||
|
- **Invocation Success**: How often agents complete tasks as intended
|
||||||
|
- **User Satisfaction Indicators**: Continued usage, follow-up requests, task completion
|
||||||
|
- **Agent Utilization**: Which agents are used most/least and why
|
||||||
|
- **Chain Effectiveness**: Success rates of multi-agent workflows
|
||||||
|
|
||||||
|
## Optimization Strategies
|
||||||
|
|
||||||
|
### Specification Enhancement
|
||||||
|
- **Clarity Improvements**: Make agent purposes and capabilities clearer
|
||||||
|
- **Scope Refinement**: Adjust agent boundaries for better effectiveness
|
||||||
|
- **Example Enhancement**: Add better usage examples and scenarios
|
||||||
|
- **Integration Guidance**: Improve agent-to-agent collaboration descriptions
|
||||||
|
|
||||||
|
### Performance Improvement
|
||||||
|
- **Trigger Optimization**: Refine when agents should be automatically suggested
|
||||||
|
- **Capability Matching**: Ensure agent capabilities match user needs
|
||||||
|
- **Redundancy Reduction**: Identify and resolve agent overlap issues
|
||||||
|
- **Gap Identification**: Find missing capabilities in the agent ecosystem
|
||||||
|
|
||||||
|
## Integration with Agent Ecosystem
|
||||||
|
|
||||||
|
### Analyzes All Agents
|
||||||
|
- **general-purpose**: Assess effectiveness for research and multi-step tasks
|
||||||
|
- **tddai-assistant**: Evaluate TDD workflow support and methodology adherence
|
||||||
|
- **project-assistant**: Review project management and milestone tracking performance
|
||||||
|
- **claude-expert**: Analyze documentation and feature explanation effectiveness
|
||||||
|
- **statusline-setup**: Assess configuration task success rates
|
||||||
|
- **output-style-setup**: Evaluate creative task completion effectiveness
|
||||||
|
|
||||||
|
### Collaborative Analysis
|
||||||
|
Works with other agents to gather performance data:
|
||||||
|
- Uses **general-purpose** for complex analysis tasks
|
||||||
|
- Coordinates with **project-assistant** for milestone-based performance tracking
|
||||||
|
- Leverages **claude-expert** for framework knowledge and best practices
|
||||||
|
|
||||||
|
## Expected Outputs
|
||||||
|
|
||||||
|
### Performance Analysis Reports
|
||||||
|
- Agent effectiveness rankings with supporting evidence
|
||||||
|
- Usage pattern analysis and trend identification
|
||||||
|
- Success/failure correlation analysis
|
||||||
|
- Performance bottleneck identification
|
||||||
|
|
||||||
|
### Optimization Recommendations
|
||||||
|
- Specific agent specification improvements
|
||||||
|
- Trigger pattern refinements
|
||||||
|
- Agent chain optimization suggestions
|
||||||
|
- New agent capability recommendations
|
||||||
|
|
||||||
|
### Implementation Guidance
|
||||||
|
- Prioritized improvement roadmap
|
||||||
|
- Specification update templates
|
||||||
|
- A/B testing suggestions for agent improvements
|
||||||
|
- Rollback strategies for failed optimizations
|
||||||
|
|
||||||
|
## Best Practices for Usage
|
||||||
|
|
||||||
|
### Provide Performance Context
|
||||||
|
- Share specific agent interactions that were particularly effective or ineffective
|
||||||
|
- Describe user experience challenges with current agents
|
||||||
|
- Include examples of successful and unsuccessful agent chains
|
||||||
|
- Specify performance concerns or optimization goals
|
||||||
|
|
||||||
|
### Be Specific About Scope
|
||||||
|
- Focus on particular agents or agent categories for analysis
|
||||||
|
- Define time windows for performance analysis
|
||||||
|
- Specify success criteria for optimization efforts
|
||||||
|
- Clarify whether analysis should be broad ecosystem or targeted
|
||||||
|
|
||||||
|
### Implementation Approach
|
||||||
|
- Request prioritized recommendations based on impact vs. effort
|
||||||
|
- Ask for specific specification changes rather than general advice
|
||||||
|
- Seek rollback plans for proposed optimizations
|
||||||
|
- Request measurable success criteria for improvements
|
||||||
|
|
||||||
|
## Quality Standards
|
||||||
|
|
||||||
|
### Analysis Rigor
|
||||||
|
- Evidence-based recommendations supported by usage patterns
|
||||||
|
- Consideration of trade-offs between different optimization approaches
|
||||||
|
- Realistic improvement expectations and timelines
|
||||||
|
- Acknowledgment of limitations in available performance data
|
||||||
|
|
||||||
|
### Recommendation Quality
|
||||||
|
- Specific, actionable changes to agent specifications
|
||||||
|
- Clear success criteria for measuring improvement effectiveness
|
||||||
|
- Integration considerations for agent ecosystem harmony
|
||||||
|
- Risk assessment for proposed changes
|
||||||
|
|
||||||
|
## Integration Notes
|
||||||
|
|
||||||
|
This agent operates within Claude Code's conversation context and focuses on:
|
||||||
|
|
||||||
|
- **Qualitative Analysis**: Since detailed metrics aren't available, focuses on behavioral patterns and user interaction quality
|
||||||
|
- **Specification Optimization**: Improving agent descriptions, examples, and usage guidance
|
||||||
|
- **Ecosystem Balance**: Ensuring agents complement rather than compete with each other
|
||||||
|
- **Practical Improvements**: Recommendations that can be implemented through specification updates
|
||||||
|
|
||||||
|
The agent serves as the continuous improvement engine for the subagent ecosystem, ensuring agents evolve to better serve user needs and project requirements.
|
||||||
|
|
||||||
|
## Session Start
|
||||||
|
|
||||||
|
1. Check for `.kaizen/agents/optimization/memory.md` in the project root.
|
||||||
|
2. If present, read it before beginning analysis.
|
||||||
|
3. Review `.kaizen/metrics/optimizer/analysis.json` if it exists for the latest fleet report.
|
||||||
|
|
||||||
|
## Session Close
|
||||||
|
|
||||||
|
1. When analysis completes, note key findings in `## Accumulated Findings`.
|
||||||
|
2. Append one line to `## Session Log`: `YYYY-MM-DD · <agents reviewed> · <outcome>`.
|
||||||
|
3. Bump `last_updated` and increment `session_count`.
|
||||||
|
4. Persist quantitative analysis via CLI (ADR-004):
|
||||||
|
|
||||||
|
```bash
|
||||||
|
kaizen-agentic metrics optimize [agent-name]
|
||||||
|
```
|
||||||
|
|
||||||
|
Run without an agent name to analyze all agents with project metrics. Requires
|
||||||
|
≥10 execution records per agent for actionable recommendations (see
|
||||||
|
`wiki/AgentKaizenOptimizer.md`).
|
||||||
@@ -216,11 +216,21 @@ it. The output schema must define `List[TaskSpec]` or a compatible envelope.
|
|||||||
|
|
||||||
#### `review_required: true`
|
#### `review_required: true`
|
||||||
|
|
||||||
When set, the instruction's proposed task list is written to a **pending review
|
When set today, the instruction's task/report output is marked with
|
||||||
queue** in issue-core rather than directly created. A human or curator agent
|
`review_required=true` in activity-core audit metadata. For report-producing
|
||||||
reviews and approves/rejects before tasks are materialised. This is the default
|
instructions, this flag is also persisted in configured report sinks so an
|
||||||
for instructions that create high-impact tasks (cross-repo changes, security
|
operator can distinguish validated-but-review-worthy output from routine
|
||||||
responses, production operations).
|
output.
|
||||||
|
|
||||||
|
activity-core does **not** currently route proposed tasks to a pending review
|
||||||
|
queue. That queue must be owned by issue-core, because issue-core owns task
|
||||||
|
lifecycle state. Until issue-core exposes a review contract, `review_required`
|
||||||
|
is metadata only; it must not be treated as evidence that live task creation was
|
||||||
|
held for approval.
|
||||||
|
|
||||||
|
Future issue-core review integration may use the same field, but that change
|
||||||
|
must update the issue sink contract and tests before any ActivityDefinition
|
||||||
|
relies on queue routing.
|
||||||
|
|
||||||
#### Evaluation semantics
|
#### Evaluation semantics
|
||||||
|
|
||||||
@@ -286,7 +296,8 @@ This boundary makes future extraction to `rules-core` a packaging exercise, not
|
|||||||
tasks" behaviour is replaced by explicit rule blocks.
|
tasks" behaviour is replaced by explicit rule blocks.
|
||||||
- A new `RuleEvaluator` class (AST walker) is added to `src/activity_core/rules/`.
|
- A new `RuleEvaluator` class (AST walker) is added to `src/activity_core/rules/`.
|
||||||
- A new `InstructionExecutor` class handles prompt rendering, LLM call, output
|
- A new `InstructionExecutor` class handles prompt rendering, LLM call, output
|
||||||
validation, and review queue routing.
|
validation, and review-required audit metadata. Pending review queue routing
|
||||||
|
remains a future issue-core integration.
|
||||||
- Integration tests for rule evaluation use fixture JSON; no running Temporal required.
|
- Integration tests for rule evaluation use fixture JSON; no running Temporal required.
|
||||||
- The `task_spawn_log` table is added to the Postgres schema (new Alembic migration).
|
- The `task_spawn_log` table is added to the Postgres schema (new Alembic migration).
|
||||||
- ActivityDefinition files that omit both `rules` and `instructions` are valid
|
- ActivityDefinition files that omit both `rules` and `instructions` are valid
|
||||||
|
|||||||
156
docs/adr/adr-004-producer-trust-boundary.md
Normal file
156
docs/adr/adr-004-producer-trust-boundary.md
Normal file
@@ -0,0 +1,156 @@
|
|||||||
|
---
|
||||||
|
id: ACT-ADR-004
|
||||||
|
type: architecture-decision-record
|
||||||
|
title: "The Producer Trust Boundary — Guardrails and Error-Correction for Untrusted Output"
|
||||||
|
status: accepted
|
||||||
|
decided_by: Bernd Worsch
|
||||||
|
date: "2026-06-26"
|
||||||
|
scope: cross-repo
|
||||||
|
affects:
|
||||||
|
- activity-core
|
||||||
|
- rules-core (future extraction)
|
||||||
|
tags: ["architecture", "llm", "safety", "validation", "guardrails", "trust-boundary", "resilience"]
|
||||||
|
---
|
||||||
|
|
||||||
|
# ACT-ADR-004: The Producer Trust Boundary
|
||||||
|
|
||||||
|
## Status
|
||||||
|
|
||||||
|
Accepted.
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
On 2026-06-26 the scheduled daily WSJF triage instruction fired on time, called
|
||||||
|
llm-connect successfully, and produced a long ranked recommendation list — but
|
||||||
|
the JSON broke at char 5268 (~rank 8–9 of ~16), failing schema validation. Because
|
||||||
|
the report was validated and consumed as a single monolithic JSON document, one
|
||||||
|
malformed delimiter discarded the **entire** run, including the 7 perfectly good
|
||||||
|
recommendations the model had already emitted. The scheduling and runtime layers
|
||||||
|
were healthy; the failure was entirely at the seam where free-form model output
|
||||||
|
meets a strict consumer.
|
||||||
|
|
||||||
|
This is not a one-off bug, it is a recurring class. activity-core has a **trust
|
||||||
|
boundary** wherever generative or human-authored output meets strict deterministic
|
||||||
|
consumers: the JSON Schema validator, the task emitter, and any classic compute
|
||||||
|
pipeline downstream. The producers on the other side of that boundary — **LLMs,
|
||||||
|
agents, and humans** — are all *untrusted producers*. Their output may be:
|
||||||
|
|
||||||
|
- **erroneous** — hallucination, truncation at a token limit, drift, type slips,
|
||||||
|
typos, a missing delimiter; or
|
||||||
|
- **malicious** — prompt injection, crafted payloads, or oversized / deeply-nested
|
||||||
|
structures intended to exhaust or confuse the consumer.
|
||||||
|
|
||||||
|
The pre-existing design treated producer output optimistically: parse the whole
|
||||||
|
document, validate the whole document, and on any failure discard the whole
|
||||||
|
document (preserving only a bounded diagnostic preview). That gives **zero error
|
||||||
|
locality** — the blast radius of any single defect is the entire activation.
|
||||||
|
|
||||||
|
## Decision
|
||||||
|
|
||||||
|
Treat the producer→consumer seam as an explicit, adversarial **trust boundary**,
|
||||||
|
and place guardrails plus error-correction tooling *at that boundary* rather than
|
||||||
|
letting raw producer output flow into deterministic consumers.
|
||||||
|
|
||||||
|
### Two non-fail-fast postures
|
||||||
|
|
||||||
|
When hard-failing on a problem is undesirable, there are two sound strategies, and
|
||||||
|
they **compose**:
|
||||||
|
|
||||||
|
- **A) Trust but handle exceptions** (optimistic / reactive). Consume the output
|
||||||
|
as-is; on exception, catch → repair → retry → or quarantine. Cheap on the happy
|
||||||
|
path; blast radius depends entirely on how granular the catch is. Best when
|
||||||
|
failures are rare and locally recoverable. Risk: failures surface late, possibly
|
||||||
|
after partial side effects.
|
||||||
|
- **B) Verify and mitigate** (defensive / proactive). Validate, sanitize, clamp,
|
||||||
|
and normalize the output to a known-good shape *before* it enters the pipeline —
|
||||||
|
drop bad items, coerce types, bound sizes/depth, allow-list references — so the
|
||||||
|
consumer only ever sees clean input. Higher upfront cost, smaller blast radius,
|
||||||
|
no partial side effects. Best when failures are common or consequences are high.
|
||||||
|
|
||||||
|
### Governing principles
|
||||||
|
|
||||||
|
1. **Push verification to the boundary; keep the interior strict.** Apply posture
|
||||||
|
**B** at the producer→consumer boundary; keep posture **A** for residual
|
||||||
|
exceptions inside the verified core. Never relax the interior schema to absorb
|
||||||
|
producer sloppiness.
|
||||||
|
2. **Make error locality match the unit of work.** One bad recommendation must
|
||||||
|
cost one recommendation, not the whole report. Structuring the payload so each
|
||||||
|
item is independently parseable and validatable is the highest-leverage change.
|
||||||
|
3. **Quarantine, never silently drop.** Invalid units are preserved as bounded,
|
||||||
|
provenance-tagged artifacts (`index`, `error`, `raw` snippet, `reason`) so they
|
||||||
|
can be debugged or replayed. Degraded-but-usable is reported distinctly from
|
||||||
|
total loss.
|
||||||
|
4. **Both human and agent input get the same rigor.** Guardrails are
|
||||||
|
producer-agnostic: the same count / length / depth caps and reference
|
||||||
|
allow-lists apply whether the producer is an LLM, an agent, or a human.
|
||||||
|
|
||||||
|
### What this means concretely in activity-core
|
||||||
|
|
||||||
|
Implemented in `src/activity_core/rules/executor.py`:
|
||||||
|
|
||||||
|
- **Strict-structure-only schema.** The daily-triage output schema is strict on
|
||||||
|
per-item *structure* (`required [rank, candidate, action, why]`, typed `wsjf`)
|
||||||
|
and carries `maxItems` as a producer *hint* — never as a hard whole-document
|
||||||
|
reject, which would reproduce the very blast-radius failure (ACT-ADR-002 governs
|
||||||
|
the schema format; `schemas/daily-triage-report.json`).
|
||||||
|
- **Item-granular recovery (posture B).** When whole-document parse + one retry
|
||||||
|
fail, `_resilient_report` recovers individually-parseable recommendation objects
|
||||||
|
via a brace/quote-aware scanner (`_extract_object_spans`) that works for both
|
||||||
|
pretty-printed and NDJSON output, attempts a best-effort `_try_repair` on a
|
||||||
|
truncated tail, validates each recovered object against the item schema, and
|
||||||
|
keeps the valid ones. Survivors are emitted with `output_validated=true`,
|
||||||
|
`partial=true`, and `review_required=true`.
|
||||||
|
- **Producer guardrails (`_partition_items`, applied on both the recovery and the
|
||||||
|
happy path).** Per recommendation: structural type → schema → structural caps
|
||||||
|
(`_MAX_DEPTH`, `_MAX_STRING_LEN`) → reference allow-list → count cap (top-N by
|
||||||
|
`maxItems`). The first failing check quarantines the item with provenance and a
|
||||||
|
`reason` (`malformed` / `schema` / `guardrail` / `allow_list` / `over_limit`).
|
||||||
|
- **Reference allow-list.** A recommendation whose `candidate` is not in the set of
|
||||||
|
known ids is quarantined. The set is sourced from resolved context
|
||||||
|
(`context["known_candidates"]`, via `_allow_list_from_context`); the check is
|
||||||
|
inert until a context resolver populates it, so the capability ships now and
|
||||||
|
activates with a one-line resolver change.
|
||||||
|
|
||||||
|
### Where each posture sits
|
||||||
|
|
||||||
|
| Layer | Posture | Mechanism |
|
||||||
|
|-------|---------|-----------|
|
||||||
|
| Schema / contract | B | strict per-item structure; `maxItems` as hint |
|
||||||
|
| Whole-document parse | A | tolerant parse + single retry |
|
||||||
|
| Failed parse | B | item-granular recovery + repair + quarantine |
|
||||||
|
| Per-item screening | B | schema + depth/length caps + allow-list + count cap |
|
||||||
|
| Emitted report | — | `partial` / `quarantined_*` provenance; never silent |
|
||||||
|
|
||||||
|
## Consequences
|
||||||
|
|
||||||
|
- A single malformed or oversized item no longer discards an entire activation;
|
||||||
|
the daily-triage run that failed on 2026-06-26 would now deliver its 7 valid
|
||||||
|
recommendations and quarantine the broken tail.
|
||||||
|
- Reports gain a `partial` / `quarantined_*` vocabulary; downstream report sinks
|
||||||
|
and reviewers can distinguish degraded-but-usable from total loss.
|
||||||
|
- Guardrail thresholds (`_MAX_DEPTH`, `_MAX_STRING_LEN`, `maxItems`, the
|
||||||
|
allow-list) are policy knobs that will need tuning; they are intentionally
|
||||||
|
conservative defaults, not a finished calibration.
|
||||||
|
- **Known retention gap (follow-on):** `LLMConnectClient.complete()` still returns
|
||||||
|
only `content`, discarding `finish_reason`/`usage`, and the total-loss artifact
|
||||||
|
caps raw output below realistic break points. Capturing those signals so
|
||||||
|
failures stay debuggable is tracked as a retention fix, not closed by this ADR.
|
||||||
|
|
||||||
|
## Alternatives considered
|
||||||
|
|
||||||
|
- **Hard-enforce `maxItems` in the validator.** Rejected: a hard reject of an
|
||||||
|
over-count document reproduces the whole-document blast radius. Mitigation (keep
|
||||||
|
top-N, quarantine the rest) is preferred.
|
||||||
|
- **Relax the schema to accept anything.** Rejected: violates principle 1; pushes
|
||||||
|
malformed data into downstream consumers.
|
||||||
|
- **Retry-until-valid only (pure posture A).** Rejected as the sole strategy: the
|
||||||
|
2026-06-26 failure recurred across both the initial attempt and the retry, so
|
||||||
|
retry alone does not bound the blast radius.
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
- ACT-ADR-002 — markdown-as-definition format and output schema governance.
|
||||||
|
- ACT-ADR-003 — Rule vs. Instruction model; the Instruction prompt-injection
|
||||||
|
surface this boundary complements on the output side.
|
||||||
|
- `workplans/ACTIVITY-WP-0016-llm-output-robustness-trust-boundary.md` — the
|
||||||
|
implementing workplan.
|
||||||
@@ -18,7 +18,7 @@ extension point `af654abb`).
|
|||||||
| Queue name | Registered workers |
|
| Queue name | Registered workers |
|
||||||
|---|---|
|
|---|---|
|
||||||
| `orchestrator-tq` | `RunActivityWorkflow` and all its activities (`load_activity_definition`, `resolve_context`, `log_run`) |
|
| `orchestrator-tq` | `RunActivityWorkflow` and all its activities (`load_activity_definition`, `resolve_context`, `log_run`) |
|
||||||
| `task-execution-tq` | `TaskExecutorWorkflow` and all concrete task type workflows |
|
| `task-execution-tq` | `TaskExecutorWorkflow` compatibility stub only; real execution belongs in per-repo workers |
|
||||||
|
|
||||||
**Rule:** a workflow and its activities must be registered on the same task queue.
|
**Rule:** a workflow and its activities must be registered on the same task queue.
|
||||||
Cross-queue activity calls require an explicit `task_queue` argument on
|
Cross-queue activity calls require an explicit `task_queue` argument on
|
||||||
@@ -60,6 +60,12 @@ A single process may run workers for multiple task queues, but each `Worker`
|
|||||||
instance is bound to one task queue. Use separate `Worker` instances for
|
instance is bound to one task queue. Use separate `Worker` instances for
|
||||||
`orchestrator-tq` and `task-execution-tq`.
|
`orchestrator-tq` and `task-execution-tq`.
|
||||||
|
|
||||||
|
`TaskExecutorWorkflow` is not a production execution surface for activity-core.
|
||||||
|
It exists only as a compatibility/idempotency stub that writes a synthetic
|
||||||
|
`task_instances` row in older tests and dev flows. Do not add concrete task
|
||||||
|
execution logic here; execution ownership belongs to per-repo workers or a
|
||||||
|
future execution-owned repo/workplan.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Search attributes
|
## Search attributes
|
||||||
|
|||||||
@@ -11,7 +11,9 @@ The current authoritative boundary is the issue-core REST API:
|
|||||||
POST {ISSUE_CORE_URL}/issues/
|
POST {ISSUE_CORE_URL}/issues/
|
||||||
```
|
```
|
||||||
|
|
||||||
`IssueCoreRestSink` sends this payload:
|
`IssueCoreRestSink` authenticates with the shared `ISSUE_CORE_API_KEY` env var
|
||||||
|
(same value as the issue-core server) via `Authorization: Bearer <key>` and
|
||||||
|
sends this payload:
|
||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
@@ -52,7 +54,7 @@ task reference before it can replace `IssueCoreRestSink`.
|
|||||||
|
|
||||||
Weekly SBOM staleness is safe to evaluate in dry-run mode because the rule
|
Weekly SBOM staleness is safe to evaluate in dry-run mode because the rule
|
||||||
contract is deterministic and tested. Do not enable it against the real REST sink
|
contract is deterministic and tested. Do not enable it against the real REST sink
|
||||||
until issue-core credentials, endpoint reachability, and duplicate-handling are
|
until `ISSUE_CORE_API_KEY`, endpoint reachability, and duplicate-handling are
|
||||||
verified in the target environment.
|
verified in the target environment.
|
||||||
|
|
||||||
## Verification
|
## Verification
|
||||||
|
|||||||
163
docs/runbook.md
163
docs/runbook.md
@@ -116,7 +116,58 @@ asyncio.run(publish())
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Syncing schedules manually
|
## Syncing definitions and schedules manually
|
||||||
|
|
||||||
|
When the API is running, prefer the admin sync endpoint for definition or
|
||||||
|
schedule changes. It refreshes file-backed ActivityDefinitions and reconciles
|
||||||
|
Temporal Schedules without restarting the worker:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl -s -X POST \
|
||||||
|
'http://localhost:8010/admin/sync?definitions=true&schedules=true'
|
||||||
|
```
|
||||||
|
|
||||||
|
The response reports:
|
||||||
|
|
||||||
|
- `definitions.synced`
|
||||||
|
- `event_types.synced`
|
||||||
|
- `schedules.upserted`
|
||||||
|
- `schedules.paused`
|
||||||
|
- `schedules.deleted_orphans`
|
||||||
|
- bounded `errors[]`
|
||||||
|
|
||||||
|
`event_types` defaults to `false` for this endpoint because event-triggered
|
||||||
|
definitions already reload from the DB in the event router path; opt in when
|
||||||
|
the operator intentionally changed event type definition files:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl -s -X POST \
|
||||||
|
'http://localhost:8010/admin/sync?definitions=true&schedules=true&event_types=true'
|
||||||
|
```
|
||||||
|
|
||||||
|
The v1 posture is manual/operator-triggered sync. A periodic background loop is
|
||||||
|
deferred until live use shows it is needed; this keeps customer definition
|
||||||
|
changes explicit and avoids background repo scanning from the worker.
|
||||||
|
|
||||||
|
### Railiance01 no-restart smoke
|
||||||
|
|
||||||
|
After changing a projected definition in `k8s/railiance/20-runtime.yaml`,
|
||||||
|
apply the ConfigMap and wait for the API pod volume to refresh (up to ~60s),
|
||||||
|
then reconcile without restarting `actcore-worker`:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export KUBECONFIG=~/.kube/config-hosteurope
|
||||||
|
kubectl apply -f k8s/railiance/20-runtime.yaml
|
||||||
|
sleep 60
|
||||||
|
kubectl -n activity-core exec deploy/actcore-api -- \
|
||||||
|
python3 -c 'import urllib.request; req=urllib.request.Request("http://localhost:8010/admin/sync?definitions=true&schedules=true", method="POST"); print(urllib.request.urlopen(req).read().decode())'
|
||||||
|
```
|
||||||
|
|
||||||
|
Automated regression for the disabled `ops-service-inventory-probes`
|
||||||
|
projection (enable/cadence flip, idempotent repeat sync, rollback) lives in
|
||||||
|
`scripts/smoke_admin_sync_no_restart.py`.
|
||||||
|
|
||||||
|
If the API is unavailable, the schedule-only CLI remains available:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
TEMPORAL_HOST=localhost:7233 \
|
TEMPORAL_HOST=localhost:7233 \
|
||||||
@@ -126,7 +177,7 @@ ACTCORE_DB_URL=postgresql+asyncpg://actcore:actcore@localhost:5433/actcore \
|
|||||||
|
|
||||||
This reconciles all Temporal Schedules with the `activity_definitions` table:
|
This reconciles all Temporal Schedules with the `activity_definitions` table:
|
||||||
- Upserts schedules for every enabled cron definition
|
- Upserts schedules for every enabled cron definition
|
||||||
- Creates paused schedules for disabled cron definitions
|
- Creates paused schedules for disabled cron or one-shot scheduled definitions
|
||||||
- Deletes orphaned schedules with no matching DB row
|
- Deletes orphaned schedules with no matching DB row
|
||||||
|
|
||||||
After adding or changing a recurring ActivityDefinition or workflow activity
|
After adding or changing a recurring ActivityDefinition or workflow activity
|
||||||
@@ -159,14 +210,34 @@ repos, and emits one automated task per stale repo through explicit
|
|||||||
`weekly-coding-retro` follows the same cron -> context resolver -> per-repo task
|
`weekly-coding-retro` follows the same cron -> context resolver -> per-repo task
|
||||||
pattern for coding-session retrospection. It runs Saturdays at 19:00
|
pattern for coding-session retrospection. It runs Saturdays at 19:00
|
||||||
Europe/Berlin and resolves the latest State Hub `/progress/` item with
|
Europe/Berlin and resolves the latest State Hub `/progress/` item with
|
||||||
`event_type=coding_retro` into `context.retro.suggestions`. Each positive-score
|
`event_type=coding_retro` and a matching `window_days` into
|
||||||
suggestion emits one task to `context.s.repo` with labels
|
`context.retro.suggestions`. Each positive-score suggestion emits one task to
|
||||||
`coding-retro`, `improvement`, and `automated`.
|
`context.s.repo` with labels `coding-retro`, `improvement`, and `automated`.
|
||||||
|
The weekly schedule intentionally ignores broader retro windows such as 30-day
|
||||||
|
catch-up reports.
|
||||||
|
|
||||||
Keep `weekly-coding-retro` disabled until Helix Forge publishes the
|
Keep `weekly-coding-retro` disabled until Helix Forge publishes the
|
||||||
`coding_retro` read model and a smoke run confirms the resolver returns a
|
`coding_retro` read model and a smoke run confirms the resolver returns a
|
||||||
non-empty suggestion set with no duplicate target tasks on re-run.
|
non-empty suggestion set with no duplicate target tasks on re-run.
|
||||||
|
|
||||||
|
## Ops inventory evidence posture
|
||||||
|
|
||||||
|
The current accepted live backend for activity-core ops inventory probes is
|
||||||
|
State Hub progress with `event_type=ops_inventory_probe`.
|
||||||
|
|
||||||
|
Inter-Hub / ops-hub per-entity submission remains intentionally deferred until
|
||||||
|
all of these are true:
|
||||||
|
|
||||||
|
- `OPS_HUB_KEY` is provisioned through an operator-owned secret path, never Git,
|
||||||
|
chat, or State Hub detail.
|
||||||
|
- Widget or capability mapping is configured for the target ops-hub entities.
|
||||||
|
- Production Inter-Hub intake is deployed and smoke-tested for the relevant
|
||||||
|
authenticated routes.
|
||||||
|
|
||||||
|
Until then, missing Inter-Hub configuration should produce an explicit skipped
|
||||||
|
sink result, not a failed probe. This posture was recorded in State Hub decision
|
||||||
|
`7c235bbb-ee6f-4c3e-b1dd-74717eac9082`.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Temporal UI — filtering by activity
|
## Temporal UI — filtering by activity
|
||||||
@@ -262,6 +333,52 @@ the same durable consumer name provides automatic failover.
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
## Run-miss recovery policies (cron triggers)
|
||||||
|
|
||||||
|
A cron fire is **missed** when the worker or Temporal is unavailable at trigger
|
||||||
|
time. `trigger_config.misfire_policy` selects what happens when the system
|
||||||
|
recovers. Each policy combines a Temporal **catchup window** (how far back missed
|
||||||
|
fires are recovered) with an **overlap policy** (what to do if a recovered fire
|
||||||
|
would start while a prior run is still executing):
|
||||||
|
|
||||||
|
| `misfire_policy` | Behaviour | Default catchup window | Overlap |
|
||||||
|
| --- | --- | --- | --- |
|
||||||
|
| `skip` | Run on trigger or skip — a missed fire is never recovered | 60s grace | `SKIP` |
|
||||||
|
| `catchup_all` | Recover **every** fire missed during the outage | 365 days | `BUFFER_ALL` |
|
||||||
|
| `catchup_latest` | Recover only the **most recent** missed fire; no backlog | 24h | `BUFFER_ONE` |
|
||||||
|
|
||||||
|
Set `trigger_config.catchup_window_seconds` to override the per-policy default
|
||||||
|
(e.g. an hourly definition using `catchup_latest` should set it to ~3600 so a
|
||||||
|
single missed hour is recovered but older ones are not).
|
||||||
|
|
||||||
|
Legacy values are still accepted: `catchup` → `catchup_all`,
|
||||||
|
`compress` → `catchup_latest`.
|
||||||
|
|
||||||
|
> **Why this exists:** before ACTIVITY-WP-0014 no catchup window was set, so a
|
||||||
|
> brief outage at trigger time silently dropped the fire with no recovery and no
|
||||||
|
> log line. The `daily-statehub-wsjf-triage` definition now uses `catchup_latest`.
|
||||||
|
|
||||||
|
## State Hub write idempotency (ACTIVITY-WP-0014 T05)
|
||||||
|
|
||||||
|
Every State Hub write from activity-core (report-sink progress, ops-evidence
|
||||||
|
progress, schedule-miss alerts) carries a stable **`Idempotency-Key`** header
|
||||||
|
derived deterministically from the write's identity
|
||||||
|
(`run_id:instruction_id:event_type`, or `schedule_miss:activity_id:last_fired`
|
||||||
|
for miss alerts). This makes writes safe to **buffer and replay** under the
|
||||||
|
planned State Hub *beachhead* (per-machine read cache + write outbox): a flush —
|
||||||
|
possibly retried after an outage — cannot create duplicate progress/triage
|
||||||
|
events once State Hub / the beachhead honours the header.
|
||||||
|
|
||||||
|
The guarantee lives on the write, not on a live dedup read. The read-based
|
||||||
|
`_progress_exists` check is now best-effort only: if State Hub is unreachable it
|
||||||
|
returns `False` (proceed to the keyed write) rather than hard-failing. The header
|
||||||
|
passes untouched through the `actcore-state-hub-bridge` proxy and is ignored by
|
||||||
|
State Hub versions that do not yet honour it.
|
||||||
|
|
||||||
|
> The queue/cache itself is **not** built in activity-core — it belongs to the
|
||||||
|
> state-hub beachhead. activity-core only emits the key. See the proposal sent to
|
||||||
|
> the `state-hub` agent.
|
||||||
|
|
||||||
## Troubleshooting
|
## Troubleshooting
|
||||||
|
|
||||||
### Worker fails to start: "ACTCORE_DB_URL is required"
|
### Worker fails to start: "ACTCORE_DB_URL is required"
|
||||||
@@ -271,6 +388,9 @@ Set the environment variable before running the worker.
|
|||||||
1. Check Temporal UI → Schedules tab for the schedule status.
|
1. Check Temporal UI → Schedules tab for the schedule status.
|
||||||
2. Ensure `enabled=True` on the ActivityDefinition (paused schedules don't fire).
|
2. Ensure `enabled=True` on the ActivityDefinition (paused schedules don't fire).
|
||||||
3. Verify the cron expression with: `docker exec temporal-admin-tools temporal schedule describe --schedule-id activity-schedule-<uuid>`
|
3. Verify the cron expression with: `docker exec temporal-admin-tools temporal schedule describe --schedule-id activity-schedule-<uuid>`
|
||||||
|
4. If a fire was **missed entirely** (no run, no failure event) during an outage,
|
||||||
|
check `misfire_policy` — under `skip` missed fires are dropped by design. Use
|
||||||
|
`catchup_all` or `catchup_latest` to recover them. See *Run-miss recovery policies*.
|
||||||
|
|
||||||
### Event not routing
|
### Event not routing
|
||||||
1. Check NATS monitoring: http://localhost:8222/jsz to verify the `ACTIVITY_EVENTS` stream exists.
|
1. Check NATS monitoring: http://localhost:8222/jsz to verify the `ACTIVITY_EVENTS` stream exists.
|
||||||
@@ -342,6 +462,14 @@ uv run alembic history # show full migration history
|
|||||||
|
|
||||||
## Railiance Deployment
|
## Railiance Deployment
|
||||||
|
|
||||||
|
### Production API access posture
|
||||||
|
|
||||||
|
The FastAPI admin surface remains ClusterIP-only in production. Do not publish
|
||||||
|
it through an external ingress until a separate access-policy work item chooses
|
||||||
|
the hostname, authentication layer, allowed users/agents, and audit
|
||||||
|
expectations. This posture was recorded in State Hub decision
|
||||||
|
`9ffaf7a9-227a-4e39-92e3-cd93d8cda1f2`.
|
||||||
|
|
||||||
### Pre-requisites
|
### Pre-requisites
|
||||||
- Docker ≥ 24 with Compose v2 (`docker compose` not `docker-compose`)
|
- Docker ≥ 24 with Compose v2 (`docker compose` not `docker-compose`)
|
||||||
- ≥ 4 GB RAM available (Temporal server takes ~1 GB)
|
- ≥ 4 GB RAM available (Temporal server takes ~1 GB)
|
||||||
@@ -412,6 +540,31 @@ make railiance-up
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
## Kaizen fleet resolver (coulomb-loop)
|
||||||
|
|
||||||
|
Dry-run scheduled agent discovery against State Hub + pilot roster:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export STATE_HUB_URL=http://127.0.0.1:8000
|
||||||
|
export KAIZEN_RUNNER_HOST=$(hostname)
|
||||||
|
export ACTIVITY_DEFINITION_DIRS=/home/worsch/coulomb-loop
|
||||||
|
|
||||||
|
uv run python -c "
|
||||||
|
from activity_core.context_resolvers.kaizen import discover_kaizen_scheduled_repos
|
||||||
|
print(discover_kaizen_scheduled_repos({
|
||||||
|
'roster': '/home/worsch/coulomb-loop/loops/kaizen-stack/roster.yaml',
|
||||||
|
'cadence': 'daily',
|
||||||
|
}))
|
||||||
|
"
|
||||||
|
|
||||||
|
make sync-activity-definitions # requires ACTCORE_DB_URL + stack up
|
||||||
|
```
|
||||||
|
|
||||||
|
Source types: `kaizen`, `resolver`, or `shell` (alias). Queries:
|
||||||
|
`discover_kaizen_scheduled_repos`, `discover_kaizen_projects`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## Wipe and restart dev stack
|
## Wipe and restart dev stack
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|||||||
118
history/2026-06-16-intent-gap-analysis.md
Normal file
118
history/2026-06-16-intent-gap-analysis.md
Normal file
@@ -0,0 +1,118 @@
|
|||||||
|
---
|
||||||
|
type: history
|
||||||
|
title: "activity-core INTENT gap analysis"
|
||||||
|
date: "2026-06-16"
|
||||||
|
author: codex
|
||||||
|
repo: activity-core
|
||||||
|
related_workplan: ACTIVITY-WP-0009
|
||||||
|
---
|
||||||
|
|
||||||
|
# activity-core INTENT Gap Analysis - 2026-06-16
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
This note preserves the findings from a repository review against `INTENT.md`.
|
||||||
|
The review refreshed `SCOPE.md` for the current repo state and identified the
|
||||||
|
remaining gaps between the intended Event Bridge boundary and the implemented /
|
||||||
|
deployed surface.
|
||||||
|
|
||||||
|
Files and surfaces reviewed:
|
||||||
|
|
||||||
|
- `INTENT.md`
|
||||||
|
- `SCOPE.md`
|
||||||
|
- `src/activity_core/`
|
||||||
|
- `activity-definitions/`
|
||||||
|
- `docs/runbook.md`
|
||||||
|
- `docs/issue-core-emission-boundary.md`
|
||||||
|
- `k8s/railiance/`
|
||||||
|
- `workplans/ACTIVITY-WP-0006-post-triage-operational-hardening.md`
|
||||||
|
- `workplans/ACTIVITY-WP-0007-ops-inventory-probe-runner.md`
|
||||||
|
- `workplans/ACTIVITY-WP-0008-weekly-coding-retro.md`
|
||||||
|
|
||||||
|
## Summary
|
||||||
|
|
||||||
|
activity-core matches the core INTENT boundary in shape: it owns trigger
|
||||||
|
durability, context resolution, rule/instruction evaluation, outbound
|
||||||
|
task/report/evidence emission, and local audit records. It still must avoid
|
||||||
|
owning task lifecycle, project state, privileged ops execution, or service
|
||||||
|
inventory authority.
|
||||||
|
|
||||||
|
The current implementation has grown a useful bounded report/evidence surface:
|
||||||
|
instruction reports can write working-memory notes and State Hub progress, and
|
||||||
|
`ops-inventory` context sources can emit compact non-secret
|
||||||
|
`ops_inventory_probe` summaries. This is still consistent with INTENT as long as
|
||||||
|
those outputs remain records of activity-core activations rather than an
|
||||||
|
authoritative task, project, or ops control plane.
|
||||||
|
|
||||||
|
## Findings
|
||||||
|
|
||||||
|
### 1. Scheduled-run trust gap
|
||||||
|
|
||||||
|
`INTENT.md` expects recurring coordination work to run without Bernd as the
|
||||||
|
manual coordination layer. The daily State Hub WSJF triage path is implemented
|
||||||
|
and has produced validated reports, but `ACTIVITY-WP-0006-T03` still lacks
|
||||||
|
three clean consecutive scheduled runs after the June 7 runtime projection
|
||||||
|
failure.
|
||||||
|
|
||||||
|
Current evidence as of 2026-06-16:
|
||||||
|
|
||||||
|
- State Hub `daily_triage` progress only shows activity-core entries through
|
||||||
|
2026-06-06.
|
||||||
|
- `/home/worsch/the-custodian/memory/working` only has `daily-triage-*` notes
|
||||||
|
for 2026-06-02 through 2026-06-06.
|
||||||
|
|
||||||
|
Impact: daily triage is production-backed, but not yet fully proven as a
|
||||||
|
standing substrate.
|
||||||
|
|
||||||
|
### 2. Live task creation gap
|
||||||
|
|
||||||
|
`INTENT.md` says each activation emits task creation requests to issue-core and
|
||||||
|
records only the spawn audit trail. The REST issue sink exists, but Railiance is
|
||||||
|
currently configured with `ISSUE_SINK_TYPE=null`, so production runs record
|
||||||
|
synthetic audit references instead of consistently creating live issue-core
|
||||||
|
tasks.
|
||||||
|
|
||||||
|
Impact: the task emission boundary is implemented but not yet broadly proven in
|
||||||
|
the production deployment.
|
||||||
|
|
||||||
|
### 3. Review queue gap
|
||||||
|
|
||||||
|
The original ADR text described `review_required` as routing instruction output
|
||||||
|
to a pending review queue. Current code records `review_required` in
|
||||||
|
report/spawn metadata but does not integrate with an issue-core review queue.
|
||||||
|
|
||||||
|
Impact: current behavior is safe as metadata. As of the ACTIVITY-WP-0009
|
||||||
|
implementation pass, ADR-003 and SCOPE.md have been aligned to that behavior.
|
||||||
|
|
||||||
|
### 4. Evidence backend gap
|
||||||
|
|
||||||
|
The State Hub fallback evidence path works for `ops_inventory_probe`, and
|
||||||
|
`ACTIVITY-WP-0007` has live Railiance evidence. Inter-Hub / ops-hub submission
|
||||||
|
is intentionally deferred behind operator-owned `OPS_HUB_KEY` custody, widget
|
||||||
|
mapping, and approval.
|
||||||
|
|
||||||
|
Impact: activity-core can preserve non-secret continuity evidence, but richer
|
||||||
|
per-entity ops evidence publication is not yet live.
|
||||||
|
|
||||||
|
### 5. Execution-boundary residue
|
||||||
|
|
||||||
|
`TaskExecutorWorkflow` remains registered as a stub that persists a done
|
||||||
|
`task_instances` row. INTENT explicitly says activity-core must not execute the
|
||||||
|
work or track lifecycle state.
|
||||||
|
|
||||||
|
Impact: low immediate risk because the workflow is inert, but it is an attractive
|
||||||
|
wrong hook for future execution creep.
|
||||||
|
|
||||||
|
### 6. API exposure gap
|
||||||
|
|
||||||
|
The FastAPI admin surface is useful for internal CRUD and manual triggers.
|
||||||
|
Railiance docs keep it as ClusterIP until an authenticated ingress and access
|
||||||
|
policy are chosen.
|
||||||
|
|
||||||
|
Impact: operationally acceptable for now, but production access posture remains
|
||||||
|
an explicit decision.
|
||||||
|
|
||||||
|
## Follow-up
|
||||||
|
|
||||||
|
`workplans/ACTIVITY-WP-0009-intent-gap-closure.md` was created to turn these
|
||||||
|
findings into tracked closure work.
|
||||||
@@ -11,7 +11,7 @@ data:
|
|||||||
TEMPORAL_NAMESPACE: default
|
TEMPORAL_NAMESPACE: default
|
||||||
NATS_URL: nats://actcore-nats:4222
|
NATS_URL: nats://actcore-nats:4222
|
||||||
STATE_HUB_URL: http://actcore-state-hub-bridge:8000
|
STATE_HUB_URL: http://actcore-state-hub-bridge:8000
|
||||||
LLM_CONNECT_URL: ""
|
LLM_CONNECT_URL: http://llm-connect.activity-core.svc.cluster.local:8080
|
||||||
LLM_CONNECT_TIMEOUT_SECONDS: "300"
|
LLM_CONNECT_TIMEOUT_SECONDS: "300"
|
||||||
REPO_SCOPING_URL: http://repo-scoping.repo-scoping.svc.cluster.local:8020
|
REPO_SCOPING_URL: http://repo-scoping.repo-scoping.svc.cluster.local:8020
|
||||||
ISSUE_CORE_URL: http://issue-core.issue-core.svc.cluster.local:8010
|
ISSUE_CORE_URL: http://issue-core.issue-core.svc.cluster.local:8010
|
||||||
@@ -47,7 +47,10 @@ data:
|
|||||||
type: cron
|
type: cron
|
||||||
cron_expression: "20 7 * * *"
|
cron_expression: "20 7 * * *"
|
||||||
timezone: Europe/Berlin
|
timezone: Europe/Berlin
|
||||||
misfire_policy: skip
|
# ACTIVITY-WP-0014: recover the most recent missed daily fire when the
|
||||||
|
# worker/Temporal was unavailable at trigger time, without accumulating a
|
||||||
|
# backlog after a multi-day outage.
|
||||||
|
misfire_policy: catchup_latest
|
||||||
context_sources:
|
context_sources:
|
||||||
- type: static
|
- type: static
|
||||||
bind_to: context.prompt_path
|
bind_to: context.prompt_path
|
||||||
@@ -164,6 +167,36 @@ data:
|
|||||||
|
|
||||||
Kubernetes projection of the Custodian-owned definition in
|
Kubernetes projection of the Custodian-owned definition in
|
||||||
`/home/worsch/the-custodian/activity-definitions/hourly-recently-on-scope.md`.
|
`/home/worsch/the-custodian/activity-definitions/hourly-recently-on-scope.md`.
|
||||||
|
state-hub-consistency-sweep.md: |
|
||||||
|
---
|
||||||
|
id: "7c4e9a12-8f3b-4d5e-9c6a-1b2d3e4f5a6b"
|
||||||
|
name: "State Hub Consistency Sweep"
|
||||||
|
type: activity-definition
|
||||||
|
version: "1.0"
|
||||||
|
enabled: true
|
||||||
|
owner: custodian
|
||||||
|
governance: custodian
|
||||||
|
status: active
|
||||||
|
created: "2026-06-21"
|
||||||
|
trigger:
|
||||||
|
type: cron
|
||||||
|
cron_expression: "*/15 * * * *"
|
||||||
|
timezone: UTC
|
||||||
|
misfire_policy: skip
|
||||||
|
context_sources:
|
||||||
|
- type: state-hub
|
||||||
|
query: consistency_sweep_remote_all
|
||||||
|
required: true
|
||||||
|
params:
|
||||||
|
max_seconds: 300
|
||||||
|
source: activity-core
|
||||||
|
bind_to: context.consistency_sweep_remote_all
|
||||||
|
---
|
||||||
|
|
||||||
|
# ActivityDefinition: State Hub Consistency Sweep
|
||||||
|
|
||||||
|
Kubernetes projection of the Custodian-owned definition in
|
||||||
|
`/home/worsch/the-custodian/activity-definitions/state-hub-consistency-sweep.md`.
|
||||||
ops-service-inventory-probes.md: |
|
ops-service-inventory-probes.md: |
|
||||||
---
|
---
|
||||||
id: "40d15a87-7ff6-4d8e-992c-37df15f95110"
|
id: "40d15a87-7ff6-4d8e-992c-37df15f95110"
|
||||||
@@ -578,7 +611,8 @@ spec:
|
|||||||
method=self.command,
|
method=self.command,
|
||||||
)
|
)
|
||||||
try:
|
try:
|
||||||
with urlopen(request, timeout=30) as response:
|
timeout = 360 if self.command == "POST" else 30
|
||||||
|
with urlopen(request, timeout=timeout) as response:
|
||||||
payload = response.read()
|
payload = response.read()
|
||||||
self.send_response(response.status)
|
self.send_response(response.status)
|
||||||
for key, value in response.headers.items():
|
for key, value in response.headers.items():
|
||||||
@@ -599,7 +633,7 @@ spec:
|
|||||||
ThreadingHTTPServer(("0.0.0.0", 18080), Proxy).serve_forever()
|
ThreadingHTTPServer(("0.0.0.0", 18080), Proxy).serve_forever()
|
||||||
readinessProbe:
|
readinessProbe:
|
||||||
httpGet:
|
httpGet:
|
||||||
path: /state/summary
|
path: /state/health
|
||||||
port: http
|
port: http
|
||||||
initialDelaySeconds: 5
|
initialDelaySeconds: 5
|
||||||
periodSeconds: 10
|
periodSeconds: 10
|
||||||
|
|||||||
@@ -32,8 +32,10 @@ Europe/Berlin schedule, verify both runtime dependencies:
|
|||||||
|
|
||||||
- `actcore-state-hub-bridge` can reach the State Hub API through the node-local
|
- `actcore-state-hub-bridge` can reach the State Hub API through the node-local
|
||||||
tunnel expected at `127.0.0.1:18000`.
|
tunnel expected at `127.0.0.1:18000`.
|
||||||
- `LLM_CONNECT_URL` is set to an operator-approved llm-connect endpoint that can
|
- `LLM_CONNECT_URL` points at the verified in-namespace llm-connect Service,
|
||||||
serve the `custodian-triage-balanced` profile.
|
`http://llm-connect.activity-core.svc.cluster.local:8080`, and the
|
||||||
|
operator-owned provider Secret lets that Service serve the
|
||||||
|
`custodian-triage-balanced` profile.
|
||||||
|
|
||||||
If `LLM_CONNECT_URL` is missing or broken, report-sink instructions write a
|
If `LLM_CONNECT_URL` is missing or broken, report-sink instructions write a
|
||||||
visible `execution_failed` diagnostic instead of silently producing no report.
|
visible `execution_failed` diagnostic instead of silently producing no report.
|
||||||
|
|||||||
@@ -12,6 +12,7 @@ dependencies = [
|
|||||||
"alembic>=1.14",
|
"alembic>=1.14",
|
||||||
"nats-py>=2.7",
|
"nats-py>=2.7",
|
||||||
"httpx>=0.27",
|
"httpx>=0.27",
|
||||||
|
"pyyaml>=6.0",
|
||||||
]
|
]
|
||||||
|
|
||||||
[project.optional-dependencies]
|
[project.optional-dependencies]
|
||||||
|
|||||||
@@ -1,4 +1,5 @@
|
|||||||
{
|
{
|
||||||
|
"$comment": "ACTIVITY-WP-0016-T02. Strict, bounded contract for the daily WSJF triage report. The per-item 'recommendations' schema is intentionally strict on STRUCTURE (types + required keys) so the T03 boundary parser can validate each recommendation independently and quarantine only the malformed ones. 'maxItems' is a producer hint (honoured by llm-connect constrained decoding and by the prompt); it is deliberately NOT hard-enforced by the in-repo validator, because rejecting a whole report for having too many items would reproduce the monolithic-failure bug WP-0016 exists to remove. Over-count is mitigated in T03 (keep top-N by rank, quarantine the rest). Value-domain vocabularies (action/confidence) are documented in the prompt and enforced by T04 guardrails with mitigation, not as brittle hard-fail enums here.",
|
||||||
"type": "object",
|
"type": "object",
|
||||||
"required": ["summary", "recommendations"],
|
"required": ["summary", "recommendations"],
|
||||||
"properties": {
|
"properties": {
|
||||||
@@ -7,8 +8,28 @@
|
|||||||
},
|
},
|
||||||
"recommendations": {
|
"recommendations": {
|
||||||
"type": "array",
|
"type": "array",
|
||||||
|
"maxItems": 7,
|
||||||
"items": {
|
"items": {
|
||||||
"type": "object"
|
"type": "object",
|
||||||
|
"required": ["rank", "candidate", "action", "why"],
|
||||||
|
"properties": {
|
||||||
|
"rank": { "type": "integer" },
|
||||||
|
"candidate": { "type": "string" },
|
||||||
|
"action": { "type": "string" },
|
||||||
|
"why": { "type": "string" },
|
||||||
|
"confidence": { "type": "string" },
|
||||||
|
"wsjf": {
|
||||||
|
"type": "object",
|
||||||
|
"properties": {
|
||||||
|
"score": { "type": "number" },
|
||||||
|
"strategic_value": { "type": "number" },
|
||||||
|
"time_criticality": { "type": "number" },
|
||||||
|
"risk_reduction": { "type": "number" },
|
||||||
|
"opportunity_enablement": { "type": "number" },
|
||||||
|
"job_size": { "type": "number" }
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
212
scripts/smoke_admin_sync_no_restart.py
Executable file
212
scripts/smoke_admin_sync_no_restart.py
Executable file
@@ -0,0 +1,212 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""Railiance01 no-restart smoke for POST /admin/sync.
|
||||||
|
|
||||||
|
Patches the disabled ops-service-inventory-probes projection in the cluster
|
||||||
|
ConfigMap, waits for the API pod volume to refresh, runs /admin/sync twice,
|
||||||
|
verifies DB + Temporal schedule drift without restarting actcore-worker, then
|
||||||
|
rolls the ConfigMap back to the disabled baseline.
|
||||||
|
|
||||||
|
Requires:
|
||||||
|
- KUBECONFIG pointing at railiance01 (for example ~/.kube/config-hosteurope)
|
||||||
|
- kubectl access to the activity-core namespace
|
||||||
|
|
||||||
|
Example:
|
||||||
|
export KUBECONFIG=~/.kube/config-hosteurope
|
||||||
|
python3 scripts/smoke_admin_sync_no_restart.py
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import json
|
||||||
|
import subprocess
|
||||||
|
import sys
|
||||||
|
import time
|
||||||
|
|
||||||
|
ACTIVITY_ID = "40d15a87-7ff6-4d8e-992c-37df15f95110"
|
||||||
|
CONFIGMAP = "actcore-external-activity-definitions"
|
||||||
|
DEFINITION_KEY = "ops-service-inventory-probes.md"
|
||||||
|
MOUNTED_FILE = (
|
||||||
|
"/etc/activity-core/external-definitions/activity-definitions/"
|
||||||
|
f"{DEFINITION_KEY}"
|
||||||
|
)
|
||||||
|
VOLUME_PROPAGATION_SECONDS = 65
|
||||||
|
|
||||||
|
|
||||||
|
def kubectl(*args: str, input_text: str | None = None) -> str:
|
||||||
|
cmd = ["kubectl", "-n", "activity-core", *args]
|
||||||
|
return subprocess.check_output(
|
||||||
|
cmd,
|
||||||
|
input=input_text,
|
||||||
|
text=True,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def api_json(path: str, *, method: str = "GET") -> dict:
|
||||||
|
script = (
|
||||||
|
"import urllib.request, json\n"
|
||||||
|
f'req = urllib.request.Request("http://localhost:8010{path}", method="{method}")\n'
|
||||||
|
"print(urllib.request.urlopen(req).read().decode())"
|
||||||
|
)
|
||||||
|
return json.loads(kubectl("exec", "deploy/actcore-api", "--", "python3", "-c", script))
|
||||||
|
|
||||||
|
|
||||||
|
def worker_lines(script: str) -> list[str]:
|
||||||
|
return kubectl("exec", "deploy/actcore-worker", "--", "python3", "-c", script).splitlines()
|
||||||
|
|
||||||
|
|
||||||
|
def worker_uid() -> str:
|
||||||
|
return kubectl(
|
||||||
|
"get",
|
||||||
|
"pod",
|
||||||
|
"-l",
|
||||||
|
"app.kubernetes.io/name=actcore-worker",
|
||||||
|
"-o",
|
||||||
|
"jsonpath={.items[0].metadata.uid}",
|
||||||
|
).strip()
|
||||||
|
|
||||||
|
|
||||||
|
def load_configmap() -> dict:
|
||||||
|
return json.loads(kubectl("get", "configmap", CONFIGMAP, "-o", "json"))
|
||||||
|
|
||||||
|
|
||||||
|
def apply_configmap(cm: dict) -> None:
|
||||||
|
kubectl("apply", "-f", "-", input_text=json.dumps(cm))
|
||||||
|
|
||||||
|
|
||||||
|
def patch_definition(cm: dict, *, enabled: bool, cron: str) -> None:
|
||||||
|
text = cm["data"][DEFINITION_KEY]
|
||||||
|
for line in text.splitlines():
|
||||||
|
if line.strip().startswith("enabled:"):
|
||||||
|
break
|
||||||
|
else:
|
||||||
|
raise RuntimeError("enabled field not found in projection")
|
||||||
|
|
||||||
|
text = _replace_once(text, 'enabled: false', f"enabled: {'true' if enabled else 'false'}")
|
||||||
|
text = _replace_once(text, 'enabled: true', f"enabled: {'true' if enabled else 'false'}")
|
||||||
|
text = _replace_once(
|
||||||
|
text,
|
||||||
|
'cron_expression: "15 * * * *"',
|
||||||
|
f'cron_expression: "{cron}"',
|
||||||
|
)
|
||||||
|
text = _replace_once(
|
||||||
|
text,
|
||||||
|
'cron_expression: "25 * * * *"',
|
||||||
|
f'cron_expression: "{cron}"',
|
||||||
|
)
|
||||||
|
cm["data"][DEFINITION_KEY] = text
|
||||||
|
apply_configmap(cm)
|
||||||
|
|
||||||
|
|
||||||
|
def _replace_once(text: str, old: str, new: str) -> str:
|
||||||
|
if old not in text:
|
||||||
|
return text
|
||||||
|
return text.replace(old, new, 1)
|
||||||
|
|
||||||
|
|
||||||
|
def wait_for_mount(*, enabled: bool, cron: str) -> None:
|
||||||
|
deadline = time.time() + VOLUME_PROPAGATION_SECONDS
|
||||||
|
want_enabled = "enabled: true" if enabled else "enabled: false"
|
||||||
|
want_cron = f'cron_expression: "{cron}"'
|
||||||
|
while time.time() < deadline:
|
||||||
|
content = kubectl("exec", "deploy/actcore-api", "--", "cat", MOUNTED_FILE)
|
||||||
|
if want_enabled in content and want_cron in content:
|
||||||
|
return
|
||||||
|
time.sleep(5)
|
||||||
|
raise RuntimeError(
|
||||||
|
f"ConfigMap projection did not refresh within {VOLUME_PROPAGATION_SECONDS}s"
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def get_definition() -> dict[str, object]:
|
||||||
|
for item in api_json("/activity-definitions/"):
|
||||||
|
if item["id"] == ACTIVITY_ID:
|
||||||
|
return {
|
||||||
|
"enabled": item["enabled"],
|
||||||
|
"cron": item["trigger_config"]["cron_expression"],
|
||||||
|
}
|
||||||
|
raise RuntimeError(f"ActivityDefinition {ACTIVITY_ID} not found")
|
||||||
|
|
||||||
|
|
||||||
|
def describe_schedule() -> dict[str, object]:
|
||||||
|
script = f"""
|
||||||
|
import asyncio
|
||||||
|
from temporalio.client import Client
|
||||||
|
|
||||||
|
async def main() -> None:
|
||||||
|
client = await Client.connect("actcore-temporal:7233")
|
||||||
|
handle = client.get_schedule_handle("activity-schedule-{ACTIVITY_ID}")
|
||||||
|
described = await handle.describe()
|
||||||
|
schedule = described.schedule
|
||||||
|
minute = schedule.spec.calendars[0].minute[0].start if schedule.spec.calendars else None
|
||||||
|
print(schedule.state.paused)
|
||||||
|
print(minute)
|
||||||
|
|
||||||
|
asyncio.run(main())
|
||||||
|
"""
|
||||||
|
paused, minute = worker_lines(script)
|
||||||
|
return {"paused": paused == "True", "minute": int(minute)}
|
||||||
|
|
||||||
|
|
||||||
|
def main() -> int:
|
||||||
|
worker_before = worker_uid()
|
||||||
|
cm = load_configmap()
|
||||||
|
|
||||||
|
print("1) enable + cadence change via ConfigMap")
|
||||||
|
patch_definition(cm, enabled=True, cron="25 * * * *")
|
||||||
|
wait_for_mount(enabled=True, cron="25 * * * *")
|
||||||
|
|
||||||
|
print("2) POST /admin/sync (first pass)")
|
||||||
|
sync1 = api_json("/admin/sync?definitions=true&schedules=true", method="POST")
|
||||||
|
if not sync1.get("ok"):
|
||||||
|
print(json.dumps(sync1, indent=2), file=sys.stderr)
|
||||||
|
return 1
|
||||||
|
|
||||||
|
defn = get_definition()
|
||||||
|
schedule = describe_schedule()
|
||||||
|
print(" definition:", defn)
|
||||||
|
print(" schedule:", schedule)
|
||||||
|
if defn != {"enabled": True, "cron": "25 * * * *"}:
|
||||||
|
print("definition drift after sync", file=sys.stderr)
|
||||||
|
return 1
|
||||||
|
if schedule["paused"] or schedule["minute"] != 25:
|
||||||
|
print("schedule drift after enable sync", file=sys.stderr)
|
||||||
|
return 1
|
||||||
|
|
||||||
|
print("3) POST /admin/sync (idempotent repeat)")
|
||||||
|
sync2 = api_json("/admin/sync?definitions=true&schedules=true", method="POST")
|
||||||
|
if sync2.get("schedules") != sync1.get("schedules"):
|
||||||
|
print("idempotent schedule counts changed", file=sys.stderr)
|
||||||
|
print(json.dumps({"sync1": sync1, "sync2": sync2}, indent=2), file=sys.stderr)
|
||||||
|
return 1
|
||||||
|
|
||||||
|
print("4) rollback ConfigMap + sync")
|
||||||
|
cm = load_configmap()
|
||||||
|
patch_definition(cm, enabled=False, cron="15 * * * *")
|
||||||
|
wait_for_mount(enabled=False, cron="15 * * * *")
|
||||||
|
sync3 = api_json("/admin/sync?definitions=true&schedules=true", method="POST")
|
||||||
|
if not sync3.get("ok"):
|
||||||
|
print(json.dumps(sync3, indent=2), file=sys.stderr)
|
||||||
|
return 1
|
||||||
|
|
||||||
|
defn = get_definition()
|
||||||
|
schedule = describe_schedule()
|
||||||
|
print(" definition:", defn)
|
||||||
|
print(" schedule:", schedule)
|
||||||
|
if defn != {"enabled": False, "cron": "15 * * * *"}:
|
||||||
|
print("rollback definition drift", file=sys.stderr)
|
||||||
|
return 1
|
||||||
|
if not schedule["paused"] or schedule["minute"] != 15:
|
||||||
|
print("rollback schedule drift", file=sys.stderr)
|
||||||
|
return 1
|
||||||
|
|
||||||
|
worker_after = worker_uid()
|
||||||
|
if worker_before != worker_after:
|
||||||
|
print("actcore-worker pod restarted during smoke", file=sys.stderr)
|
||||||
|
return 1
|
||||||
|
|
||||||
|
print("smoke passed: admin sync hot-reload without worker restart")
|
||||||
|
return 0
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
raise SystemExit(main())
|
||||||
@@ -11,8 +11,10 @@ activities that need DB access.
|
|||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import json
|
||||||
import uuid
|
import uuid
|
||||||
from datetime import datetime, timezone
|
from datetime import datetime, timezone
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
from sqlalchemy import select
|
from sqlalchemy import select
|
||||||
from sqlalchemy.dialects.postgresql import insert as pg_insert
|
from sqlalchemy.dialects.postgresql import insert as pg_insert
|
||||||
@@ -52,6 +54,36 @@ def _get_session_factory() -> async_sessionmaker[AsyncSession]:
|
|||||||
return _session_factory
|
return _session_factory
|
||||||
|
|
||||||
|
|
||||||
|
def _bind_resolver_result(bind_key: str, result: Any) -> Any:
|
||||||
|
"""Unwrap single-key resolver payloads when the key matches bind_key.
|
||||||
|
|
||||||
|
Resolvers such as ``discover_kaizen_projects`` return ``{"projects": [...]}``
|
||||||
|
while definitions bind to ``context.projects`` and iterate ``for_each:
|
||||||
|
context.projects``. Multi-key summaries (e.g. repo SBOM bulk) stay intact.
|
||||||
|
"""
|
||||||
|
if isinstance(result, dict) and len(result) == 1 and bind_key in result:
|
||||||
|
return result[bind_key]
|
||||||
|
return result
|
||||||
|
|
||||||
|
|
||||||
|
def _parse_event_envelope(event_envelope_json: str | None) -> dict[str, Any] | None:
|
||||||
|
"""Parse an event envelope JSON string for context resolvers."""
|
||||||
|
if not event_envelope_json:
|
||||||
|
return None
|
||||||
|
try:
|
||||||
|
payload = json.loads(event_envelope_json)
|
||||||
|
except (TypeError, json.JSONDecodeError) as exc:
|
||||||
|
activity.logger.warning("Invalid event envelope JSON - %s", exc)
|
||||||
|
return None
|
||||||
|
if not isinstance(payload, dict):
|
||||||
|
activity.logger.warning(
|
||||||
|
"Invalid event envelope JSON - expected object, got %s",
|
||||||
|
type(payload).__name__,
|
||||||
|
)
|
||||||
|
return None
|
||||||
|
return payload
|
||||||
|
|
||||||
|
|
||||||
# ── Activities ─────────────────────────────────────────────────────────────────
|
# ── Activities ─────────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
@activity.defn
|
@activity.defn
|
||||||
@@ -111,11 +143,14 @@ async def resolve_context(
|
|||||||
from activity_core.context_resolvers.base import CONTEXT_RESOLVER_REGISTRY
|
from activity_core.context_resolvers.base import CONTEXT_RESOLVER_REGISTRY
|
||||||
|
|
||||||
snapshot: dict = {}
|
snapshot: dict = {}
|
||||||
|
event_envelope = _parse_event_envelope(event_envelope_json)
|
||||||
for source in context_sources:
|
for source in context_sources:
|
||||||
source_type = source.get("type", "")
|
source_type = source.get("type", "")
|
||||||
query = source.get("query", "")
|
query = source.get("query", "")
|
||||||
params = source.get("params") or {}
|
params = source.get("params") or {}
|
||||||
required = bool(source.get("required") or params.get("required", False))
|
required = bool(source.get("required") or params.get("required", False))
|
||||||
|
resolver_params = dict(params)
|
||||||
|
resolver_params["required"] = required
|
||||||
raw_bind = source.get("bind_to") or source.get("name") or source_type
|
raw_bind = source.get("bind_to") or source.get("name") or source_type
|
||||||
# Strip the 'context.' namespace prefix so evaluator can find the key.
|
# Strip the 'context.' namespace prefix so evaluator can find the key.
|
||||||
bind_key = raw_bind.removeprefix("context.") if raw_bind.startswith("context.") else raw_bind
|
bind_key = raw_bind.removeprefix("context.") if raw_bind.startswith("context.") else raw_bind
|
||||||
@@ -139,7 +174,8 @@ async def resolve_context(
|
|||||||
continue
|
continue
|
||||||
|
|
||||||
try:
|
try:
|
||||||
snapshot[bind_key] = resolver_cls().resolve(query, None, params)
|
resolved = resolver_cls().resolve(query, event_envelope, resolver_params)
|
||||||
|
snapshot[bind_key] = _bind_resolver_result(bind_key, resolved)
|
||||||
except Exception as exc:
|
except Exception as exc:
|
||||||
if required:
|
if required:
|
||||||
raise ApplicationError(
|
raise ApplicationError(
|
||||||
|
|||||||
@@ -40,6 +40,7 @@ from temporalio.client import Client
|
|||||||
from activity_core.models import ActivityDefinition, CronTriggerConfig
|
from activity_core.models import ActivityDefinition, CronTriggerConfig
|
||||||
from activity_core.orm import ActivityDefinition as ActivityDefinitionRow, EventType as EventTypeRow
|
from activity_core.orm import ActivityDefinition as ActivityDefinitionRow, EventType as EventTypeRow
|
||||||
from activity_core.schedule_manager import delete_schedule, upsert_schedule
|
from activity_core.schedule_manager import delete_schedule, upsert_schedule
|
||||||
|
from activity_core.sync_service import run_sync
|
||||||
from activity_core.webhook_receiver import router as webhook_router
|
from activity_core.webhook_receiver import router as webhook_router
|
||||||
|
|
||||||
TEMPORAL_HOST = os.environ.get("TEMPORAL_HOST", "localhost:7233")
|
TEMPORAL_HOST = os.environ.get("TEMPORAL_HOST", "localhost:7233")
|
||||||
@@ -275,6 +276,24 @@ async def trigger_definition(definition_id: uuid.UUID) -> dict[str, str]:
|
|||||||
return {"workflow_id": handle.id, "trigger_key": trigger_key}
|
return {"workflow_id": handle.id, "trigger_key": trigger_key}
|
||||||
|
|
||||||
|
|
||||||
|
# --- Admin sync ---------------------------------------------------------------
|
||||||
|
|
||||||
|
@app.post("/admin/sync")
|
||||||
|
async def admin_sync(
|
||||||
|
definitions: bool = True,
|
||||||
|
schedules: bool = True,
|
||||||
|
event_types: bool = False,
|
||||||
|
) -> dict[str, Any]:
|
||||||
|
"""Run operator-triggered definition/event/schedule sync without restart."""
|
||||||
|
return await run_sync(
|
||||||
|
session_factory=_get_db(),
|
||||||
|
temporal_client=_get_temporal() if schedules else None,
|
||||||
|
definitions=definitions,
|
||||||
|
schedules=schedules,
|
||||||
|
event_types=event_types,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
# T42: Curator gate — event type approval endpoint
|
# T42: Curator gate — event type approval endpoint
|
||||||
|
|
||||||
@app.get("/health")
|
@app.get("/health")
|
||||||
|
|||||||
@@ -1 +1,8 @@
|
|||||||
from activity_core.context_resolvers import ops_inventory, repo_scoping, state_hub # noqa: F401
|
from activity_core.context_resolvers import ( # noqa: F401
|
||||||
|
event_payload,
|
||||||
|
kaizen,
|
||||||
|
ops_inventory,
|
||||||
|
repo_scoping,
|
||||||
|
state_hub,
|
||||||
|
reuse_surface,
|
||||||
|
)
|
||||||
|
|||||||
51
src/activity_core/context_resolvers/event_payload.py
Normal file
51
src/activity_core/context_resolvers/event_payload.py
Normal file
@@ -0,0 +1,51 @@
|
|||||||
|
"""Event payload context adapter.
|
||||||
|
|
||||||
|
Registered as source type ``event-payload``. It exposes the triggering
|
||||||
|
EventEnvelope attributes to event-triggered ActivityDefinitions without
|
||||||
|
requiring an external context service call.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
from activity_core.context_resolvers.base import CONTEXT_RESOLVER_REGISTRY, ContextResolver
|
||||||
|
|
||||||
|
|
||||||
|
class EventPayloadContextResolver(ContextResolver):
|
||||||
|
"""Resolve context from the triggering event envelope attributes."""
|
||||||
|
|
||||||
|
def resolve(self, query: str, event: Any, params: dict[str, Any]) -> Any:
|
||||||
|
attributes = _event_attributes(event)
|
||||||
|
if query in {"", "attributes"}:
|
||||||
|
return attributes
|
||||||
|
if query.startswith("attributes."):
|
||||||
|
return _resolve_path(attributes, query.removeprefix("attributes."))
|
||||||
|
return _resolve_path(attributes, query)
|
||||||
|
|
||||||
|
|
||||||
|
def _event_attributes(event: Any) -> dict[str, Any]:
|
||||||
|
if not isinstance(event, dict):
|
||||||
|
raise RuntimeError("event-payload source requires an event envelope")
|
||||||
|
attributes = event.get("attributes")
|
||||||
|
if not isinstance(attributes, dict):
|
||||||
|
raise RuntimeError("event-payload source requires envelope attributes")
|
||||||
|
return attributes
|
||||||
|
|
||||||
|
|
||||||
|
def _resolve_path(root: dict[str, Any], path: str) -> Any:
|
||||||
|
if not path:
|
||||||
|
return root
|
||||||
|
current: Any = root
|
||||||
|
for part in path.split("."):
|
||||||
|
if not part:
|
||||||
|
return {}
|
||||||
|
if not isinstance(current, dict):
|
||||||
|
return {}
|
||||||
|
current = current.get(part)
|
||||||
|
if current is None:
|
||||||
|
return {}
|
||||||
|
return current
|
||||||
|
|
||||||
|
|
||||||
|
CONTEXT_RESOLVER_REGISTRY["event-payload"] = EventPayloadContextResolver
|
||||||
305
src/activity_core/context_resolvers/kaizen.py
Normal file
305
src/activity_core/context_resolvers/kaizen.py
Normal file
@@ -0,0 +1,305 @@
|
|||||||
|
"""Kaizen-agentic fleet context adapter.
|
||||||
|
|
||||||
|
Registered as source types ``kaizen`` and ``resolver`` (alias for ADR-005 drafts).
|
||||||
|
|
||||||
|
Supported queries:
|
||||||
|
- discover_kaizen_scheduled_repos: hub roster ∩ valid ``.kaizen/schedule.yml``
|
||||||
|
- discover_kaizen_projects: repos with ``.kaizen/metrics`` marker (+ optional roster)
|
||||||
|
|
||||||
|
Contract: kaizen-agentic ``docs/integrations/discover-kaizen-scheduled-repos.md``
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import json
|
||||||
|
import logging
|
||||||
|
import os
|
||||||
|
import socket
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
import httpx
|
||||||
|
import yaml
|
||||||
|
|
||||||
|
from activity_core.context_resolvers.base import CONTEXT_RESOLVER_REGISTRY, ContextResolver
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
_DEFAULT_STATE_HUB_URL = "http://127.0.0.1:8000"
|
||||||
|
_TIMEOUT_SECONDS = 10.0
|
||||||
|
_SCHEDULE_VERSION = "1"
|
||||||
|
_VALID_CADENCES = frozenset({"daily", "weekly", "monthly"})
|
||||||
|
_PREPARE_BIN = os.environ.get("KAIZEN_AGENTIC_BIN", "kaizen-agentic")
|
||||||
|
|
||||||
|
|
||||||
|
def _base_url() -> str:
|
||||||
|
return os.environ.get("STATE_HUB_URL", _DEFAULT_STATE_HUB_URL).rstrip("/")
|
||||||
|
|
||||||
|
|
||||||
|
def _runner_host() -> str:
|
||||||
|
return os.environ.get("KAIZEN_RUNNER_HOST", socket.gethostname())
|
||||||
|
|
||||||
|
|
||||||
|
def _fetch_repos(domain: str | None) -> list[dict[str, Any]]:
|
||||||
|
url = f"{_base_url()}/repos/"
|
||||||
|
try:
|
||||||
|
resp = httpx.get(url, timeout=_TIMEOUT_SECONDS)
|
||||||
|
resp.raise_for_status()
|
||||||
|
except httpx.HTTPError as exc:
|
||||||
|
raise RuntimeError(f"State Hub unreachable at {url}: {exc}") from exc
|
||||||
|
payload = resp.json()
|
||||||
|
if not isinstance(payload, list):
|
||||||
|
raise RuntimeError(f"State Hub /repos/ returned non-list: {type(payload)!r}")
|
||||||
|
if domain:
|
||||||
|
payload = [r for r in payload if r.get("domain_slug") == domain]
|
||||||
|
return payload
|
||||||
|
|
||||||
|
|
||||||
|
def _repo_root(repo: dict[str, Any]) -> Path | None:
|
||||||
|
host_paths = repo.get("host_paths") or {}
|
||||||
|
host = _runner_host()
|
||||||
|
raw = host_paths.get(host) or repo.get("local_path")
|
||||||
|
if not raw or raw == "(unknown)":
|
||||||
|
return None
|
||||||
|
path = Path(raw)
|
||||||
|
return path if path.is_dir() else None
|
||||||
|
|
||||||
|
|
||||||
|
def _load_roster(params: dict[str, Any]) -> dict[str, dict[str, Any]] | None:
|
||||||
|
"""Return slug -> roster entry for active repos, or None if no roster param."""
|
||||||
|
roster_path = params.get("roster")
|
||||||
|
if not roster_path:
|
||||||
|
return None
|
||||||
|
path = Path(roster_path)
|
||||||
|
if not path.is_file():
|
||||||
|
logger.warning("kaizen roster file not found: %s", path)
|
||||||
|
return {}
|
||||||
|
data = yaml.safe_load(path.read_text(encoding="utf-8"))
|
||||||
|
if not isinstance(data, dict):
|
||||||
|
logger.warning("kaizen roster invalid (not a mapping): %s", path)
|
||||||
|
return {}
|
||||||
|
entries: dict[str, dict[str, Any]] = {}
|
||||||
|
for item in data.get("active") or []:
|
||||||
|
if isinstance(item, dict) and item.get("slug"):
|
||||||
|
slug = str(item["slug"])
|
||||||
|
if item.get("status", "active") == "saturated":
|
||||||
|
continue
|
||||||
|
entries[slug] = item
|
||||||
|
return entries
|
||||||
|
|
||||||
|
|
||||||
|
def _validate_schedule_file(path: Path) -> list[str]:
|
||||||
|
"""Structural validation aligned with kaizen-agentic schedule validate."""
|
||||||
|
errors: list[str] = []
|
||||||
|
try:
|
||||||
|
raw = yaml.safe_load(path.read_text(encoding="utf-8"))
|
||||||
|
except yaml.YAMLError as exc:
|
||||||
|
return [f"invalid YAML: {exc}"]
|
||||||
|
|
||||||
|
if not isinstance(raw, dict):
|
||||||
|
return ["schedule.yml must be a YAML mapping at the top level"]
|
||||||
|
|
||||||
|
version = raw.get("version")
|
||||||
|
if version is None:
|
||||||
|
errors.append("missing required key: version")
|
||||||
|
elif str(version) != _SCHEDULE_VERSION:
|
||||||
|
errors.append(f"unsupported version '{version}' (expected '{_SCHEDULE_VERSION}')")
|
||||||
|
|
||||||
|
agents = raw.get("agents", {})
|
||||||
|
if not isinstance(agents, dict):
|
||||||
|
errors.append("agents must be a mapping")
|
||||||
|
return errors
|
||||||
|
if not agents:
|
||||||
|
errors.append("no agents declared under 'agents:'")
|
||||||
|
|
||||||
|
seen: set[str] = set()
|
||||||
|
for name, settings in agents.items():
|
||||||
|
if settings is None:
|
||||||
|
settings = {}
|
||||||
|
if not isinstance(settings, dict):
|
||||||
|
errors.append(f"agent '{name}' settings must be a mapping")
|
||||||
|
continue
|
||||||
|
if name in seen:
|
||||||
|
errors.append(f"duplicate agent entry: {name}")
|
||||||
|
seen.add(name)
|
||||||
|
cadence = str(settings.get("cadence", ""))
|
||||||
|
if cadence not in _VALID_CADENCES:
|
||||||
|
errors.append(
|
||||||
|
f"agent '{name}': invalid cadence '{cadence}' "
|
||||||
|
f"(expected one of {', '.join(sorted(_VALID_CADENCES))})"
|
||||||
|
)
|
||||||
|
cron = settings.get("cron")
|
||||||
|
if cron is not None and not isinstance(cron, str):
|
||||||
|
errors.append(f"agent '{name}' cron must be a string")
|
||||||
|
|
||||||
|
return errors
|
||||||
|
|
||||||
|
|
||||||
|
def _parse_schedule(path: Path) -> dict[str, Any] | None:
|
||||||
|
errors = _validate_schedule_file(path)
|
||||||
|
if errors:
|
||||||
|
return None
|
||||||
|
raw = yaml.safe_load(path.read_text(encoding="utf-8"))
|
||||||
|
return raw if isinstance(raw, dict) else None
|
||||||
|
|
||||||
|
|
||||||
|
def _prepare_command(agent: str, root: Path) -> str:
|
||||||
|
return f"{_PREPARE_BIN} schedule prepare {agent} --target {root}"
|
||||||
|
|
||||||
|
|
||||||
|
def discover_kaizen_scheduled_repos(params: dict[str, Any]) -> dict[str, Any]:
|
||||||
|
domain = params.get("domain")
|
||||||
|
cadence_filter = params.get("cadence")
|
||||||
|
roster = _load_roster(params)
|
||||||
|
runs: list[dict[str, Any]] = []
|
||||||
|
|
||||||
|
for repo in _fetch_repos(domain):
|
||||||
|
slug = repo.get("slug", "")
|
||||||
|
if not slug:
|
||||||
|
continue
|
||||||
|
if roster is not None and slug not in roster:
|
||||||
|
continue
|
||||||
|
|
||||||
|
root = _repo_root(repo)
|
||||||
|
if root is None:
|
||||||
|
logger.info("kaizen repo_unreachable slug=%s host=%s", slug, _runner_host())
|
||||||
|
continue
|
||||||
|
|
||||||
|
schedule_path = root / ".kaizen" / "schedule.yml"
|
||||||
|
if not schedule_path.is_file():
|
||||||
|
continue
|
||||||
|
|
||||||
|
errors = _validate_schedule_file(schedule_path)
|
||||||
|
if errors:
|
||||||
|
logger.warning(
|
||||||
|
"kaizen schedule_invalid slug=%s path=%s errors=%s",
|
||||||
|
slug,
|
||||||
|
schedule_path,
|
||||||
|
"; ".join(errors),
|
||||||
|
)
|
||||||
|
continue
|
||||||
|
|
||||||
|
schedule = _parse_schedule(schedule_path)
|
||||||
|
if schedule is None:
|
||||||
|
continue
|
||||||
|
|
||||||
|
timezone = schedule.get("timezone") or "Europe/Berlin"
|
||||||
|
roster_agents = roster.get(slug, {}).get("agents") if roster else None
|
||||||
|
agents = schedule.get("agents") or {}
|
||||||
|
|
||||||
|
for agent_name, settings in agents.items():
|
||||||
|
if not isinstance(settings, dict):
|
||||||
|
continue
|
||||||
|
if not bool(settings.get("enabled", True)):
|
||||||
|
continue
|
||||||
|
cadence = str(settings.get("cadence", ""))
|
||||||
|
if cadence_filter and cadence != cadence_filter:
|
||||||
|
continue
|
||||||
|
if roster_agents and agent_name not in roster_agents:
|
||||||
|
continue
|
||||||
|
cron = settings.get("cron")
|
||||||
|
runs.append(
|
||||||
|
{
|
||||||
|
"repo": slug,
|
||||||
|
"root": str(root),
|
||||||
|
"agent": agent_name,
|
||||||
|
"cadence": cadence,
|
||||||
|
"cron": cron,
|
||||||
|
"timezone": timezone,
|
||||||
|
"enabled": True,
|
||||||
|
"prepare_command": _prepare_command(agent_name, root),
|
||||||
|
}
|
||||||
|
)
|
||||||
|
|
||||||
|
return {"scheduled_runs": runs}
|
||||||
|
|
||||||
|
|
||||||
|
def _read_metrics_summary(metrics_dir: Path) -> dict[str, Any]:
|
||||||
|
summary_path = metrics_dir / "summary.json"
|
||||||
|
if not summary_path.is_file():
|
||||||
|
return {}
|
||||||
|
try:
|
||||||
|
data = json.loads(summary_path.read_text(encoding="utf-8"))
|
||||||
|
return data if isinstance(data, dict) else {}
|
||||||
|
except (json.JSONDecodeError, OSError):
|
||||||
|
return {}
|
||||||
|
|
||||||
|
|
||||||
|
def discover_kaizen_projects(params: dict[str, Any]) -> dict[str, Any]:
|
||||||
|
"""Discover repos with ``.kaizen/metrics`` (optional per-agent summaries)."""
|
||||||
|
domain = params.get("domain")
|
||||||
|
marker = params.get("marker", ".kaizen/metrics")
|
||||||
|
roster = _load_roster(params)
|
||||||
|
in_roster_key = "in_pilot_roster"
|
||||||
|
projects: list[dict[str, Any]] = []
|
||||||
|
|
||||||
|
for repo in _fetch_repos(domain):
|
||||||
|
slug = repo.get("slug", "")
|
||||||
|
if not slug:
|
||||||
|
continue
|
||||||
|
in_pilot = roster is None or slug in roster
|
||||||
|
if roster is not None and slug not in roster:
|
||||||
|
continue
|
||||||
|
|
||||||
|
root = _repo_root(repo)
|
||||||
|
if root is None:
|
||||||
|
continue
|
||||||
|
|
||||||
|
metrics_root = root / Path(marker)
|
||||||
|
if not metrics_root.is_dir():
|
||||||
|
continue
|
||||||
|
|
||||||
|
has_metrics = any(metrics_root.iterdir()) if metrics_root.is_dir() else False
|
||||||
|
if not has_metrics:
|
||||||
|
continue
|
||||||
|
|
||||||
|
roster_entry = roster.get(slug, {}) if roster else {}
|
||||||
|
agent_filter = roster_entry.get("agents")
|
||||||
|
|
||||||
|
for agent_dir in sorted(metrics_root.iterdir()):
|
||||||
|
if not agent_dir.is_dir() or agent_dir.name == "optimizer":
|
||||||
|
continue
|
||||||
|
agent = agent_dir.name
|
||||||
|
if agent_filter and agent not in agent_filter:
|
||||||
|
continue
|
||||||
|
summary = _read_metrics_summary(agent_dir)
|
||||||
|
projects.append(
|
||||||
|
{
|
||||||
|
"repo": slug,
|
||||||
|
"root": str(root),
|
||||||
|
"agent": agent,
|
||||||
|
"has_metrics": True,
|
||||||
|
in_roster_key: in_pilot,
|
||||||
|
"summary": summary,
|
||||||
|
}
|
||||||
|
)
|
||||||
|
|
||||||
|
if not any(p["repo"] == slug for p in projects):
|
||||||
|
projects.append(
|
||||||
|
{
|
||||||
|
"repo": slug,
|
||||||
|
"root": str(root),
|
||||||
|
"agent": None,
|
||||||
|
"has_metrics": has_metrics,
|
||||||
|
in_roster_key: in_pilot,
|
||||||
|
"summary": {},
|
||||||
|
}
|
||||||
|
)
|
||||||
|
|
||||||
|
return {"projects": projects}
|
||||||
|
|
||||||
|
|
||||||
|
class KaizenContextResolver(ContextResolver):
|
||||||
|
"""Resolves kaizen fleet scheduling and project metrics discovery."""
|
||||||
|
|
||||||
|
def resolve(self, query: str, event: Any, params: dict[str, Any]) -> dict[str, Any]:
|
||||||
|
if query == "discover_kaizen_scheduled_repos":
|
||||||
|
return discover_kaizen_scheduled_repos(params)
|
||||||
|
if query == "discover_kaizen_projects":
|
||||||
|
return discover_kaizen_projects(params)
|
||||||
|
return {}
|
||||||
|
|
||||||
|
|
||||||
|
CONTEXT_RESOLVER_REGISTRY["kaizen"] = KaizenContextResolver
|
||||||
|
CONTEXT_RESOLVER_REGISTRY["resolver"] = KaizenContextResolver
|
||||||
|
CONTEXT_RESOLVER_REGISTRY["shell"] = KaizenContextResolver
|
||||||
516
src/activity_core/context_resolvers/reuse_surface.py
Normal file
516
src/activity_core/context_resolvers/reuse_surface.py
Normal file
@@ -0,0 +1,516 @@
|
|||||||
|
"""Reuse-surface registry hygiene context adapter.
|
||||||
|
|
||||||
|
Registered as source type ``reuse-surface`` and as the ``shell`` resolver
|
||||||
|
dispatcher for the ``reuse_surface_report_gaps`` query. Other shell queries
|
||||||
|
continue to delegate to the kaizen resolver for backward compatibility.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import json
|
||||||
|
import logging
|
||||||
|
import os
|
||||||
|
import socket
|
||||||
|
import subprocess
|
||||||
|
from dataclasses import dataclass
|
||||||
|
from datetime import datetime, timezone
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
import httpx
|
||||||
|
import yaml
|
||||||
|
|
||||||
|
from activity_core.context_resolvers.base import CONTEXT_RESOLVER_REGISTRY, ContextResolver
|
||||||
|
from activity_core.context_resolvers.kaizen import KaizenContextResolver
|
||||||
|
from activity_core.context_resolvers.state_hub import StateHubContextResolver
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
_DEFAULT_STATE_HUB_URL = "http://127.0.0.1:8000"
|
||||||
|
_REPORT_TIMEOUT_SECONDS = 60
|
||||||
|
_STATE_HUB_TIMEOUT_SECONDS = 10.0
|
||||||
|
_KNOWN_SIGNALS = frozenset(
|
||||||
|
{
|
||||||
|
"registry_gap",
|
||||||
|
"empty_capability_scaffold",
|
||||||
|
"stale_scope",
|
||||||
|
"stale_sbom",
|
||||||
|
"publish_check_fail",
|
||||||
|
}
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass(frozen=True)
|
||||||
|
class RosterEntry:
|
||||||
|
slug: str
|
||||||
|
domain: str | None = None
|
||||||
|
publish_check: str | None = None
|
||||||
|
|
||||||
|
|
||||||
|
def _base_url() -> str:
|
||||||
|
return os.environ.get("STATE_HUB_URL", _DEFAULT_STATE_HUB_URL).rstrip("/")
|
||||||
|
|
||||||
|
|
||||||
|
def _runner_host(params: dict[str, Any]) -> str:
|
||||||
|
return str(
|
||||||
|
params.get("runner_host")
|
||||||
|
or os.environ.get("KAIZEN_RUNNER_HOST")
|
||||||
|
or socket.gethostname()
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _as_required(params: dict[str, Any]) -> bool:
|
||||||
|
return bool(params.get("required", False))
|
||||||
|
|
||||||
|
|
||||||
|
def reuse_surface_report_gaps(params: dict[str, Any]) -> dict[str, Any]:
|
||||||
|
"""Resolve registry-hygiene gaps for the next rollout batch.
|
||||||
|
|
||||||
|
Missing operational dependencies are visible failures for required sources
|
||||||
|
and graceful empty lists for optional sources so definitions can opt into
|
||||||
|
either behavior without changing rule logic.
|
||||||
|
"""
|
||||||
|
try:
|
||||||
|
return _resolve_reuse_surface_report_gaps(params)
|
||||||
|
except Exception as exc:
|
||||||
|
if _as_required(params):
|
||||||
|
raise
|
||||||
|
logger.warning("reuse_surface_report_gaps unavailable: %s", exc)
|
||||||
|
return {"gaps": []}
|
||||||
|
|
||||||
|
|
||||||
|
def _resolve_reuse_surface_report_gaps(params: dict[str, Any]) -> dict[str, Any]:
|
||||||
|
roster_path = _roster_path(params)
|
||||||
|
entries = _load_active_roster_entries(roster_path)
|
||||||
|
if not entries:
|
||||||
|
return {"gaps": []}
|
||||||
|
|
||||||
|
state_path = _round_robin_state_path(params, roster_path)
|
||||||
|
selected, next_cursor = _select_round_robin_batch(
|
||||||
|
entries,
|
||||||
|
_batch_size(params),
|
||||||
|
state_path,
|
||||||
|
)
|
||||||
|
if not selected:
|
||||||
|
return {"gaps": []}
|
||||||
|
|
||||||
|
signals = _enabled_signals(_signals_path(params, roster_path))
|
||||||
|
roots = _resolve_repo_roots(selected, _runner_host(params))
|
||||||
|
report = _reuse_surface_report(params, signals)
|
||||||
|
gaps = _gap_records(selected, roots, signals, report)
|
||||||
|
|
||||||
|
_write_round_robin_state(state_path, next_cursor, selected)
|
||||||
|
return {"gaps": gaps}
|
||||||
|
|
||||||
|
|
||||||
|
def _roster_path(params: dict[str, Any]) -> Path:
|
||||||
|
raw = params.get("roster")
|
||||||
|
if not raw:
|
||||||
|
raise ValueError("reuse_surface_report_gaps requires params.roster")
|
||||||
|
path = Path(str(raw)).expanduser()
|
||||||
|
if not path.is_file():
|
||||||
|
raise FileNotFoundError(f"reuse_surface_report_gaps roster not found: {path}")
|
||||||
|
return path
|
||||||
|
|
||||||
|
|
||||||
|
def _batch_size(params: dict[str, Any]) -> int:
|
||||||
|
try:
|
||||||
|
return max(1, int(params.get("batch_size", 3)))
|
||||||
|
except (TypeError, ValueError):
|
||||||
|
return 3
|
||||||
|
|
||||||
|
|
||||||
|
def _round_robin_state_path(params: dict[str, Any], roster_path: Path) -> Path:
|
||||||
|
raw = params.get("round_robin_state")
|
||||||
|
if raw:
|
||||||
|
return Path(str(raw)).expanduser()
|
||||||
|
return roster_path.with_name("round-robin-state.json")
|
||||||
|
|
||||||
|
|
||||||
|
def _signals_path(params: dict[str, Any], roster_path: Path) -> Path:
|
||||||
|
raw = params.get("signals")
|
||||||
|
if raw:
|
||||||
|
return Path(str(raw)).expanduser()
|
||||||
|
return roster_path.with_name("signals.yml")
|
||||||
|
|
||||||
|
|
||||||
|
def _load_active_roster_entries(path: Path) -> list[RosterEntry]:
|
||||||
|
data = yaml.safe_load(path.read_text(encoding="utf-8"))
|
||||||
|
if not isinstance(data, dict):
|
||||||
|
raise ValueError(f"reuse_surface rollout roster is not a mapping: {path}")
|
||||||
|
|
||||||
|
entries: dict[str, RosterEntry] = {}
|
||||||
|
for domain, block in _iter_domain_blocks(data):
|
||||||
|
if _domain_phase(block) != "active":
|
||||||
|
continue
|
||||||
|
for item in _repo_items(block):
|
||||||
|
entry = _entry_from_item(item, domain, block)
|
||||||
|
if entry and entry.slug not in entries:
|
||||||
|
entries[entry.slug] = entry
|
||||||
|
return list(entries.values())
|
||||||
|
|
||||||
|
|
||||||
|
def _iter_domain_blocks(data: dict[str, Any]) -> list[tuple[str | None, dict[str, Any]]]:
|
||||||
|
domains = data.get("domains")
|
||||||
|
if isinstance(domains, dict):
|
||||||
|
return [
|
||||||
|
(str(name), block)
|
||||||
|
for name, block in domains.items()
|
||||||
|
if isinstance(block, dict)
|
||||||
|
]
|
||||||
|
if isinstance(domains, list):
|
||||||
|
return [
|
||||||
|
(str(block.get("name") or block.get("domain") or ""), block)
|
||||||
|
for block in domains
|
||||||
|
if isinstance(block, dict)
|
||||||
|
]
|
||||||
|
if isinstance(data.get("active"), list):
|
||||||
|
return [(None, {"phase": "active", "repos": data["active"]})]
|
||||||
|
return [
|
||||||
|
(str(name), block)
|
||||||
|
for name, block in data.items()
|
||||||
|
if isinstance(block, dict) and ("phase" in block or "repos" in block)
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
|
def _domain_phase(block: dict[str, Any]) -> str:
|
||||||
|
return str(block.get("phase") or block.get("status") or "").lower()
|
||||||
|
|
||||||
|
|
||||||
|
def _repo_items(block: dict[str, Any]) -> list[Any]:
|
||||||
|
repos = (
|
||||||
|
block.get("repos")
|
||||||
|
or block.get("repo_slugs")
|
||||||
|
or block.get("repositories")
|
||||||
|
or block.get("slugs")
|
||||||
|
or []
|
||||||
|
)
|
||||||
|
if isinstance(repos, dict):
|
||||||
|
items: list[Any] = []
|
||||||
|
for slug, config in repos.items():
|
||||||
|
if isinstance(config, dict):
|
||||||
|
item = dict(config)
|
||||||
|
item.setdefault("slug", slug)
|
||||||
|
items.append(item)
|
||||||
|
else:
|
||||||
|
items.append(str(slug))
|
||||||
|
return items
|
||||||
|
if isinstance(repos, list):
|
||||||
|
return repos
|
||||||
|
return []
|
||||||
|
|
||||||
|
|
||||||
|
def _entry_from_item(
|
||||||
|
item: Any,
|
||||||
|
domain: str | None,
|
||||||
|
block: dict[str, Any],
|
||||||
|
) -> RosterEntry | None:
|
||||||
|
publish_check = block.get("publish_check")
|
||||||
|
if isinstance(item, str):
|
||||||
|
slug = item
|
||||||
|
elif isinstance(item, dict):
|
||||||
|
slug = item.get("slug") or item.get("repo") or item.get("name")
|
||||||
|
publish_check = item.get("publish_check", publish_check)
|
||||||
|
else:
|
||||||
|
return None
|
||||||
|
if not slug:
|
||||||
|
return None
|
||||||
|
return RosterEntry(
|
||||||
|
slug=str(slug),
|
||||||
|
domain=domain or None,
|
||||||
|
publish_check=str(publish_check).lower() if publish_check is not None else None,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _select_round_robin_batch(
|
||||||
|
entries: list[RosterEntry],
|
||||||
|
batch_size: int,
|
||||||
|
state_path: Path,
|
||||||
|
) -> tuple[list[RosterEntry], int]:
|
||||||
|
if not entries:
|
||||||
|
return [], 0
|
||||||
|
cursor = _read_round_robin_cursor(state_path) % len(entries)
|
||||||
|
size = min(batch_size, len(entries))
|
||||||
|
selected = [entries[(cursor + offset) % len(entries)] for offset in range(size)]
|
||||||
|
next_cursor = (cursor + size) % len(entries)
|
||||||
|
return selected, next_cursor
|
||||||
|
|
||||||
|
|
||||||
|
def _read_round_robin_cursor(path: Path) -> int:
|
||||||
|
if not path.is_file():
|
||||||
|
return 0
|
||||||
|
try:
|
||||||
|
data = json.loads(path.read_text(encoding="utf-8"))
|
||||||
|
except (OSError, json.JSONDecodeError):
|
||||||
|
return 0
|
||||||
|
if not isinstance(data, dict):
|
||||||
|
return 0
|
||||||
|
try:
|
||||||
|
return int(data.get("cursor", 0))
|
||||||
|
except (TypeError, ValueError):
|
||||||
|
return 0
|
||||||
|
|
||||||
|
|
||||||
|
def _write_round_robin_state(
|
||||||
|
path: Path,
|
||||||
|
cursor: int,
|
||||||
|
selected: list[RosterEntry],
|
||||||
|
) -> None:
|
||||||
|
path.parent.mkdir(parents=True, exist_ok=True)
|
||||||
|
payload = {
|
||||||
|
"cursor": cursor,
|
||||||
|
"last_batch": [entry.slug for entry in selected],
|
||||||
|
"updated_at": datetime.now(timezone.utc).isoformat(),
|
||||||
|
}
|
||||||
|
path.write_text(
|
||||||
|
json.dumps(payload, indent=2, sort_keys=True) + "\n",
|
||||||
|
encoding="utf-8",
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _enabled_signals(path: Path) -> set[str]:
|
||||||
|
if not path.is_file():
|
||||||
|
return set(_KNOWN_SIGNALS)
|
||||||
|
data = yaml.safe_load(path.read_text(encoding="utf-8"))
|
||||||
|
node = data.get("signals") if isinstance(data, dict) else data
|
||||||
|
enabled: set[str] = set()
|
||||||
|
saw_known_signal = False
|
||||||
|
|
||||||
|
if isinstance(node, dict):
|
||||||
|
for name, config in node.items():
|
||||||
|
if str(name) not in _KNOWN_SIGNALS:
|
||||||
|
continue
|
||||||
|
saw_known_signal = True
|
||||||
|
if isinstance(config, dict) and config.get("enabled") is False:
|
||||||
|
continue
|
||||||
|
if config is False:
|
||||||
|
continue
|
||||||
|
enabled.add(str(name))
|
||||||
|
elif isinstance(node, list):
|
||||||
|
for item in node:
|
||||||
|
if isinstance(item, str) and item in _KNOWN_SIGNALS:
|
||||||
|
saw_known_signal = True
|
||||||
|
enabled.add(item)
|
||||||
|
elif isinstance(item, dict):
|
||||||
|
name = item.get("id") or item.get("signal") or item.get("name")
|
||||||
|
if str(name) in _KNOWN_SIGNALS and item.get("enabled", True) is not False:
|
||||||
|
saw_known_signal = True
|
||||||
|
enabled.add(str(name))
|
||||||
|
|
||||||
|
return enabled if saw_known_signal else set(_KNOWN_SIGNALS)
|
||||||
|
|
||||||
|
|
||||||
|
def _resolve_repo_roots(
|
||||||
|
entries: list[RosterEntry],
|
||||||
|
runner_host: str,
|
||||||
|
) -> dict[str, Path]:
|
||||||
|
requested = {entry.slug for entry in entries}
|
||||||
|
roots: dict[str, Path] = {}
|
||||||
|
for repo in _fetch_repos():
|
||||||
|
slug = str(repo.get("slug") or "")
|
||||||
|
if slug not in requested:
|
||||||
|
continue
|
||||||
|
raw = _repo_path_for_host(repo, runner_host)
|
||||||
|
if raw:
|
||||||
|
roots[slug] = Path(raw)
|
||||||
|
return roots
|
||||||
|
|
||||||
|
|
||||||
|
def _fetch_repos() -> list[dict[str, Any]]:
|
||||||
|
url = f"{_base_url()}/repos/"
|
||||||
|
try:
|
||||||
|
resp = httpx.get(url, timeout=_STATE_HUB_TIMEOUT_SECONDS)
|
||||||
|
resp.raise_for_status()
|
||||||
|
except httpx.HTTPError as exc:
|
||||||
|
raise RuntimeError(f"State Hub unreachable at {url}: {exc}") from exc
|
||||||
|
payload = resp.json()
|
||||||
|
if not isinstance(payload, list):
|
||||||
|
raise RuntimeError(f"State Hub /repos/ returned non-list: {type(payload)!r}")
|
||||||
|
return [repo for repo in payload if isinstance(repo, dict)]
|
||||||
|
|
||||||
|
|
||||||
|
def _repo_path_for_host(repo: dict[str, Any], runner_host: str) -> str | None:
|
||||||
|
host_paths = repo.get("host_paths") or {}
|
||||||
|
raw = None
|
||||||
|
if isinstance(host_paths, dict):
|
||||||
|
raw = host_paths.get(runner_host)
|
||||||
|
raw = raw or repo.get("local_path")
|
||||||
|
if not raw or raw == "(unknown)":
|
||||||
|
return None
|
||||||
|
return str(raw)
|
||||||
|
|
||||||
|
|
||||||
|
def _reuse_surface_report(params: dict[str, Any], signals: set[str]) -> dict[str, Any]:
|
||||||
|
if not (signals & {"registry_gap", "empty_capability_scaffold"}):
|
||||||
|
return {}
|
||||||
|
binary = str(params.get("reuse_surface_bin") or "reuse-surface")
|
||||||
|
try:
|
||||||
|
completed = subprocess.run(
|
||||||
|
[binary, "report", "gaps", "--format", "json"],
|
||||||
|
capture_output=True,
|
||||||
|
check=False,
|
||||||
|
text=True,
|
||||||
|
timeout=_REPORT_TIMEOUT_SECONDS,
|
||||||
|
)
|
||||||
|
except FileNotFoundError as exc:
|
||||||
|
raise RuntimeError(f"reuse-surface CLI not found: {binary}") from exc
|
||||||
|
except subprocess.TimeoutExpired as exc:
|
||||||
|
raise RuntimeError("reuse-surface report gaps timed out") from exc
|
||||||
|
|
||||||
|
if completed.returncode != 0:
|
||||||
|
detail = completed.stderr.strip() or completed.stdout.strip()
|
||||||
|
raise RuntimeError(f"reuse-surface report gaps failed: {detail}")
|
||||||
|
try:
|
||||||
|
payload = json.loads(completed.stdout or "{}")
|
||||||
|
except json.JSONDecodeError as exc:
|
||||||
|
raise RuntimeError("reuse-surface report gaps returned invalid JSON") from exc
|
||||||
|
if not isinstance(payload, dict):
|
||||||
|
raise RuntimeError("reuse-surface report gaps returned non-object JSON")
|
||||||
|
return payload
|
||||||
|
|
||||||
|
|
||||||
|
def _gap_records(
|
||||||
|
entries: list[RosterEntry],
|
||||||
|
roots: dict[str, Path],
|
||||||
|
signals: set[str],
|
||||||
|
report: dict[str, Any],
|
||||||
|
) -> list[dict[str, Any]]:
|
||||||
|
empty_scaffolds = _repo_set(report, {"empty_scaffolds", "empty_scaffold"})
|
||||||
|
publish_fail = _repo_set(
|
||||||
|
report,
|
||||||
|
{"publish_fail", "publish_fails", "publish_failures"},
|
||||||
|
)
|
||||||
|
gaps: list[dict[str, Any]] = []
|
||||||
|
seen: set[tuple[str, str]] = set()
|
||||||
|
|
||||||
|
for entry in entries:
|
||||||
|
root = roots.get(entry.slug)
|
||||||
|
if root is None:
|
||||||
|
logger.info("reuse_surface repo_unreachable slug=%s", entry.slug)
|
||||||
|
continue
|
||||||
|
|
||||||
|
if (
|
||||||
|
signals & {"registry_gap", "empty_capability_scaffold"}
|
||||||
|
and entry.slug in empty_scaffolds
|
||||||
|
):
|
||||||
|
_append_gap(gaps, seen, entry.slug, root, "empty_capability_scaffold")
|
||||||
|
|
||||||
|
if "registry_gap" in signals and entry.slug in publish_fail:
|
||||||
|
_append_gap(gaps, seen, entry.slug, root, "registry_gap")
|
||||||
|
|
||||||
|
if "publish_check_fail" in signals and entry.publish_check == "fail":
|
||||||
|
_append_gap(gaps, seen, entry.slug, root, "publish_check_fail")
|
||||||
|
|
||||||
|
if "stale_scope" in signals and _scope_is_stale(root):
|
||||||
|
_append_gap(gaps, seen, entry.slug, root, "stale_scope")
|
||||||
|
|
||||||
|
if "stale_sbom" in signals and _sbom_is_stale(entry.slug):
|
||||||
|
_append_gap(gaps, seen, entry.slug, root, "stale_sbom")
|
||||||
|
|
||||||
|
return gaps
|
||||||
|
|
||||||
|
|
||||||
|
def _append_gap(
|
||||||
|
gaps: list[dict[str, Any]],
|
||||||
|
seen: set[tuple[str, str]],
|
||||||
|
slug: str,
|
||||||
|
root: Path,
|
||||||
|
signal: str,
|
||||||
|
) -> None:
|
||||||
|
key = (slug, signal)
|
||||||
|
if key in seen:
|
||||||
|
return
|
||||||
|
seen.add(key)
|
||||||
|
gaps.append(
|
||||||
|
{
|
||||||
|
"repo": slug,
|
||||||
|
"root": str(root),
|
||||||
|
"signal": signal,
|
||||||
|
"hygiene_signal": signal,
|
||||||
|
}
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _scope_is_stale(root: Path) -> bool:
|
||||||
|
scope = root / "SCOPE.md"
|
||||||
|
if not scope.is_file():
|
||||||
|
return True
|
||||||
|
age_seconds = datetime.now(timezone.utc).timestamp() - scope.stat().st_mtime
|
||||||
|
return age_seconds > 90 * 24 * 60 * 60
|
||||||
|
|
||||||
|
|
||||||
|
def _sbom_is_stale(slug: str) -> bool:
|
||||||
|
payload = StateHubContextResolver().resolve(
|
||||||
|
"repo_sbom_status",
|
||||||
|
None,
|
||||||
|
{"repo_slug": slug},
|
||||||
|
)
|
||||||
|
if not isinstance(payload, dict):
|
||||||
|
return False
|
||||||
|
try:
|
||||||
|
return int(payload.get("sbom_age_days", 0)) > 30
|
||||||
|
except (TypeError, ValueError):
|
||||||
|
return False
|
||||||
|
|
||||||
|
|
||||||
|
def _repo_set(report: dict[str, Any], keys: set[str]) -> set[str]:
|
||||||
|
slugs: set[str] = set()
|
||||||
|
for value in _values_for_keys(report, keys):
|
||||||
|
slugs.update(_slugs_from_value(value))
|
||||||
|
return slugs
|
||||||
|
|
||||||
|
|
||||||
|
def _values_for_keys(value: Any, keys: set[str]) -> list[Any]:
|
||||||
|
values: list[Any] = []
|
||||||
|
if isinstance(value, dict):
|
||||||
|
for key, nested in value.items():
|
||||||
|
if key in keys:
|
||||||
|
values.append(nested)
|
||||||
|
values.extend(_values_for_keys(nested, keys))
|
||||||
|
elif isinstance(value, list):
|
||||||
|
for item in value:
|
||||||
|
values.extend(_values_for_keys(item, keys))
|
||||||
|
return values
|
||||||
|
|
||||||
|
|
||||||
|
def _slugs_from_value(value: Any) -> set[str]:
|
||||||
|
if isinstance(value, str):
|
||||||
|
return {value}
|
||||||
|
if isinstance(value, list):
|
||||||
|
slugs: set[str] = set()
|
||||||
|
for item in value:
|
||||||
|
slugs.update(_slugs_from_value(item))
|
||||||
|
return slugs
|
||||||
|
if isinstance(value, dict):
|
||||||
|
for key in ("repo", "repo_slug", "slug", "name"):
|
||||||
|
if value.get(key):
|
||||||
|
return {str(value[key])}
|
||||||
|
slugs: set[str] = set()
|
||||||
|
for key, nested in value.items():
|
||||||
|
if nested is True or isinstance(nested, (dict, list)):
|
||||||
|
slugs.add(str(key))
|
||||||
|
slugs.update(_slugs_from_value(nested))
|
||||||
|
return slugs
|
||||||
|
return set()
|
||||||
|
|
||||||
|
|
||||||
|
class ReuseSurfaceContextResolver(ContextResolver):
|
||||||
|
"""Resolves reuse-surface registry hygiene gap reports."""
|
||||||
|
|
||||||
|
def resolve(self, query: str, event: Any, params: dict[str, Any]) -> dict[str, Any]:
|
||||||
|
if query == "reuse_surface_report_gaps":
|
||||||
|
return reuse_surface_report_gaps(params)
|
||||||
|
return {}
|
||||||
|
|
||||||
|
|
||||||
|
class ShellContextResolver(ContextResolver):
|
||||||
|
"""Dispatch shell-backed context queries without breaking kaizen aliases."""
|
||||||
|
|
||||||
|
def resolve(self, query: str, event: Any, params: dict[str, Any]) -> dict[str, Any]:
|
||||||
|
if query == "reuse_surface_report_gaps":
|
||||||
|
return reuse_surface_report_gaps(params)
|
||||||
|
return KaizenContextResolver().resolve(query, event, params)
|
||||||
|
|
||||||
|
|
||||||
|
CONTEXT_RESOLVER_REGISTRY["reuse-surface"] = ReuseSurfaceContextResolver
|
||||||
|
CONTEXT_RESOLVER_REGISTRY["shell"] = ShellContextResolver
|
||||||
@@ -12,6 +12,7 @@ Supported queries:
|
|||||||
- coding_retro: latest /progress/ item with event_type=coding_retro
|
- coding_retro: latest /progress/ item with event_type=coding_retro
|
||||||
- daily_triage_digest: curated scalar JSON digest for daily WSJF triage
|
- daily_triage_digest: curated scalar JSON digest for daily WSJF triage
|
||||||
- recently_on_scope_hourly: POST {STATE_HUB_URL}/recently-on-scope/hourly
|
- recently_on_scope_hourly: POST {STATE_HUB_URL}/recently-on-scope/hourly
|
||||||
|
- consistency_sweep_remote_all: POST {STATE_HUB_URL}/consistency/sweep/remote-all
|
||||||
|
|
||||||
No caching — state hub data is live operational state and must not be stale
|
No caching — state hub data is live operational state and must not be stale
|
||||||
within a single workflow run.
|
within a single workflow run.
|
||||||
@@ -31,6 +32,7 @@ from activity_core.context_resolvers.base import CONTEXT_RESOLVER_REGISTRY, Cont
|
|||||||
|
|
||||||
_DEFAULT_STATE_HUB_URL = "http://127.0.0.1:8000"
|
_DEFAULT_STATE_HUB_URL = "http://127.0.0.1:8000"
|
||||||
_TIMEOUT_SECONDS = 10.0
|
_TIMEOUT_SECONDS = 10.0
|
||||||
|
_SWEEP_TIMEOUT_SECONDS = 330.0
|
||||||
_OPEN_WORKSTREAM_STATUSES = {"active", "ready", "blocked"}
|
_OPEN_WORKSTREAM_STATUSES = {"active", "ready", "blocked"}
|
||||||
_OPEN_TASK_STATUSES = {"wait", "todo", "progress"}
|
_OPEN_TASK_STATUSES = {"wait", "todo", "progress"}
|
||||||
# Sentinel age for repos that have never had an SBOM ingested. Large enough
|
# Sentinel age for repos that have never had an SBOM ingested. Large enough
|
||||||
@@ -53,13 +55,26 @@ def _fetch_json(path: str, params: dict[str, Any] | None = None) -> Any:
|
|||||||
return {}
|
return {}
|
||||||
|
|
||||||
|
|
||||||
def _post_json(path: str, payload: dict[str, Any]) -> Any:
|
def _post_json(path: str, payload: dict[str, Any], *, timeout: float = _TIMEOUT_SECONDS) -> Any:
|
||||||
url = f"{_base_url()}{path}"
|
url = f"{_base_url()}{path}"
|
||||||
resp = httpx.post(url, json=payload, timeout=_TIMEOUT_SECONDS)
|
resp = httpx.post(url, json=payload, timeout=timeout)
|
||||||
resp.raise_for_status()
|
resp.raise_for_status()
|
||||||
return resp.json()
|
return resp.json()
|
||||||
|
|
||||||
|
|
||||||
|
def _validate_consistency_sweep_remote_all(result: Any) -> dict[str, Any]:
|
||||||
|
if not isinstance(result, dict):
|
||||||
|
raise RuntimeError("consistency_sweep_remote_all returned a non-object response")
|
||||||
|
required_keys = {"exit_code", "lock_skipped", "repos_processed"}
|
||||||
|
missing = required_keys - set(result)
|
||||||
|
if missing:
|
||||||
|
missing_list = ", ".join(sorted(missing))
|
||||||
|
raise RuntimeError(
|
||||||
|
f"consistency_sweep_remote_all response missing required key(s): {missing_list}"
|
||||||
|
)
|
||||||
|
return result
|
||||||
|
|
||||||
|
|
||||||
def _validate_recently_on_scope_hourly(result: Any) -> dict[str, Any]:
|
def _validate_recently_on_scope_hourly(result: Any) -> dict[str, Any]:
|
||||||
if not isinstance(result, dict):
|
if not isinstance(result, dict):
|
||||||
raise RuntimeError("recently_on_scope_hourly returned a non-object response")
|
raise RuntimeError("recently_on_scope_hourly returned a non-object response")
|
||||||
@@ -107,6 +122,18 @@ class StateHubContextResolver(ContextResolver):
|
|||||||
}
|
}
|
||||||
result = _post_json("/recently-on-scope/hourly", payload)
|
result = _post_json("/recently-on-scope/hourly", payload)
|
||||||
return _validate_recently_on_scope_hourly(result)
|
return _validate_recently_on_scope_hourly(result)
|
||||||
|
if query == "consistency_sweep_remote_all":
|
||||||
|
payload = {
|
||||||
|
key: value
|
||||||
|
for key, value in params.items()
|
||||||
|
if key not in {"required"}
|
||||||
|
}
|
||||||
|
result = _post_json(
|
||||||
|
"/consistency/sweep/remote-all",
|
||||||
|
payload,
|
||||||
|
timeout=_SWEEP_TIMEOUT_SECONDS,
|
||||||
|
)
|
||||||
|
return _validate_consistency_sweep_remote_all(result)
|
||||||
return {}
|
return {}
|
||||||
|
|
||||||
|
|
||||||
@@ -219,11 +246,13 @@ def _coding_retro(params: dict[str, Any]) -> dict[str, Any]:
|
|||||||
"""
|
"""
|
||||||
event_type = str(params.get("event_type") or "coding_retro")
|
event_type = str(params.get("event_type") or "coding_retro")
|
||||||
limit = _bounded_int(params.get("limit", 100), default=100, minimum=1, maximum=500)
|
limit = _bounded_int(params.get("limit", 100), default=100, minimum=1, maximum=500)
|
||||||
items = _fetch_json("/progress/", {"limit": limit})
|
query_params = {"event_type": event_type, "limit": limit}
|
||||||
|
items = _fetch_json("/progress/", query_params)
|
||||||
if not isinstance(items, list):
|
if not isinstance(items, list):
|
||||||
return _empty_coding_retro(event_type)
|
return _empty_coding_retro(event_type)
|
||||||
|
|
||||||
item = _latest_progress_item(items, event_type)
|
window_days = _optional_int(params.get("window_days"))
|
||||||
|
item = _latest_progress_item(items, event_type, window_days)
|
||||||
if item is None:
|
if item is None:
|
||||||
return _empty_coding_retro(event_type)
|
return _empty_coding_retro(event_type)
|
||||||
|
|
||||||
@@ -256,12 +285,18 @@ def _empty_coding_retro(event_type: str) -> dict[str, Any]:
|
|||||||
def _latest_progress_item(
|
def _latest_progress_item(
|
||||||
items: list[Any],
|
items: list[Any],
|
||||||
event_type: str,
|
event_type: str,
|
||||||
|
window_days: int | None = None,
|
||||||
) -> dict[str, Any] | None:
|
) -> dict[str, Any] | None:
|
||||||
newest: dict[str, Any] | None = None
|
newest: dict[str, Any] | None = None
|
||||||
newest_key: tuple[datetime, int] | None = None
|
newest_key: tuple[datetime, int] | None = None
|
||||||
for index, item in enumerate(items):
|
for index, item in enumerate(items):
|
||||||
if not isinstance(item, dict) or item.get("event_type") != event_type:
|
if not isinstance(item, dict) or item.get("event_type") != event_type:
|
||||||
continue
|
continue
|
||||||
|
if window_days is not None and not _progress_matches_window_days(
|
||||||
|
item,
|
||||||
|
window_days,
|
||||||
|
):
|
||||||
|
continue
|
||||||
key = (_parse_progress_timestamp(item.get("created_at")), index)
|
key = (_parse_progress_timestamp(item.get("created_at")), index)
|
||||||
if newest_key is None or key > newest_key:
|
if newest_key is None or key > newest_key:
|
||||||
newest = item
|
newest = item
|
||||||
@@ -295,6 +330,56 @@ def _progress_detail(item: dict[str, Any]) -> dict[str, Any]:
|
|||||||
return {}
|
return {}
|
||||||
|
|
||||||
|
|
||||||
|
def _progress_matches_window_days(item: dict[str, Any], window_days: int) -> bool:
|
||||||
|
detail = _progress_detail(item)
|
||||||
|
return _progress_window_days(detail) == window_days
|
||||||
|
|
||||||
|
|
||||||
|
def _progress_window_days(detail: dict[str, Any]) -> int | None:
|
||||||
|
window = detail.get("window")
|
||||||
|
if isinstance(window, dict):
|
||||||
|
direct = _optional_int(window.get("days") or window.get("window_days"))
|
||||||
|
if direct is not None:
|
||||||
|
return direct
|
||||||
|
ranged = _window_days_from_range(
|
||||||
|
window.get("since") or window.get("window_start"),
|
||||||
|
window.get("until") or window.get("window_end"),
|
||||||
|
)
|
||||||
|
if ranged is not None:
|
||||||
|
return ranged
|
||||||
|
|
||||||
|
direct = _optional_int(detail.get("days") or detail.get("window_days"))
|
||||||
|
if direct is not None:
|
||||||
|
return direct
|
||||||
|
return _window_days_from_range(
|
||||||
|
detail.get("since") or detail.get("window_start"),
|
||||||
|
detail.get("until") or detail.get("window_end"),
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _window_days_from_range(start: Any, end: Any) -> int | None:
|
||||||
|
start_ts = _parse_optional_timestamp(start)
|
||||||
|
end_ts = _parse_optional_timestamp(end)
|
||||||
|
if start_ts is None or end_ts is None or end_ts < start_ts:
|
||||||
|
return None
|
||||||
|
seconds = (end_ts - start_ts).total_seconds()
|
||||||
|
if seconds <= 0:
|
||||||
|
return None
|
||||||
|
return max(1, round(seconds / 86400))
|
||||||
|
|
||||||
|
|
||||||
|
def _parse_optional_timestamp(value: Any) -> datetime | None:
|
||||||
|
if not isinstance(value, str) or not value:
|
||||||
|
return None
|
||||||
|
try:
|
||||||
|
parsed = datetime.fromisoformat(value.replace("Z", "+00:00"))
|
||||||
|
except ValueError:
|
||||||
|
return None
|
||||||
|
if parsed.tzinfo is None:
|
||||||
|
parsed = parsed.replace(tzinfo=timezone.utc)
|
||||||
|
return parsed.astimezone(timezone.utc)
|
||||||
|
|
||||||
|
|
||||||
def _normalise_coding_retro_suggestions(value: Any) -> list[dict[str, Any]]:
|
def _normalise_coding_retro_suggestions(value: Any) -> list[dict[str, Any]]:
|
||||||
if not isinstance(value, list):
|
if not isinstance(value, list):
|
||||||
return []
|
return []
|
||||||
@@ -374,6 +459,13 @@ def _bounded_int(value: Any, *, default: int, minimum: int, maximum: int) -> int
|
|||||||
return max(minimum, min(maximum, number))
|
return max(minimum, min(maximum, number))
|
||||||
|
|
||||||
|
|
||||||
|
def _optional_int(value: Any) -> int | None:
|
||||||
|
try:
|
||||||
|
return int(value)
|
||||||
|
except (TypeError, ValueError):
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
def _clean_scalar(value: Any) -> str:
|
def _clean_scalar(value: Any) -> str:
|
||||||
return " ".join(str(value or "").split())
|
return " ".join(str(value or "").split())
|
||||||
|
|
||||||
|
|||||||
@@ -20,7 +20,8 @@ from activity_core.rules.models import TaskRef, TaskSpec
|
|||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
ISSUE_CORE_URL = os.environ.get("ISSUE_CORE_URL", "http://127.0.0.1:8010")
|
ISSUE_CORE_URL = os.environ.get("ISSUE_CORE_URL", "http://127.0.0.1:8765")
|
||||||
|
ISSUE_CORE_API_KEY_ENV = "ISSUE_CORE_API_KEY"
|
||||||
ISSUE_SINK_TYPE = os.environ.get("ISSUE_SINK_TYPE", "rest")
|
ISSUE_SINK_TYPE = os.environ.get("ISSUE_SINK_TYPE", "rest")
|
||||||
|
|
||||||
|
|
||||||
@@ -30,10 +31,30 @@ class IssueSink(ABC):
|
|||||||
|
|
||||||
|
|
||||||
class IssueCoreRestSink(IssueSink):
|
class IssueCoreRestSink(IssueSink):
|
||||||
"""POSTs to issue-core REST API. Config: ISSUE_CORE_URL env var."""
|
"""POSTs to issue-core REST API.
|
||||||
|
|
||||||
def __init__(self, base_url: str = ISSUE_CORE_URL) -> None:
|
Config: ISSUE_CORE_URL and ISSUE_CORE_API_KEY env vars (shared key with
|
||||||
|
the issue-core server).
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
base_url: str = ISSUE_CORE_URL,
|
||||||
|
api_key: str | None = None,
|
||||||
|
) -> None:
|
||||||
self._base_url = base_url.rstrip("/")
|
self._base_url = base_url.rstrip("/")
|
||||||
|
if api_key is not None:
|
||||||
|
self._api_key = api_key.strip()
|
||||||
|
else:
|
||||||
|
self._api_key = os.environ.get(ISSUE_CORE_API_KEY_ENV, "").strip()
|
||||||
|
|
||||||
|
def _auth_headers(self) -> dict[str, str]:
|
||||||
|
if not self._api_key:
|
||||||
|
raise RuntimeError(
|
||||||
|
f"{ISSUE_CORE_API_KEY_ENV} is not set. "
|
||||||
|
"Required when ISSUE_SINK_TYPE=rest."
|
||||||
|
)
|
||||||
|
return {"Authorization": f"Bearer {self._api_key}"}
|
||||||
|
|
||||||
def emit(self, task_spec: TaskSpec) -> TaskRef:
|
def emit(self, task_spec: TaskSpec) -> TaskRef:
|
||||||
payload = {
|
payload = {
|
||||||
@@ -45,10 +66,19 @@ class IssueCoreRestSink(IssueSink):
|
|||||||
"due_in_days": task_spec.due_in_days,
|
"due_in_days": task_spec.due_in_days,
|
||||||
"source_type": task_spec.source_type,
|
"source_type": task_spec.source_type,
|
||||||
"source_id": task_spec.source_id,
|
"source_id": task_spec.source_id,
|
||||||
"triggering_event_id": task_spec.triggering_event_id,
|
"triggering_event_id": (
|
||||||
|
str(task_spec.triggering_event_id)
|
||||||
|
if task_spec.triggering_event_id is not None
|
||||||
|
else None
|
||||||
|
),
|
||||||
"activity_definition_id": task_spec.activity_definition_id,
|
"activity_definition_id": task_spec.activity_definition_id,
|
||||||
}
|
}
|
||||||
resp = httpx.post(f"{self._base_url}/issues/", json=payload, timeout=10.0)
|
resp = httpx.post(
|
||||||
|
f"{self._base_url}/issues/",
|
||||||
|
json=payload,
|
||||||
|
headers=self._auth_headers(),
|
||||||
|
timeout=10.0,
|
||||||
|
)
|
||||||
resp.raise_for_status()
|
resp.raise_for_status()
|
||||||
data = resp.json()
|
data = resp.json()
|
||||||
return TaskRef(
|
return TaskRef(
|
||||||
|
|||||||
@@ -49,7 +49,18 @@ class CronTriggerConfig(BaseModel):
|
|||||||
)
|
)
|
||||||
timezone: str = Field(default="UTC", description="IANA timezone name.")
|
timezone: str = Field(default="UTC", description="IANA timezone name.")
|
||||||
jitter_seconds: int = Field(default=0, ge=0)
|
jitter_seconds: int = Field(default=0, ge=0)
|
||||||
misfire_policy: Literal["skip", "catchup", "compress"] = Field(default="skip")
|
# Run-miss recovery behaviour (ACTIVITY-WP-0014). What happens when a fire is
|
||||||
|
# missed because the worker / Temporal was unavailable at trigger time:
|
||||||
|
# skip - run on trigger or skip; a missed fire is never recovered
|
||||||
|
# catchup_all - recover every fire missed during the outage window
|
||||||
|
# catchup_latest - recover only the most recent missed fire; do not accumulate
|
||||||
|
# Legacy aliases are accepted: catchup → catchup_all, compress → catchup_latest.
|
||||||
|
misfire_policy: Literal[
|
||||||
|
"skip", "catchup_all", "catchup_latest", "catchup", "compress"
|
||||||
|
] = Field(default="skip")
|
||||||
|
# Override the per-policy default catchup window (how far back Temporal will
|
||||||
|
# recover missed fires after an outage). None uses the policy default.
|
||||||
|
catchup_window_seconds: int | None = Field(default=None, ge=0)
|
||||||
|
|
||||||
|
|
||||||
class EventTriggerConfig(BaseModel):
|
class EventTriggerConfig(BaseModel):
|
||||||
|
|||||||
@@ -2,12 +2,15 @@
|
|||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import json
|
||||||
import os
|
import os
|
||||||
|
from pathlib import Path
|
||||||
from typing import Any
|
from typing import Any
|
||||||
|
|
||||||
import httpx
|
import httpx
|
||||||
|
|
||||||
from activity_core.context_resolvers.ops_inventory import _sanitize_url
|
from activity_core.context_resolvers.ops_inventory import _sanitize_url
|
||||||
|
from activity_core.state_hub_write import idempotency_headers
|
||||||
|
|
||||||
_DEFAULT_STATE_HUB_URL = "http://127.0.0.1:8000"
|
_DEFAULT_STATE_HUB_URL = "http://127.0.0.1:8000"
|
||||||
_INTER_HUB_SINK_TYPES = {
|
_INTER_HUB_SINK_TYPES = {
|
||||||
@@ -15,6 +18,10 @@ _INTER_HUB_SINK_TYPES = {
|
|||||||
"inter-hub-event",
|
"inter-hub-event",
|
||||||
"inter-hub-interaction-event",
|
"inter-hub-interaction-event",
|
||||||
}
|
}
|
||||||
|
_CORE_HUB_SINK_TYPES = {
|
||||||
|
"core-hub",
|
||||||
|
"core-hub-interaction-event",
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
def persist_ops_inventory_evidence(payload: dict[str, Any]) -> list[dict[str, Any]]:
|
def persist_ops_inventory_evidence(payload: dict[str, Any]) -> list[dict[str, Any]]:
|
||||||
@@ -55,6 +62,12 @@ def persist_ops_inventory_evidence(payload: dict[str, Any]) -> list[dict[str, An
|
|||||||
results.append(
|
results.append(
|
||||||
_post_state_hub_progress(payload, bind_key, probe_result, sink)
|
_post_state_hub_progress(payload, bind_key, probe_result, sink)
|
||||||
)
|
)
|
||||||
|
elif sink_type in _CORE_HUB_SINK_TYPES:
|
||||||
|
results.append(
|
||||||
|
_post_core_hub_interaction_event(
|
||||||
|
payload, bind_key, probe_result, sink
|
||||||
|
)
|
||||||
|
)
|
||||||
elif sink_type in _INTER_HUB_SINK_TYPES:
|
elif sink_type in _INTER_HUB_SINK_TYPES:
|
||||||
results.append(_inter_hub_result(sink))
|
results.append(_inter_hub_result(sink))
|
||||||
else:
|
else:
|
||||||
@@ -121,6 +134,7 @@ def _post_state_hub_progress(
|
|||||||
resp = httpx.post(
|
resp = httpx.post(
|
||||||
f"{base_url}/progress/",
|
f"{base_url}/progress/",
|
||||||
json=body,
|
json=body,
|
||||||
|
headers=idempotency_headers(run_id, context_key, event_type),
|
||||||
timeout=float(sink.get("timeout_seconds", 10.0)),
|
timeout=float(sink.get("timeout_seconds", 10.0)),
|
||||||
)
|
)
|
||||||
resp.raise_for_status()
|
resp.raise_for_status()
|
||||||
@@ -136,12 +150,17 @@ def _post_state_hub_progress(
|
|||||||
|
|
||||||
|
|
||||||
def _progress_exists(base_url: str, event_type: str, idempotency_key: str) -> bool:
|
def _progress_exists(base_url: str, event_type: str, idempotency_key: str) -> bool:
|
||||||
resp = httpx.get(
|
# Best-effort optimisation only; the Idempotency-Key header on the write is the
|
||||||
f"{base_url}/progress/",
|
# real dedup guarantee. Do not hard-fail if State Hub is unreachable here.
|
||||||
params={"limit": 100},
|
try:
|
||||||
timeout=10.0,
|
resp = httpx.get(
|
||||||
)
|
f"{base_url}/progress/",
|
||||||
resp.raise_for_status()
|
params={"limit": 100},
|
||||||
|
timeout=10.0,
|
||||||
|
)
|
||||||
|
resp.raise_for_status()
|
||||||
|
except httpx.HTTPError:
|
||||||
|
return False
|
||||||
for item in resp.json():
|
for item in resp.json():
|
||||||
detail = item.get("detail") or {}
|
detail = item.get("detail") or {}
|
||||||
if (
|
if (
|
||||||
@@ -152,6 +171,213 @@ def _progress_exists(base_url: str, event_type: str, idempotency_key: str) -> bo
|
|||||||
return False
|
return False
|
||||||
|
|
||||||
|
|
||||||
|
def _post_core_hub_interaction_event(
|
||||||
|
payload: dict[str, Any],
|
||||||
|
context_key: str,
|
||||||
|
probe_result: dict[str, Any],
|
||||||
|
sink: dict[str, Any],
|
||||||
|
) -> dict[str, Any]:
|
||||||
|
raw_base_url = (
|
||||||
|
sink.get("core_hub_url")
|
||||||
|
or sink.get("base_url")
|
||||||
|
or os.environ.get("CORE_HUB_BASE_URL")
|
||||||
|
or ""
|
||||||
|
)
|
||||||
|
base_url = str(raw_base_url).rstrip("/")
|
||||||
|
runtime_token = _core_hub_runtime_token(sink)
|
||||||
|
widget_id = _core_hub_widget_id(sink, probe_result)
|
||||||
|
|
||||||
|
missing: list[str] = []
|
||||||
|
if not base_url:
|
||||||
|
missing.append("CORE_HUB_BASE_URL")
|
||||||
|
if not runtime_token:
|
||||||
|
missing.append("CORE_HUB_RUNTIME_TOKEN or CORE_HUB_RUNTIME_TOKEN_FILE")
|
||||||
|
if not widget_id:
|
||||||
|
missing.append("widget_id or CORE_HUB_WIDGET_ID")
|
||||||
|
if missing:
|
||||||
|
return {
|
||||||
|
"type": sink.get("type"),
|
||||||
|
"status": "skipped",
|
||||||
|
"reason": "missing_core_hub_config",
|
||||||
|
"missing": missing,
|
||||||
|
"context_key": context_key,
|
||||||
|
}
|
||||||
|
|
||||||
|
endpoint = _selected_endpoint(probe_result, sink)
|
||||||
|
event_type = sink.get("event_type", "ops-endpoint-verified")
|
||||||
|
timeout = float(sink.get("timeout_seconds", 10.0))
|
||||||
|
body = {
|
||||||
|
"widgetId": widget_id,
|
||||||
|
"eventType": event_type,
|
||||||
|
"viewContext": _core_hub_view_context(payload, context_key, endpoint, sink),
|
||||||
|
"metadata": _core_hub_metadata(payload, context_key, probe_result, endpoint),
|
||||||
|
}
|
||||||
|
resp = httpx.post(
|
||||||
|
f"{base_url}/api/v2/interaction-events",
|
||||||
|
json=body,
|
||||||
|
headers=_core_hub_headers(runtime_token),
|
||||||
|
timeout=timeout,
|
||||||
|
)
|
||||||
|
resp.raise_for_status()
|
||||||
|
data = resp.json()
|
||||||
|
event_id = data.get("id")
|
||||||
|
if not event_id:
|
||||||
|
raise RuntimeError("Core Hub interaction event response did not include an id")
|
||||||
|
if not _core_hub_event_exists(base_url, runtime_token, str(event_id), timeout):
|
||||||
|
raise RuntimeError("Core Hub interaction event was not visible after create")
|
||||||
|
|
||||||
|
return {
|
||||||
|
"type": sink.get("type"),
|
||||||
|
"status": "posted",
|
||||||
|
"event_type": data.get("eventType", event_type),
|
||||||
|
"event_id": event_id,
|
||||||
|
"widget_id": data.get("widgetId", widget_id),
|
||||||
|
"verified": True,
|
||||||
|
"context_key": context_key,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def _core_hub_headers(runtime_token: str) -> dict[str, str]:
|
||||||
|
return {
|
||||||
|
"Accept": "application/json",
|
||||||
|
"Authorization": f"Bearer {runtime_token}",
|
||||||
|
"Content-Type": "application/json",
|
||||||
|
"User-Agent": "activity-core-ops-evidence/0.1",
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def _core_hub_runtime_token(sink: dict[str, Any]) -> str:
|
||||||
|
token_file = (
|
||||||
|
sink.get("runtime_token_file")
|
||||||
|
or sink.get("token_file")
|
||||||
|
or os.environ.get("CORE_HUB_RUNTIME_TOKEN_FILE")
|
||||||
|
)
|
||||||
|
if token_file:
|
||||||
|
return Path(str(token_file)).read_text(encoding="utf-8").strip()
|
||||||
|
env_name = (
|
||||||
|
sink.get("runtime_token_env")
|
||||||
|
or os.environ.get("CORE_HUB_RUNTIME_TOKEN_ENV")
|
||||||
|
or "CORE_HUB_RUNTIME_TOKEN"
|
||||||
|
)
|
||||||
|
return os.environ.get(str(env_name), "").strip()
|
||||||
|
|
||||||
|
|
||||||
|
def _core_hub_widget_id(sink: dict[str, Any], probe_result: dict[str, Any]) -> str:
|
||||||
|
direct = sink.get("widget_id") or os.environ.get("CORE_HUB_WIDGET_ID")
|
||||||
|
if direct:
|
||||||
|
return str(direct)
|
||||||
|
|
||||||
|
endpoint = _selected_endpoint(probe_result, sink)
|
||||||
|
widget_ref = endpoint.get("widget_ref") if endpoint else None
|
||||||
|
if not widget_ref:
|
||||||
|
return ""
|
||||||
|
|
||||||
|
mapping = sink.get("widget_mapping") or sink.get("capability_mapping")
|
||||||
|
if mapping is None:
|
||||||
|
mapping = os.environ.get("CORE_HUB_WIDGET_MAPPING")
|
||||||
|
parsed = _parse_widget_mapping(mapping)
|
||||||
|
return parsed.get(str(widget_ref), "")
|
||||||
|
|
||||||
|
|
||||||
|
def _parse_widget_mapping(raw: Any) -> dict[str, str]:
|
||||||
|
if isinstance(raw, dict):
|
||||||
|
return {str(key): str(value) for key, value in raw.items() if value}
|
||||||
|
if not isinstance(raw, str) or not raw.strip():
|
||||||
|
return {}
|
||||||
|
value = raw.strip()
|
||||||
|
if value.startswith("{"):
|
||||||
|
try:
|
||||||
|
loaded = json.loads(value)
|
||||||
|
except json.JSONDecodeError:
|
||||||
|
return {}
|
||||||
|
if isinstance(loaded, dict):
|
||||||
|
return {str(key): str(item) for key, item in loaded.items() if item}
|
||||||
|
return {}
|
||||||
|
if "=" not in value:
|
||||||
|
return {}
|
||||||
|
pairs: dict[str, str] = {}
|
||||||
|
for part in value.split(","):
|
||||||
|
key, _, item = part.partition("=")
|
||||||
|
if key.strip() and item.strip():
|
||||||
|
pairs[key.strip()] = item.strip()
|
||||||
|
return pairs
|
||||||
|
|
||||||
|
|
||||||
|
def _selected_endpoint(probe_result: dict[str, Any], sink: dict[str, Any]) -> dict[str, Any]:
|
||||||
|
endpoints = [
|
||||||
|
endpoint
|
||||||
|
for endpoint in probe_result.get("endpoints", [])
|
||||||
|
if isinstance(endpoint, dict)
|
||||||
|
]
|
||||||
|
endpoint_id = sink.get("endpoint_id")
|
||||||
|
if endpoint_id:
|
||||||
|
match = next(
|
||||||
|
(endpoint for endpoint in endpoints if endpoint.get("endpoint_id") == endpoint_id),
|
||||||
|
None,
|
||||||
|
)
|
||||||
|
if match:
|
||||||
|
return match
|
||||||
|
return next(
|
||||||
|
(endpoint for endpoint in endpoints if endpoint.get("widget_ref")),
|
||||||
|
endpoints[0] if endpoints else {},
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _core_hub_view_context(
|
||||||
|
payload: dict[str, Any],
|
||||||
|
context_key: str,
|
||||||
|
endpoint: dict[str, Any],
|
||||||
|
sink: dict[str, Any],
|
||||||
|
) -> str:
|
||||||
|
return str(
|
||||||
|
sink.get("view_context")
|
||||||
|
or endpoint.get("view_context")
|
||||||
|
or f"activity-core/ops-inventory/{payload.get('run_id', 'unknown')}/{context_key}"
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _core_hub_metadata(
|
||||||
|
payload: dict[str, Any],
|
||||||
|
context_key: str,
|
||||||
|
probe_result: dict[str, Any],
|
||||||
|
endpoint: dict[str, Any],
|
||||||
|
) -> dict[str, Any]:
|
||||||
|
compact = _compact_probe_result(probe_result)
|
||||||
|
return {
|
||||||
|
"activity_id": payload.get("activity_id"),
|
||||||
|
"activity_core_run_id": payload.get("run_id"),
|
||||||
|
"scheduled_for": payload.get("scheduled_for"),
|
||||||
|
"source_type": "ops-inventory",
|
||||||
|
"context_key": context_key,
|
||||||
|
"probe": {
|
||||||
|
"generated_at": compact.get("generated_at"),
|
||||||
|
"inventory_path": compact.get("inventory_path"),
|
||||||
|
"status": compact.get("status"),
|
||||||
|
"reason": compact.get("reason"),
|
||||||
|
"summary": compact.get("summary", {}),
|
||||||
|
},
|
||||||
|
"endpoint": _compact_endpoint(endpoint) if endpoint else {},
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def _core_hub_event_exists(
|
||||||
|
base_url: str,
|
||||||
|
runtime_token: str,
|
||||||
|
event_id: str,
|
||||||
|
timeout: float,
|
||||||
|
) -> bool:
|
||||||
|
resp = httpx.get(
|
||||||
|
f"{base_url}/api/v2/interaction-events",
|
||||||
|
headers=_core_hub_headers(runtime_token),
|
||||||
|
timeout=timeout,
|
||||||
|
)
|
||||||
|
resp.raise_for_status()
|
||||||
|
payload = resp.json()
|
||||||
|
data = payload.get("data") if isinstance(payload, dict) else []
|
||||||
|
if not isinstance(data, list):
|
||||||
|
return False
|
||||||
|
return any(isinstance(item, dict) and item.get("id") == event_id for item in data)
|
||||||
|
|
||||||
def _inter_hub_result(sink: dict[str, Any]) -> dict[str, Any]:
|
def _inter_hub_result(sink: dict[str, Any]) -> dict[str, Any]:
|
||||||
missing: list[str] = []
|
missing: list[str] = []
|
||||||
if not (sink.get("inter_hub_url") or os.environ.get("INTER_HUB_URL")):
|
if not (sink.get("inter_hub_url") or os.environ.get("INTER_HUB_URL")):
|
||||||
|
|||||||
@@ -11,6 +11,8 @@ from zoneinfo import ZoneInfo
|
|||||||
|
|
||||||
import httpx
|
import httpx
|
||||||
|
|
||||||
|
from activity_core.state_hub_write import idempotency_headers
|
||||||
|
|
||||||
_DEFAULT_STATE_HUB_URL = "http://127.0.0.1:8000"
|
_DEFAULT_STATE_HUB_URL = "http://127.0.0.1:8000"
|
||||||
_THE_CUSTODIAN_ROOT = Path("/home/worsch/the-custodian")
|
_THE_CUSTODIAN_ROOT = Path("/home/worsch/the-custodian")
|
||||||
_FORBIDDEN_CUSTODIAN_ROOTS = (
|
_FORBIDDEN_CUSTODIAN_ROOTS = (
|
||||||
@@ -149,6 +151,7 @@ def _post_state_hub_progress(
|
|||||||
resp = httpx.post(
|
resp = httpx.post(
|
||||||
f"{base_url}/progress/",
|
f"{base_url}/progress/",
|
||||||
json=body,
|
json=body,
|
||||||
|
headers=idempotency_headers(run_id, instruction_id, event_type),
|
||||||
timeout=float(sink.get("timeout_seconds", 10.0)),
|
timeout=float(sink.get("timeout_seconds", 10.0)),
|
||||||
)
|
)
|
||||||
resp.raise_for_status()
|
resp.raise_for_status()
|
||||||
@@ -167,12 +170,18 @@ def _progress_exists(
|
|||||||
instruction_id: str,
|
instruction_id: str,
|
||||||
event_type: str,
|
event_type: str,
|
||||||
) -> bool:
|
) -> bool:
|
||||||
resp = httpx.get(
|
# Best-effort read-dedup optimisation only. The Idempotency-Key header on the
|
||||||
f"{base_url}/progress/",
|
# write is the real guarantee; if State Hub is unreachable here we must not
|
||||||
params={"limit": 100},
|
# hard-fail — proceed to the (keyed) write rather than raising.
|
||||||
timeout=10.0,
|
try:
|
||||||
)
|
resp = httpx.get(
|
||||||
resp.raise_for_status()
|
f"{base_url}/progress/",
|
||||||
|
params={"limit": 100},
|
||||||
|
timeout=10.0,
|
||||||
|
)
|
||||||
|
resp.raise_for_status()
|
||||||
|
except httpx.HTTPError:
|
||||||
|
return False
|
||||||
for item in resp.json():
|
for item in resp.json():
|
||||||
detail = item.get("detail") or {}
|
detail = item.get("detail") or {}
|
||||||
if (
|
if (
|
||||||
|
|||||||
@@ -160,15 +160,20 @@ def _execute(
|
|||||||
prompt_hash = hashlib.sha256(rendered.encode()).hexdigest()
|
prompt_hash = hashlib.sha256(rendered.encode()).hexdigest()
|
||||||
llm_config = _llm_run_config(instr)
|
llm_config = _llm_run_config(instr)
|
||||||
|
|
||||||
|
# Reference allow-list (WP-0016-T04): if a context resolver supplied the set
|
||||||
|
# of known candidate ids, recommendations pointing at anything else are
|
||||||
|
# quarantined. Absent (None) today → the check is inert until wired.
|
||||||
|
allow_list = _allow_list_from_context(context)
|
||||||
|
|
||||||
# Step 3 — call LLM
|
# Step 3 — call LLM
|
||||||
raw_output = llm_client.complete(rendered, model=instr.model, config=llm_config)
|
raw_output = llm_client.complete(rendered, model=instr.model, config=llm_config)
|
||||||
|
|
||||||
# Step 4 — validate and optionally retry
|
# Step 4 — validate and optionally retry
|
||||||
task_specs, report, error = _validate_output(raw_output, instr)
|
task_specs, report, error = _validate_output(raw_output, instr, allow_list)
|
||||||
if error:
|
if error:
|
||||||
retry_prompt = rendered + f"\n\nPrevious output was invalid: {error}\nPlease fix."
|
retry_prompt = rendered + f"\n\nPrevious output was invalid: {error}\nPlease fix."
|
||||||
raw_output = llm_client.complete(retry_prompt, model=instr.model, config=llm_config)
|
raw_output = llm_client.complete(retry_prompt, model=instr.model, config=llm_config)
|
||||||
task_specs, report, error = _validate_output(raw_output, instr)
|
task_specs, report, error = _validate_output(raw_output, instr, allow_list)
|
||||||
if error:
|
if error:
|
||||||
# Truncate to keep log volume bounded but long enough to see the
|
# Truncate to keep log volume bounded but long enough to see the
|
||||||
# actual JSON shape mismatch (typical reports are <2KB).
|
# actual JSON shape mismatch (typical reports are <2KB).
|
||||||
@@ -178,6 +183,14 @@ def _execute(
|
|||||||
"error=%s, raw_output_preview=%r",
|
"error=%s, raw_output_preview=%r",
|
||||||
instr.id, prompt_hash, error, preview,
|
instr.id, prompt_hash, error, preview,
|
||||||
)
|
)
|
||||||
|
# Posture B (WP-0016-T03): try to recover a partial-but-usable
|
||||||
|
# report from individually-parseable items before declaring total
|
||||||
|
# loss. One bad item should cost one item, not the whole report.
|
||||||
|
recovered = _resilient_report(
|
||||||
|
instr, raw_output, error, prompt_hash, allow_list,
|
||||||
|
)
|
||||||
|
if recovered is not None:
|
||||||
|
return recovered
|
||||||
failure_report = _invalid_output_report(instr, error, raw_output)
|
failure_report = _invalid_output_report(instr, error, raw_output)
|
||||||
if failure_report is not None:
|
if failure_report is not None:
|
||||||
return InstructionResult(
|
return InstructionResult(
|
||||||
@@ -279,6 +292,320 @@ def _invalid_output_report(
|
|||||||
return report
|
return report
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Resilient report recovery (ACTIVITY-WP-0016-T03)
|
||||||
|
#
|
||||||
|
# Posture B — verify & mitigate at the producer→consumer boundary. When the
|
||||||
|
# whole-document parse/validate fails, recover individually-parseable
|
||||||
|
# recommendation objects, validate each against the item schema, keep the valid
|
||||||
|
# ones, and quarantine the malformed/over-limit ones with provenance. One bad
|
||||||
|
# item costs one item, not the whole report (error locality == unit of work).
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
_QUARANTINE_LIMIT = 20
|
||||||
|
_SNIPPET_LIMIT = 200
|
||||||
|
# Producer guardrails (ACTIVITY-WP-0016-T04): structural bounds applied to every
|
||||||
|
# recommendation regardless of producer (LLM, agent, or human). These are
|
||||||
|
# verify-and-mitigate limits — an offending item is quarantined, never allowed to
|
||||||
|
# fail the whole report or flow unbounded into a downstream consumer.
|
||||||
|
_MAX_STRING_LEN = 4000
|
||||||
|
_MAX_DEPTH = 8
|
||||||
|
_SUMMARY_RE = re.compile(r'"summary"\s*:\s*"((?:[^"\\]|\\.)*)"')
|
||||||
|
|
||||||
|
|
||||||
|
def _snippet(value: Any) -> str:
|
||||||
|
text = value if isinstance(value, str) else json.dumps(value, default=str)
|
||||||
|
return text[:_SNIPPET_LIMIT]
|
||||||
|
|
||||||
|
|
||||||
|
def _json_depth(value: Any, depth: int = 1) -> int:
|
||||||
|
if depth > _MAX_DEPTH:
|
||||||
|
return depth
|
||||||
|
if isinstance(value, dict):
|
||||||
|
return max((_json_depth(v, depth + 1) for v in value.values()), default=depth)
|
||||||
|
if isinstance(value, list):
|
||||||
|
return max((_json_depth(v, depth + 1) for v in value), default=depth)
|
||||||
|
return depth
|
||||||
|
|
||||||
|
|
||||||
|
def _has_oversized_string(value: Any) -> bool:
|
||||||
|
if isinstance(value, str):
|
||||||
|
return len(value) > _MAX_STRING_LEN
|
||||||
|
if isinstance(value, dict):
|
||||||
|
return any(_has_oversized_string(v) for v in value.values())
|
||||||
|
if isinstance(value, list):
|
||||||
|
return any(_has_oversized_string(v) for v in value)
|
||||||
|
return False
|
||||||
|
|
||||||
|
|
||||||
|
def _item_structure_error(item: Any) -> str | None:
|
||||||
|
"""Producer-agnostic structural guardrail: depth and string-length caps."""
|
||||||
|
if _json_depth(item) > _MAX_DEPTH:
|
||||||
|
return f"exceeds max nesting depth {_MAX_DEPTH}"
|
||||||
|
if _has_oversized_string(item):
|
||||||
|
return f"contains a string longer than {_MAX_STRING_LEN} chars"
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def _allow_list_from_context(context: dict | None) -> set[str] | None:
|
||||||
|
"""Build the recommendation-candidate allow-list from resolved context.
|
||||||
|
|
||||||
|
Looks for `context["known_candidates"]` (a list/set of valid candidate ids).
|
||||||
|
Returns None when absent so the allow-list check stays inert until a context
|
||||||
|
resolver populates it — the guardrail capability ships now; activation is a
|
||||||
|
one-line resolver change.
|
||||||
|
"""
|
||||||
|
if not isinstance(context, dict):
|
||||||
|
return None
|
||||||
|
known = context.get("known_candidates")
|
||||||
|
if isinstance(known, (list, set, tuple)):
|
||||||
|
return {str(item) for item in known}
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def _report_contract(instr: Any) -> tuple[dict[str, Any] | None, int | None]:
|
||||||
|
"""Extract (item_schema, max_items) for the recommendations list, if any."""
|
||||||
|
try:
|
||||||
|
schema = _load_output_schema(getattr(instr, "output_schema", ""))
|
||||||
|
except (OSError, json.JSONDecodeError, TypeError):
|
||||||
|
return None, None
|
||||||
|
if not isinstance(schema, dict):
|
||||||
|
return None, None
|
||||||
|
recs = (schema.get("properties") or {}).get("recommendations")
|
||||||
|
if not isinstance(recs, dict):
|
||||||
|
return None, None
|
||||||
|
item_schema = recs.get("items") if isinstance(recs.get("items"), dict) else None
|
||||||
|
max_items = recs.get("maxItems") if isinstance(recs.get("maxItems"), int) else None
|
||||||
|
return item_schema, max_items
|
||||||
|
|
||||||
|
|
||||||
|
def _extract_object_spans(raw: str) -> list[tuple[str, bool]]:
|
||||||
|
"""Return (span, complete) for each recommendation object in raw output.
|
||||||
|
|
||||||
|
Scans the `recommendations` array brace-aware and string-aware so it recovers
|
||||||
|
objects whether they are pretty-printed across many lines or emitted one per
|
||||||
|
line (NDJSON). A truncated trailing object is returned with complete=False.
|
||||||
|
"""
|
||||||
|
key = raw.find('"recommendations"')
|
||||||
|
start_region = raw.find("[", key) if key >= 0 else -1
|
||||||
|
if start_region < 0:
|
||||||
|
return []
|
||||||
|
spans: list[tuple[str, bool]] = []
|
||||||
|
i, n = start_region + 1, len(raw)
|
||||||
|
while i < n:
|
||||||
|
ch = raw[i]
|
||||||
|
if ch == "]":
|
||||||
|
break
|
||||||
|
if ch != "{":
|
||||||
|
i += 1
|
||||||
|
continue
|
||||||
|
depth, in_str, esc, j = 0, False, False, i
|
||||||
|
closed = False
|
||||||
|
while j < n:
|
||||||
|
c = raw[j]
|
||||||
|
if in_str:
|
||||||
|
if esc:
|
||||||
|
esc = False
|
||||||
|
elif c == "\\":
|
||||||
|
esc = True
|
||||||
|
elif c == '"':
|
||||||
|
in_str = False
|
||||||
|
elif c == '"':
|
||||||
|
in_str = True
|
||||||
|
elif c == "{":
|
||||||
|
depth += 1
|
||||||
|
elif c == "}":
|
||||||
|
depth -= 1
|
||||||
|
if depth == 0:
|
||||||
|
spans.append((raw[i:j + 1], True))
|
||||||
|
closed = True
|
||||||
|
break
|
||||||
|
j += 1
|
||||||
|
if not closed:
|
||||||
|
spans.append((raw[i:], False)) # truncated tail
|
||||||
|
break
|
||||||
|
i = j + 1
|
||||||
|
return spans
|
||||||
|
|
||||||
|
|
||||||
|
def _try_repair(span: str) -> str:
|
||||||
|
"""Best-effort close of a truncated JSON object: balance quote, braces, brackets."""
|
||||||
|
in_str, esc, depth_c, depth_b = False, False, 0, 0
|
||||||
|
for c in span:
|
||||||
|
if in_str:
|
||||||
|
if esc:
|
||||||
|
esc = False
|
||||||
|
elif c == "\\":
|
||||||
|
esc = True
|
||||||
|
elif c == '"':
|
||||||
|
in_str = False
|
||||||
|
elif c == '"':
|
||||||
|
in_str = True
|
||||||
|
elif c == "{":
|
||||||
|
depth_c += 1
|
||||||
|
elif c == "}":
|
||||||
|
depth_c -= 1
|
||||||
|
elif c == "[":
|
||||||
|
depth_b += 1
|
||||||
|
elif c == "]":
|
||||||
|
depth_b -= 1
|
||||||
|
repaired = span.rstrip().rstrip(",")
|
||||||
|
if in_str:
|
||||||
|
repaired += '"'
|
||||||
|
return repaired + "]" * max(depth_b, 0) + "}" * max(depth_c, 0)
|
||||||
|
|
||||||
|
|
||||||
|
def _recover_recommendations(
|
||||||
|
raw: str,
|
||||||
|
) -> tuple[str | None, list[dict[str, Any]], list[dict[str, Any]]]:
|
||||||
|
"""Recover (summary, items, quarantined) from a failed report payload."""
|
||||||
|
summary_match = _SUMMARY_RE.search(raw)
|
||||||
|
summary = None
|
||||||
|
if summary_match:
|
||||||
|
try:
|
||||||
|
summary = json.loads(f'"{summary_match.group(1)}"')
|
||||||
|
except json.JSONDecodeError:
|
||||||
|
summary = summary_match.group(1)
|
||||||
|
items: list[dict[str, Any]] = []
|
||||||
|
quarantined: list[dict[str, Any]] = []
|
||||||
|
for index, (span, complete) in enumerate(_extract_object_spans(raw)):
|
||||||
|
parsed: Any = None
|
||||||
|
try:
|
||||||
|
parsed = json.loads(span)
|
||||||
|
except json.JSONDecodeError as exc:
|
||||||
|
if not complete:
|
||||||
|
try:
|
||||||
|
parsed = json.loads(_try_repair(span))
|
||||||
|
except json.JSONDecodeError:
|
||||||
|
parsed = None
|
||||||
|
if parsed is None:
|
||||||
|
quarantined.append(
|
||||||
|
{"index": index, "error": str(exc), "raw": _snippet(span),
|
||||||
|
"reason": "truncated" if not complete else "unparseable"}
|
||||||
|
)
|
||||||
|
continue
|
||||||
|
if isinstance(parsed, dict):
|
||||||
|
items.append(parsed)
|
||||||
|
else:
|
||||||
|
quarantined.append(
|
||||||
|
{"index": index, "error": "item is not a JSON object",
|
||||||
|
"raw": _snippet(span)}
|
||||||
|
)
|
||||||
|
return summary, items, quarantined
|
||||||
|
|
||||||
|
|
||||||
|
def _partition_items(
|
||||||
|
items: list[dict[str, Any]],
|
||||||
|
item_schema: dict[str, Any] | None,
|
||||||
|
max_items: int | None,
|
||||||
|
*,
|
||||||
|
run_schema: bool = True,
|
||||||
|
allow_list: set[str] | None = None,
|
||||||
|
) -> tuple[list[dict[str, Any]], list[dict[str, Any]]]:
|
||||||
|
"""Screen items into (valid, quarantined).
|
||||||
|
|
||||||
|
Applied uniformly to recovered items (run_schema=True) and to already
|
||||||
|
schema-valid happy-path items (run_schema=False). Order of checks: structural
|
||||||
|
type → schema → producer guardrails (depth/length) → reference allow-list →
|
||||||
|
count cap. The first failing check quarantines the item with provenance.
|
||||||
|
"""
|
||||||
|
valid: list[dict[str, Any]] = []
|
||||||
|
quarantined: list[dict[str, Any]] = []
|
||||||
|
for index, item in enumerate(items):
|
||||||
|
if not isinstance(item, dict):
|
||||||
|
quarantined.append(
|
||||||
|
{"index": index, "error": "item is not a JSON object",
|
||||||
|
"raw": _snippet(item), "reason": "malformed"}
|
||||||
|
)
|
||||||
|
continue
|
||||||
|
schema_error = (
|
||||||
|
_validate_schema_node(item, item_schema, f"recommendations[{index}]")
|
||||||
|
if (run_schema and item_schema)
|
||||||
|
else None
|
||||||
|
)
|
||||||
|
if schema_error:
|
||||||
|
quarantined.append(
|
||||||
|
{"index": index, "error": schema_error, "raw": _snippet(item),
|
||||||
|
"reason": "schema"}
|
||||||
|
)
|
||||||
|
continue
|
||||||
|
structure_error = _item_structure_error(item)
|
||||||
|
if structure_error:
|
||||||
|
quarantined.append(
|
||||||
|
{"index": index, "error": structure_error, "raw": _snippet(item),
|
||||||
|
"reason": "guardrail"}
|
||||||
|
)
|
||||||
|
continue
|
||||||
|
if allow_list is not None:
|
||||||
|
candidate = item.get("candidate")
|
||||||
|
if not isinstance(candidate, str) or candidate not in allow_list:
|
||||||
|
quarantined.append(
|
||||||
|
{"index": index, "error": f"candidate {candidate!r} not in allow-list",
|
||||||
|
"raw": _snippet(item), "reason": "allow_list"}
|
||||||
|
)
|
||||||
|
continue
|
||||||
|
valid.append(item)
|
||||||
|
if max_items is not None and len(valid) > max_items:
|
||||||
|
for item in valid[max_items:]:
|
||||||
|
quarantined.append(
|
||||||
|
{"index": None, "error": f"exceeds maxItems={max_items}",
|
||||||
|
"raw": _snippet(item), "reason": "over_limit"}
|
||||||
|
)
|
||||||
|
valid = valid[:max_items]
|
||||||
|
return valid, quarantined
|
||||||
|
|
||||||
|
|
||||||
|
def _resilient_report(
|
||||||
|
instr: Any,
|
||||||
|
raw_output: Any,
|
||||||
|
original_error: str,
|
||||||
|
prompt_hash: str | None,
|
||||||
|
allow_list: set[str] | None = None,
|
||||||
|
) -> InstructionResult | None:
|
||||||
|
"""Recover a partial-but-usable report from output that failed validation.
|
||||||
|
|
||||||
|
Returns None when nothing usable can be recovered, so the caller falls back
|
||||||
|
to the total-loss diagnostic artifact (_invalid_output_report).
|
||||||
|
"""
|
||||||
|
if not getattr(instr, "report_sinks", None) or not isinstance(raw_output, str):
|
||||||
|
return None
|
||||||
|
item_schema, max_items = _report_contract(instr)
|
||||||
|
summary, items, quarantined = _recover_recommendations(raw_output)
|
||||||
|
if not items:
|
||||||
|
return None
|
||||||
|
valid, item_quarantine = _partition_items(
|
||||||
|
items, item_schema, max_items, allow_list=allow_list,
|
||||||
|
)
|
||||||
|
quarantined.extend(item_quarantine)
|
||||||
|
if not valid:
|
||||||
|
return None
|
||||||
|
report: dict[str, Any] = {
|
||||||
|
"summary": summary
|
||||||
|
or f"Partial daily triage: recovered {len(valid)} recommendation(s) "
|
||||||
|
"after the full report failed validation.",
|
||||||
|
"recommendations": valid,
|
||||||
|
"status": "partial",
|
||||||
|
"partial": True,
|
||||||
|
"quarantined_count": len(quarantined),
|
||||||
|
"quarantined_items": quarantined[:_QUARANTINE_LIMIT],
|
||||||
|
"recovery_note": f"original validation error: {original_error}",
|
||||||
|
}
|
||||||
|
logger.warning(
|
||||||
|
"instruction_output_recovered: instruction=%r, kept=%d, quarantined=%d",
|
||||||
|
getattr(instr, "id", None), len(valid), len(quarantined),
|
||||||
|
)
|
||||||
|
return InstructionResult(
|
||||||
|
tasks=[],
|
||||||
|
report=report,
|
||||||
|
prompt_hash=prompt_hash,
|
||||||
|
model=getattr(instr, "model", None),
|
||||||
|
output_validated=True,
|
||||||
|
review_required=True,
|
||||||
|
condition_matched=getattr(instr, "condition", "") or None,
|
||||||
|
validation_error=None,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
def _execution_failure_report(instr: Any, error: str) -> dict[str, Any] | None:
|
def _execution_failure_report(instr: Any, error: str) -> dict[str, Any] | None:
|
||||||
"""Build a durable diagnostic report when a report instruction cannot run."""
|
"""Build a durable diagnostic report when a report instruction cannot run."""
|
||||||
if not getattr(instr, "report_sinks", None):
|
if not getattr(instr, "report_sinks", None):
|
||||||
@@ -295,6 +622,7 @@ def _execution_failure_report(instr: Any, error: str) -> dict[str, Any] | None:
|
|||||||
def _validate_output(
|
def _validate_output(
|
||||||
raw_output: Any,
|
raw_output: Any,
|
||||||
instr: Any,
|
instr: Any,
|
||||||
|
allow_list: set[str] | None = None,
|
||||||
) -> tuple[list[TaskSpec], dict[str, Any] | None, str | None]:
|
) -> tuple[list[TaskSpec], dict[str, Any] | None, str | None]:
|
||||||
"""Parse raw LLM output into TaskSpecs and optional report payload.
|
"""Parse raw LLM output into TaskSpecs and optional report payload.
|
||||||
|
|
||||||
@@ -349,6 +677,28 @@ def _validate_output(
|
|||||||
source_type="instruction",
|
source_type="instruction",
|
||||||
source_id=instr.id,
|
source_id=instr.id,
|
||||||
))
|
))
|
||||||
|
|
||||||
|
# Happy-path producer guardrails (WP-0016-T04): the whole document already
|
||||||
|
# passed schema validation, so recommendations are schema-valid; still apply
|
||||||
|
# the count cap, structural caps, and reference allow-list, quarantining any
|
||||||
|
# offenders rather than emitting them. Report shape only changes when an item
|
||||||
|
# is actually quarantined.
|
||||||
|
if isinstance(report, dict) and isinstance(report.get("recommendations"), list):
|
||||||
|
item_schema, max_items = _report_contract(instr)
|
||||||
|
kept, quarantined = _partition_items(
|
||||||
|
report["recommendations"], item_schema, max_items,
|
||||||
|
run_schema=False, allow_list=allow_list,
|
||||||
|
)
|
||||||
|
if quarantined:
|
||||||
|
report = {
|
||||||
|
**report,
|
||||||
|
"recommendations": kept,
|
||||||
|
"status": "partial",
|
||||||
|
"partial": True,
|
||||||
|
"quarantined_count": len(quarantined),
|
||||||
|
"quarantined_items": quarantined[:_QUARANTINE_LIMIT],
|
||||||
|
}
|
||||||
|
|
||||||
return specs, report, None
|
return specs, report, None
|
||||||
except (json.JSONDecodeError, AttributeError, KeyError, TypeError) as exc:
|
except (json.JSONDecodeError, AttributeError, KeyError, TypeError) as exc:
|
||||||
return [], None, str(exc)
|
return [], None, str(exc)
|
||||||
|
|||||||
194
src/activity_core/schedule_health.py
Normal file
194
src/activity_core/schedule_health.py
Normal file
@@ -0,0 +1,194 @@
|
|||||||
|
"""Missed-fire detection for cron schedules (ACTIVITY-WP-0014, T03).
|
||||||
|
|
||||||
|
Even with a catchup window configured, an operator wants to *know* when a fire
|
||||||
|
was missed — especially under ``misfire_policy: skip`` where missed fires are
|
||||||
|
dropped by design and leave no run and no failure event. This module turns the
|
||||||
|
schedule's own bookkeeping into an explicit verdict and an optional State Hub
|
||||||
|
alert so a miss is never invisible again.
|
||||||
|
|
||||||
|
Temporal already counts fires that were dropped because they fell outside the
|
||||||
|
catchup window in ``ScheduleInfo.num_actions_missed_catchup_window``. We surface
|
||||||
|
that, plus a staleness check on the most recent fire, as a ``ScheduleHealth``
|
||||||
|
verdict. The verdict logic is a pure function so it is testable without a live
|
||||||
|
Temporal server; ``check_schedule_health`` is the thin async reader.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import os
|
||||||
|
from dataclasses import dataclass, field
|
||||||
|
from datetime import datetime, timedelta, timezone
|
||||||
|
from typing import Any
|
||||||
|
from uuid import UUID
|
||||||
|
|
||||||
|
import httpx
|
||||||
|
|
||||||
|
from activity_core.schedule_manager import schedule_id
|
||||||
|
from activity_core.state_hub_write import idempotency_headers
|
||||||
|
|
||||||
|
_DEFAULT_STATE_HUB_URL = "http://127.0.0.1:8000"
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass(frozen=True)
|
||||||
|
class ScheduleHealth:
|
||||||
|
"""Verdict for a single schedule's recent firing behaviour."""
|
||||||
|
|
||||||
|
activity_id: str
|
||||||
|
healthy: bool
|
||||||
|
missed_catchup_window: int
|
||||||
|
last_fired_at: datetime | None
|
||||||
|
staleness: timedelta | None
|
||||||
|
reasons: list[str] = field(default_factory=list)
|
||||||
|
|
||||||
|
@property
|
||||||
|
def missed(self) -> bool:
|
||||||
|
return not self.healthy
|
||||||
|
|
||||||
|
|
||||||
|
def evaluate_schedule_health(
|
||||||
|
*,
|
||||||
|
activity_id: str,
|
||||||
|
missed_catchup_window: int,
|
||||||
|
last_fired_at: datetime | None,
|
||||||
|
now: datetime,
|
||||||
|
expected_interval: timedelta | None = None,
|
||||||
|
tolerance: timedelta = timedelta(minutes=10),
|
||||||
|
) -> ScheduleHealth:
|
||||||
|
"""Pure verdict: was a fire missed?
|
||||||
|
|
||||||
|
A schedule is unhealthy if Temporal dropped any fire past the catchup window,
|
||||||
|
or — when ``expected_interval`` is known — if the most recent fire is older
|
||||||
|
than one interval plus ``tolerance`` (i.e. a fire should have happened and
|
||||||
|
did not).
|
||||||
|
"""
|
||||||
|
reasons: list[str] = []
|
||||||
|
|
||||||
|
if missed_catchup_window > 0:
|
||||||
|
reasons.append(
|
||||||
|
f"{missed_catchup_window} fire(s) dropped outside the catchup window"
|
||||||
|
)
|
||||||
|
|
||||||
|
staleness: timedelta | None = None
|
||||||
|
if last_fired_at is not None:
|
||||||
|
staleness = now - last_fired_at
|
||||||
|
if expected_interval is not None and staleness > expected_interval + tolerance:
|
||||||
|
reasons.append(
|
||||||
|
f"last fire was {staleness} ago, exceeding the expected "
|
||||||
|
f"{expected_interval} interval"
|
||||||
|
)
|
||||||
|
elif expected_interval is not None:
|
||||||
|
reasons.append("no recorded fire for a schedule that should have fired")
|
||||||
|
|
||||||
|
return ScheduleHealth(
|
||||||
|
activity_id=activity_id,
|
||||||
|
healthy=not reasons,
|
||||||
|
missed_catchup_window=missed_catchup_window,
|
||||||
|
last_fired_at=last_fired_at,
|
||||||
|
staleness=staleness,
|
||||||
|
reasons=reasons,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _extract_info(desc: Any) -> tuple[int, datetime | None]:
|
||||||
|
"""Pull (missed_catchup_window, last_fired_at) from a ScheduleDescription.
|
||||||
|
|
||||||
|
Accesses are defensive so a Temporal SDK field rename degrades to "unknown"
|
||||||
|
rather than raising inside an operational health check.
|
||||||
|
"""
|
||||||
|
info = getattr(desc, "info", None)
|
||||||
|
missed = int(getattr(info, "num_actions_missed_catchup_window", 0) or 0)
|
||||||
|
|
||||||
|
last_fired: datetime | None = None
|
||||||
|
recent = getattr(info, "recent_actions", None) or []
|
||||||
|
times = [
|
||||||
|
getattr(a, "scheduled_at", None) or getattr(a, "started_at", None)
|
||||||
|
for a in recent
|
||||||
|
]
|
||||||
|
times = [t for t in times if t is not None]
|
||||||
|
if times:
|
||||||
|
last_fired = max(times)
|
||||||
|
return missed, last_fired
|
||||||
|
|
||||||
|
|
||||||
|
async def check_schedule_health(
|
||||||
|
client: Any,
|
||||||
|
activity_id: str | UUID,
|
||||||
|
*,
|
||||||
|
now: datetime | None = None,
|
||||||
|
expected_interval: timedelta | None = None,
|
||||||
|
tolerance: timedelta = timedelta(minutes=10),
|
||||||
|
) -> ScheduleHealth:
|
||||||
|
"""Describe the schedule for ``activity_id`` and evaluate its health."""
|
||||||
|
now = now or datetime.now(tz=timezone.utc)
|
||||||
|
handle = client.get_schedule_handle(schedule_id(activity_id))
|
||||||
|
desc = await handle.describe()
|
||||||
|
missed, last_fired = _extract_info(desc)
|
||||||
|
return evaluate_schedule_health(
|
||||||
|
activity_id=str(activity_id),
|
||||||
|
missed_catchup_window=missed,
|
||||||
|
last_fired_at=last_fired,
|
||||||
|
now=now,
|
||||||
|
expected_interval=expected_interval,
|
||||||
|
tolerance=tolerance,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def post_missed_fire_alert(
|
||||||
|
health: ScheduleHealth,
|
||||||
|
*,
|
||||||
|
state_hub_url: str | None = None,
|
||||||
|
author: str = "activity-core",
|
||||||
|
topic_id: str | None = None,
|
||||||
|
workstream_id: str | None = None,
|
||||||
|
timeout_seconds: float = 10.0,
|
||||||
|
) -> dict[str, Any]:
|
||||||
|
"""Post a ``schedule_miss`` progress event to State Hub for an unhealthy schedule.
|
||||||
|
|
||||||
|
No-op (returns ``status: ok``) when the schedule is healthy, so callers can
|
||||||
|
invoke unconditionally.
|
||||||
|
"""
|
||||||
|
if health.healthy:
|
||||||
|
return {"type": "schedule-miss-alert", "status": "ok"}
|
||||||
|
|
||||||
|
base_url = state_hub_url or os.environ.get("STATE_HUB_URL", _DEFAULT_STATE_HUB_URL)
|
||||||
|
base_url = str(base_url).rstrip("/")
|
||||||
|
|
||||||
|
body: dict[str, Any] = {
|
||||||
|
"event_type": "schedule_miss",
|
||||||
|
"author": author,
|
||||||
|
"summary": (
|
||||||
|
f"Schedule {health.activity_id} missed a fire: "
|
||||||
|
+ "; ".join(health.reasons)
|
||||||
|
),
|
||||||
|
"detail": {
|
||||||
|
"activity_id": health.activity_id,
|
||||||
|
"missed_catchup_window": health.missed_catchup_window,
|
||||||
|
"last_fired_at": (
|
||||||
|
health.last_fired_at.isoformat() if health.last_fired_at else None
|
||||||
|
),
|
||||||
|
"staleness_seconds": (
|
||||||
|
health.staleness.total_seconds() if health.staleness else None
|
||||||
|
),
|
||||||
|
"reasons": health.reasons,
|
||||||
|
},
|
||||||
|
}
|
||||||
|
if topic_id:
|
||||||
|
body["topic_id"] = topic_id
|
||||||
|
if workstream_id:
|
||||||
|
body["workstream_id"] = workstream_id
|
||||||
|
|
||||||
|
# Dedup repeated alerts for the same missed window (same schedule + last fire).
|
||||||
|
last_fired = health.last_fired_at.isoformat() if health.last_fired_at else "none"
|
||||||
|
resp = httpx.post(
|
||||||
|
f"{base_url}/progress/",
|
||||||
|
json=body,
|
||||||
|
headers=idempotency_headers("schedule_miss", health.activity_id, last_fired),
|
||||||
|
timeout=timeout_seconds,
|
||||||
|
)
|
||||||
|
resp.raise_for_status()
|
||||||
|
data = resp.json()
|
||||||
|
return {
|
||||||
|
"type": "schedule-miss-alert",
|
||||||
|
"status": "posted",
|
||||||
|
"progress_id": data.get("id"),
|
||||||
|
}
|
||||||
@@ -17,7 +17,6 @@ from temporalio.client import (
|
|||||||
Schedule,
|
Schedule,
|
||||||
ScheduleActionStartWorkflow,
|
ScheduleActionStartWorkflow,
|
||||||
ScheduleAlreadyRunningError,
|
ScheduleAlreadyRunningError,
|
||||||
ScheduleBackfill,
|
|
||||||
ScheduleCalendarSpec,
|
ScheduleCalendarSpec,
|
||||||
ScheduleHandle,
|
ScheduleHandle,
|
||||||
ScheduleOverlapPolicy,
|
ScheduleOverlapPolicy,
|
||||||
@@ -38,13 +37,49 @@ _ORCHESTRATOR_TASK_QUEUE = "orchestrator-tq"
|
|||||||
# RunActivityWorkflow detects this value and derives run dedup key from workflow_id.
|
# RunActivityWorkflow detects this value and derives run dedup key from workflow_id.
|
||||||
SCHEDULED_TRIGGER_KEY = "scheduled"
|
SCHEDULED_TRIGGER_KEY = "scheduled"
|
||||||
|
|
||||||
# T24: misfire_policy → ScheduleOverlapPolicy
|
# ACTIVITY-WP-0014: misfire_policy → run-miss recovery behaviour.
|
||||||
_MISFIRE_TO_OVERLAP: dict[str, ScheduleOverlapPolicy] = {
|
#
|
||||||
"skip": ScheduleOverlapPolicy.SKIP,
|
# A "missed fire" happens when the worker / Temporal is unavailable at trigger
|
||||||
"catchup": ScheduleOverlapPolicy.BUFFER_ALL,
|
# time. Two Temporal levers together define the behaviour:
|
||||||
"compress": ScheduleOverlapPolicy.BUFFER_ONE,
|
# - catchup_window: how far back the server will recover missed fires once it
|
||||||
|
# is healthy again. The previous code never set this, so a brief outage at
|
||||||
|
# trigger time silently dropped the fire with no recovery and no signal.
|
||||||
|
# - overlap: what to do when a (recovered) fire would start while a prior run
|
||||||
|
# is still executing.
|
||||||
|
#
|
||||||
|
# Legacy values (catchup, compress) are aliased onto the explicit names.
|
||||||
|
_MISFIRE_ALIASES: dict[str, str] = {
|
||||||
|
"catchup": "catchup_all",
|
||||||
|
"compress": "catchup_latest",
|
||||||
}
|
}
|
||||||
|
|
||||||
|
# overlap policy + default catchup window (seconds) per normalised policy.
|
||||||
|
_SKIP_WINDOW_SECONDS = 60
|
||||||
|
_CATCHUP_ALL_WINDOW_SECONDS = 365 * 24 * 3600
|
||||||
|
_CATCHUP_LATEST_WINDOW_SECONDS = 24 * 3600
|
||||||
|
|
||||||
|
_MISFIRE_TO_OVERLAP: dict[str, ScheduleOverlapPolicy] = {
|
||||||
|
# Run on trigger or skip — recover nothing past a tiny grace window.
|
||||||
|
"skip": ScheduleOverlapPolicy.SKIP,
|
||||||
|
# Run on trigger or recover every missed fire during the outage window.
|
||||||
|
"catchup_all": ScheduleOverlapPolicy.BUFFER_ALL,
|
||||||
|
# Run on trigger or recover the most recent missed fire only; BUFFER_ONE
|
||||||
|
# buffers at most one start and drops the rest, so a backlog never accumulates.
|
||||||
|
"catchup_latest": ScheduleOverlapPolicy.BUFFER_ONE,
|
||||||
|
}
|
||||||
|
|
||||||
|
_MISFIRE_DEFAULT_WINDOW: dict[str, int] = {
|
||||||
|
"skip": _SKIP_WINDOW_SECONDS,
|
||||||
|
"catchup_all": _CATCHUP_ALL_WINDOW_SECONDS,
|
||||||
|
"catchup_latest": _CATCHUP_LATEST_WINDOW_SECONDS,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def _normalize_misfire_policy(misfire_policy: str) -> str:
|
||||||
|
"""Map legacy aliases onto the explicit run-miss policy names."""
|
||||||
|
canonical = _MISFIRE_ALIASES.get(misfire_policy, misfire_policy)
|
||||||
|
return canonical if canonical in _MISFIRE_TO_OVERLAP else "skip"
|
||||||
|
|
||||||
|
|
||||||
def schedule_id(activity_id: str | UUID) -> str:
|
def schedule_id(activity_id: str | UUID) -> str:
|
||||||
"""Return the canonical Temporal Schedule ID for an ActivityDefinition."""
|
"""Return the canonical Temporal Schedule ID for an ActivityDefinition."""
|
||||||
@@ -57,7 +92,15 @@ def smoke_schedule_id(activity_id: str | UUID) -> str:
|
|||||||
|
|
||||||
|
|
||||||
def _overlap_policy(misfire_policy: str) -> ScheduleOverlapPolicy:
|
def _overlap_policy(misfire_policy: str) -> ScheduleOverlapPolicy:
|
||||||
return _MISFIRE_TO_OVERLAP.get(misfire_policy, ScheduleOverlapPolicy.SKIP)
|
return _MISFIRE_TO_OVERLAP[_normalize_misfire_policy(misfire_policy)]
|
||||||
|
|
||||||
|
|
||||||
|
def _catchup_window(cfg: CronTriggerConfig) -> timedelta:
|
||||||
|
"""Resolve the catchup window: explicit override, else the policy default."""
|
||||||
|
if cfg.catchup_window_seconds is not None:
|
||||||
|
return timedelta(seconds=cfg.catchup_window_seconds)
|
||||||
|
policy = _normalize_misfire_policy(cfg.misfire_policy)
|
||||||
|
return timedelta(seconds=_MISFIRE_DEFAULT_WINDOW[policy])
|
||||||
|
|
||||||
|
|
||||||
def _build_schedule(defn: ActivityDefinition) -> Schedule:
|
def _build_schedule(defn: ActivityDefinition) -> Schedule:
|
||||||
@@ -80,7 +123,10 @@ def _build_schedule(defn: ActivityDefinition) -> Schedule:
|
|||||||
jitter=timedelta(seconds=cfg.jitter_seconds) if cfg.jitter_seconds else None,
|
jitter=timedelta(seconds=cfg.jitter_seconds) if cfg.jitter_seconds else None,
|
||||||
)
|
)
|
||||||
|
|
||||||
policy = SchedulePolicy(overlap=_overlap_policy(cfg.misfire_policy))
|
policy = SchedulePolicy(
|
||||||
|
overlap=_overlap_policy(cfg.misfire_policy),
|
||||||
|
catchup_window=_catchup_window(cfg),
|
||||||
|
)
|
||||||
state = ScheduleState(paused=not defn.enabled)
|
state = ScheduleState(paused=not defn.enabled)
|
||||||
|
|
||||||
return Schedule(action=action, spec=spec, policy=policy, state=state)
|
return Schedule(action=action, spec=spec, policy=policy, state=state)
|
||||||
@@ -282,18 +328,10 @@ async def upsert_schedule(client: Client, defn: ActivityDefinition) -> ScheduleH
|
|||||||
else:
|
else:
|
||||||
await handle.pause(note="disabled via upsert_schedule")
|
await handle.pause(note="disabled via upsert_schedule")
|
||||||
|
|
||||||
# T24 catchup: backfill any fires missed in the last hour.
|
# ACTIVITY-WP-0014: missed-fire recovery is now handled natively by the
|
||||||
if isinstance(defn.trigger_config, CronTriggerConfig):
|
# schedule's catchup_window (see _build_schedule), which the server applies
|
||||||
if defn.trigger_config.misfire_policy == "catchup":
|
# continuously after any outage — not only at upsert time. The previous
|
||||||
now = datetime.now(tz=timezone.utc)
|
# ad-hoc 1-hour backfill is therefore no longer needed.
|
||||||
backfill_start = now - timedelta(hours=1)
|
|
||||||
await handle.backfill(
|
|
||||||
ScheduleBackfill(
|
|
||||||
start_at=backfill_start,
|
|
||||||
end_at=now,
|
|
||||||
overlap=ScheduleOverlapPolicy.BUFFER_ALL,
|
|
||||||
)
|
|
||||||
)
|
|
||||||
|
|
||||||
return handle
|
return handle
|
||||||
|
|
||||||
|
|||||||
34
src/activity_core/state_hub_write.py
Normal file
34
src/activity_core/state_hub_write.py
Normal file
@@ -0,0 +1,34 @@
|
|||||||
|
"""Idempotency-keyed State Hub writes (ACTIVITY-WP-0014 T05).
|
||||||
|
|
||||||
|
Under the State Hub *beachhead* model, a write may be buffered locally while
|
||||||
|
central State Hub is unreachable and **flushed later, possibly with retries**.
|
||||||
|
To keep that flush safe — no duplicate progress / triage events — every write
|
||||||
|
carries a stable ``Idempotency-Key`` header derived deterministically from the
|
||||||
|
write's identity. The guarantee lives on the write itself and does **not** depend
|
||||||
|
on a live dedup read, so it holds even when the beachhead is serving offline.
|
||||||
|
|
||||||
|
activity-core does not implement the queue/cache (that is state-hub's beachhead);
|
||||||
|
it only emits the key so the beachhead / State Hub can dedup on flush. The header
|
||||||
|
passes untouched through the existing ``actcore-state-hub-bridge`` proxy and is
|
||||||
|
ignored by State Hub versions that do not yet honour it.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
IDEMPOTENCY_HEADER = "Idempotency-Key"
|
||||||
|
|
||||||
|
|
||||||
|
def idempotency_key(*parts: str | None) -> str:
|
||||||
|
"""Build a stable, header-safe idempotency key from identity parts.
|
||||||
|
|
||||||
|
Empty/None parts are kept as empty segments so the key shape is stable across
|
||||||
|
calls. Whitespace and control characters are collapsed to keep the value a
|
||||||
|
valid single-line HTTP header.
|
||||||
|
"""
|
||||||
|
raw = ":".join((p or "") for p in parts)
|
||||||
|
return "".join(ch if 0x20 < ord(ch) < 0x7F else "_" for ch in raw) or "_"
|
||||||
|
|
||||||
|
|
||||||
|
def idempotency_headers(*parts: str | None) -> dict[str, str]:
|
||||||
|
"""Return the header dict to attach to a State Hub write."""
|
||||||
|
return {IDEMPOTENCY_HEADER: idempotency_key(*parts)}
|
||||||
@@ -15,6 +15,8 @@ import asyncio
|
|||||||
import logging
|
import logging
|
||||||
import os
|
import os
|
||||||
import uuid
|
import uuid
|
||||||
|
from dataclasses import dataclass
|
||||||
|
from typing import Sequence
|
||||||
|
|
||||||
from sqlalchemy import select
|
from sqlalchemy import select
|
||||||
from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker, create_async_engine
|
from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker, create_async_engine
|
||||||
@@ -30,6 +32,20 @@ TEMPORAL_HOST = os.environ.get("TEMPORAL_HOST", "localhost:7233")
|
|||||||
TEMPORAL_NAMESPACE = os.environ.get("TEMPORAL_NAMESPACE", "default")
|
TEMPORAL_NAMESPACE = os.environ.get("TEMPORAL_NAMESPACE", "default")
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class ScheduleSyncResult:
|
||||||
|
upserted: int = 0
|
||||||
|
paused: int = 0
|
||||||
|
deleted_orphans: int = 0
|
||||||
|
|
||||||
|
def to_dict(self) -> dict[str, int]:
|
||||||
|
return {
|
||||||
|
"upserted": self.upserted,
|
||||||
|
"paused": self.paused,
|
||||||
|
"deleted_orphans": self.deleted_orphans,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
def _row_to_domain(row: ActivityDefinitionRow) -> ActivityDefinition:
|
def _row_to_domain(row: ActivityDefinitionRow) -> ActivityDefinition:
|
||||||
"""Convert an ORM row to a domain ActivityDefinition for schedule_manager."""
|
"""Convert an ORM row to a domain ActivityDefinition for schedule_manager."""
|
||||||
return ActivityDefinition.model_validate(
|
return ActivityDefinition.model_validate(
|
||||||
@@ -46,12 +62,82 @@ def _row_to_domain(row: ActivityDefinitionRow) -> ActivityDefinition:
|
|||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
async def sync(client: Client, db_url: str) -> None:
|
def _valid_schedule_activity_id(defn: ActivityDefinition) -> str:
|
||||||
|
if isinstance(defn.trigger_config, ScheduledTriggerConfig):
|
||||||
|
return f"{defn.id}-once"
|
||||||
|
return str(defn.id)
|
||||||
|
|
||||||
|
|
||||||
|
async def _load_schedule_rows(
|
||||||
|
session_factory: async_sessionmaker[AsyncSession],
|
||||||
|
) -> Sequence[ActivityDefinitionRow]:
|
||||||
|
async with session_factory() as session:
|
||||||
|
return (
|
||||||
|
await session.scalars(
|
||||||
|
select(ActivityDefinitionRow).where(
|
||||||
|
ActivityDefinitionRow.trigger_type.in_(["cron", "scheduled"])
|
||||||
|
)
|
||||||
|
)
|
||||||
|
).all()
|
||||||
|
|
||||||
|
|
||||||
|
async def sync_schedule_rows(
|
||||||
|
client: Client,
|
||||||
|
rows: Sequence[ActivityDefinitionRow],
|
||||||
|
) -> ScheduleSyncResult:
|
||||||
|
"""Reconcile Temporal Schedules against already-loaded definition rows."""
|
||||||
|
valid_schedule_activity_ids: set[str] = set()
|
||||||
|
result = ScheduleSyncResult()
|
||||||
|
|
||||||
|
for row in rows:
|
||||||
|
defn = _row_to_domain(row)
|
||||||
|
if not isinstance(
|
||||||
|
defn.trigger_config,
|
||||||
|
(CronTriggerConfig, ScheduledTriggerConfig),
|
||||||
|
):
|
||||||
|
continue
|
||||||
|
|
||||||
|
valid_schedule_activity_ids.add(_valid_schedule_activity_id(defn))
|
||||||
|
|
||||||
|
await upsert_schedule(client, defn)
|
||||||
|
if defn.enabled:
|
||||||
|
result.upserted += 1
|
||||||
|
logger.info("upserted schedule for activity %s (%s)", defn.id, defn.name)
|
||||||
|
else:
|
||||||
|
result.paused += 1
|
||||||
|
logger.info("upserted paused schedule for disabled activity %s", defn.id)
|
||||||
|
|
||||||
|
# Tombstone cleanup: remove Temporal Schedules with no matching DB row.
|
||||||
|
existing_schedules = await list_schedules(client)
|
||||||
|
for entry in existing_schedules:
|
||||||
|
if entry["activity_id"] not in valid_schedule_activity_ids:
|
||||||
|
await delete_schedule(client, entry["activity_id"])
|
||||||
|
result.deleted_orphans += 1
|
||||||
|
logger.info("deleted orphaned schedule %s", entry["schedule_id"])
|
||||||
|
|
||||||
|
logger.info(
|
||||||
|
"sync_schedules complete — upserted=%d paused=%d deleted_orphans=%d",
|
||||||
|
result.upserted,
|
||||||
|
result.paused,
|
||||||
|
result.deleted_orphans,
|
||||||
|
)
|
||||||
|
return result
|
||||||
|
|
||||||
|
|
||||||
|
async def sync_with_session_factory(
|
||||||
|
client: Client,
|
||||||
|
session_factory: async_sessionmaker[AsyncSession],
|
||||||
|
) -> ScheduleSyncResult:
|
||||||
|
"""Reconcile Temporal Schedules using an existing DB session factory."""
|
||||||
|
return await sync_schedule_rows(client, await _load_schedule_rows(session_factory))
|
||||||
|
|
||||||
|
|
||||||
|
async def sync(client: Client, db_url: str) -> ScheduleSyncResult:
|
||||||
"""Reconcile Temporal Schedules against the ActivityDefinition table.
|
"""Reconcile Temporal Schedules against the ActivityDefinition table.
|
||||||
|
|
||||||
Steps:
|
Steps:
|
||||||
1. Load all enabled cron ActivityDefinitions from Postgres.
|
1. Load all cron/scheduled ActivityDefinitions from Postgres.
|
||||||
2. Upsert a Temporal Schedule for each one.
|
2. Upsert a Temporal Schedule for each one, paused when disabled.
|
||||||
3. Delete Temporal Schedules whose activity_id has no matching DB row
|
3. Delete Temporal Schedules whose activity_id has no matching DB row
|
||||||
(tombstone cleanup for deleted or trigger-type-changed definitions).
|
(tombstone cleanup for deleted or trigger-type-changed definitions).
|
||||||
"""
|
"""
|
||||||
@@ -59,55 +145,10 @@ async def sync(client: Client, db_url: str) -> None:
|
|||||||
session_factory = async_sessionmaker(engine, expire_on_commit=False)
|
session_factory = async_sessionmaker(engine, expire_on_commit=False)
|
||||||
|
|
||||||
try:
|
try:
|
||||||
async with session_factory() as session:
|
return await sync_with_session_factory(client, session_factory)
|
||||||
rows = (
|
|
||||||
await session.scalars(
|
|
||||||
select(ActivityDefinitionRow).where(
|
|
||||||
ActivityDefinitionRow.trigger_type.in_(["cron", "scheduled"])
|
|
||||||
)
|
|
||||||
)
|
|
||||||
).all()
|
|
||||||
finally:
|
finally:
|
||||||
await engine.dispose()
|
await engine.dispose()
|
||||||
|
|
||||||
db_activity_ids: set[str] = set()
|
|
||||||
upserted = 0
|
|
||||||
skipped = 0
|
|
||||||
|
|
||||||
for row in rows:
|
|
||||||
defn = _row_to_domain(row)
|
|
||||||
if not isinstance(defn.trigger_config, (CronTriggerConfig, ScheduledTriggerConfig)):
|
|
||||||
continue
|
|
||||||
|
|
||||||
db_activity_ids.add(str(defn.id))
|
|
||||||
|
|
||||||
if defn.enabled:
|
|
||||||
await upsert_schedule(client, defn)
|
|
||||||
upserted += 1
|
|
||||||
logger.info("upserted schedule for activity %s (%s)", defn.id, defn.name)
|
|
||||||
else:
|
|
||||||
# Disabled definitions: schedule may exist (paused) — leave it;
|
|
||||||
# upsert_schedule already handles the paused state.
|
|
||||||
await upsert_schedule(client, defn)
|
|
||||||
skipped += 1
|
|
||||||
logger.info("upserted paused schedule for disabled activity %s", defn.id)
|
|
||||||
|
|
||||||
# Tombstone cleanup: remove Temporal Schedules with no matching DB row.
|
|
||||||
existing_schedules = await list_schedules(client)
|
|
||||||
deleted = 0
|
|
||||||
for entry in existing_schedules:
|
|
||||||
if entry["activity_id"] not in db_activity_ids:
|
|
||||||
await delete_schedule(client, entry["activity_id"])
|
|
||||||
deleted += 1
|
|
||||||
logger.info("deleted orphaned schedule %s", entry["schedule_id"])
|
|
||||||
|
|
||||||
logger.info(
|
|
||||||
"sync_schedules complete — upserted=%d skipped_disabled=%d deleted_orphans=%d",
|
|
||||||
upserted,
|
|
||||||
skipped,
|
|
||||||
deleted,
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
async def main() -> None:
|
async def main() -> None:
|
||||||
logging.basicConfig(level=logging.INFO)
|
logging.basicConfig(level=logging.INFO)
|
||||||
@@ -116,7 +157,13 @@ async def main() -> None:
|
|||||||
raise RuntimeError("ACTCORE_DB_URL is required")
|
raise RuntimeError("ACTCORE_DB_URL is required")
|
||||||
|
|
||||||
client = await Client.connect(TEMPORAL_HOST, namespace=TEMPORAL_NAMESPACE)
|
client = await Client.connect(TEMPORAL_HOST, namespace=TEMPORAL_NAMESPACE)
|
||||||
await sync(client, db_url)
|
result = await sync(client, db_url)
|
||||||
|
print(
|
||||||
|
"Synced schedules: "
|
||||||
|
f"upserted={result.upserted} "
|
||||||
|
f"paused={result.paused} "
|
||||||
|
f"deleted_orphans={result.deleted_orphans}"
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
if __name__ == "__main__":
|
||||||
|
|||||||
97
src/activity_core/sync_service.py
Normal file
97
src/activity_core/sync_service.py
Normal file
@@ -0,0 +1,97 @@
|
|||||||
|
"""Shared ActivityDefinition/event type/schedule sync orchestration."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
from temporalio.client import Client
|
||||||
|
|
||||||
|
from activity_core.event_type_registry import sync_event_types
|
||||||
|
from activity_core.sync_activity_definitions import sync as sync_activity_definitions
|
||||||
|
from activity_core.sync_schedules import ScheduleSyncResult, sync_with_session_factory
|
||||||
|
|
||||||
|
_MAX_ERRORS = 20
|
||||||
|
_MAX_ERROR_MESSAGE_LENGTH = 1000
|
||||||
|
|
||||||
|
|
||||||
|
def _empty_result(
|
||||||
|
*,
|
||||||
|
definitions: bool,
|
||||||
|
schedules: bool,
|
||||||
|
event_types: bool,
|
||||||
|
) -> dict[str, Any]:
|
||||||
|
return {
|
||||||
|
"ok": True,
|
||||||
|
"ran": {
|
||||||
|
"definitions": definitions,
|
||||||
|
"schedules": schedules,
|
||||||
|
"event_types": event_types,
|
||||||
|
},
|
||||||
|
"definitions": {"synced": 0},
|
||||||
|
"event_types": {"synced": 0},
|
||||||
|
"schedules": ScheduleSyncResult().to_dict(),
|
||||||
|
"errors": [],
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def _record_error(result: dict[str, Any], stage: str, exc: Exception) -> None:
|
||||||
|
errors = result["errors"]
|
||||||
|
if len(errors) >= _MAX_ERRORS:
|
||||||
|
return
|
||||||
|
errors.append(
|
||||||
|
{
|
||||||
|
"stage": stage,
|
||||||
|
"type": type(exc).__name__,
|
||||||
|
"message": str(exc)[:_MAX_ERROR_MESSAGE_LENGTH],
|
||||||
|
}
|
||||||
|
)
|
||||||
|
result["ok"] = False
|
||||||
|
|
||||||
|
|
||||||
|
async def run_sync(
|
||||||
|
*,
|
||||||
|
session_factory: Any,
|
||||||
|
temporal_client: Client | None,
|
||||||
|
definitions: bool = True,
|
||||||
|
schedules: bool = True,
|
||||||
|
event_types: bool = False,
|
||||||
|
) -> dict[str, Any]:
|
||||||
|
"""Run the requested sync stages and return bounded operator-facing status.
|
||||||
|
|
||||||
|
The orchestration deliberately accepts its database and Temporal
|
||||||
|
dependencies as arguments so startup and the API can share the same behavior
|
||||||
|
without creating another global runtime.
|
||||||
|
"""
|
||||||
|
result = _empty_result(
|
||||||
|
definitions=definitions,
|
||||||
|
schedules=schedules,
|
||||||
|
event_types=event_types,
|
||||||
|
)
|
||||||
|
|
||||||
|
if definitions:
|
||||||
|
try:
|
||||||
|
result["definitions"]["synced"] = await sync_activity_definitions(
|
||||||
|
session_factory
|
||||||
|
)
|
||||||
|
except Exception as exc: # pragma: no cover - exercised through tests
|
||||||
|
_record_error(result, "definitions", exc)
|
||||||
|
|
||||||
|
if event_types:
|
||||||
|
try:
|
||||||
|
result["event_types"]["synced"] = await sync_event_types(session_factory)
|
||||||
|
except Exception as exc: # pragma: no cover - exercised through tests
|
||||||
|
_record_error(result, "event_types", exc)
|
||||||
|
|
||||||
|
if schedules:
|
||||||
|
try:
|
||||||
|
if temporal_client is None:
|
||||||
|
raise RuntimeError("Temporal client is required for schedule sync")
|
||||||
|
schedule_result = await sync_with_session_factory(
|
||||||
|
temporal_client,
|
||||||
|
session_factory,
|
||||||
|
)
|
||||||
|
result["schedules"] = schedule_result.to_dict()
|
||||||
|
except Exception as exc: # pragma: no cover - exercised through tests
|
||||||
|
_record_error(result, "schedules", exc)
|
||||||
|
|
||||||
|
return result
|
||||||
@@ -46,8 +46,7 @@ from activity_core.activities import (
|
|||||||
)
|
)
|
||||||
from activity_core.db import make_engine
|
from activity_core.db import make_engine
|
||||||
from sqlalchemy.ext.asyncio import async_sessionmaker
|
from sqlalchemy.ext.asyncio import async_sessionmaker
|
||||||
from activity_core.sync_activity_definitions import sync as sync_activity_defs
|
from activity_core.sync_service import run_sync
|
||||||
from activity_core.sync_schedules import sync as sync_schedules
|
|
||||||
from activity_core.workflows import RunActivityWorkflow, TaskExecutorWorkflow
|
from activity_core.workflows import RunActivityWorkflow, TaskExecutorWorkflow
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
@@ -77,20 +76,26 @@ async def run() -> None:
|
|||||||
TEMPORAL_HOST, namespace=TEMPORAL_NAMESPACE, runtime=runtime
|
TEMPORAL_HOST, namespace=TEMPORAL_NAMESPACE, runtime=runtime
|
||||||
)
|
)
|
||||||
|
|
||||||
# T45: Sync ActivityDefinition files into DB before schedule sync.
|
logger.info("Syncing ActivityDefinitions and Temporal Schedules...")
|
||||||
logger.info("Syncing ActivityDefinition files...")
|
sync_engine = make_engine(db_url)
|
||||||
|
session_factory = async_sessionmaker(sync_engine, expire_on_commit=False)
|
||||||
try:
|
try:
|
||||||
session_factory = async_sessionmaker(make_engine(db_url), expire_on_commit=False)
|
sync_result = await run_sync(
|
||||||
await sync_activity_defs(session_factory)
|
session_factory=session_factory,
|
||||||
except Exception:
|
temporal_client=client,
|
||||||
logger.exception("activity definition sync failed — continuing worker startup")
|
definitions=True,
|
||||||
|
schedules=True,
|
||||||
# T23: Sync Temporal Schedules with the DB before workers start accepting tasks.
|
event_types=False,
|
||||||
logger.info("Syncing Temporal Schedules with ActivityDefinition DB...")
|
)
|
||||||
try:
|
for error in sync_result["errors"]:
|
||||||
await sync_schedules(client, db_url)
|
logger.error(
|
||||||
except Exception:
|
"startup sync %s failed — %s: %s",
|
||||||
logger.exception("schedule sync failed — continuing worker startup")
|
error["stage"],
|
||||||
|
error["type"],
|
||||||
|
error["message"],
|
||||||
|
)
|
||||||
|
finally:
|
||||||
|
await sync_engine.dispose()
|
||||||
|
|
||||||
orchestrator_worker = Worker(
|
orchestrator_worker = Worker(
|
||||||
client,
|
client,
|
||||||
|
|||||||
@@ -209,11 +209,12 @@ class RunActivityWorkflow:
|
|||||||
|
|
||||||
@workflow.defn
|
@workflow.defn
|
||||||
class TaskExecutorWorkflow:
|
class TaskExecutorWorkflow:
|
||||||
"""Child workflow that executes one concrete task instance.
|
"""Compatibility stub for legacy task-instance workflows.
|
||||||
|
|
||||||
Stub behaviour: persists a task_instances row with status=done and
|
This is not a production execution surface for activity-core. It persists a
|
||||||
returns immediately. Real task execution logic replaces this in a
|
task_instances row with status=done and returns immediately so legacy/dev
|
||||||
later workstream.
|
flows keep their idempotency behavior. Real task execution belongs in
|
||||||
|
per-repo workers or a future execution-owned repo/workplan, not here.
|
||||||
|
|
||||||
task_id is derived deterministically from the workflow's own ID so
|
task_id is derived deterministically from the workflow's own ID so
|
||||||
persist_task_instance retries remain idempotent.
|
persist_task_instance retries remain idempotent.
|
||||||
@@ -221,7 +222,7 @@ class TaskExecutorWorkflow:
|
|||||||
|
|
||||||
@workflow.run
|
@workflow.run
|
||||||
async def run(self, run_id: str, task_type: str, params: dict) -> dict:
|
async def run(self, run_id: str, task_type: str, params: dict) -> dict:
|
||||||
# Derive a stable task_id from this workflow's own ID.
|
# Keep the stub idempotent without implying task lifecycle ownership.
|
||||||
task_id = str(
|
task_id = str(
|
||||||
uuid.uuid5(uuid.NAMESPACE_URL, workflow.info().workflow_id)
|
uuid.uuid5(uuid.NAMESPACE_URL, workflow.info().workflow_id)
|
||||||
)
|
)
|
||||||
|
|||||||
5
tests/fixtures/wp0016/daily_triage_2026-06-26_validation_failure.partial.json
vendored
Normal file
5
tests/fixtures/wp0016/daily_triage_2026-06-26_validation_failure.partial.json
vendored
Normal file
@@ -0,0 +1,5 @@
|
|||||||
|
{
|
||||||
|
"_note": "PARTIAL 4000-char preview of the 2026-06-26 daily-triage validation failure (retry attempt). Full payload not recoverable from activity-core: complete() drops finish_reason; report sink caps raw at 4000 chars; the JSON break is at char 5268 (beyond this preview). Full response would require llm-connect producer-side logs on railiance01.",
|
||||||
|
"validation_error": "Expecting ',' delimiter: line 136 column 22 (char 5268)",
|
||||||
|
"raw_output_preview": "{\n \"summary\": \"Triage report focusing on high-priority workstreams with pending human intervention or critical dependencies, and addressing recently cleared dependencies to unblock progress.\",\n \"recommendations\": [\n {\n \"rank\": 1,\n \"candidate\": \"2731fece-6c49-45b8-ab8a-4ea6c04ac603\",\n \"action\": \"work-next\",\n \"why\": \"A critical dependency (T03 - Configure bounded OpenBao token roles and policies) for this workstream has been cleared, unblocking significant progress on credential management. This workstream has 8 todo tasks and no waits, indicating it's ready for immediate action.\",\n \"confidence\": \"high\",\n \"wsjf\": {\n \"score\": 5.0,\n \"strategic_value\": 5,\n \"time_criticality\": 5,\n \"risk_reduction\": 4,\n \"opportunity_enablement\": 5,\n \"job_size\": 4\n }\n },\n {\n \"rank\": 2,\n \"candidate\": \"bd086c41-287d-4a4e-8ac5-9ab270f14d72\",\n \"action\": \"needs-human\",\n \"why\": \"This high-priority workstream has a 'needs_human' task (T04 - Provision the runtime API key outside Git) and is currently blocked by 3 'wait' tasks. Human intervention is required to unblock progress.\",\n \"confidence\": \"high\",\n \"wsjf\": {\n \"score\": 4.7,\n \"strategic_value\": 5,\n \"time_criticality\": 4,\n \"risk_reduction\": 5,\n \"opportunity_enablement\": 4,\n \"job_size\": 3\n }\n },\n {\n \"rank\": 3,\n \"candidate\": \"9b56414a-c71f-4e72-9b2b-d2166aaf50d0\",\n \"action\": \"needs-human\",\n \"why\": \"This high-priority workstream has a 'needs_human' task (Task: Execute Live Ops-Hub Bootstrap) and is currently blocked by a 'wait' task. Human intervention is required to proceed with the bootstrap.\",\n \"confidence\": \"high\",\n \"wsjf\": {\n \"score\": 4.7,\n \"strategic_value\": 5,\n \"time_criticality\": 4,\n \"risk_reduction\": 5,\n \"opportunity_enablement\": 4,\n \"job_size\": 3\n }\n },\n {\n \"rank\": 4,\n \"candidate\": \"84e17675-0d15-4268-a8bd-540124d37018\",\n \"action\": \"needs-human\",\n \"why\": \"This workstream has 4 'needs_human' tasks, including 'T02 \u2014 Resolve Forgejo production design decisions', indicating significant human input is required to move forward with the migration.\",\n \"confidence\": \"high\",\n \"wsjf\": {\n \"score\": 4.0,\n \"strategic_value\": 4,\n \"time_criticality\": 4,\n \"risk_reduction\": 4,\n \"opportunity_enablement\": 4,\n \"job_size\": 4\n }\n },\n {\n \"rank\": 5,\n \"candidate\": \"5646e13a-13af-4724-bca6-3c0d86f96733\",\n \"action\": \"needs-human\",\n \"why\": \"This workstream has a 'needs_human' task ('Three-Run Calibration Feedback') and is currently in a 'wait' state. Human feedback is crucial for operational hardening.\",\n \"confidence\": \"medium\",\n \"wsjf\": {\n \"score\": 3.7,\n \"strategic_value\": 4,\n \"time_criticality\": 3,\n \"risk_reduction\": 4,\n \"opportunity_enablement\": 4,\n \"job_size\": 4\n }\n },\n {\n \"rank\": 6,\n \"candidate\": \"896ace77-21b3-450b-8fb7-254aefc8c570\",\n \"action\": \"close-out\",\n \"why\": \"The task 'Wire activity-core to the live service' has been resolved, and the workstream shows 2 progress tasks with 0 todo/wait tasks. This indicates the deployment is likely complete or nearing completion and ready for close-out after verification.\",\n \"confidence\": \"high\",\n \"wsjf\": {\n \"score\": 3.7,\n \"strategic_value\": 4,\n \"time_criticality\": 3,\n \"risk_reduction\": 4,\n \"opportunity_enablement\": 4,\n \"job_size\": 4\n }\n },\n {\n \"rank\": 7,\n \"candidate\": \"656e435d-3a00-4f5e-a38e-114467f9062e\",\n \"action\": \"work-next\",\n \"why\": \"This high-priority workstream has a single 'wait' task ('Task: Activate Ops-Hub Widgets In Inter-Hub') and no 'needs_human' tasks. It appears ready for the next step to activate the widgets.\",\n \"confidence\": \"medium\",\n \"wsjf"
|
||||||
|
}
|
||||||
@@ -88,6 +88,43 @@ def test_for_each_binds_each_list_item_before_condition_and_action_rendering() -
|
|||||||
]
|
]
|
||||||
|
|
||||||
|
|
||||||
|
def test_for_each_can_gate_registry_hygiene_gaps_on_signal() -> None:
|
||||||
|
rules = [
|
||||||
|
{
|
||||||
|
"id": "flag-registry-hygiene-gap",
|
||||||
|
"for_each": "context.gaps",
|
||||||
|
"bind_as": "g",
|
||||||
|
"condition": 'context.g.hygiene_signal != ""',
|
||||||
|
"action": {
|
||||||
|
"task_template": "Close registry hygiene gap for {context.g.repo}",
|
||||||
|
"target_repo": "context.g.repo",
|
||||||
|
"priority": "medium",
|
||||||
|
"labels": ["registry-hygiene", "{context.g.hygiene_signal}"],
|
||||||
|
},
|
||||||
|
}
|
||||||
|
]
|
||||||
|
context = {
|
||||||
|
"gaps": [
|
||||||
|
{
|
||||||
|
"repo": "reuse-surface",
|
||||||
|
"hygiene_signal": "empty_capability_scaffold",
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"repo": "activity-core",
|
||||||
|
"hygiene_signal": "",
|
||||||
|
},
|
||||||
|
]
|
||||||
|
}
|
||||||
|
|
||||||
|
specs = expand_rule_actions(rules, _Event(), context)
|
||||||
|
|
||||||
|
assert [spec["target_repo"] for spec in specs] == ["reuse-surface"]
|
||||||
|
assert specs[0]["labels"] == [
|
||||||
|
"registry-hygiene",
|
||||||
|
"empty_capability_scaffold",
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
def test_for_each_rejects_non_path_expression() -> None:
|
def test_for_each_rejects_non_path_expression() -> None:
|
||||||
rules = [
|
rules = [
|
||||||
{
|
{
|
||||||
|
|||||||
@@ -12,6 +12,7 @@ Covers:
|
|||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
import json
|
import json
|
||||||
|
from pathlib import Path
|
||||||
from types import SimpleNamespace
|
from types import SimpleNamespace
|
||||||
from typing import Any
|
from typing import Any
|
||||||
|
|
||||||
@@ -333,7 +334,14 @@ def test_execute_instruction_forwards_output_schema_to_llm_connect(tmp_path, mon
|
|||||||
def test_execute_instruction_with_audit_accepts_report_payload():
|
def test_execute_instruction_with_audit_accepts_report_payload():
|
||||||
report_data = {
|
report_data = {
|
||||||
"summary": "State Hub has loose ends.",
|
"summary": "State Hub has loose ends.",
|
||||||
"recommendations": [{"action": "revisit", "candidate": "CUST-WP-0045"}],
|
"recommendations": [
|
||||||
|
{
|
||||||
|
"rank": 1,
|
||||||
|
"action": "revisit",
|
||||||
|
"candidate": "CUST-WP-0045",
|
||||||
|
"why": "Loose ends need attention.",
|
||||||
|
}
|
||||||
|
],
|
||||||
}
|
}
|
||||||
llm = _CountingLLM([json.dumps(report_data)])
|
llm = _CountingLLM([json.dumps(report_data)])
|
||||||
instr = _instr(
|
instr = _instr(
|
||||||
@@ -353,7 +361,14 @@ def test_execute_instruction_with_audit_accepts_report_payload():
|
|||||||
def test_execute_instruction_with_audit_accepts_fenced_report_payload():
|
def test_execute_instruction_with_audit_accepts_fenced_report_payload():
|
||||||
report_data = {
|
report_data = {
|
||||||
"summary": "State Hub has loose ends.",
|
"summary": "State Hub has loose ends.",
|
||||||
"recommendations": [{"action": "revisit", "candidate": "CUST-WP-0045"}],
|
"recommendations": [
|
||||||
|
{
|
||||||
|
"rank": 1,
|
||||||
|
"action": "revisit",
|
||||||
|
"candidate": "CUST-WP-0045",
|
||||||
|
"why": "Loose ends need attention.",
|
||||||
|
}
|
||||||
|
],
|
||||||
}
|
}
|
||||||
llm = _CountingLLM([f"```json\n{json.dumps(report_data)}\n```"])
|
llm = _CountingLLM([f"```json\n{json.dumps(report_data)}\n```"])
|
||||||
instr = _instr(
|
instr = _instr(
|
||||||
@@ -389,6 +404,175 @@ def test_execute_instruction_with_audit_rejects_invalid_report_schema():
|
|||||||
assert llm.call_count == 2
|
assert llm.call_count == 2
|
||||||
|
|
||||||
|
|
||||||
|
# ── WP-0016-T03 resilient report recovery ─────────────────────────────────────
|
||||||
|
|
||||||
|
def _valid_rec(rank: int) -> dict[str, Any]:
|
||||||
|
return {
|
||||||
|
"rank": rank,
|
||||||
|
"candidate": f"WS-{rank}",
|
||||||
|
"action": "work-next",
|
||||||
|
"why": f"reason {rank}",
|
||||||
|
"wsjf": {"score": 5.0},
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def _pretty_triage_with_truncated_tail(num_valid: int) -> str:
|
||||||
|
body = ",\n".join(" " + json.dumps(_valid_rec(i)) for i in range(1, num_valid + 1))
|
||||||
|
# Trailing object is cut off mid-string — the whole document is invalid JSON,
|
||||||
|
# reproducing the 2026-06-26 failure shape (valid prefix, broken tail).
|
||||||
|
return (
|
||||||
|
'{\n "summary": "Daily triage.",\n "recommendations": [\n'
|
||||||
|
+ body
|
||||||
|
+ ',\n {\n "rank": '
|
||||||
|
+ str(num_valid + 1)
|
||||||
|
+ ',\n "candidate": "WS-X",\n "action": "work-'
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def test_resilient_report_recovers_valid_prefix_and_quarantines_truncated_tail():
|
||||||
|
raw = _pretty_triage_with_truncated_tail(7)
|
||||||
|
llm = _CountingLLM([raw, raw])
|
||||||
|
instr = _instr(
|
||||||
|
id="daily-triage-report",
|
||||||
|
prompt="Report.",
|
||||||
|
trusted_fields=[],
|
||||||
|
output_schema="schemas/daily-triage-report.json",
|
||||||
|
report_sinks=[{"type": "working-memory"}],
|
||||||
|
)
|
||||||
|
|
||||||
|
result = execute_instruction_with_audit(instr, _Event(), {}, llm)
|
||||||
|
|
||||||
|
assert result.output_validated is True
|
||||||
|
assert result.review_required is True
|
||||||
|
assert result.report is not None
|
||||||
|
assert result.report["partial"] is True
|
||||||
|
assert len(result.report["recommendations"]) == 7
|
||||||
|
assert result.report["summary"] == "Daily triage."
|
||||||
|
assert result.report["quarantined_count"] >= 1
|
||||||
|
# The broken tail is dropped — either as an unparseable/truncated span or,
|
||||||
|
# if _try_repair salvages its structure, as a schema-invalid item. Either way
|
||||||
|
# it carries a diagnostic error and never pollutes the surviving report.
|
||||||
|
assert result.report["quarantined_items"][0]["error"]
|
||||||
|
|
||||||
|
|
||||||
|
def test_resilient_report_quarantines_one_bad_item_among_valid():
|
||||||
|
recs = [_valid_rec(1), {"candidate": "WS-2", "action": "x", "why": "no rank"}, _valid_rec(3)]
|
||||||
|
raw = json.dumps({"summary": "Triage.", "recommendations": recs})
|
||||||
|
llm = _CountingLLM([raw, raw])
|
||||||
|
instr = _instr(
|
||||||
|
id="daily-triage-report",
|
||||||
|
prompt="Report.",
|
||||||
|
trusted_fields=[],
|
||||||
|
output_schema="schemas/daily-triage-report.json",
|
||||||
|
report_sinks=[{"type": "working-memory"}],
|
||||||
|
)
|
||||||
|
|
||||||
|
result = execute_instruction_with_audit(instr, _Event(), {}, llm)
|
||||||
|
|
||||||
|
assert result.output_validated is True
|
||||||
|
assert result.report["partial"] is True
|
||||||
|
assert len(result.report["recommendations"]) == 2
|
||||||
|
assert result.report["quarantined_count"] == 1
|
||||||
|
assert "rank" in result.report["quarantined_items"][0]["error"]
|
||||||
|
|
||||||
|
|
||||||
|
# ── WP-0016-T04 producer guardrails ───────────────────────────────────────────
|
||||||
|
|
||||||
|
def _triage_instr() -> SimpleNamespace:
|
||||||
|
return _instr(
|
||||||
|
id="daily-triage-report",
|
||||||
|
prompt="Report.",
|
||||||
|
trusted_fields=[],
|
||||||
|
output_schema="schemas/daily-triage-report.json",
|
||||||
|
report_sinks=[{"type": "working-memory"}],
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def test_guardrail_count_cap_on_valid_happy_path():
|
||||||
|
# 9 fully-valid recommendations in a syntactically valid document: schema
|
||||||
|
# validation passes, but the maxItems=7 count cap must keep 7 and quarantine 2.
|
||||||
|
recs = [_valid_rec(i) for i in range(1, 10)]
|
||||||
|
raw = json.dumps({"summary": "Triage.", "recommendations": recs})
|
||||||
|
llm = _CountingLLM([raw])
|
||||||
|
|
||||||
|
result = execute_instruction_with_audit(_triage_instr(), _Event(), {}, llm)
|
||||||
|
|
||||||
|
assert llm.call_count == 1 # no retry — the document was valid
|
||||||
|
assert result.report["partial"] is True
|
||||||
|
assert len(result.report["recommendations"]) == 7
|
||||||
|
assert result.report["quarantined_count"] == 2
|
||||||
|
assert all(q["reason"] == "over_limit" for q in result.report["quarantined_items"])
|
||||||
|
|
||||||
|
|
||||||
|
def test_guardrail_oversized_string_quarantined():
|
||||||
|
big = _valid_rec(2)
|
||||||
|
big["why"] = "x" * 5000 # exceeds _MAX_STRING_LEN
|
||||||
|
raw = json.dumps({"summary": "Triage.", "recommendations": [_valid_rec(1), big]})
|
||||||
|
llm = _CountingLLM([raw])
|
||||||
|
|
||||||
|
result = execute_instruction_with_audit(_triage_instr(), _Event(), {}, llm)
|
||||||
|
|
||||||
|
assert len(result.report["recommendations"]) == 1
|
||||||
|
assert result.report["quarantined_count"] == 1
|
||||||
|
assert result.report["quarantined_items"][0]["reason"] == "guardrail"
|
||||||
|
|
||||||
|
|
||||||
|
def test_guardrail_allow_list_rejects_unknown_candidate():
|
||||||
|
raw = json.dumps({
|
||||||
|
"summary": "Triage.",
|
||||||
|
"recommendations": [_valid_rec(1), _valid_rec(2)], # candidates WS-1, WS-2
|
||||||
|
})
|
||||||
|
llm = _CountingLLM([raw])
|
||||||
|
context = {"known_candidates": ["WS-1"]}
|
||||||
|
|
||||||
|
result = execute_instruction_with_audit(_triage_instr(), _Event(), context, llm)
|
||||||
|
|
||||||
|
assert len(result.report["recommendations"]) == 1
|
||||||
|
assert result.report["recommendations"][0]["candidate"] == "WS-1"
|
||||||
|
assert result.report["quarantined_items"][0]["reason"] == "allow_list"
|
||||||
|
|
||||||
|
|
||||||
|
def _nested(depth: int) -> dict[str, Any]:
|
||||||
|
node: dict[str, Any] = {"leaf": 1}
|
||||||
|
for _ in range(depth):
|
||||||
|
node = {"a": node}
|
||||||
|
return node
|
||||||
|
|
||||||
|
|
||||||
|
def test_guardrail_over_depth_quarantined():
|
||||||
|
deep = _valid_rec(2)
|
||||||
|
deep["extra"] = _nested(12) # well past _MAX_DEPTH
|
||||||
|
raw = json.dumps({"summary": "Triage.", "recommendations": [_valid_rec(1), deep]})
|
||||||
|
llm = _CountingLLM([raw])
|
||||||
|
|
||||||
|
result = execute_instruction_with_audit(_triage_instr(), _Event(), {}, llm)
|
||||||
|
|
||||||
|
assert len(result.report["recommendations"]) == 1
|
||||||
|
assert result.report["quarantined_count"] == 1
|
||||||
|
assert result.report["quarantined_items"][0]["reason"] == "guardrail"
|
||||||
|
assert "depth" in result.report["quarantined_items"][0]["error"]
|
||||||
|
|
||||||
|
|
||||||
|
def test_resilient_recovery_against_real_2026_06_26_fixture():
|
||||||
|
# The actual captured failure payload (4000-char preview, truncated at the 7th
|
||||||
|
# recommendation) — the run that reset the WP-0006-T03 streak. Before WP-0016
|
||||||
|
# this discarded the whole report; now it must recover the valid prefix.
|
||||||
|
fixture = json.loads(
|
||||||
|
Path("tests/fixtures/wp0016/daily_triage_2026-06-26_validation_failure.partial.json")
|
||||||
|
.read_text(encoding="utf-8")
|
||||||
|
)
|
||||||
|
raw = fixture["raw_output_preview"]
|
||||||
|
llm = _CountingLLM([raw, raw])
|
||||||
|
|
||||||
|
result = execute_instruction_with_audit(_triage_instr(), _Event(), {}, llm)
|
||||||
|
|
||||||
|
assert result.output_validated is True
|
||||||
|
assert result.report["partial"] is True
|
||||||
|
# Six recommendations are fully intact before the truncation point.
|
||||||
|
assert len(result.report["recommendations"]) >= 6
|
||||||
|
assert all("rank" in rec and "candidate" in rec for rec in result.report["recommendations"])
|
||||||
|
|
||||||
|
|
||||||
def test_execute_instruction_with_audit_preserves_invalid_report_with_sinks(
|
def test_execute_instruction_with_audit_preserves_invalid_report_with_sinks(
|
||||||
tmp_path,
|
tmp_path,
|
||||||
monkeypatch,
|
monkeypatch,
|
||||||
|
|||||||
114
tests/test_admin_sync_api.py
Normal file
114
tests/test_admin_sync_api.py
Normal file
@@ -0,0 +1,114 @@
|
|||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
from activity_core import api
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_admin_sync_definitions_only_does_not_require_temporal(
|
||||||
|
monkeypatch,
|
||||||
|
) -> None:
|
||||||
|
seen: dict[str, Any] = {}
|
||||||
|
|
||||||
|
async def fake_run_sync(**kwargs: Any) -> dict[str, Any]:
|
||||||
|
seen.update(kwargs)
|
||||||
|
return {"ok": True, "ran": {"definitions": True}}
|
||||||
|
|
||||||
|
monkeypatch.setattr(api, "_session_factory", object())
|
||||||
|
monkeypatch.setattr(api, "_temporal_client", None)
|
||||||
|
monkeypatch.setattr(api, "run_sync", fake_run_sync)
|
||||||
|
|
||||||
|
result = await api.admin_sync(
|
||||||
|
definitions=True,
|
||||||
|
schedules=False,
|
||||||
|
event_types=False,
|
||||||
|
)
|
||||||
|
|
||||||
|
assert result == {"ok": True, "ran": {"definitions": True}}
|
||||||
|
assert seen["session_factory"] is api._session_factory
|
||||||
|
assert seen["temporal_client"] is None
|
||||||
|
assert seen["definitions"] is True
|
||||||
|
assert seen["schedules"] is False
|
||||||
|
assert seen["event_types"] is False
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_admin_sync_schedules_only_passes_temporal(monkeypatch) -> None:
|
||||||
|
temporal = object()
|
||||||
|
seen: dict[str, Any] = {}
|
||||||
|
|
||||||
|
async def fake_run_sync(**kwargs: Any) -> dict[str, Any]:
|
||||||
|
seen.update(kwargs)
|
||||||
|
return {
|
||||||
|
"ok": True,
|
||||||
|
"schedules": {
|
||||||
|
"upserted": 1,
|
||||||
|
"paused": 0,
|
||||||
|
"deleted_orphans": 0,
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
|
monkeypatch.setattr(api, "_session_factory", object())
|
||||||
|
monkeypatch.setattr(api, "_temporal_client", temporal)
|
||||||
|
monkeypatch.setattr(api, "run_sync", fake_run_sync)
|
||||||
|
|
||||||
|
result = await api.admin_sync(
|
||||||
|
definitions=False,
|
||||||
|
schedules=True,
|
||||||
|
event_types=False,
|
||||||
|
)
|
||||||
|
|
||||||
|
assert result["schedules"]["upserted"] == 1
|
||||||
|
assert seen["temporal_client"] is temporal
|
||||||
|
assert seen["definitions"] is False
|
||||||
|
assert seen["schedules"] is True
|
||||||
|
assert seen["event_types"] is False
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_admin_sync_all_sync_returns_failure_result(monkeypatch) -> None:
|
||||||
|
async def fake_run_sync(**kwargs: Any) -> dict[str, Any]:
|
||||||
|
return {
|
||||||
|
"ok": False,
|
||||||
|
"ran": {
|
||||||
|
"definitions": kwargs["definitions"],
|
||||||
|
"schedules": kwargs["schedules"],
|
||||||
|
"event_types": kwargs["event_types"],
|
||||||
|
},
|
||||||
|
"errors": [
|
||||||
|
{
|
||||||
|
"stage": "event_types",
|
||||||
|
"type": "RuntimeError",
|
||||||
|
"message": "bad event type",
|
||||||
|
}
|
||||||
|
],
|
||||||
|
}
|
||||||
|
|
||||||
|
monkeypatch.setattr(api, "_session_factory", object())
|
||||||
|
monkeypatch.setattr(api, "_temporal_client", object())
|
||||||
|
monkeypatch.setattr(api, "run_sync", fake_run_sync)
|
||||||
|
|
||||||
|
result = await api.admin_sync(
|
||||||
|
definitions=True,
|
||||||
|
schedules=True,
|
||||||
|
event_types=True,
|
||||||
|
)
|
||||||
|
|
||||||
|
assert result == {
|
||||||
|
"ok": False,
|
||||||
|
"ran": {
|
||||||
|
"definitions": True,
|
||||||
|
"schedules": True,
|
||||||
|
"event_types": True,
|
||||||
|
},
|
||||||
|
"errors": [
|
||||||
|
{
|
||||||
|
"stage": "event_types",
|
||||||
|
"type": "RuntimeError",
|
||||||
|
"message": "bad event type",
|
||||||
|
}
|
||||||
|
],
|
||||||
|
}
|
||||||
@@ -1,6 +1,7 @@
|
|||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
import json
|
import json
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
import pytest
|
import pytest
|
||||||
|
|
||||||
@@ -70,7 +71,14 @@ async def test_evaluate_instructions_returns_task_specs_with_audit(monkeypatch)
|
|||||||
async def test_evaluate_instructions_returns_report_payload(monkeypatch) -> None:
|
async def test_evaluate_instructions_returns_report_payload(monkeypatch) -> None:
|
||||||
llm = FakeLLMClient(json.dumps({
|
llm = FakeLLMClient(json.dumps({
|
||||||
"summary": "State Hub has open loose ends.",
|
"summary": "State Hub has open loose ends.",
|
||||||
"recommendations": [{"candidate": "CUST-WP-0045", "action": "work-next"}],
|
"recommendations": [
|
||||||
|
{
|
||||||
|
"rank": 1,
|
||||||
|
"candidate": "CUST-WP-0045",
|
||||||
|
"action": "work-next",
|
||||||
|
"why": "Open loose ends.",
|
||||||
|
}
|
||||||
|
],
|
||||||
}))
|
}))
|
||||||
monkeypatch.setattr(activities, "get_llm_client", lambda: llm)
|
monkeypatch.setattr(activities, "get_llm_client", lambda: llm)
|
||||||
|
|
||||||
@@ -209,6 +217,12 @@ async def test_evaluate_instructions_forwards_llm_connect_depth_config(monkeypat
|
|||||||
"context": {},
|
"context": {},
|
||||||
})
|
})
|
||||||
|
|
||||||
|
# Read the live schema file rather than hard-coding it, so the forwarded
|
||||||
|
# json_schema assertion tracks schemas/daily-triage-report.json as the
|
||||||
|
# contract evolves (ACTIVITY-WP-0016-T02).
|
||||||
|
expected_schema = json.loads(
|
||||||
|
Path("schemas/daily-triage-report.json").read_text(encoding="utf-8")
|
||||||
|
)
|
||||||
assert llm.calls[0][2] == {
|
assert llm.calls[0][2] == {
|
||||||
"model_name": "custodian-triage-balanced",
|
"model_name": "custodian-triage-balanced",
|
||||||
"temperature": 0.2,
|
"temperature": 0.2,
|
||||||
@@ -216,16 +230,6 @@ async def test_evaluate_instructions_forwards_llm_connect_depth_config(monkeypat
|
|||||||
"max_depth": 2,
|
"max_depth": 2,
|
||||||
"model_params": {
|
"model_params": {
|
||||||
"reasoning_effort": "medium",
|
"reasoning_effort": "medium",
|
||||||
"json_schema": {
|
"json_schema": expected_schema,
|
||||||
"type": "object",
|
|
||||||
"required": ["summary", "recommendations"],
|
|
||||||
"properties": {
|
|
||||||
"summary": {"type": "string"},
|
|
||||||
"recommendations": {
|
|
||||||
"type": "array",
|
|
||||||
"items": {"type": "object"},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
},
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -34,7 +34,7 @@ def test_issue_core_rest_sink_posts_task_contract(monkeypatch) -> None:
|
|||||||
|
|
||||||
monkeypatch.setattr(httpx, "post", fake_post)
|
monkeypatch.setattr(httpx, "post", fake_post)
|
||||||
|
|
||||||
ref = IssueCoreRestSink("http://issue-core.test/").emit(TaskSpec(
|
ref = IssueCoreRestSink("http://issue-core.test/", api_key="test-key").emit(TaskSpec(
|
||||||
title="Run SBOM rescan for activity-core",
|
title="Run SBOM rescan for activity-core",
|
||||||
description="SBOM is older than 30 days.",
|
description="SBOM is older than 30 days.",
|
||||||
target_repo="activity-core",
|
target_repo="activity-core",
|
||||||
@@ -67,9 +67,28 @@ def test_issue_core_rest_sink_posts_task_contract(monkeypatch) -> None:
|
|||||||
"triggering_event_id": "scheduled",
|
"triggering_event_id": "scheduled",
|
||||||
"activity_definition_id": "activity-1",
|
"activity_definition_id": "activity-1",
|
||||||
},
|
},
|
||||||
|
"headers": {"Authorization": "Bearer test-key"},
|
||||||
"timeout": 10.0,
|
"timeout": 10.0,
|
||||||
}
|
}
|
||||||
]
|
]
|
||||||
|
assert "review_required" not in posts[0]["json"]
|
||||||
|
|
||||||
|
|
||||||
|
def test_issue_core_rest_sink_requires_api_key() -> None:
|
||||||
|
sink = IssueCoreRestSink("http://issue-core.test/", api_key="")
|
||||||
|
with pytest.raises(RuntimeError, match="ISSUE_CORE_API_KEY"):
|
||||||
|
sink.emit(TaskSpec(
|
||||||
|
title="t",
|
||||||
|
description="",
|
||||||
|
target_repo="activity-core",
|
||||||
|
priority="low",
|
||||||
|
labels=[],
|
||||||
|
due_in_days=None,
|
||||||
|
source_type="rule",
|
||||||
|
source_id="r",
|
||||||
|
triggering_event_id="e",
|
||||||
|
activity_definition_id="a",
|
||||||
|
))
|
||||||
|
|
||||||
|
|
||||||
@pytest.mark.asyncio
|
@pytest.mark.asyncio
|
||||||
|
|||||||
195
tests/test_kaizen_context_resolver.py
Normal file
195
tests/test_kaizen_context_resolver.py
Normal file
@@ -0,0 +1,195 @@
|
|||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
import httpx
|
||||||
|
import pytest
|
||||||
|
import yaml
|
||||||
|
|
||||||
|
from activity_core.context_resolvers.kaizen import (
|
||||||
|
KaizenContextResolver,
|
||||||
|
discover_kaizen_scheduled_repos,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class DummyResponse:
|
||||||
|
def __init__(self, payload: Any, status_error: Exception | None = None) -> None:
|
||||||
|
self.payload = payload
|
||||||
|
self.status_error = status_error
|
||||||
|
|
||||||
|
def raise_for_status(self) -> None:
|
||||||
|
if self.status_error is not None:
|
||||||
|
raise self.status_error
|
||||||
|
|
||||||
|
def json(self) -> Any:
|
||||||
|
return self.payload
|
||||||
|
|
||||||
|
|
||||||
|
def _write_schedule(path: Path, agents: dict[str, Any]) -> None:
|
||||||
|
path.parent.mkdir(parents=True, exist_ok=True)
|
||||||
|
path.write_text(
|
||||||
|
yaml.safe_dump(
|
||||||
|
{"version": "1", "timezone": "Europe/Berlin", "agents": agents},
|
||||||
|
sort_keys=False,
|
||||||
|
),
|
||||||
|
encoding="utf-8",
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def test_discover_scheduled_repos_emits_enabled_coach(tmp_path, monkeypatch) -> None:
|
||||||
|
repo_root = tmp_path / "pilot-repo"
|
||||||
|
repo_root.mkdir()
|
||||||
|
_write_schedule(
|
||||||
|
repo_root / ".kaizen" / "schedule.yml",
|
||||||
|
{"coach": {"cadence": "daily", "cron": "15 * * * *", "enabled": True}},
|
||||||
|
)
|
||||||
|
|
||||||
|
def fake_get(url: str, **kwargs: Any) -> DummyResponse:
|
||||||
|
return DummyResponse(
|
||||||
|
[
|
||||||
|
{
|
||||||
|
"slug": "pilot-repo",
|
||||||
|
"domain_slug": "custodian",
|
||||||
|
"host_paths": {"testhost": str(repo_root)},
|
||||||
|
}
|
||||||
|
]
|
||||||
|
)
|
||||||
|
|
||||||
|
monkeypatch.setenv("STATE_HUB_URL", "http://hub.test")
|
||||||
|
monkeypatch.setenv("KAIZEN_RUNNER_HOST", "testhost")
|
||||||
|
monkeypatch.setattr(httpx, "get", fake_get)
|
||||||
|
|
||||||
|
result = discover_kaizen_scheduled_repos({})
|
||||||
|
|
||||||
|
assert len(result["scheduled_runs"]) == 1
|
||||||
|
run = result["scheduled_runs"][0]
|
||||||
|
assert run["repo"] == "pilot-repo"
|
||||||
|
assert run["agent"] == "coach"
|
||||||
|
assert run["enabled"] is True
|
||||||
|
assert "schedule prepare coach" in run["prepare_command"]
|
||||||
|
|
||||||
|
|
||||||
|
def test_discover_scheduled_repos_skips_disabled_coach(tmp_path, monkeypatch) -> None:
|
||||||
|
repo_root = tmp_path / "pilot-repo"
|
||||||
|
repo_root.mkdir()
|
||||||
|
_write_schedule(
|
||||||
|
repo_root / ".kaizen" / "schedule.yml",
|
||||||
|
{"coach": {"cadence": "daily", "enabled": False}},
|
||||||
|
)
|
||||||
|
|
||||||
|
monkeypatch.setenv("STATE_HUB_URL", "http://hub.test")
|
||||||
|
monkeypatch.setenv("KAIZEN_RUNNER_HOST", "testhost")
|
||||||
|
monkeypatch.setattr(
|
||||||
|
httpx,
|
||||||
|
"get",
|
||||||
|
lambda url, **kwargs: DummyResponse(
|
||||||
|
[{"slug": "pilot-repo", "host_paths": {"testhost": str(repo_root)}}]
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
result = discover_kaizen_scheduled_repos({})
|
||||||
|
assert result["scheduled_runs"] == []
|
||||||
|
|
||||||
|
|
||||||
|
def test_discover_scheduled_repos_skips_missing_schedule(tmp_path, monkeypatch) -> None:
|
||||||
|
repo_root = tmp_path / "no-schedule"
|
||||||
|
repo_root.mkdir()
|
||||||
|
|
||||||
|
monkeypatch.setenv("STATE_HUB_URL", "http://hub.test")
|
||||||
|
monkeypatch.setenv("KAIZEN_RUNNER_HOST", "testhost")
|
||||||
|
monkeypatch.setattr(
|
||||||
|
httpx,
|
||||||
|
"get",
|
||||||
|
lambda url, **kwargs: DummyResponse(
|
||||||
|
[{"slug": "no-schedule", "host_paths": {"testhost": str(repo_root)}}]
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
result = discover_kaizen_scheduled_repos({})
|
||||||
|
assert result["scheduled_runs"] == []
|
||||||
|
|
||||||
|
|
||||||
|
def test_discover_scheduled_repos_skips_invalid_schedule(tmp_path, monkeypatch) -> None:
|
||||||
|
repo_root = tmp_path / "bad-schedule"
|
||||||
|
schedule = repo_root / ".kaizen" / "schedule.yml"
|
||||||
|
schedule.parent.mkdir(parents=True)
|
||||||
|
schedule.write_text("version: '2'\nagents: {}\n", encoding="utf-8")
|
||||||
|
|
||||||
|
monkeypatch.setenv("STATE_HUB_URL", "http://hub.test")
|
||||||
|
monkeypatch.setenv("KAIZEN_RUNNER_HOST", "testhost")
|
||||||
|
monkeypatch.setattr(
|
||||||
|
httpx,
|
||||||
|
"get",
|
||||||
|
lambda url, **kwargs: DummyResponse(
|
||||||
|
[{"slug": "bad-schedule", "host_paths": {"testhost": str(repo_root)}}]
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
result = discover_kaizen_scheduled_repos({})
|
||||||
|
assert result["scheduled_runs"] == []
|
||||||
|
|
||||||
|
|
||||||
|
def test_discover_scheduled_repos_filters_by_roster_and_cadence(
|
||||||
|
tmp_path, monkeypatch
|
||||||
|
) -> None:
|
||||||
|
repo_a = tmp_path / "kaizen-agentic"
|
||||||
|
repo_b = tmp_path / "other-repo"
|
||||||
|
for root in (repo_a, repo_b):
|
||||||
|
_write_schedule(
|
||||||
|
root / ".kaizen" / "schedule.yml",
|
||||||
|
{
|
||||||
|
"coach": {"cadence": "daily", "enabled": True},
|
||||||
|
"optimization": {"cadence": "weekly", "enabled": True},
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
roster = tmp_path / "roster.yaml"
|
||||||
|
roster.write_text(
|
||||||
|
yaml.safe_dump(
|
||||||
|
{
|
||||||
|
"active": [
|
||||||
|
{"slug": "kaizen-agentic", "agents": ["coach"], "status": "active"}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
),
|
||||||
|
encoding="utf-8",
|
||||||
|
)
|
||||||
|
|
||||||
|
monkeypatch.setenv("STATE_HUB_URL", "http://hub.test")
|
||||||
|
monkeypatch.setenv("KAIZEN_RUNNER_HOST", "testhost")
|
||||||
|
monkeypatch.setattr(
|
||||||
|
httpx,
|
||||||
|
"get",
|
||||||
|
lambda url, **kwargs: DummyResponse(
|
||||||
|
[
|
||||||
|
{"slug": "kaizen-agentic", "host_paths": {"testhost": str(repo_a)}},
|
||||||
|
{"slug": "other-repo", "host_paths": {"testhost": str(repo_b)}},
|
||||||
|
]
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
result = discover_kaizen_scheduled_repos(
|
||||||
|
{"roster": str(roster), "cadence": "daily"}
|
||||||
|
)
|
||||||
|
agents = {r["agent"] for r in result["scheduled_runs"]}
|
||||||
|
repos = {r["repo"] for r in result["scheduled_runs"]}
|
||||||
|
assert repos == {"kaizen-agentic"}
|
||||||
|
assert agents == {"coach"}
|
||||||
|
|
||||||
|
|
||||||
|
def test_hub_unreachable_raises(monkeypatch) -> None:
|
||||||
|
monkeypatch.setenv("STATE_HUB_URL", "http://hub.test")
|
||||||
|
|
||||||
|
def fail_get(url: str, **kwargs: Any) -> DummyResponse:
|
||||||
|
raise httpx.ConnectError("down")
|
||||||
|
|
||||||
|
monkeypatch.setattr(httpx, "get", fail_get)
|
||||||
|
|
||||||
|
with pytest.raises(RuntimeError, match="State Hub unreachable"):
|
||||||
|
discover_kaizen_scheduled_repos({})
|
||||||
|
|
||||||
|
|
||||||
|
def test_resolver_registry_alias() -> None:
|
||||||
|
resolver = KaizenContextResolver()
|
||||||
|
assert resolver.resolve("unknown_query", None, {}) == {}
|
||||||
@@ -166,6 +166,93 @@ def test_state_hub_progress_sink_is_idempotent(monkeypatch) -> None:
|
|||||||
assert result[0]["idempotency_key"] == idempotency_key
|
assert result[0]["idempotency_key"] == idempotency_key
|
||||||
|
|
||||||
|
|
||||||
|
def test_core_hub_interaction_event_sink_posts_and_verifies_compact_event(monkeypatch) -> None:
|
||||||
|
posts: list[dict[str, Any]] = []
|
||||||
|
|
||||||
|
def fake_post(url: str, **kwargs: Any) -> DummyResponse:
|
||||||
|
assert url == "http://core-hub.test/api/v2/interaction-events"
|
||||||
|
assert kwargs["headers"]["Authorization"] == "Bearer runtime-secret"
|
||||||
|
posts.append({"url": url, **kwargs})
|
||||||
|
return DummyResponse(
|
||||||
|
{
|
||||||
|
"id": "event-1",
|
||||||
|
"eventType": "ops-endpoint-verified",
|
||||||
|
"widgetId": "widget-1",
|
||||||
|
}
|
||||||
|
)
|
||||||
|
|
||||||
|
def fake_get(url: str, **kwargs: Any) -> DummyResponse:
|
||||||
|
assert url == "http://core-hub.test/api/v2/interaction-events"
|
||||||
|
assert kwargs["headers"]["Authorization"] == "Bearer runtime-secret"
|
||||||
|
return DummyResponse({"data": [{"id": "event-1"}]})
|
||||||
|
|
||||||
|
monkeypatch.setenv("CORE_HUB_RUNTIME_TOKEN", "runtime-secret")
|
||||||
|
monkeypatch.setattr(httpx, "post", fake_post)
|
||||||
|
monkeypatch.setattr(httpx, "get", fake_get)
|
||||||
|
|
||||||
|
result = persist_ops_inventory_evidence(
|
||||||
|
_payload([
|
||||||
|
{
|
||||||
|
"type": "core-hub-interaction-event",
|
||||||
|
"core_hub_url": "http://core-hub.test",
|
||||||
|
"widget_id": "widget-1",
|
||||||
|
"event_type": "ops-endpoint-verified",
|
||||||
|
}
|
||||||
|
])
|
||||||
|
)
|
||||||
|
|
||||||
|
assert result == [
|
||||||
|
{
|
||||||
|
"type": "core-hub-interaction-event",
|
||||||
|
"status": "posted",
|
||||||
|
"event_type": "ops-endpoint-verified",
|
||||||
|
"event_id": "event-1",
|
||||||
|
"widget_id": "widget-1",
|
||||||
|
"verified": True,
|
||||||
|
"context_key": "ops_probe",
|
||||||
|
}
|
||||||
|
]
|
||||||
|
body = posts[0]["json"]
|
||||||
|
assert body["widgetId"] == "widget-1"
|
||||||
|
assert body["eventType"] == "ops-endpoint-verified"
|
||||||
|
assert body["metadata"]["activity_core_run_id"] == _run_id()
|
||||||
|
assert body["metadata"]["endpoint"]["url"] == "http://state-hub.test/health"
|
||||||
|
assert body["metadata"]["endpoint"]["widget_ref"] == "ops:endpoint:state-hub-health"
|
||||||
|
|
||||||
|
serialized = json.dumps(body, sort_keys=True)
|
||||||
|
assert "runtime-secret" not in serialized
|
||||||
|
assert "secret response body" not in serialized
|
||||||
|
assert "Authorization" not in serialized
|
||||||
|
assert "user:pass" not in serialized
|
||||||
|
assert "token=secret" not in serialized
|
||||||
|
|
||||||
|
|
||||||
|
def test_core_hub_sink_skips_cleanly_when_config_missing(monkeypatch) -> None:
|
||||||
|
monkeypatch.delenv("CORE_HUB_BASE_URL", raising=False)
|
||||||
|
monkeypatch.delenv("CORE_HUB_RUNTIME_TOKEN", raising=False)
|
||||||
|
monkeypatch.delenv("CORE_HUB_RUNTIME_TOKEN_FILE", raising=False)
|
||||||
|
monkeypatch.delenv("CORE_HUB_WIDGET_ID", raising=False)
|
||||||
|
monkeypatch.delenv("CORE_HUB_WIDGET_MAPPING", raising=False)
|
||||||
|
|
||||||
|
result = persist_ops_inventory_evidence(
|
||||||
|
_payload([{"type": "core-hub-interaction-event"}])
|
||||||
|
)
|
||||||
|
|
||||||
|
assert result == [
|
||||||
|
{
|
||||||
|
"type": "core-hub-interaction-event",
|
||||||
|
"status": "skipped",
|
||||||
|
"reason": "missing_core_hub_config",
|
||||||
|
"missing": [
|
||||||
|
"CORE_HUB_BASE_URL",
|
||||||
|
"CORE_HUB_RUNTIME_TOKEN or CORE_HUB_RUNTIME_TOKEN_FILE",
|
||||||
|
"widget_id or CORE_HUB_WIDGET_ID",
|
||||||
|
],
|
||||||
|
"context_key": "ops_probe",
|
||||||
|
}
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
def test_inter_hub_sink_skips_cleanly_when_config_missing(monkeypatch) -> None:
|
def test_inter_hub_sink_skips_cleanly_when_config_missing(monkeypatch) -> None:
|
||||||
monkeypatch.delenv("INTER_HUB_URL", raising=False)
|
monkeypatch.delenv("INTER_HUB_URL", raising=False)
|
||||||
monkeypatch.delenv("OPS_HUB_KEY", raising=False)
|
monkeypatch.delenv("OPS_HUB_KEY", raising=False)
|
||||||
|
|||||||
@@ -33,7 +33,9 @@ def _by_kind_name(kind: str, name: str) -> dict[str, Any]:
|
|||||||
def test_runtime_config_has_ops_inventory_placeholders() -> None:
|
def test_runtime_config_has_ops_inventory_placeholders() -> None:
|
||||||
config = _by_kind_name("ConfigMap", "actcore-runtime-config")
|
config = _by_kind_name("ConfigMap", "actcore-runtime-config")
|
||||||
|
|
||||||
assert config["data"]["LLM_CONNECT_URL"] == ""
|
assert config["data"]["LLM_CONNECT_URL"] == (
|
||||||
|
"http://llm-connect.activity-core.svc.cluster.local:8080"
|
||||||
|
)
|
||||||
assert config["data"]["LLM_CONNECT_TIMEOUT_SECONDS"] == "300"
|
assert config["data"]["LLM_CONNECT_TIMEOUT_SECONDS"] == "300"
|
||||||
assert config["data"]["OPS_INVENTORY_PATH"] == (
|
assert config["data"]["OPS_INVENTORY_PATH"] == (
|
||||||
"/etc/activity-core/ops/service-inventory.yml"
|
"/etc/activity-core/ops/service-inventory.yml"
|
||||||
|
|||||||
160
tests/test_resolve_context_binding.py
Normal file
160
tests/test_resolve_context_binding.py
Normal file
@@ -0,0 +1,160 @@
|
|||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import json
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
from temporalio.exceptions import ApplicationError
|
||||||
|
|
||||||
|
from activity_core import activities
|
||||||
|
from activity_core.activities import _bind_resolver_result, resolve_context
|
||||||
|
|
||||||
|
|
||||||
|
def test_bind_resolver_result_unwraps_single_key_wrapper() -> None:
|
||||||
|
projects = [{"repo": "kaizen-agentic", "has_metrics": True}]
|
||||||
|
assert _bind_resolver_result("projects", {"projects": projects}) == projects
|
||||||
|
|
||||||
|
|
||||||
|
def test_bind_resolver_result_keeps_multi_key_summary() -> None:
|
||||||
|
summary = {
|
||||||
|
"repos": [{"repo_slug": "a"}],
|
||||||
|
"stale_count": 1,
|
||||||
|
"total_count": 2,
|
||||||
|
}
|
||||||
|
assert _bind_resolver_result("repos", summary) == summary
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_resolve_context_unwraps_kaizen_projects(monkeypatch) -> None:
|
||||||
|
class _FakeResolver:
|
||||||
|
def resolve(self, query: str, event: object, params: dict) -> dict:
|
||||||
|
assert query == "discover_kaizen_projects"
|
||||||
|
return {"projects": [{"repo": "pilot", "has_metrics": True}]}
|
||||||
|
|
||||||
|
import activity_core.context_resolvers # noqa: F401
|
||||||
|
from activity_core.context_resolvers.base import CONTEXT_RESOLVER_REGISTRY
|
||||||
|
|
||||||
|
monkeypatch.setitem(CONTEXT_RESOLVER_REGISTRY, "kaizen", lambda: _FakeResolver())
|
||||||
|
|
||||||
|
snapshot = await resolve_context(
|
||||||
|
[
|
||||||
|
{
|
||||||
|
"type": "kaizen",
|
||||||
|
"query": "discover_kaizen_projects",
|
||||||
|
"params": {},
|
||||||
|
"bind_to": "context.projects",
|
||||||
|
}
|
||||||
|
]
|
||||||
|
)
|
||||||
|
|
||||||
|
assert snapshot == {"projects": [{"repo": "pilot", "has_metrics": True}]}
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_resolve_context_binds_event_payload_attributes() -> None:
|
||||||
|
envelope = {
|
||||||
|
"type": "kaizen.metrics.recorded",
|
||||||
|
"attributes": {
|
||||||
|
"agent": "coach",
|
||||||
|
"project": "kaizen-agentic",
|
||||||
|
"summary": {
|
||||||
|
"success_rate": 0.75,
|
||||||
|
"execution_count": 12,
|
||||||
|
"avg_quality": 0.81,
|
||||||
|
},
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
|
snapshot = await resolve_context(
|
||||||
|
[
|
||||||
|
{
|
||||||
|
"type": "event-payload",
|
||||||
|
"bind_to": "context.metrics",
|
||||||
|
}
|
||||||
|
],
|
||||||
|
json.dumps(envelope),
|
||||||
|
)
|
||||||
|
|
||||||
|
assert snapshot == {
|
||||||
|
"metrics": {
|
||||||
|
"agent": "coach",
|
||||||
|
"project": "kaizen-agentic",
|
||||||
|
"summary": {
|
||||||
|
"success_rate": 0.75,
|
||||||
|
"execution_count": 12,
|
||||||
|
"avg_quality": 0.81,
|
||||||
|
},
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_event_payload_context_supports_low_success_rate_rule() -> None:
|
||||||
|
snapshot = await resolve_context(
|
||||||
|
[
|
||||||
|
{
|
||||||
|
"type": "event-payload",
|
||||||
|
"bind_to": "context.metrics",
|
||||||
|
}
|
||||||
|
],
|
||||||
|
json.dumps({
|
||||||
|
"type": "kaizen.metrics.recorded",
|
||||||
|
"attributes": {
|
||||||
|
"agent": "coach",
|
||||||
|
"project": "kaizen-agentic",
|
||||||
|
"summary": {"success_rate": 0.75},
|
||||||
|
},
|
||||||
|
}),
|
||||||
|
)
|
||||||
|
|
||||||
|
result = await activities.evaluate_rules({
|
||||||
|
"rules": [
|
||||||
|
{
|
||||||
|
"id": "flag-low-success-rate",
|
||||||
|
"condition": "context.metrics.summary.success_rate < 0.8",
|
||||||
|
"action": {
|
||||||
|
"task_template": (
|
||||||
|
"Review low success rate for {context.metrics.agent}"
|
||||||
|
),
|
||||||
|
"target_repo": "context.metrics.project",
|
||||||
|
"priority": "high",
|
||||||
|
"labels": ["kaizen", "{context.metrics.agent}"],
|
||||||
|
},
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"event": {},
|
||||||
|
"context": snapshot,
|
||||||
|
})
|
||||||
|
|
||||||
|
assert len(result) == 1
|
||||||
|
assert result[0]["source_id"] == "flag-low-success-rate"
|
||||||
|
assert result[0]["title"] == "Review low success rate for coach"
|
||||||
|
assert result[0]["target_repo"] == "kaizen-agentic"
|
||||||
|
assert result[0]["labels"] == ["kaizen", "coach"]
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_event_payload_context_binds_empty_when_optional_envelope_missing() -> None:
|
||||||
|
snapshot = await resolve_context(
|
||||||
|
[
|
||||||
|
{
|
||||||
|
"type": "event-payload",
|
||||||
|
"bind_to": "context.metrics",
|
||||||
|
}
|
||||||
|
],
|
||||||
|
)
|
||||||
|
|
||||||
|
assert snapshot == {"metrics": {}}
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_event_payload_context_fails_when_required_envelope_missing() -> None:
|
||||||
|
with pytest.raises(ApplicationError, match="Required context resolver"):
|
||||||
|
await resolve_context(
|
||||||
|
[
|
||||||
|
{
|
||||||
|
"type": "event-payload",
|
||||||
|
"bind_to": "context.metrics",
|
||||||
|
"required": True,
|
||||||
|
}
|
||||||
|
],
|
||||||
|
)
|
||||||
167
tests/test_reuse_surface_context_resolver.py
Normal file
167
tests/test_reuse_surface_context_resolver.py
Normal file
@@ -0,0 +1,167 @@
|
|||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import json
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
from temporalio.exceptions import ApplicationError
|
||||||
|
|
||||||
|
from activity_core.activities import resolve_context
|
||||||
|
from activity_core.context_resolvers import reuse_surface
|
||||||
|
from activity_core.context_resolvers.base import CONTEXT_RESOLVER_REGISTRY
|
||||||
|
|
||||||
|
|
||||||
|
class _Response:
|
||||||
|
def __init__(self, payload: Any) -> None:
|
||||||
|
self._payload = payload
|
||||||
|
|
||||||
|
def raise_for_status(self) -> None:
|
||||||
|
return None
|
||||||
|
|
||||||
|
def json(self) -> Any:
|
||||||
|
return self._payload
|
||||||
|
|
||||||
|
|
||||||
|
class _Completed:
|
||||||
|
returncode = 0
|
||||||
|
stderr = ""
|
||||||
|
|
||||||
|
def __init__(self, payload: dict[str, Any]) -> None:
|
||||||
|
self.stdout = json.dumps(payload)
|
||||||
|
|
||||||
|
|
||||||
|
def _write_rollout(path: Path) -> None:
|
||||||
|
path.write_text(
|
||||||
|
"""
|
||||||
|
domains:
|
||||||
|
reuse:
|
||||||
|
phase: active
|
||||||
|
repos:
|
||||||
|
- reuse-surface
|
||||||
|
- activity-core
|
||||||
|
parked:
|
||||||
|
phase: backlog
|
||||||
|
repos:
|
||||||
|
- ignored-repo
|
||||||
|
""".lstrip(),
|
||||||
|
encoding="utf-8",
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _write_cli_only_signals(path: Path) -> None:
|
||||||
|
path.write_text(
|
||||||
|
"""
|
||||||
|
signals:
|
||||||
|
empty_capability_scaffold:
|
||||||
|
enabled: true
|
||||||
|
registry_gap:
|
||||||
|
enabled: false
|
||||||
|
stale_scope:
|
||||||
|
enabled: false
|
||||||
|
stale_sbom:
|
||||||
|
enabled: false
|
||||||
|
publish_check_fail:
|
||||||
|
enabled: false
|
||||||
|
""".lstrip(),
|
||||||
|
encoding="utf-8",
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def test_shell_resolver_emits_reuse_surface_gaps_and_advances_cursor(
|
||||||
|
tmp_path,
|
||||||
|
monkeypatch,
|
||||||
|
) -> None:
|
||||||
|
rollout = tmp_path / "rollout.yaml"
|
||||||
|
_write_rollout(rollout)
|
||||||
|
_write_cli_only_signals(tmp_path / "signals.yml")
|
||||||
|
reuse_root = tmp_path / "reuse-surface"
|
||||||
|
reuse_root.mkdir()
|
||||||
|
(reuse_root / "SCOPE.md").write_text("fresh\n", encoding="utf-8")
|
||||||
|
activity_root = tmp_path / "activity-core"
|
||||||
|
activity_root.mkdir()
|
||||||
|
|
||||||
|
monkeypatch.setenv("KAIZEN_RUNNER_HOST", "runner")
|
||||||
|
|
||||||
|
def fake_get(url: str, **kwargs: Any) -> _Response:
|
||||||
|
assert url.endswith("/repos/")
|
||||||
|
return _Response(
|
||||||
|
[
|
||||||
|
{
|
||||||
|
"slug": "reuse-surface",
|
||||||
|
"host_paths": {"runner": str(reuse_root)},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"slug": "activity-core",
|
||||||
|
"host_paths": {"runner": str(activity_root)},
|
||||||
|
},
|
||||||
|
]
|
||||||
|
)
|
||||||
|
|
||||||
|
def fake_run(cmd: list[str], **kwargs: Any) -> _Completed:
|
||||||
|
assert cmd == ["reuse-surface", "report", "gaps", "--format", "json"]
|
||||||
|
return _Completed({"empty_scaffolds": ["reuse-surface"]})
|
||||||
|
|
||||||
|
monkeypatch.setattr(reuse_surface.httpx, "get", fake_get)
|
||||||
|
monkeypatch.setattr(reuse_surface.subprocess, "run", fake_run)
|
||||||
|
|
||||||
|
import activity_core.context_resolvers # noqa: F401
|
||||||
|
|
||||||
|
result = CONTEXT_RESOLVER_REGISTRY["shell"]().resolve(
|
||||||
|
"reuse_surface_report_gaps",
|
||||||
|
None,
|
||||||
|
{
|
||||||
|
"roster": str(rollout),
|
||||||
|
"batch_size": 1,
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
assert result == {
|
||||||
|
"gaps": [
|
||||||
|
{
|
||||||
|
"repo": "reuse-surface",
|
||||||
|
"root": str(reuse_root),
|
||||||
|
"signal": "empty_capability_scaffold",
|
||||||
|
"hygiene_signal": "empty_capability_scaffold",
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
state = json.loads((tmp_path / "round-robin-state.json").read_text(encoding="utf-8"))
|
||||||
|
assert state["cursor"] == 1
|
||||||
|
assert state["last_batch"] == ["reuse-surface"]
|
||||||
|
|
||||||
|
|
||||||
|
def test_shell_resolver_keeps_kaizen_fallback_for_existing_queries() -> None:
|
||||||
|
assert CONTEXT_RESOLVER_REGISTRY["shell"]().resolve("unknown_query", None, {}) == {}
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_optional_reuse_surface_missing_roster_binds_empty_list(tmp_path) -> None:
|
||||||
|
snapshot = await resolve_context(
|
||||||
|
[
|
||||||
|
{
|
||||||
|
"type": "shell",
|
||||||
|
"query": "reuse_surface_report_gaps",
|
||||||
|
"params": {"roster": str(tmp_path / "missing.yaml")},
|
||||||
|
"bind_to": "context.gaps",
|
||||||
|
}
|
||||||
|
]
|
||||||
|
)
|
||||||
|
|
||||||
|
assert snapshot == {"gaps": []}
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_required_reuse_surface_missing_roster_fails_visibly(tmp_path) -> None:
|
||||||
|
with pytest.raises(ApplicationError, match="Required context resolver"):
|
||||||
|
await resolve_context(
|
||||||
|
[
|
||||||
|
{
|
||||||
|
"type": "shell",
|
||||||
|
"query": "reuse_surface_report_gaps",
|
||||||
|
"params": {"roster": str(tmp_path / "missing.yaml")},
|
||||||
|
"bind_to": "context.gaps",
|
||||||
|
"required": True,
|
||||||
|
}
|
||||||
|
]
|
||||||
|
)
|
||||||
81
tests/test_schedule_health.py
Normal file
81
tests/test_schedule_health.py
Normal file
@@ -0,0 +1,81 @@
|
|||||||
|
"""ACTIVITY-WP-0014 T03: missed-fire detection verdict tests."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from datetime import datetime, timedelta, timezone
|
||||||
|
|
||||||
|
from activity_core.schedule_health import evaluate_schedule_health
|
||||||
|
|
||||||
|
NOW = datetime(2026, 6, 23, 12, 0, tzinfo=timezone.utc)
|
||||||
|
|
||||||
|
|
||||||
|
def test_healthy_when_recent_fire_and_no_drops() -> None:
|
||||||
|
health = evaluate_schedule_health(
|
||||||
|
activity_id="a1",
|
||||||
|
missed_catchup_window=0,
|
||||||
|
last_fired_at=NOW - timedelta(minutes=5),
|
||||||
|
now=NOW,
|
||||||
|
expected_interval=timedelta(hours=1),
|
||||||
|
)
|
||||||
|
assert health.healthy is True
|
||||||
|
assert health.missed is False
|
||||||
|
assert health.reasons == []
|
||||||
|
|
||||||
|
|
||||||
|
def test_unhealthy_when_catchup_window_dropped_fires() -> None:
|
||||||
|
health = evaluate_schedule_health(
|
||||||
|
activity_id="a1",
|
||||||
|
missed_catchup_window=2,
|
||||||
|
last_fired_at=NOW - timedelta(minutes=5),
|
||||||
|
now=NOW,
|
||||||
|
)
|
||||||
|
assert health.missed is True
|
||||||
|
assert "2 fire(s) dropped" in health.reasons[0]
|
||||||
|
|
||||||
|
|
||||||
|
def test_unhealthy_when_last_fire_too_stale() -> None:
|
||||||
|
health = evaluate_schedule_health(
|
||||||
|
activity_id="daily",
|
||||||
|
missed_catchup_window=0,
|
||||||
|
last_fired_at=NOW - timedelta(days=2),
|
||||||
|
now=NOW,
|
||||||
|
expected_interval=timedelta(days=1),
|
||||||
|
)
|
||||||
|
assert health.missed is True
|
||||||
|
assert any("exceeding the expected" in r for r in health.reasons)
|
||||||
|
assert health.staleness == timedelta(days=2)
|
||||||
|
|
||||||
|
|
||||||
|
def test_within_tolerance_is_healthy() -> None:
|
||||||
|
health = evaluate_schedule_health(
|
||||||
|
activity_id="daily",
|
||||||
|
missed_catchup_window=0,
|
||||||
|
last_fired_at=NOW - (timedelta(days=1) + timedelta(minutes=5)),
|
||||||
|
now=NOW,
|
||||||
|
expected_interval=timedelta(days=1),
|
||||||
|
tolerance=timedelta(minutes=10),
|
||||||
|
)
|
||||||
|
assert health.healthy is True
|
||||||
|
|
||||||
|
|
||||||
|
def test_no_fire_recorded_for_due_schedule_is_unhealthy() -> None:
|
||||||
|
health = evaluate_schedule_health(
|
||||||
|
activity_id="daily",
|
||||||
|
missed_catchup_window=0,
|
||||||
|
last_fired_at=None,
|
||||||
|
now=NOW,
|
||||||
|
expected_interval=timedelta(days=1),
|
||||||
|
)
|
||||||
|
assert health.missed is True
|
||||||
|
assert "no recorded fire" in health.reasons[0]
|
||||||
|
|
||||||
|
|
||||||
|
def test_no_interval_and_no_fire_is_not_flagged() -> None:
|
||||||
|
# Without an expected interval we cannot assert a miss from absence alone.
|
||||||
|
health = evaluate_schedule_health(
|
||||||
|
activity_id="event-ish",
|
||||||
|
missed_catchup_window=0,
|
||||||
|
last_fired_at=None,
|
||||||
|
now=NOW,
|
||||||
|
)
|
||||||
|
assert health.healthy is True
|
||||||
@@ -37,6 +37,7 @@ def _make_defn(
|
|||||||
misfire_policy: str = "skip",
|
misfire_policy: str = "skip",
|
||||||
enabled: bool = True,
|
enabled: bool = True,
|
||||||
jitter: int = 0,
|
jitter: int = 0,
|
||||||
|
catchup_window_seconds: int | None = None,
|
||||||
) -> ActivityDefinition:
|
) -> ActivityDefinition:
|
||||||
return ActivityDefinition(
|
return ActivityDefinition(
|
||||||
id=uuid.uuid4(),
|
id=uuid.uuid4(),
|
||||||
@@ -46,6 +47,7 @@ def _make_defn(
|
|||||||
cron_expression=cron,
|
cron_expression=cron,
|
||||||
misfire_policy=misfire_policy,
|
misfire_policy=misfire_policy,
|
||||||
jitter_seconds=jitter,
|
jitter_seconds=jitter,
|
||||||
|
catchup_window_seconds=catchup_window_seconds,
|
||||||
),
|
),
|
||||||
)
|
)
|
||||||
|
|
||||||
@@ -186,6 +188,76 @@ async def test_misfire_policy_compress_sets_overlap_buffer_one(env: WorkflowEnvi
|
|||||||
await delete_schedule(env.client, defn.id)
|
await delete_schedule(env.client, defn.id)
|
||||||
|
|
||||||
|
|
||||||
|
# ── ACTIVITY-WP-0014: explicit run-miss policies + catchup window ────────────
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_skip_sets_short_catchup_window(env: WorkflowEnvironment) -> None:
|
||||||
|
"""skip = run on trigger or skip: tiny grace window, no real recovery."""
|
||||||
|
defn = _make_defn(misfire_policy="skip")
|
||||||
|
await upsert_schedule(env.client, defn)
|
||||||
|
|
||||||
|
desc = await env.client.get_schedule_handle(schedule_id(defn.id)).describe()
|
||||||
|
assert desc.schedule.policy.overlap == ScheduleOverlapPolicy.SKIP
|
||||||
|
assert desc.schedule.policy.catchup_window == timedelta(seconds=60)
|
||||||
|
|
||||||
|
await delete_schedule(env.client, defn.id)
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_catchup_all_recovers_full_window(env: WorkflowEnvironment) -> None:
|
||||||
|
"""catchup_all = recover every missed fire: long window, BUFFER_ALL."""
|
||||||
|
defn = _make_defn(misfire_policy="catchup_all")
|
||||||
|
await upsert_schedule(env.client, defn)
|
||||||
|
|
||||||
|
desc = await env.client.get_schedule_handle(schedule_id(defn.id)).describe()
|
||||||
|
assert desc.schedule.policy.overlap == ScheduleOverlapPolicy.BUFFER_ALL
|
||||||
|
assert desc.schedule.policy.catchup_window == timedelta(days=365)
|
||||||
|
|
||||||
|
await delete_schedule(env.client, defn.id)
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_catchup_latest_does_not_accumulate(env: WorkflowEnvironment) -> None:
|
||||||
|
"""catchup_latest = recover only the most recent missed fire: BUFFER_ONE."""
|
||||||
|
defn = _make_defn(misfire_policy="catchup_latest")
|
||||||
|
await upsert_schedule(env.client, defn)
|
||||||
|
|
||||||
|
desc = await env.client.get_schedule_handle(schedule_id(defn.id)).describe()
|
||||||
|
assert desc.schedule.policy.overlap == ScheduleOverlapPolicy.BUFFER_ONE
|
||||||
|
assert desc.schedule.policy.catchup_window == timedelta(hours=24)
|
||||||
|
|
||||||
|
await delete_schedule(env.client, defn.id)
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_legacy_aliases_map_to_explicit_policies(env: WorkflowEnvironment) -> None:
|
||||||
|
"""Legacy catchup/compress keep working and pick up the new catchup windows."""
|
||||||
|
catchup = _make_defn(misfire_policy="catchup")
|
||||||
|
compress = _make_defn(misfire_policy="compress")
|
||||||
|
await upsert_schedule(env.client, catchup)
|
||||||
|
await upsert_schedule(env.client, compress)
|
||||||
|
|
||||||
|
d1 = await env.client.get_schedule_handle(schedule_id(catchup.id)).describe()
|
||||||
|
d2 = await env.client.get_schedule_handle(schedule_id(compress.id)).describe()
|
||||||
|
assert d1.schedule.policy.catchup_window == timedelta(days=365)
|
||||||
|
assert d2.schedule.policy.catchup_window == timedelta(hours=24)
|
||||||
|
|
||||||
|
await delete_schedule(env.client, catchup.id)
|
||||||
|
await delete_schedule(env.client, compress.id)
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_explicit_catchup_window_override(env: WorkflowEnvironment) -> None:
|
||||||
|
"""An explicit catchup_window_seconds overrides the per-policy default."""
|
||||||
|
defn = _make_defn(misfire_policy="skip", catchup_window_seconds=7200)
|
||||||
|
await upsert_schedule(env.client, defn)
|
||||||
|
|
||||||
|
desc = await env.client.get_schedule_handle(schedule_id(defn.id)).describe()
|
||||||
|
assert desc.schedule.policy.catchup_window == timedelta(hours=2)
|
||||||
|
|
||||||
|
await delete_schedule(env.client, defn.id)
|
||||||
|
|
||||||
|
|
||||||
@pytest.mark.asyncio
|
@pytest.mark.asyncio
|
||||||
async def test_schedule_smoke_test_creates_one_shot_schedule(
|
async def test_schedule_smoke_test_creates_one_shot_schedule(
|
||||||
env: WorkflowEnvironment,
|
env: WorkflowEnvironment,
|
||||||
|
|||||||
@@ -215,6 +215,29 @@ def test_coding_retro_returns_latest_progress_suggestions(monkeypatch) -> None:
|
|||||||
],
|
],
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
"id": "newer-30-day-retro",
|
||||||
|
"event_type": "coding_retro",
|
||||||
|
"summary": "monthly coding retro ready",
|
||||||
|
"created_at": "2026-06-07T17:15:00Z",
|
||||||
|
"detail": {
|
||||||
|
"generated_at": "2026-06-07T17:14:30Z",
|
||||||
|
"window": {
|
||||||
|
"days": 30,
|
||||||
|
"since": "2026-05-08T00:00:00Z",
|
||||||
|
"until": "2026-06-07T00:00:00Z",
|
||||||
|
},
|
||||||
|
"suggestions": [
|
||||||
|
{
|
||||||
|
"repo": "broad-retro-repo",
|
||||||
|
"title": "Should not displace the weekly retro",
|
||||||
|
"recommendation": "Keep weekly schedule bounded.",
|
||||||
|
"priority": "high",
|
||||||
|
"score": 99,
|
||||||
|
}
|
||||||
|
],
|
||||||
|
},
|
||||||
|
},
|
||||||
])
|
])
|
||||||
|
|
||||||
monkeypatch.setenv("STATE_HUB_URL", "http://state-hub.test/")
|
monkeypatch.setenv("STATE_HUB_URL", "http://state-hub.test/")
|
||||||
@@ -229,7 +252,7 @@ def test_coding_retro_returns_latest_progress_suggestions(monkeypatch) -> None:
|
|||||||
assert calls == [
|
assert calls == [
|
||||||
{
|
{
|
||||||
"url": "http://state-hub.test/progress/",
|
"url": "http://state-hub.test/progress/",
|
||||||
"params": {"limit": 20},
|
"params": {"event_type": "coding_retro", "limit": 20},
|
||||||
"timeout": 10.0,
|
"timeout": 10.0,
|
||||||
}
|
}
|
||||||
]
|
]
|
||||||
@@ -251,6 +274,47 @@ def test_coding_retro_returns_latest_progress_suggestions(monkeypatch) -> None:
|
|||||||
]
|
]
|
||||||
|
|
||||||
|
|
||||||
|
def test_coding_retro_returns_empty_when_window_does_not_match(monkeypatch) -> None:
|
||||||
|
def fake_get(url: str, **kwargs: Any) -> DummyResponse:
|
||||||
|
return DummyResponse([
|
||||||
|
{
|
||||||
|
"id": "monthly-retro",
|
||||||
|
"event_type": "coding_retro",
|
||||||
|
"summary": "monthly coding retro ready",
|
||||||
|
"created_at": "2026-06-07T17:10:00Z",
|
||||||
|
"detail": {
|
||||||
|
"window": {"days": 30},
|
||||||
|
"suggestions": [
|
||||||
|
{
|
||||||
|
"repo": "activity-core",
|
||||||
|
"title": "Broad retro item",
|
||||||
|
"recommendation": "Do not emit from weekly schedule.",
|
||||||
|
"priority": "high",
|
||||||
|
"score": 10,
|
||||||
|
}
|
||||||
|
],
|
||||||
|
},
|
||||||
|
}
|
||||||
|
])
|
||||||
|
|
||||||
|
monkeypatch.setattr(httpx, "get", fake_get)
|
||||||
|
|
||||||
|
result = StateHubContextResolver().resolve(
|
||||||
|
"coding_retro",
|
||||||
|
None,
|
||||||
|
{"event_type": "coding_retro", "window_days": 7},
|
||||||
|
)
|
||||||
|
|
||||||
|
assert result == {
|
||||||
|
"suggestions": [],
|
||||||
|
"window": None,
|
||||||
|
"generated_at": None,
|
||||||
|
"source_progress_id": None,
|
||||||
|
"event_type": "coding_retro",
|
||||||
|
"summary": "",
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
def test_coding_retro_returns_empty_shape_when_not_published(monkeypatch) -> None:
|
def test_coding_retro_returns_empty_shape_when_not_published(monkeypatch) -> None:
|
||||||
def fake_get(url: str, **kwargs: Any) -> DummyResponse:
|
def fake_get(url: str, **kwargs: Any) -> DummyResponse:
|
||||||
return DummyResponse([
|
return DummyResponse([
|
||||||
@@ -343,6 +407,70 @@ def test_recently_on_scope_hourly_failure_bubbles(monkeypatch) -> None:
|
|||||||
StateHubContextResolver().resolve("recently_on_scope_hourly", None, {"range": "1h"})
|
StateHubContextResolver().resolve("recently_on_scope_hourly", None, {"range": "1h"})
|
||||||
|
|
||||||
|
|
||||||
|
def test_consistency_sweep_remote_all_posts_batch(monkeypatch) -> None:
|
||||||
|
calls: list[dict[str, Any]] = []
|
||||||
|
|
||||||
|
def fake_post(url: str, **kwargs: Any) -> DummyResponse:
|
||||||
|
calls.append({"url": url, **kwargs})
|
||||||
|
return DummyResponse(
|
||||||
|
{
|
||||||
|
"exit_code": 0,
|
||||||
|
"lock_skipped": False,
|
||||||
|
"repos_processed": [{"repo_slug": "state-hub", "result": "pass"}],
|
||||||
|
"skipped_clean": ["quiet-repo"],
|
||||||
|
"skipped_missing": [],
|
||||||
|
"skipped_budget": [],
|
||||||
|
}
|
||||||
|
)
|
||||||
|
|
||||||
|
monkeypatch.setenv("STATE_HUB_URL", "http://state-hub.test/")
|
||||||
|
monkeypatch.setattr(httpx, "post", fake_post)
|
||||||
|
|
||||||
|
result = StateHubContextResolver().resolve(
|
||||||
|
"consistency_sweep_remote_all",
|
||||||
|
None,
|
||||||
|
{"max_seconds": 300, "source": "activity-core", "required": True},
|
||||||
|
)
|
||||||
|
|
||||||
|
assert result["exit_code"] == 0
|
||||||
|
assert result["repos_processed"][0]["repo_slug"] == "state-hub"
|
||||||
|
assert calls == [
|
||||||
|
{
|
||||||
|
"url": "http://state-hub.test/consistency/sweep/remote-all",
|
||||||
|
"json": {"max_seconds": 300, "source": "activity-core"},
|
||||||
|
"timeout": 330.0,
|
||||||
|
}
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
|
def test_consistency_sweep_remote_all_failure_bubbles(monkeypatch) -> None:
|
||||||
|
def fake_post(url: str, **kwargs: Any) -> DummyResponse:
|
||||||
|
raise httpx.ConnectError("offline")
|
||||||
|
|
||||||
|
monkeypatch.setattr(httpx, "post", fake_post)
|
||||||
|
|
||||||
|
with pytest.raises(httpx.ConnectError):
|
||||||
|
StateHubContextResolver().resolve(
|
||||||
|
"consistency_sweep_remote_all",
|
||||||
|
None,
|
||||||
|
{"max_seconds": 300},
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def test_consistency_sweep_remote_all_rejects_empty_response(monkeypatch) -> None:
|
||||||
|
def fake_post(url: str, **kwargs: Any) -> DummyResponse:
|
||||||
|
return DummyResponse({})
|
||||||
|
|
||||||
|
monkeypatch.setattr(httpx, "post", fake_post)
|
||||||
|
|
||||||
|
with pytest.raises(RuntimeError, match="missing required key"):
|
||||||
|
StateHubContextResolver().resolve(
|
||||||
|
"consistency_sweep_remote_all",
|
||||||
|
None,
|
||||||
|
{"max_seconds": 300},
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
def test_recently_on_scope_hourly_rejects_empty_response(monkeypatch) -> None:
|
def test_recently_on_scope_hourly_rejects_empty_response(monkeypatch) -> None:
|
||||||
def fake_post(url: str, **kwargs: Any) -> DummyResponse:
|
def fake_post(url: str, **kwargs: Any) -> DummyResponse:
|
||||||
return DummyResponse({})
|
return DummyResponse({})
|
||||||
|
|||||||
81
tests/test_state_hub_write.py
Normal file
81
tests/test_state_hub_write.py
Normal file
@@ -0,0 +1,81 @@
|
|||||||
|
"""ACTIVITY-WP-0014 T05: idempotency-keyed State Hub writes."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import httpx
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
from activity_core import report_sinks
|
||||||
|
from activity_core.state_hub_write import (
|
||||||
|
IDEMPOTENCY_HEADER,
|
||||||
|
idempotency_headers,
|
||||||
|
idempotency_key,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def test_key_is_stable_and_deterministic() -> None:
|
||||||
|
a = idempotency_key("run1", "daily-triage-report", "daily_triage")
|
||||||
|
b = idempotency_key("run1", "daily-triage-report", "daily_triage")
|
||||||
|
assert a == b == "run1:daily-triage-report:daily_triage"
|
||||||
|
|
||||||
|
|
||||||
|
def test_key_shape_stable_with_missing_parts() -> None:
|
||||||
|
assert idempotency_key("run1", None, "daily_triage") == "run1::daily_triage"
|
||||||
|
|
||||||
|
|
||||||
|
def test_key_sanitizes_control_and_whitespace() -> None:
|
||||||
|
key = idempotency_key("run 1", "a\tb", "x\n")
|
||||||
|
assert "\t" not in key and "\n" not in key and " " not in key
|
||||||
|
|
||||||
|
|
||||||
|
def test_headers_carry_the_key() -> None:
|
||||||
|
headers = idempotency_headers("run1", "i", "e")
|
||||||
|
assert headers == {IDEMPOTENCY_HEADER: "run1:i:e"}
|
||||||
|
|
||||||
|
|
||||||
|
def test_distinct_identities_get_distinct_keys() -> None:
|
||||||
|
assert idempotency_key("r", "i", "daily_triage") != idempotency_key(
|
||||||
|
"r", "i", "schedule_miss"
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def test_progress_exists_is_best_effort_on_connection_error(monkeypatch) -> None:
|
||||||
|
"""A down State Hub must not hard-fail the dedup read; it returns False so the
|
||||||
|
keyed write can still proceed."""
|
||||||
|
|
||||||
|
def _boom(*args, **kwargs):
|
||||||
|
raise httpx.ConnectError("Connection refused")
|
||||||
|
|
||||||
|
monkeypatch.setattr(report_sinks.httpx, "get", _boom)
|
||||||
|
assert (
|
||||||
|
report_sinks._progress_exists(
|
||||||
|
"http://127.0.0.1:8000", "run1", "daily-triage-report", "daily_triage"
|
||||||
|
)
|
||||||
|
is False
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def test_report_sink_post_sends_idempotency_header(monkeypatch) -> None:
|
||||||
|
"""The state-hub-progress write carries a stable Idempotency-Key header."""
|
||||||
|
captured: dict[str, object] = {}
|
||||||
|
|
||||||
|
monkeypatch.setattr(report_sinks, "_progress_exists", lambda *a, **k: False)
|
||||||
|
|
||||||
|
class _Resp:
|
||||||
|
def raise_for_status(self) -> None: ...
|
||||||
|
def json(self) -> dict[str, str]:
|
||||||
|
return {"id": "pid-1"}
|
||||||
|
|
||||||
|
def _capture_post(url, json, headers, timeout): # noqa: A002
|
||||||
|
captured["headers"] = headers
|
||||||
|
return _Resp()
|
||||||
|
|
||||||
|
monkeypatch.setattr(report_sinks.httpx, "post", _capture_post)
|
||||||
|
|
||||||
|
payload = {"run_id": "run1", "activity_id": "act1", "scheduled_for": None}
|
||||||
|
report_entry = {"instruction_id": "daily-triage-report", "report": {"summary": "s"}}
|
||||||
|
sink = {"event_type": "daily_triage"}
|
||||||
|
|
||||||
|
result = report_sinks._post_state_hub_progress(payload, report_entry, sink)
|
||||||
|
assert result["status"] == "posted"
|
||||||
|
assert captured["headers"][IDEMPOTENCY_HEADER] == "run1:daily-triage-report:daily_triage"
|
||||||
126
tests/test_sync_schedules.py
Normal file
126
tests/test_sync_schedules.py
Normal file
@@ -0,0 +1,126 @@
|
|||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import uuid
|
||||||
|
from datetime import datetime, timezone
|
||||||
|
from types import SimpleNamespace
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
from activity_core import sync_schedules
|
||||||
|
|
||||||
|
|
||||||
|
def _row(
|
||||||
|
*,
|
||||||
|
activity_id: uuid.UUID,
|
||||||
|
enabled: bool,
|
||||||
|
trigger_config: dict[str, Any],
|
||||||
|
) -> SimpleNamespace:
|
||||||
|
return SimpleNamespace(
|
||||||
|
id=activity_id,
|
||||||
|
name=f"definition-{activity_id}",
|
||||||
|
enabled=enabled,
|
||||||
|
trigger_config=trigger_config,
|
||||||
|
context_sources=[],
|
||||||
|
task_templates=[],
|
||||||
|
dedupe_key_strategy="skip",
|
||||||
|
version=1,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_sync_schedule_rows_reports_drift_counts_and_preserves_one_shots(
|
||||||
|
monkeypatch,
|
||||||
|
) -> None:
|
||||||
|
new_id = uuid.uuid4()
|
||||||
|
disabled_old_id = uuid.uuid4()
|
||||||
|
one_shot_id = uuid.uuid4()
|
||||||
|
orphan_id = uuid.uuid4()
|
||||||
|
upserted: list[tuple[uuid.UUID, bool, str]] = []
|
||||||
|
deleted: list[str] = []
|
||||||
|
|
||||||
|
async def fake_upsert_schedule(client: object, defn: object) -> None:
|
||||||
|
upserted.append((
|
||||||
|
defn.id,
|
||||||
|
defn.enabled,
|
||||||
|
defn.trigger_config.trigger_type,
|
||||||
|
))
|
||||||
|
|
||||||
|
async def fake_list_schedules(client: object) -> list[dict[str, str]]:
|
||||||
|
return [
|
||||||
|
{
|
||||||
|
"schedule_id": f"activity-schedule-{disabled_old_id}",
|
||||||
|
"activity_id": str(disabled_old_id),
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"schedule_id": f"activity-schedule-{one_shot_id}-once",
|
||||||
|
"activity_id": f"{one_shot_id}-once",
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"schedule_id": f"activity-schedule-{orphan_id}",
|
||||||
|
"activity_id": str(orphan_id),
|
||||||
|
},
|
||||||
|
]
|
||||||
|
|
||||||
|
async def fake_delete_schedule(client: object, activity_id: str) -> None:
|
||||||
|
deleted.append(activity_id)
|
||||||
|
|
||||||
|
monkeypatch.setattr(sync_schedules, "upsert_schedule", fake_upsert_schedule)
|
||||||
|
monkeypatch.setattr(sync_schedules, "list_schedules", fake_list_schedules)
|
||||||
|
monkeypatch.setattr(sync_schedules, "delete_schedule", fake_delete_schedule)
|
||||||
|
|
||||||
|
result = await sync_schedules.sync_schedule_rows(
|
||||||
|
object(),
|
||||||
|
[
|
||||||
|
_row(
|
||||||
|
activity_id=new_id,
|
||||||
|
enabled=True,
|
||||||
|
trigger_config={
|
||||||
|
"trigger_type": "cron",
|
||||||
|
"cron_expression": "20 7 * * *",
|
||||||
|
"timezone": "Europe/Berlin",
|
||||||
|
"misfire_policy": "skip",
|
||||||
|
},
|
||||||
|
),
|
||||||
|
_row(
|
||||||
|
activity_id=disabled_old_id,
|
||||||
|
enabled=False,
|
||||||
|
trigger_config={
|
||||||
|
"trigger_type": "cron",
|
||||||
|
"cron_expression": "20 * * * *",
|
||||||
|
"timezone": "Europe/Berlin",
|
||||||
|
"misfire_policy": "skip",
|
||||||
|
},
|
||||||
|
),
|
||||||
|
_row(
|
||||||
|
activity_id=one_shot_id,
|
||||||
|
enabled=True,
|
||||||
|
trigger_config={
|
||||||
|
"trigger_type": "scheduled",
|
||||||
|
"at": datetime(2026, 6, 19, 8, 0, tzinfo=timezone.utc),
|
||||||
|
"timezone": "UTC",
|
||||||
|
},
|
||||||
|
),
|
||||||
|
_row(
|
||||||
|
activity_id=uuid.uuid4(),
|
||||||
|
enabled=True,
|
||||||
|
trigger_config={
|
||||||
|
"trigger_type": "event",
|
||||||
|
"event_type": "kaizen.metrics.recorded",
|
||||||
|
"filters": {},
|
||||||
|
},
|
||||||
|
),
|
||||||
|
],
|
||||||
|
)
|
||||||
|
|
||||||
|
assert result.to_dict() == {
|
||||||
|
"upserted": 2,
|
||||||
|
"paused": 1,
|
||||||
|
"deleted_orphans": 1,
|
||||||
|
}
|
||||||
|
assert upserted == [
|
||||||
|
(new_id, True, "cron"),
|
||||||
|
(disabled_old_id, False, "cron"),
|
||||||
|
(one_shot_id, True, "scheduled"),
|
||||||
|
]
|
||||||
|
assert deleted == [str(orphan_id)]
|
||||||
134
tests/test_sync_service.py
Normal file
134
tests/test_sync_service.py
Normal file
@@ -0,0 +1,134 @@
|
|||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
from activity_core import sync_service
|
||||||
|
from activity_core.sync_schedules import ScheduleSyncResult
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_run_sync_runs_requested_sections(monkeypatch) -> None:
|
||||||
|
calls: list[str] = []
|
||||||
|
|
||||||
|
async def fake_definitions(session_factory: object) -> int:
|
||||||
|
calls.append("definitions")
|
||||||
|
return 2
|
||||||
|
|
||||||
|
async def fake_event_types(session_factory: object) -> int:
|
||||||
|
calls.append("event_types")
|
||||||
|
return 5
|
||||||
|
|
||||||
|
async def fake_schedules(
|
||||||
|
temporal_client: object,
|
||||||
|
session_factory: object,
|
||||||
|
) -> ScheduleSyncResult:
|
||||||
|
calls.append("schedules")
|
||||||
|
return ScheduleSyncResult(upserted=3, paused=1, deleted_orphans=2)
|
||||||
|
|
||||||
|
monkeypatch.setattr(sync_service, "sync_activity_definitions", fake_definitions)
|
||||||
|
monkeypatch.setattr(sync_service, "sync_event_types", fake_event_types)
|
||||||
|
monkeypatch.setattr(sync_service, "sync_with_session_factory", fake_schedules)
|
||||||
|
|
||||||
|
result = await sync_service.run_sync(
|
||||||
|
session_factory=object(),
|
||||||
|
temporal_client=object(),
|
||||||
|
definitions=True,
|
||||||
|
schedules=True,
|
||||||
|
event_types=True,
|
||||||
|
)
|
||||||
|
|
||||||
|
assert calls == ["definitions", "event_types", "schedules"]
|
||||||
|
assert result["ok"] is True
|
||||||
|
assert result["ran"] == {
|
||||||
|
"definitions": True,
|
||||||
|
"schedules": True,
|
||||||
|
"event_types": True,
|
||||||
|
}
|
||||||
|
assert result["definitions"] == {"synced": 2}
|
||||||
|
assert result["event_types"] == {"synced": 5}
|
||||||
|
assert result["schedules"] == {
|
||||||
|
"upserted": 3,
|
||||||
|
"paused": 1,
|
||||||
|
"deleted_orphans": 2,
|
||||||
|
}
|
||||||
|
assert result["errors"] == []
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_run_sync_collects_errors_and_continues(monkeypatch) -> None:
|
||||||
|
calls: list[str] = []
|
||||||
|
|
||||||
|
async def failing_definitions(session_factory: object) -> int:
|
||||||
|
calls.append("definitions")
|
||||||
|
raise RuntimeError("definition parse failed")
|
||||||
|
|
||||||
|
async def fake_schedules(
|
||||||
|
temporal_client: object,
|
||||||
|
session_factory: object,
|
||||||
|
) -> ScheduleSyncResult:
|
||||||
|
calls.append("schedules")
|
||||||
|
return ScheduleSyncResult(upserted=1)
|
||||||
|
|
||||||
|
monkeypatch.setattr(
|
||||||
|
sync_service,
|
||||||
|
"sync_activity_definitions",
|
||||||
|
failing_definitions,
|
||||||
|
)
|
||||||
|
monkeypatch.setattr(sync_service, "sync_with_session_factory", fake_schedules)
|
||||||
|
|
||||||
|
result = await sync_service.run_sync(
|
||||||
|
session_factory=object(),
|
||||||
|
temporal_client=object(),
|
||||||
|
definitions=True,
|
||||||
|
schedules=True,
|
||||||
|
event_types=False,
|
||||||
|
)
|
||||||
|
|
||||||
|
assert calls == ["definitions", "schedules"]
|
||||||
|
assert result["ok"] is False
|
||||||
|
assert result["definitions"] == {"synced": 0}
|
||||||
|
assert result["schedules"]["upserted"] == 1
|
||||||
|
assert result["errors"] == [
|
||||||
|
{
|
||||||
|
"stage": "definitions",
|
||||||
|
"type": "RuntimeError",
|
||||||
|
"message": "definition parse failed",
|
||||||
|
}
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_run_sync_reports_missing_temporal_client_for_schedules() -> None:
|
||||||
|
result = await sync_service.run_sync(
|
||||||
|
session_factory=object(),
|
||||||
|
temporal_client=None,
|
||||||
|
definitions=False,
|
||||||
|
schedules=True,
|
||||||
|
event_types=False,
|
||||||
|
)
|
||||||
|
|
||||||
|
assert result["ok"] is False
|
||||||
|
assert result["errors"] == [
|
||||||
|
{
|
||||||
|
"stage": "schedules",
|
||||||
|
"type": "RuntimeError",
|
||||||
|
"message": "Temporal client is required for schedule sync",
|
||||||
|
}
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
|
def test_record_error_bounds_error_count() -> None:
|
||||||
|
result: dict[str, Any] = {
|
||||||
|
"ok": True,
|
||||||
|
"errors": [],
|
||||||
|
}
|
||||||
|
|
||||||
|
for i in range(25):
|
||||||
|
sync_service._record_error(result, "stage", RuntimeError(f"boom {i}"))
|
||||||
|
|
||||||
|
assert result["ok"] is False
|
||||||
|
assert len(result["errors"]) == 20
|
||||||
|
assert result["errors"][0]["message"] == "boom 0"
|
||||||
|
assert result["errors"][-1]["message"] == "boom 19"
|
||||||
2
uv.lock
generated
2
uv.lock
generated
@@ -12,6 +12,7 @@ dependencies = [
|
|||||||
{ name = "httpx" },
|
{ name = "httpx" },
|
||||||
{ name = "nats-py" },
|
{ name = "nats-py" },
|
||||||
{ name = "pydantic" },
|
{ name = "pydantic" },
|
||||||
|
{ name = "pyyaml" },
|
||||||
{ name = "sqlalchemy", extra = ["asyncio"] },
|
{ name = "sqlalchemy", extra = ["asyncio"] },
|
||||||
{ name = "temporalio" },
|
{ name = "temporalio" },
|
||||||
{ name = "uvicorn", extra = ["standard"] },
|
{ name = "uvicorn", extra = ["standard"] },
|
||||||
@@ -34,6 +35,7 @@ requires-dist = [
|
|||||||
{ name = "pydantic", specifier = ">=2.0" },
|
{ name = "pydantic", specifier = ">=2.0" },
|
||||||
{ name = "pytest", marker = "extra == 'dev'", specifier = ">=8.0" },
|
{ name = "pytest", marker = "extra == 'dev'", specifier = ">=8.0" },
|
||||||
{ name = "pytest-asyncio", marker = "extra == 'dev'", specifier = ">=0.24" },
|
{ name = "pytest-asyncio", marker = "extra == 'dev'", specifier = ">=0.24" },
|
||||||
|
{ name = "pyyaml", specifier = ">=6.0" },
|
||||||
{ name = "sqlalchemy", extras = ["asyncio"], specifier = ">=2.0" },
|
{ name = "sqlalchemy", extras = ["asyncio"], specifier = ">=2.0" },
|
||||||
{ name = "temporalio", specifier = ">=1.7" },
|
{ name = "temporalio", specifier = ">=1.7" },
|
||||||
{ name = "temporalio", extras = ["testing"], marker = "extra == 'dev'", specifier = ">=1.7" },
|
{ name = "temporalio", extras = ["testing"], marker = "extra == 'dev'", specifier = ">=1.7" },
|
||||||
|
|||||||
@@ -8,7 +8,7 @@ status: active
|
|||||||
owner: codex
|
owner: codex
|
||||||
topic_slug: custodian
|
topic_slug: custodian
|
||||||
created: "2026-06-03"
|
created: "2026-06-03"
|
||||||
updated: "2026-06-07"
|
updated: "2026-06-27"
|
||||||
state_hub_workstream_id: "5646e13a-13af-4724-bca6-3c0d86f96733"
|
state_hub_workstream_id: "5646e13a-13af-4724-bca6-3c0d86f96733"
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -150,6 +150,59 @@ State Hub to `state-hub` (`dc10704f`), `railiance-cluster` (`53e78702`),
|
|||||||
activity-core runner plus three clean scheduled daily runs and calibration
|
activity-core runner plus three clean scheduled daily runs and calibration
|
||||||
feedback.
|
feedback.
|
||||||
|
|
||||||
|
2026-06-16: Rechecked State Hub and the configured working-memory sink. State
|
||||||
|
Hub `/progress/?event_type=daily_triage` still only shows activity-core
|
||||||
|
`daily_triage` progress through 2026-06-06, and
|
||||||
|
`/home/worsch/the-custodian/memory/working` only has `daily-triage-*` notes
|
||||||
|
for 2026-06-02 through 2026-06-06. There is still no evidence of three clean
|
||||||
|
consecutive scheduled runs after the June 7 runtime projection failure, so
|
||||||
|
T03 remains `wait`.
|
||||||
|
|
||||||
|
2026-06-18: Consumed the verified in-cluster llm-connect Service URL in the
|
||||||
|
Railiance runtime projection. `actcore-runtime-config` now sets
|
||||||
|
`LLM_CONNECT_URL=http://llm-connect.activity-core.svc.cluster.local:8080` and
|
||||||
|
keeps `LLM_CONNECT_TIMEOUT_SECONDS=300`. The remaining live gate is no longer
|
||||||
|
the URL slot itself; it is operator-owned provider credential custody for
|
||||||
|
`activity-core/llm-connect-provider-secrets`, a schema-valid fixture smoke, and
|
||||||
|
then three clean scheduled daily triage runs.
|
||||||
|
|
||||||
|
2026-06-18 follow-up: `llm-connect` reported State Hub message
|
||||||
|
`6a098e1e-65de-4309-ab4a-446aba2f3587`: the provider Secret now has a populated
|
||||||
|
key count and the in-namespace fixture smoke passed on the llm-connect side.
|
||||||
|
The remaining activity-core gate is to reconcile the live Railiance runtime so
|
||||||
|
the worker consumes the configured URL, then produce schema-valid daily triage
|
||||||
|
evidence and three clean scheduled runs. This narrower path is tracked in
|
||||||
|
`ACTIVITY-WP-0010`.
|
||||||
|
|
||||||
|
2026-06-25: Consecutive-run streak resumed. State Hub `daily_triage` progress
|
||||||
|
events from author `activity-core` fired on time on **2026-06-24 05:20:56Z** and
|
||||||
|
**2026-06-25 05:20:47Z** (07:20 Berlin), both delivered, no misfires. That is two
|
||||||
|
clean consecutive scheduled runs. **RECHECK 2026-06-26 (after 05:20Z):** confirm
|
||||||
|
the 06-26 scheduled `daily_triage` event delivered. If clean, that completes three
|
||||||
|
clean consecutive scheduled runs (06-24 / 06-25 / 06-26) — record the calibration
|
||||||
|
result in State Hub and close T03. If the 06-26 run misfires or is missing, the
|
||||||
|
streak resets and T03 stays `wait`. Flag deliberately kept in-repo (agent-agnostic)
|
||||||
|
rather than tied to any single coding agent's scheduler.
|
||||||
|
|
||||||
|
2026-06-26 recheck outcome: **streak reset at two.** The 06-26 scheduled run fired
|
||||||
|
on time (`daily_triage` event 05:20:57Z) — scheduling layer healthy, no misfire —
|
||||||
|
but the `daily-triage-report` instruction output **failed schema validation**:
|
||||||
|
`Expecting ',' delimiter: line 136 column 22 (char 5268)`. The model produced a
|
||||||
|
long ranked WSJF recommendation list (reached rank 7+ with nested `wsjf` objects)
|
||||||
|
whose JSON broke ~char 5268; only a bounded 4000-char preview is preserved in the
|
||||||
|
State Hub event, so the exact offending token needs the runtime llm-connect log.
|
||||||
|
This is an LLM-output-quality failure (tracked by `ACTIVITY-WP-0010`), not a
|
||||||
|
runtime/projection failure. T03 stays `wait`; three clean consecutive scheduled
|
||||||
|
runs not yet achieved (06-24 ✅, 06-25 ✅, 06-26 ✗-validation).
|
||||||
|
|
||||||
|
2026-06-27 recheck outcome: streak remains reset. The scheduled run fired and
|
||||||
|
wrote State Hub progress plus working memory, but daily-triage-report failed
|
||||||
|
validation again with an unterminated string around char 5246. This confirms the
|
||||||
|
runner/sink path is alive and the active blocker is live deployment of the
|
||||||
|
ACTIVITY-WP-0016 output-robustness bundle and runtime prompt/token changes, not
|
||||||
|
a missing schedule. T03 stays wait until a post-deployment smoke passes and three
|
||||||
|
new clean scheduled runs are collected.
|
||||||
|
|
||||||
## Rule Action Contract Documentation
|
## Rule Action Contract Documentation
|
||||||
|
|
||||||
```task
|
```task
|
||||||
|
|||||||
@@ -8,7 +8,7 @@ status: blocked
|
|||||||
owner: codex
|
owner: codex
|
||||||
topic_slug: custodian
|
topic_slug: custodian
|
||||||
created: "2026-06-07"
|
created: "2026-06-07"
|
||||||
updated: "2026-06-07"
|
updated: "2026-06-17"
|
||||||
state_hub_workstream_id: "7387fc50-1f2c-471a-9d85-bb085cbd0b63"
|
state_hub_workstream_id: "7387fc50-1f2c-471a-9d85-bb085cbd0b63"
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -47,6 +47,12 @@ resolver. It reads recent `/progress/` items, selects the latest
|
|||||||
`event_type=coding_retro`, normalizes `suggestions[]`, and returns an empty
|
`event_type=coding_retro`, normalizes `suggestions[]`, and returns an empty
|
||||||
suggestion list while the upstream publisher has not produced a read model yet.
|
suggestion list while the upstream publisher has not produced a read model yet.
|
||||||
|
|
||||||
|
**2026-06-17:** Hardened the resolver lookup after live review found recent
|
||||||
|
non-retro progress could hide older retro events. The resolver now queries
|
||||||
|
State Hub with `event_type=coding_retro` and only selects a read model matching
|
||||||
|
the requested `window_days`, so the weekly schedule cannot accidentally route a
|
||||||
|
broader 30-day retro batch.
|
||||||
|
|
||||||
## `weekly-coding-retro` Activity-Definition
|
## `weekly-coding-retro` Activity-Definition
|
||||||
|
|
||||||
```task
|
```task
|
||||||
@@ -92,3 +98,12 @@ make fix-consistency REPO=activity-core
|
|||||||
Live State Hub did not yet expose a published `event_type=coding_retro` progress
|
Live State Hub did not yet expose a published `event_type=coding_retro` progress
|
||||||
item, so the real dry-run, duplicate check, and `enabled: true` flip remain
|
item, so the real dry-run, duplicate check, and `enabled: true` flip remain
|
||||||
blocked on `AGENTIC-WP-0010`.
|
blocked on `AGENTIC-WP-0010`.
|
||||||
|
|
||||||
|
**2026-06-17:** `AGENTIC-WP-0010` is finished and State Hub has
|
||||||
|
`coding_retro` progress. A live no-write smoke now resolves the matching weekly
|
||||||
|
read model `ec20ac1c-ef50-4db4-a5dc-364d31a259a5`
|
||||||
|
(`generated_at=2026-06-07T19:25:19Z`, `window.days=7`) and emits zero task
|
||||||
|
specs because that weekly read model has zero suggestions. The schedule remains
|
||||||
|
disabled until a non-empty weekly read model, or an explicit operator decision
|
||||||
|
that a zero-suggestion dry-run is an acceptable enablement proof, confirms
|
||||||
|
correct routing and no duplicate target tasks on re-run.
|
||||||
|
|||||||
250
workplans/ACTIVITY-WP-0009-intent-gap-closure.md
Normal file
250
workplans/ACTIVITY-WP-0009-intent-gap-closure.md
Normal file
@@ -0,0 +1,250 @@
|
|||||||
|
---
|
||||||
|
id: ACTIVITY-WP-0009
|
||||||
|
type: workplan
|
||||||
|
title: "Intent gap closure"
|
||||||
|
domain: custodian
|
||||||
|
repo: activity-core
|
||||||
|
status: blocked
|
||||||
|
owner: codex
|
||||||
|
topic_slug: custodian
|
||||||
|
created: "2026-06-16"
|
||||||
|
updated: "2026-06-18"
|
||||||
|
state_hub_workstream_id: "d64cfbba-6da7-4737-afb9-866afa0e9cda"
|
||||||
|
---
|
||||||
|
|
||||||
|
# ACTIVITY-WP-0009 - Intent gap closure
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
The 2026-06-16 review of activity-core against `INTENT.md` found that the repo
|
||||||
|
matches the intended Event Bridge shape, but several production and contract
|
||||||
|
gaps remain before the implementation fully satisfies the operational promise:
|
||||||
|
|
||||||
|
- recurring scheduled work must be trusted without manual coordination
|
||||||
|
- live task creation must be proven through issue-core, not only null-sink audit
|
||||||
|
- `review_required` semantics must either be implemented or documented as
|
||||||
|
metadata only
|
||||||
|
- ops evidence must either remain explicitly fallback-first or activate the
|
||||||
|
Inter-Hub / ops-hub backend behind operator-owned secrets
|
||||||
|
- the `TaskExecutorWorkflow` stub must not become a back door into execution
|
||||||
|
ownership
|
||||||
|
- the internal FastAPI surface needs an explicit production access decision
|
||||||
|
|
||||||
|
The preserved analysis lives in:
|
||||||
|
|
||||||
|
`history/2026-06-16-intent-gap-analysis.md`
|
||||||
|
|
||||||
|
## Close Daily Triage Scheduled-Run Trust Gap
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: ACTIVITY-WP-0009-T01
|
||||||
|
status: wait
|
||||||
|
priority: high
|
||||||
|
state_hub_task_id: "7012e4fd-2530-49b7-9c2f-1d949809a144"
|
||||||
|
```
|
||||||
|
|
||||||
|
Close the scheduled-run trust gap identified in `ACTIVITY-WP-0006-T03`.
|
||||||
|
|
||||||
|
Acceptance criteria:
|
||||||
|
|
||||||
|
- activity-core has three clean consecutive scheduled daily State Hub WSJF
|
||||||
|
triage runs after the June 7 runtime projection failure
|
||||||
|
- each run has matching Temporal workflow history, `activity_runs` row, State
|
||||||
|
Hub `daily_triage` progress, and working-memory report note
|
||||||
|
- calibration feedback is recorded in State Hub
|
||||||
|
- `ACTIVITY-WP-0006-T03` can move from `wait` to `done`
|
||||||
|
|
||||||
|
Current wait reason: as of 2026-06-16, State Hub `daily_triage` progress and
|
||||||
|
working-memory `daily-triage-*` notes only show activity-core evidence through
|
||||||
|
2026-06-06.
|
||||||
|
|
||||||
|
2026-06-18 update: activity-core now consumes the verified in-cluster
|
||||||
|
llm-connect Service URL in `k8s/railiance/20-runtime.yaml`:
|
||||||
|
`LLM_CONNECT_URL=http://llm-connect.activity-core.svc.cluster.local:8080` with
|
||||||
|
`LLM_CONNECT_TIMEOUT_SECONDS=300`. This removes the activity-core repo-side URL
|
||||||
|
gap. Closure still waits on the operator-owned provider Secret for llm-connect,
|
||||||
|
a schema-valid fixture smoke, and three clean scheduled daily triage runs with
|
||||||
|
matching State Hub and working-memory evidence.
|
||||||
|
|
||||||
|
2026-06-18 follow-up: State Hub message
|
||||||
|
`6a098e1e-65de-4309-ab4a-446aba2f3587` reports that the llm-connect side is now
|
||||||
|
complete: the provider Secret has a populated key count and the in-namespace
|
||||||
|
fixture smoke passed. The remaining work is the activity-core / Railiance
|
||||||
|
runtime reconciliation and daily-triage evidence collection path captured in
|
||||||
|
`ACTIVITY-WP-0010`.
|
||||||
|
|
||||||
|
## Promote Issue-Core Task Emission Safely
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: ACTIVITY-WP-0009-T02
|
||||||
|
status: wait
|
||||||
|
priority: high
|
||||||
|
state_hub_task_id: "3854677b-32b4-43f8-a6ca-5a2b25a08dd9"
|
||||||
|
```
|
||||||
|
|
||||||
|
Move selected production-safe definitions from `ISSUE_SINK_TYPE=null` audit mode
|
||||||
|
toward real issue-core task creation.
|
||||||
|
|
||||||
|
Acceptance criteria:
|
||||||
|
|
||||||
|
- issue-core endpoint, credentials, and duplicate-handling posture are approved
|
||||||
|
for the target environment
|
||||||
|
- one known-safe definition is run first in null-sink mode and its task specs are
|
||||||
|
reviewed
|
||||||
|
- the same definition creates exactly the expected issue-core task(s) through
|
||||||
|
`IssueCoreRestSink`
|
||||||
|
- `task_spawn_log` records the real returned task references
|
||||||
|
- rollback to null-sink mode is documented
|
||||||
|
|
||||||
|
Current wait reason: production Railiance currently uses null-sink audit mode;
|
||||||
|
live issue-core credentials/access and duplicate-handling are not yet verified
|
||||||
|
for this repo.
|
||||||
|
|
||||||
|
## Resolve Review-Required Contract Drift
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: ACTIVITY-WP-0009-T03
|
||||||
|
status: done
|
||||||
|
priority: medium
|
||||||
|
state_hub_task_id: "1eafe5e4-8412-4104-a417-933efe8e7bbd"
|
||||||
|
```
|
||||||
|
|
||||||
|
Resolve the mismatch between ADR language and current code for
|
||||||
|
`review_required`.
|
||||||
|
|
||||||
|
Options:
|
||||||
|
|
||||||
|
- implement an issue-core-owned pending review queue contract and route
|
||||||
|
`review_required=true` instruction outputs there, or
|
||||||
|
- update ADR/docs to state that `review_required` is currently audit/report
|
||||||
|
metadata only
|
||||||
|
|
||||||
|
Acceptance criteria:
|
||||||
|
|
||||||
|
- `docs/adr/adr-003-rule-instruction-model.md`, `SCOPE.md`, and tests describe
|
||||||
|
the same behavior
|
||||||
|
- no ActivityDefinition implies a review queue exists unless that downstream
|
||||||
|
contract is live
|
||||||
|
- report/spawn metadata remains available for operator review either way
|
||||||
|
|
||||||
|
2026-06-16: Completed by aligning ADR-003 with the implemented behavior:
|
||||||
|
`review_required` is audit/report metadata only until issue-core owns a pending
|
||||||
|
review queue contract. `SCOPE.md` already had the same boundary, and
|
||||||
|
`tests/test_issue_sink.py` now asserts the REST issue sink does not send a
|
||||||
|
`review_required` field as though a review queue existed.
|
||||||
|
|
||||||
|
## Decide And Gate Ops Evidence Backend
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: ACTIVITY-WP-0009-T04
|
||||||
|
status: done
|
||||||
|
priority: medium
|
||||||
|
state_hub_task_id: "61300966-c119-4ebf-af89-a6c50df93ac8"
|
||||||
|
```
|
||||||
|
|
||||||
|
Decide whether the `ops-inventory` evidence path should remain State Hub
|
||||||
|
fallback-first for now or activate Inter-Hub / ops-hub submission.
|
||||||
|
|
||||||
|
Acceptance criteria:
|
||||||
|
|
||||||
|
- the decision is recorded in State Hub and the relevant docs/workplans
|
||||||
|
- if fallback-first remains the chosen mode, docs explicitly say State Hub
|
||||||
|
`ops_inventory_probe` progress is the accepted closure path
|
||||||
|
- if Inter-Hub is activated, `OPS_HUB_KEY` is provisioned outside Git, widget /
|
||||||
|
capability mapping is configured, and live submission is tested without
|
||||||
|
printing or storing secrets
|
||||||
|
|
||||||
|
2026-06-16: Completed the current posture decision. State Hub decision
|
||||||
|
`7c235bbb-ee6f-4c3e-b1dd-74717eac9082` records that State Hub
|
||||||
|
`ops_inventory_probe` progress is the accepted live evidence backend for now.
|
||||||
|
Inter-Hub / ops-hub per-entity submission remains future work gated on
|
||||||
|
operator-owned `OPS_HUB_KEY` custody, widget mapping, and production intake
|
||||||
|
smoke tests. `docs/runbook.md` documents the fallback-first posture.
|
||||||
|
|
||||||
|
## Remove Or Rehome TaskExecutor Stub Risk
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: ACTIVITY-WP-0009-T05
|
||||||
|
status: done
|
||||||
|
priority: medium
|
||||||
|
state_hub_task_id: "fbe3e822-1a7c-4fe6-8251-cc8a782b9516"
|
||||||
|
```
|
||||||
|
|
||||||
|
Reduce the chance that `TaskExecutorWorkflow` attracts real execution work
|
||||||
|
inside activity-core.
|
||||||
|
|
||||||
|
Acceptance criteria:
|
||||||
|
|
||||||
|
- decide whether the stub should stay registered, be removed, or be moved to an
|
||||||
|
execution-owned repo/workplan
|
||||||
|
- if it stays, docs and comments explicitly mark it as non-production and
|
||||||
|
outside the activity-core ownership boundary
|
||||||
|
- no production ActivityDefinition or workflow path depends on `task_instances`
|
||||||
|
as task lifecycle state
|
||||||
|
|
||||||
|
2026-06-16: Completed by deciding to keep `TaskExecutorWorkflow` registered only
|
||||||
|
as a compatibility/idempotency stub. `src/activity_core/workflows.py` and
|
||||||
|
`docs/conventions.md` now mark it as non-production and outside activity-core's
|
||||||
|
execution boundary. No production ActivityDefinition uses `task_instances` for
|
||||||
|
task lifecycle state.
|
||||||
|
|
||||||
|
## Decide FastAPI Production Access Posture
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: ACTIVITY-WP-0009-T06
|
||||||
|
status: done
|
||||||
|
priority: medium
|
||||||
|
state_hub_task_id: "99e1e301-296b-4f78-8843-2a39e59ecd7d"
|
||||||
|
```
|
||||||
|
|
||||||
|
Choose and document the production access posture for the FastAPI admin surface.
|
||||||
|
|
||||||
|
Acceptance criteria:
|
||||||
|
|
||||||
|
- operator decides whether the API remains ClusterIP-only or receives an
|
||||||
|
authenticated ingress
|
||||||
|
- if ingress is chosen, hostname, auth layer, allowed users/agents, and audit
|
||||||
|
expectations are documented before exposure
|
||||||
|
- runbook and Railiance deployment docs match the chosen posture
|
||||||
|
|
||||||
|
2026-06-16: Completed the current access posture decision. State Hub decision
|
||||||
|
`9ffaf7a9-227a-4e39-92e3-cd93d8cda1f2` records that the FastAPI admin surface
|
||||||
|
remains ClusterIP-only until a separate authenticated ingress/access-policy work
|
||||||
|
item chooses hostname, auth layer, allowed users/agents, and audit expectations.
|
||||||
|
`docs/runbook.md` and `k8s/railiance/README.md` now agree on this posture.
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
- The historical findings are preserved under `history/`.
|
||||||
|
- `SCOPE.md`, ADRs, workplans, and implementation agree on activity-core's
|
||||||
|
boundary.
|
||||||
|
- Daily scheduled triage has real consecutive-run calibration evidence.
|
||||||
|
- At least one production-safe task creation path is proven against issue-core,
|
||||||
|
or null-sink mode is explicitly accepted as the current production posture.
|
||||||
|
- Ops evidence backend posture is explicit and tested in the chosen mode.
|
||||||
|
- No registered workflow or API path invites activity-core to own execution,
|
||||||
|
task lifecycle, project state, or privileged ops control.
|
||||||
|
|
||||||
|
## Implementation Pass - 2026-06-16
|
||||||
|
|
||||||
|
Agent-actionable closure is complete for T03, T04, T05, and T06.
|
||||||
|
|
||||||
|
Remaining waits:
|
||||||
|
|
||||||
|
- T01 waits on real scheduled daily triage run evidence.
|
||||||
|
- T02 waits on issue-core production endpoint/credentials and duplicate-handling
|
||||||
|
approval.
|
||||||
|
|
||||||
|
Verification:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/pytest tests/test_issue_sink.py tests/rules/test_executor.py -k "review_required or issue_core_rest_sink"
|
||||||
|
```
|
||||||
|
|
||||||
|
Result: 3 passed, 24 deselected.
|
||||||
|
|
||||||
|
After this workplan is synced by the custodian operator, run from `~/state-hub`:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
make fix-consistency REPO=activity-core
|
||||||
|
```
|
||||||
225
workplans/ACTIVITY-WP-0010-daily-triage-llm-reconciliation.md
Normal file
225
workplans/ACTIVITY-WP-0010-daily-triage-llm-reconciliation.md
Normal file
@@ -0,0 +1,225 @@
|
|||||||
|
---
|
||||||
|
id: ACTIVITY-WP-0010
|
||||||
|
type: workplan
|
||||||
|
title: "Daily Triage LLM Reconciliation And Evidence"
|
||||||
|
domain: custodian
|
||||||
|
repo: activity-core
|
||||||
|
status: blocked
|
||||||
|
owner: codex
|
||||||
|
topic_slug: custodian
|
||||||
|
created: "2026-06-18"
|
||||||
|
updated: "2026-06-27"
|
||||||
|
state_hub_workstream_id: "f2c73ac6-13f0-4005-82cc-76c7c9f9c8b9"
|
||||||
|
---
|
||||||
|
|
||||||
|
# ACTIVITY-WP-0010 - Daily Triage LLM Reconciliation And Evidence
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
This workplan implements the in-scope portion of the latest activity-core
|
||||||
|
suggestion review against `INTENT.md` and `SCOPE.md`.
|
||||||
|
|
||||||
|
Relevant accepted suggestion:
|
||||||
|
|
||||||
|
- State Hub message `6a098e1e-65de-4309-ab4a-446aba2f3587` from
|
||||||
|
`llm-connect` says `LLM-WP-0006` is complete on the llm-connect side. The
|
||||||
|
stable Service URL is
|
||||||
|
`http://llm-connect.activity-core.svc.cluster.local:8080`, timeout remains
|
||||||
|
`300`, the provider Secret reports populated key count, and the in-namespace
|
||||||
|
fixture smoke passed with schema-valid endpoint behavior.
|
||||||
|
|
||||||
|
Why this belongs in activity-core:
|
||||||
|
|
||||||
|
- `INTENT.md` says activity-core owns the **when/what/where** loop for
|
||||||
|
scheduled coordination work.
|
||||||
|
- `SCOPE.md` keeps LLM instruction execution in scope through the llm-connect
|
||||||
|
boundary, while keeping provider credentials and cluster reconciliation out of
|
||||||
|
scope.
|
||||||
|
- `ACTIVITY-WP-0006-T03` and `ACTIVITY-WP-0009-T01` remain open because daily
|
||||||
|
State Hub WSJF triage has not yet produced three clean scheduled runs after
|
||||||
|
the June 7 runtime projection failure.
|
||||||
|
|
||||||
|
Suggestions reviewed but not accepted as product/runtime implementation work:
|
||||||
|
|
||||||
|
- `coding_retro` activity-core suggestions for Bash tool thrash, schema thrash,
|
||||||
|
and read-before-edit hygiene are agent workflow advice. They are useful for
|
||||||
|
Codex operating style, but they do not change activity-core's Event Bridge
|
||||||
|
product surface and should not become runtime code.
|
||||||
|
- The earlier local-kubectl / cluster-owned evidence suggestion for
|
||||||
|
`ACTIVITY-WP-0007` has already been handled by moving live evidence ownership
|
||||||
|
to Railiance and closing the workplan from cluster-owned proof.
|
||||||
|
|
||||||
|
Latest evidence before this workplan:
|
||||||
|
|
||||||
|
- State Hub `daily_triage` progress on 2026-06-18 still shows
|
||||||
|
`LLM_CONNECT_URL is not configured`, which means the live activity-core
|
||||||
|
runtime has not yet consumed the repo-side URL update.
|
||||||
|
- `k8s/railiance/20-runtime.yaml` now sets the verified llm-connect Service URL
|
||||||
|
and `LLM_CONNECT_TIMEOUT_SECONDS=300`.
|
||||||
|
|
||||||
|
## Confirm Repo-Side Runtime Contract
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: ACTIVITY-WP-0010-T01
|
||||||
|
status: done
|
||||||
|
priority: high
|
||||||
|
state_hub_task_id: "dd52ce21-23b8-4e46-b3af-cb7bf486e40f"
|
||||||
|
```
|
||||||
|
|
||||||
|
Update activity-core's Railiance runtime projection so the daily triage worker
|
||||||
|
consumes the verified llm-connect Service URL by default.
|
||||||
|
|
||||||
|
Done when:
|
||||||
|
|
||||||
|
- `k8s/railiance/20-runtime.yaml` sets
|
||||||
|
`LLM_CONNECT_URL=http://llm-connect.activity-core.svc.cluster.local:8080`.
|
||||||
|
- `LLM_CONNECT_TIMEOUT_SECONDS=300` remains configured.
|
||||||
|
- Wiring tests assert the URL and timeout.
|
||||||
|
- The Railiance README states that provider credentials remain operator-owned
|
||||||
|
and outside Git / State Hub.
|
||||||
|
|
||||||
|
2026-06-18: Completed. Updated the runtime ConfigMap, README, and
|
||||||
|
`tests/test_railiance_ops_inventory_wiring.py`. Focused tests passed:
|
||||||
|
`tests/test_railiance_ops_inventory_wiring.py tests/test_llm_client.py`
|
||||||
|
reported 9 passed.
|
||||||
|
|
||||||
|
## Reconcile Live Railiance Runtime
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: ACTIVITY-WP-0010-T02
|
||||||
|
status: done
|
||||||
|
priority: high
|
||||||
|
state_hub_task_id: "23545ddc-926b-485a-8535-5cc11e01134a"
|
||||||
|
```
|
||||||
|
|
||||||
|
Apply or reconcile the updated activity-core Railiance runtime through the
|
||||||
|
cluster-owned deployment path, not through ad hoc local kubectl from this repo.
|
||||||
|
|
||||||
|
Done when non-secret evidence shows:
|
||||||
|
|
||||||
|
- live `actcore-runtime-config` has the verified `LLM_CONNECT_URL` and timeout;
|
||||||
|
- the activity-core worker has restarted or otherwise consumed the new config;
|
||||||
|
- `activity-core/llm-connect-provider-secrets` remains present with a populated
|
||||||
|
key count only, without printing or storing secret values;
|
||||||
|
- the State Hub bridge remains reachable from the activity-core runtime.
|
||||||
|
|
||||||
|
Current wait reason: this is Railiance/operator-owned live cluster work. State
|
||||||
|
Hub handoff message `9a074b7c-4b87-4e3c-a6bf-e1fe5580daa8` asks
|
||||||
|
`railiance-cluster` to reconcile the updated config and smoke it.
|
||||||
|
|
||||||
|
2026-06-19 recheck:
|
||||||
|
|
||||||
|
- Deployed `llm-connect` into the `activity-core` namespace on `railiance01`
|
||||||
|
(the cluster that runs `actcore-worker`). `coulombcore` had llm-connect only;
|
||||||
|
the in-cluster Service URL is cluster-local.
|
||||||
|
- `actcore-runtime-config` already exposed the verified URL and timeout;
|
||||||
|
`deployment/actcore-worker` was restarted and now reports
|
||||||
|
`LLM_CONNECT_URL=http://llm-connect.activity-core.svc.cluster.local:8080`.
|
||||||
|
- `llm-connect-provider-secrets` reports `DATA 1`; no Secret values were
|
||||||
|
inspected.
|
||||||
|
- Worker health probe to llm-connect `/health` returns `{"status": "ok"}`.
|
||||||
|
- `actcore-state-hub-bridge` remains `0/1` Ready with upstream timeouts, so T02
|
||||||
|
is not fully closed until the node-local State Hub tunnel is restored.
|
||||||
|
|
||||||
|
2026-06-27 recheck:
|
||||||
|
|
||||||
|
- Superseded by real scheduled runner evidence: State Hub daily_triage events on
|
||||||
|
2026-06-24, 2026-06-25, 2026-06-26, and 2026-06-27 all reached State Hub and
|
||||||
|
wrote working-memory notes. The bridge/sink is therefore reachable for the
|
||||||
|
live runner.
|
||||||
|
- 2026-06-24 and 2026-06-25 were schema-valid; 2026-06-26 and 2026-06-27 failed
|
||||||
|
output validation after calling llm-connect. That moves the active blocker out
|
||||||
|
of T02 and into the WP-0016 live bundle/smoke lane. Marking T02 done.
|
||||||
|
|
||||||
|
## Run Daily Triage Fixture Smoke
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: ACTIVITY-WP-0010-T03
|
||||||
|
status: wait
|
||||||
|
priority: high
|
||||||
|
state_hub_task_id: "10e0df77-c230-4a82-b720-23c66bd17c0a"
|
||||||
|
```
|
||||||
|
|
||||||
|
After T02, run a manual or smoke execution of
|
||||||
|
`daily-statehub-wsjf-triage` against the live activity-core runtime.
|
||||||
|
|
||||||
|
Done when:
|
||||||
|
|
||||||
|
- the run calls llm-connect through the configured Service URL;
|
||||||
|
- llm-connect returns content accepted as schema-valid daily-triage JSON;
|
||||||
|
- State Hub receives a `daily_triage` progress item with `output_validated=true`;
|
||||||
|
- the working-memory daily-triage note exists at the path recorded in State Hub
|
||||||
|
detail;
|
||||||
|
- `scripts/verify_daily_triage.py` reports the smoke/manual run as present.
|
||||||
|
|
||||||
|
2026-06-19 recheck:
|
||||||
|
|
||||||
|
- In-namespace llm-connect fixture smoke on `railiance01` passed:
|
||||||
|
`smoke: pass health=ok latency_seconds=1.681 recommendations=1`.
|
||||||
|
- Manual `POST /activity-definitions/6fca51fa-387a-4fd0-bc4e-d62c29eb859a/trigger`
|
||||||
|
reached llm-connect, but the workflow failed at `persist_instruction_reports`
|
||||||
|
with `state-hub-progress` sink `Connection refused` while
|
||||||
|
`actcore-state-hub-bridge` is unhealthy.
|
||||||
|
- T03 therefore remains open until State Hub bridge reachability is restored and
|
||||||
|
a run emits non-secret `daily_triage` progress with `output_validated=true`.
|
||||||
|
|
||||||
|
2026-06-27 recheck:
|
||||||
|
|
||||||
|
- Scheduled runs on 2026-06-24 and 2026-06-25 satisfy the non-secret smoke
|
||||||
|
evidence for llm-connect call, State Hub progress with output_validated=true,
|
||||||
|
and working-memory note creation.
|
||||||
|
- Kept T03 at progress rather than done because the workstation did not run the
|
||||||
|
live verifier against Temporal/activity-core DB, and the smoke must be repeated
|
||||||
|
after the WP-0016 code/schema/runtime-prompt deployment due the 2026-06-26 and
|
||||||
|
2026-06-27 malformed-output failures.
|
||||||
|
|
||||||
|
## Collect Three Clean Scheduled Runs
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: ACTIVITY-WP-0010-T04
|
||||||
|
status: wait
|
||||||
|
priority: high
|
||||||
|
state_hub_task_id: "dc6b9482-cf43-4fc5-994b-dcd7dea47db7"
|
||||||
|
```
|
||||||
|
|
||||||
|
Let the normal 07:20 Europe/Berlin schedule produce three consecutive clean
|
||||||
|
daily triage runs after the live config reconciliation.
|
||||||
|
|
||||||
|
Done when:
|
||||||
|
|
||||||
|
- three consecutive scheduled runs have Temporal workflow evidence,
|
||||||
|
`activity_runs` rows, State Hub `daily_triage` progress, and working-memory
|
||||||
|
notes;
|
||||||
|
- none of the three runs are merely manual smoke tests or `execution_failed`
|
||||||
|
diagnostics;
|
||||||
|
- calibration feedback is recorded in State Hub;
|
||||||
|
- `ACTIVITY-WP-0006-T03` and `ACTIVITY-WP-0009-T01` can move from `wait` to
|
||||||
|
`done`.
|
||||||
|
|
||||||
|
2026-06-27 recheck:
|
||||||
|
|
||||||
|
- Three-clean-run streak is reset. The latest sequence is 2026-06-24 clean,
|
||||||
|
2026-06-25 clean, 2026-06-26 validation_failed, 2026-06-27 validation_failed.
|
||||||
|
- Current pickup is to deploy ACTIVITY-WP-0016 code/schema together with the
|
||||||
|
Railiance runtime prompt and max_tokens changes, run a live smoke, then restart
|
||||||
|
the three-consecutive-scheduled-run gate from zero.
|
||||||
|
|
||||||
|
## Close Handoff State
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: ACTIVITY-WP-0010-T05
|
||||||
|
status: wait
|
||||||
|
priority: medium
|
||||||
|
state_hub_task_id: "ecc57e21-1716-4daa-aba6-d8a6d824e4ed"
|
||||||
|
```
|
||||||
|
|
||||||
|
Update the surrounding workplans and State Hub once the live daily triage gate
|
||||||
|
passes.
|
||||||
|
|
||||||
|
Done when:
|
||||||
|
|
||||||
|
- `ACTIVITY-WP-0006` records the three-run calibration evidence;
|
||||||
|
- `ACTIVITY-WP-0009` records the scheduled-run trust gap closure;
|
||||||
|
- any temporary `needs_human` flags created for the llm-connect provider/config
|
||||||
|
handoff are cleared or replaced by a narrower follow-up;
|
||||||
|
- this workplan is marked `finished`.
|
||||||
179
workplans/ACTIVITY-WP-0011-event-payload-context-resolver.md
Normal file
179
workplans/ACTIVITY-WP-0011-event-payload-context-resolver.md
Normal file
@@ -0,0 +1,179 @@
|
|||||||
|
---
|
||||||
|
id: ACTIVITY-WP-0011
|
||||||
|
type: workplan
|
||||||
|
title: "Event Payload Context Resolver"
|
||||||
|
domain: custodian
|
||||||
|
repo: activity-core
|
||||||
|
status: finished
|
||||||
|
owner: codex
|
||||||
|
topic_slug: custodian
|
||||||
|
created: "2026-06-18"
|
||||||
|
updated: "2026-06-18"
|
||||||
|
state_hub_workstream_id: "4efe4bcf-2148-4489-b57c-87f6039d4ed5"
|
||||||
|
---
|
||||||
|
|
||||||
|
# ACTIVITY-WP-0011 - Event Payload Context Resolver
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
State Hub message `d561ebd7-ba01-4dc6-8ffc-fe87d45304ee` from
|
||||||
|
`kaizen-agentic` handed off an urgent blocker for LOOP-WP-0002:
|
||||||
|
event-triggered definitions can receive the triggering EventEnvelope JSON, but
|
||||||
|
activity-core did not bind `source.type: event-payload` into the context
|
||||||
|
snapshot. The immediate customer is the disabled
|
||||||
|
`coulomb-low-success-rate-review` ActivityDefinition, whose
|
||||||
|
`flag-low-success-rate` rule needs to evaluate
|
||||||
|
`context.metrics.summary.success_rate`.
|
||||||
|
|
||||||
|
This is in activity-core scope because the repo owns ActivityDefinition context
|
||||||
|
resolution and the Event Bridge workflow boundary. The remaining event type
|
||||||
|
registry and live NATS smoke evidence are cross-repo/operator gates and should
|
||||||
|
wait in State Hub rather than depending on local kubectl or ad hoc live cluster
|
||||||
|
access from this repo.
|
||||||
|
|
||||||
|
## Implement Event Payload Resolver
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: ACTIVITY-WP-0011-T01
|
||||||
|
status: done
|
||||||
|
priority: high
|
||||||
|
state_hub_task_id: "5c87ce0b-3bd0-4a44-aae5-10d7586c939e"
|
||||||
|
```
|
||||||
|
|
||||||
|
Register resolver type `event-payload` so event-triggered definitions can bind
|
||||||
|
the triggering EventEnvelope attributes into `context.*`.
|
||||||
|
|
||||||
|
Done when:
|
||||||
|
|
||||||
|
- `activity_core.context_resolvers` imports and registers an `event-payload`
|
||||||
|
resolver.
|
||||||
|
- `resolve_context` parses `event_envelope_json` once and passes the parsed
|
||||||
|
envelope to registered resolvers.
|
||||||
|
- `source.type: event-payload` extracts envelope `attributes`.
|
||||||
|
- `bind_to: context.metrics` strips the `context.` prefix and unwraps a
|
||||||
|
single-key `{"metrics": ...}` attributes payload into `snapshot["metrics"]`.
|
||||||
|
- Missing or malformed envelopes fail required sources visibly and bind `{}` for
|
||||||
|
optional sources.
|
||||||
|
|
||||||
|
2026-06-18: Completed in `src/activity_core/activities.py` and
|
||||||
|
`src/activity_core/context_resolvers/event_payload.py`.
|
||||||
|
|
||||||
|
## Cover Binding And Rule Evaluation
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: ACTIVITY-WP-0011-T02
|
||||||
|
status: done
|
||||||
|
priority: high
|
||||||
|
state_hub_task_id: "c6f7dea6-9adc-4997-a22e-4bf2e94dc05a"
|
||||||
|
```
|
||||||
|
|
||||||
|
Add focused tests for the handoff acceptance contract.
|
||||||
|
|
||||||
|
Done when:
|
||||||
|
|
||||||
|
- sample `kaizen.metrics.recorded` envelope attributes resolve to:
|
||||||
|
`{"metrics": {"agent": "coach", "project": "kaizen-agentic", "summary": ...}}`;
|
||||||
|
- `flag-low-success-rate` evaluates
|
||||||
|
`context.metrics.summary.success_rate < 0.8`;
|
||||||
|
- optional missing envelopes bind `{}`;
|
||||||
|
- required missing envelopes raise a visible activity failure.
|
||||||
|
|
||||||
|
2026-06-18: Completed in `tests/test_resolve_context_binding.py`. Focused
|
||||||
|
tests passed:
|
||||||
|
`.venv/bin/python -m pytest tests/test_resolve_context_binding.py tests/test_rule_evaluation_activity.py`
|
||||||
|
reported 8 passed, and adjacent rule tests
|
||||||
|
`.venv/bin/python -m pytest tests/rules/test_evaluator.py tests/rules/test_actions.py`
|
||||||
|
reported 55 passed.
|
||||||
|
|
||||||
|
## Wait For Event Type Registry
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: ACTIVITY-WP-0011-T03
|
||||||
|
status: done
|
||||||
|
priority: high
|
||||||
|
state_hub_task_id: "a4f277de-eb83-41bc-860e-b26586c72495"
|
||||||
|
```
|
||||||
|
|
||||||
|
Confirm that `kaizen.metrics.recorded` is registered in the shared event type
|
||||||
|
catalog through the owning State Hub / producer workflow.
|
||||||
|
|
||||||
|
Done when:
|
||||||
|
|
||||||
|
- State Hub or the producer-owned event catalog exposes
|
||||||
|
`kaizen.metrics.recorded` with an attributes schema covering
|
||||||
|
`metrics.agent`, `metrics.project`, and `metrics.summary.success_rate`;
|
||||||
|
- the registry decision names the owning repo for future schema changes;
|
||||||
|
- activity-core has no local-only event type drift from the producer contract.
|
||||||
|
|
||||||
|
Registry ownership: the event type is producer/catalog owned. Activity-core
|
||||||
|
accepted State Hub-backed registry confirmation before closing the workplan.
|
||||||
|
|
||||||
|
2026-06-18: Closed from State Hub acknowledgement
|
||||||
|
`3efb56d8-c3d6-4308-82ea-76eaaa172255` from `kaizen-agentic`. The producer
|
||||||
|
registered `kaizen.metrics.recorded` in `kaizen-agentic/event-types/` with
|
||||||
|
status `active`, publisher `kaizen-agentic`, and schema fields
|
||||||
|
`agent`, `project`, `summary.success_rate`, `summary.execution_count`, and
|
||||||
|
`summary.avg_quality`. The sync command reported was
|
||||||
|
`ACTIVITY_DEFINITION_DIRS=~/coulomb-loop:~/kaizen-agentic make sync-event-types`.
|
||||||
|
|
||||||
|
## Wait For Live Event Smoke
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: ACTIVITY-WP-0011-T04
|
||||||
|
status: done
|
||||||
|
priority: high
|
||||||
|
state_hub_task_id: "3b636d5e-8f93-49b4-ae53-3da4f736a4d9"
|
||||||
|
```
|
||||||
|
|
||||||
|
After T03, run the live event-triggered path without relying on local kubectl
|
||||||
|
from activity-core.
|
||||||
|
|
||||||
|
Done when State Hub records non-secret evidence that:
|
||||||
|
|
||||||
|
- a sample `kaizen.metrics.recorded` envelope was published on the expected NATS
|
||||||
|
subject;
|
||||||
|
- activity-core triggered `coulomb-low-success-rate-review`;
|
||||||
|
- the resolved context snapshot contained `context.metrics.summary.success_rate`;
|
||||||
|
- `flag-low-success-rate` matched and produced the expected task/report output;
|
||||||
|
- any disabled-definition or operator-controlled enablement state was recorded.
|
||||||
|
|
||||||
|
Execution ownership: this cross-repo/live-runtime smoke was owned by the event
|
||||||
|
producer, customer definition owner, and cluster/operator path. Activity-core
|
||||||
|
accepted the non-secret evidence from State Hub.
|
||||||
|
|
||||||
|
2026-06-18: Closed from State Hub acknowledgement
|
||||||
|
`68bfcd0d-7c47-4b42-85fc-64d63f38a909` from `kaizen-agentic`.
|
||||||
|
Supplier confirms R1 acceptance criteria met and LOOP-WP-0002 closed. Evidence:
|
||||||
|
NATS `activity.kaizen.metrics.recorded` triggered
|
||||||
|
`coulomb-low-success-rate-review` (`da7a9af7`), run
|
||||||
|
`e61554c6-1e67-5fa1-b34e-478d154a188e`, `tasks_spawned=1`, with
|
||||||
|
`metrics.summary.success_rate=0.75`.
|
||||||
|
|
||||||
|
## Close Handoff
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: ACTIVITY-WP-0011-T05
|
||||||
|
status: done
|
||||||
|
priority: medium
|
||||||
|
state_hub_task_id: "5169d8c5-769f-4272-97cf-c25b31087601"
|
||||||
|
```
|
||||||
|
|
||||||
|
Close the urgent R1/live-smoke handoff once State Hub has acknowledgement that
|
||||||
|
the resolver-side blocker is removed. The broader workplan remains blocked only
|
||||||
|
on T03 event-type registry confirmation.
|
||||||
|
|
||||||
|
Done when:
|
||||||
|
|
||||||
|
- State Hub message `d561ebd7-ba01-4dc6-8ffc-fe87d45304ee` is answered or
|
||||||
|
linked to this workplan;
|
||||||
|
- `kaizen-agentic` / LOOP-WP-0002 can proceed without an activity-core code
|
||||||
|
blocker;
|
||||||
|
- this workplan has no remaining activity-core code or live-smoke blocker.
|
||||||
|
|
||||||
|
2026-06-18: Closed from State Hub acknowledgement
|
||||||
|
`68bfcd0d-7c47-4b42-85fc-64d63f38a909`. The original handoff message
|
||||||
|
`d561ebd7-ba01-4dc6-8ffc-fe87d45304ee` was answered, and the live smoke
|
||||||
|
evidence in T04 unblocks LOOP-WP-0002.
|
||||||
|
|
||||||
|
2026-06-18: Workplan finished. T03 registry confirmation, T04 live event smoke,
|
||||||
|
and T05 handoff closure are all done in State Hub.
|
||||||
192
workplans/ACTIVITY-WP-0012-definition-schedule-hot-reload.md
Normal file
192
workplans/ACTIVITY-WP-0012-definition-schedule-hot-reload.md
Normal file
@@ -0,0 +1,192 @@
|
|||||||
|
---
|
||||||
|
id: ACTIVITY-WP-0012
|
||||||
|
type: workplan
|
||||||
|
title: "Definition And Schedule Hot Reload"
|
||||||
|
domain: custodian
|
||||||
|
repo: activity-core
|
||||||
|
status: finished
|
||||||
|
owner: codex
|
||||||
|
topic_slug: custodian
|
||||||
|
created: "2026-06-18"
|
||||||
|
updated: "2026-06-22"
|
||||||
|
state_hub_workstream_id: "8887075e-21ec-451b-b82b-cd81035c9ca5"
|
||||||
|
---
|
||||||
|
|
||||||
|
# ACTIVITY-WP-0012 - Definition And Schedule Hot Reload
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
State Hub message `f4876517-f738-4571-a2d6-76f2965e9a13` from
|
||||||
|
`coulomb-loop` reports an operational gap from the Coulomb cadence ramp: after
|
||||||
|
renaming customer definitions from hourly to daily, operators had to run
|
||||||
|
definition/schedule sync and restart the worker before new Temporal schedule
|
||||||
|
state was reliable.
|
||||||
|
|
||||||
|
Current behavior:
|
||||||
|
|
||||||
|
- `worker.py` runs `sync_activity_definitions` and `sync_schedules` once at
|
||||||
|
startup.
|
||||||
|
- `RunActivityWorkflow` loads ActivityDefinitions from the DB at activity time.
|
||||||
|
- The event router reloads enabled event definitions per NATS message.
|
||||||
|
- Cron schedule changes only take effect when `sync_schedules` runs.
|
||||||
|
|
||||||
|
This belongs in activity-core because the repo owns ActivityDefinition sync,
|
||||||
|
Temporal schedule projection, and the admin API. The first implementation
|
||||||
|
should expose an operator-triggered sync path without turning activity-core into
|
||||||
|
a repo checkout manager or CI system.
|
||||||
|
|
||||||
|
## Extract Reusable Sync Service
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: ACTIVITY-WP-0012-T01
|
||||||
|
status: done
|
||||||
|
priority: high
|
||||||
|
state_hub_task_id: "53a7970b-7eec-47f5-ad30-bbd7c6271952"
|
||||||
|
```
|
||||||
|
|
||||||
|
Refactor the worker-startup sync sequence into a reusable async service that can
|
||||||
|
be called by startup and the API.
|
||||||
|
|
||||||
|
Done when:
|
||||||
|
|
||||||
|
- the service can run ActivityDefinition sync, event type sync, and Temporal
|
||||||
|
schedule sync independently based on booleans;
|
||||||
|
- it accepts the existing DB session factory / Temporal client dependencies
|
||||||
|
without creating hidden global state;
|
||||||
|
- startup behavior remains unchanged except for calling the shared service;
|
||||||
|
- failures are collected into a bounded `errors[]` result while preserving the
|
||||||
|
current startup best-effort behavior.
|
||||||
|
|
||||||
|
2026-06-19: Completed. Added `activity_core.sync_service.run_sync`, which
|
||||||
|
orchestrates ActivityDefinition, event type, and schedule sync independently
|
||||||
|
from explicit DB session factory and Temporal client dependencies. Worker
|
||||||
|
startup now calls the shared service for definitions+schedules and logs bounded
|
||||||
|
stage errors while continuing startup.
|
||||||
|
|
||||||
|
## Add Admin Sync Endpoint
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: ACTIVITY-WP-0012-T02
|
||||||
|
status: done
|
||||||
|
priority: high
|
||||||
|
state_hub_task_id: "8697c761-15d1-4da0-b66b-d838218a2495"
|
||||||
|
```
|
||||||
|
|
||||||
|
Add an operator-only API endpoint:
|
||||||
|
|
||||||
|
`POST /admin/sync?definitions=true&schedules=true&event_types=true`
|
||||||
|
|
||||||
|
Done when:
|
||||||
|
|
||||||
|
- the endpoint runs the shared sync service without requiring worker restart;
|
||||||
|
- response JSON reports counts for definitions, event types, schedules upserted,
|
||||||
|
schedules paused/deleted, and errors;
|
||||||
|
- default parameters sync definitions and schedules, with event types opt-in or
|
||||||
|
clearly documented;
|
||||||
|
- endpoint tests cover definitions-only, schedules-only, all-sync, and failure
|
||||||
|
result behavior.
|
||||||
|
|
||||||
|
2026-06-19: Completed. Added `POST /admin/sync` with defaults
|
||||||
|
`definitions=true`, `schedules=true`, and `event_types=false`. The response
|
||||||
|
reports definition/event counts, schedule upsert/pause/orphan-delete counts, and
|
||||||
|
bounded `errors[]`. Tests cover definitions-only, schedules-only, all-sync, and
|
||||||
|
failure-result behavior.
|
||||||
|
|
||||||
|
## Preserve Schedule Drift Semantics
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: ACTIVITY-WP-0012-T03
|
||||||
|
status: done
|
||||||
|
priority: high
|
||||||
|
state_hub_task_id: "efeac412-632c-4c90-9428-bb575ac7a624"
|
||||||
|
```
|
||||||
|
|
||||||
|
Make the sync result explicit enough for cadence changes and renames.
|
||||||
|
|
||||||
|
Done when:
|
||||||
|
|
||||||
|
- disabled cron definitions pause their Temporal schedules on sync;
|
||||||
|
- renamed definitions create the new schedule and pause/delete orphaned old
|
||||||
|
schedules according to the existing `sync_schedules` semantics;
|
||||||
|
- event-triggered definitions remain hot through the existing router DB reload
|
||||||
|
path;
|
||||||
|
- regression tests demonstrate the Coulomb hourly-to-daily rename shape without
|
||||||
|
needing a worker restart.
|
||||||
|
|
||||||
|
2026-06-19: Completed. `sync_schedules` now returns explicit counts for enabled
|
||||||
|
schedule upserts, disabled schedule pauses, and orphan deletes. Regression tests
|
||||||
|
cover the hourly-to-daily rename shape: a new enabled cron schedule is upserted,
|
||||||
|
the old disabled cron schedule is preserved as paused, unrelated orphan
|
||||||
|
schedules are deleted, event-triggered definitions do not create schedules, and
|
||||||
|
one-shot scheduled definitions are no longer mistaken for orphans.
|
||||||
|
|
||||||
|
## Optional Background Sync Loop
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: ACTIVITY-WP-0012-T04
|
||||||
|
status: done
|
||||||
|
priority: medium
|
||||||
|
state_hub_task_id: "d774087b-c51d-4444-8e90-bfef43765456"
|
||||||
|
```
|
||||||
|
|
||||||
|
Decide whether to add a periodic sync loop after the admin endpoint exists.
|
||||||
|
|
||||||
|
Done when:
|
||||||
|
|
||||||
|
- either `ACTIVITY_SYNC_INTERVAL_SECONDS` is implemented with a default disabled
|
||||||
|
or conservative interval, or the workplan records why manual/admin-triggered
|
||||||
|
sync is the safer v1 posture;
|
||||||
|
- if implemented, logs and metrics expose the last successful sync timestamp and
|
||||||
|
last error summary;
|
||||||
|
- the loop does not block worker startup or workflow task processing.
|
||||||
|
|
||||||
|
2026-06-19: Completed by decision. v1 stays manual/operator-triggered through
|
||||||
|
`POST /admin/sync`; no background loop was added. The runbook records this
|
||||||
|
posture so customer definition changes stay explicit and the worker does not
|
||||||
|
start background repo scanning. A periodic loop remains a future option if live
|
||||||
|
operator use proves it is needed.
|
||||||
|
|
||||||
|
## Live No-Restart Smoke
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: ACTIVITY-WP-0012-T05
|
||||||
|
status: done
|
||||||
|
priority: high
|
||||||
|
state_hub_task_id: "68a0e22a-106a-4d21-9f39-c6279850cb5e"
|
||||||
|
```
|
||||||
|
|
||||||
|
Validate the hot-reload path in the cluster/operator environment.
|
||||||
|
|
||||||
|
Done when non-secret State Hub evidence shows:
|
||||||
|
|
||||||
|
- a customer repo definition rename or `enabled` flip is synced through
|
||||||
|
`/admin/sync`;
|
||||||
|
- new Temporal schedules are active and retired schedules are paused/deleted
|
||||||
|
without worker SIGTERM or pod restart;
|
||||||
|
- event-triggered definitions still fire normally;
|
||||||
|
- rollback or repeat sync is idempotent.
|
||||||
|
|
||||||
|
2026-06-22: Completed on Railiance01 (`KUBECONFIG=~/.kube/config-hosteurope`).
|
||||||
|
|
||||||
|
Smoke target: disabled projection `ops-service-inventory-probes`
|
||||||
|
(`40d15a87-7ff6-4d8e-992c-37df15f95110`) in
|
||||||
|
`actcore-external-activity-definitions`.
|
||||||
|
|
||||||
|
Evidence:
|
||||||
|
|
||||||
|
- ConfigMap flip `enabled: false -> true` and cadence `15 * * * * -> 25 * * * *`,
|
||||||
|
then `POST /admin/sync?definitions=true&schedules=true` from `actcore-api`.
|
||||||
|
- DB after sync: `enabled=true`, `cron=25 * * * *`.
|
||||||
|
- Temporal schedule after sync: `paused=false`, calendar minute `25`.
|
||||||
|
- Repeat sync returned identical schedule counts
|
||||||
|
(`upserted=5`, `paused=1`, `deleted_orphans=0`) — idempotent.
|
||||||
|
- Rollback flip restored `enabled=false`, `cron=15 * * * *`, schedule
|
||||||
|
`paused=true`, calendar minute `15`.
|
||||||
|
- `actcore-worker` pod UID unchanged (`a68d6539-2bba-457e-a78a-39564002a980`,
|
||||||
|
started `2026-06-21T18:46:46Z`); `actcore-event-router` pod UID unchanged.
|
||||||
|
- Event-triggered definitions: none projected on Railiance01 today; hot DB
|
||||||
|
reload path for event definitions remains covered by T03 unit tests and an
|
||||||
|
unchanged event-router deployment.
|
||||||
|
|
||||||
|
Automation: `scripts/smoke_admin_sync_no_restart.py`. Runbook section added
|
||||||
|
under "Railiance01 no-restart smoke".
|
||||||
@@ -0,0 +1,78 @@
|
|||||||
|
---
|
||||||
|
id: ACTIVITY-WP-0013
|
||||||
|
type: workplan
|
||||||
|
title: "Reuse Surface Report Gaps Resolver"
|
||||||
|
domain: custodian
|
||||||
|
repo: activity-core
|
||||||
|
status: finished
|
||||||
|
owner: codex
|
||||||
|
topic_slug: activity-core
|
||||||
|
created: "2026-06-18"
|
||||||
|
updated: "2026-06-18"
|
||||||
|
state_hub_workstream_id: "01e68dfd-b146-4aef-a575-2d3b178ca5c2"
|
||||||
|
---
|
||||||
|
|
||||||
|
# Reuse Surface Report Gaps Resolver
|
||||||
|
|
||||||
|
Implement the R2 handoff from kaizen-agentic (`bffa224c`) so the
|
||||||
|
`reuse_surface_report_gaps` shell context source populates
|
||||||
|
`context.gaps` for the Coulomb daily registry hygiene sweep.
|
||||||
|
|
||||||
|
## Register Shell Resolver Query
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: ACTIVITY-WP-0013-T01
|
||||||
|
status: done
|
||||||
|
priority: high
|
||||||
|
state_hub_task_id: "a6e1fc5c-7b42-436d-914e-4d605cb6f329"
|
||||||
|
```
|
||||||
|
|
||||||
|
Add a dedicated reuse-surface context resolver module and register
|
||||||
|
`reuse_surface_report_gaps` on the `shell` resolver path while preserving
|
||||||
|
the existing kaizen shell query behavior.
|
||||||
|
|
||||||
|
## Implement Batch And Signal Semantics
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: ACTIVITY-WP-0013-T02
|
||||||
|
status: done
|
||||||
|
priority: high
|
||||||
|
state_hub_task_id: "229cf285-8388-471d-95fd-08400db1553e"
|
||||||
|
```
|
||||||
|
|
||||||
|
Load the Coulomb rollout roster, select active repos with a persisted
|
||||||
|
round-robin cursor, resolve repo roots from State Hub host paths, run
|
||||||
|
`reuse-surface report gaps --format json`, and emit gap records for the
|
||||||
|
enabled registry hygiene signals.
|
||||||
|
|
||||||
|
## Cover Required And Optional Failure Modes
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: ACTIVITY-WP-0013-T03
|
||||||
|
status: done
|
||||||
|
priority: high
|
||||||
|
state_hub_task_id: "85b5c7d4-40e1-4945-8ada-1dff2363c194"
|
||||||
|
```
|
||||||
|
|
||||||
|
Ensure missing required dependencies fail visibly while optional resolver
|
||||||
|
sources bind an empty `context.gaps` list. Add unit coverage for fixture
|
||||||
|
rollout data, mocked CLI JSON, resolver binding, and `hygiene_signal`
|
||||||
|
rule gating.
|
||||||
|
|
||||||
|
## Smoke Real Coulomb Rollout
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: ACTIVITY-WP-0013-T04
|
||||||
|
status: done
|
||||||
|
priority: medium
|
||||||
|
state_hub_task_id: "6a5446ed-b4ec-4693-b508-65415571d834"
|
||||||
|
```
|
||||||
|
|
||||||
|
Run a live resolver smoke against
|
||||||
|
`/home/worsch/coulomb-loop/loops/registry-hygiene/rollout.yaml` using a
|
||||||
|
temporary round-robin cursor. The real active rollout produced five gaps,
|
||||||
|
including one for `reuse-surface` with `hygiene_signal: stale_sbom`.
|
||||||
|
The smoke supplied `reuse_surface_bin:
|
||||||
|
/home/worsch/reuse-surface/.venv/bin/reuse-surface` and
|
||||||
|
`runner_host: bnt-lap001`; the worker environment or definition params must
|
||||||
|
provide equivalent values before enabling the production sweep.
|
||||||
194
workplans/ACTIVITY-WP-0014-schedule-misfire-robustness.md
Normal file
194
workplans/ACTIVITY-WP-0014-schedule-misfire-robustness.md
Normal file
@@ -0,0 +1,194 @@
|
|||||||
|
---
|
||||||
|
id: ACTIVITY-WP-0014
|
||||||
|
type: workplan
|
||||||
|
title: "Schedule Misfire Robustness & Run-Miss Recovery Options"
|
||||||
|
domain: infotech
|
||||||
|
repo: activity-core
|
||||||
|
status: finished
|
||||||
|
owner: claude
|
||||||
|
topic_slug: activity-core
|
||||||
|
created: "2026-06-23"
|
||||||
|
updated: "2026-06-24"
|
||||||
|
status_note: "T01-T05 complete; beachhead-endpoint adoption split to ACTIVITY-WP-0015"
|
||||||
|
state_hub_workstream_id: "91b64686-5d17-4c86-bc9e-3d0ee6720cf5"
|
||||||
|
---
|
||||||
|
|
||||||
|
# Schedule Misfire Robustness & Run-Miss Recovery Options
|
||||||
|
|
||||||
|
Make cron-triggered ActivityDefinitions robust to missed fires (worker/Temporal
|
||||||
|
unavailable at trigger time) with explicit, per-definition recovery behaviour,
|
||||||
|
plus detection/alerting when a scheduled fire is missed.
|
||||||
|
|
||||||
|
## Motivation
|
||||||
|
|
||||||
|
On 2026-06-22 and 2026-06-23 the `daily-statehub-wsjf-triage` definition
|
||||||
|
(cron `20 7 * * *` Europe/Berlin, projected into the Railiance runtime ConfigMap
|
||||||
|
`actcore-external-activity-definitions`) produced **no `daily_triage` progress
|
||||||
|
event at all** — neither a success nor a `could not run; operator review
|
||||||
|
required` failure.
|
||||||
|
|
||||||
|
> **Corrected by T01 (2026-06-23).** The initial hypothesis below — that
|
||||||
|
> `_build_schedule()` never set `catchup_window`, so a short-default catchup
|
||||||
|
> window silently dropped the fire — was **disproven on the live cluster**. The
|
||||||
|
> Temporal schedule is healthy with `CatchupWindow 365d` (the server default) and
|
||||||
|
> `0 MissedCatchupWindow`. The real cause is that the run **fired and ran but
|
||||||
|
> failed at the report sink** with `Connection refused` posting to State Hub,
|
||||||
|
> because railiance01 reaches State Hub via a reverse tunnel back to the
|
||||||
|
> workstation, which is asleep at 07:20 Berlin. See the T01 findings and T05.
|
||||||
|
|
||||||
|
The trigger now originates entirely on **railiance01** (in-cluster Temporal
|
||||||
|
Schedule, ConfigMap-projected definition) and is **not** laptop-dependent — but
|
||||||
|
the triage's State Hub *data dependencies* (context resolution and report
|
||||||
|
delivery) still route back to the workstation State Hub.
|
||||||
|
|
||||||
|
This workplan still delivers worthwhile robustness — explicit run-miss recovery
|
||||||
|
policies (T02) and missed-fire detection (T03) — but the fix for *this* incident
|
||||||
|
is T05 (resilient sinks/resolvers + a workstation-independent State Hub endpoint).
|
||||||
|
|
||||||
|
## Desired run-miss options (from Bernd)
|
||||||
|
|
||||||
|
Three explicit, per-definition behaviours when a fire is missed:
|
||||||
|
|
||||||
|
1. **Run on trigger or skip** — never recover a missed fire.
|
||||||
|
2. **Run on trigger or later if missed** — recover **all** missed fires when back up.
|
||||||
|
3. **Run on trigger or later if missed, but skip if next trigger reached** —
|
||||||
|
recover only the **most recent** missed fire; do not accumulate a backlog.
|
||||||
|
|
||||||
|
Proposed mapping to a new `misfire_policy` value set (names open to review):
|
||||||
|
|
||||||
|
| Policy | Semantics | Temporal mapping |
|
||||||
|
| --- | --- | --- |
|
||||||
|
| `skip` | Run on trigger or skip | `catchup_window ≈ 0`, `overlap=SKIP` |
|
||||||
|
| `catchup_all` | Run on trigger or all missed later | `catchup_window=<long>`, `overlap=BUFFER_ALL` |
|
||||||
|
| `catchup_latest` | Run on trigger or only the latest missed | `catchup_window ≈ 1 interval`, `overlap=BUFFER_ONE` |
|
||||||
|
|
||||||
|
## Confirm root cause on Railiance01
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: ACTIVITY-WP-0014-T01
|
||||||
|
status: done
|
||||||
|
priority: high
|
||||||
|
state_hub_task_id: "c90ff214-9214-48c7-96b9-7d699528d5ab"
|
||||||
|
```
|
||||||
|
|
||||||
|
Inspected via `ssh railiance01` + in-node `kubectl`/`temporal` (no k3s tunnel is
|
||||||
|
defined for railiance01; the documented access path is SSH to the host).
|
||||||
|
|
||||||
|
**Findings (2026-06-23) — the WP-0014 premise was wrong for this incident:**
|
||||||
|
|
||||||
|
- All pods healthy; `actcore-worker` up 44h, 0 restarts. Not a crash.
|
||||||
|
- The daily-triage Temporal schedule (`activity-schedule-6fca51fa-…`) is
|
||||||
|
**healthy**: `Paused false`, `OverlapPolicy Skip`, **`CatchupWindow 365d`**
|
||||||
|
(Temporal's *default* when unset), `ActionCounts {Total:8, MissedCatchupWindow:0}`.
|
||||||
|
So fires were **not** silently dropped — my original "no catchup window → silent
|
||||||
|
drop" hypothesis does not hold; the server default is already 365d.
|
||||||
|
- The `2026-06-23T05:20:00Z` fire **did fire and ran**, then **Failed at the report
|
||||||
|
sink**: `report sink failure: state-hub-progress … '[Errno 111] Connection
|
||||||
|
refused'`. The run produced a report but could not deliver it to State Hub, so
|
||||||
|
no `daily_triage` progress event (not even a "could not run" one) was posted →
|
||||||
|
the silence. The 06-22 fire has no execution in retention (bridge likely down
|
||||||
|
then too / schedule update window at `LastUpdateAt 1d ago`).
|
||||||
|
- Root cause is **State Hub connectivity from railiance01**, not Temporal. The
|
||||||
|
in-cluster `actcore-state-hub-bridge` (`hostNetwork`) proxies to
|
||||||
|
`127.0.0.1:18000` on the node — the local end of the ops-bridge **reverse tunnel
|
||||||
|
back to the workstation's State Hub**. At 07:20 Europe/Berlin (= 05:20 UTC) the
|
||||||
|
workstation/tunnel was unreachable → `Connection refused`. Chronic flakiness
|
||||||
|
confirmed: 102 State Hub resolver timeouts in 24h (69 `recently_on_scope`,
|
||||||
|
33 `consistency_sweep`).
|
||||||
|
|
||||||
|
**Implication:** the trigger *is* independent of the laptop, but the triage's
|
||||||
|
**data dependencies (State Hub context resolution + report delivery) still route
|
||||||
|
back to the workstation State Hub**, which is asleep at 07:20 Berlin. WP-0014's
|
||||||
|
misfire policies are still good robustness, but the real fix is (a) State Hub
|
||||||
|
reachable from railiance01 independent of the workstation, and/or (b) sinks/
|
||||||
|
resolvers resilient to transient State Hub unavailability (retry/backoff,
|
||||||
|
store-and-forward) instead of hard-failing the workflow. Tracked as follow-up
|
||||||
|
below. Backfill deferred: a replay only succeeds while the workstation State Hub
|
||||||
|
is reachable.
|
||||||
|
|
||||||
|
## Implement explicit misfire recovery modes
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: ACTIVITY-WP-0014-T02
|
||||||
|
status: done
|
||||||
|
priority: high
|
||||||
|
state_hub_task_id: "19615562-4cb2-4f25-872f-505d6e40dcc5"
|
||||||
|
```
|
||||||
|
|
||||||
|
Add `catchup_window_seconds` to `CronTriggerConfig` and redefine `misfire_policy`
|
||||||
|
into the three explicit modes above. In `_build_schedule()` set
|
||||||
|
`SchedulePolicy(overlap=..., catchup_window=timedelta(...))` per mode. Remove the
|
||||||
|
ad-hoc 1-hour `backfill` hack in favour of native catchup-window semantics. Keep
|
||||||
|
backward compatibility for existing `skip`/`catchup`/`compress` values (alias
|
||||||
|
map). Unit tests for each mode's `(catchup_window, overlap)` mapping.
|
||||||
|
|
||||||
|
## Missed-fire detection & alert sink
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: ACTIVITY-WP-0014-T03
|
||||||
|
status: done
|
||||||
|
priority: medium
|
||||||
|
state_hub_task_id: "dbedd96a-59ca-4b83-bce6-35755b076807"
|
||||||
|
```
|
||||||
|
|
||||||
|
Detect when a scheduled definition has no successful run within its expected
|
||||||
|
interval + tolerance, and emit a signal (State Hub progress event and/or
|
||||||
|
agent-inbox message) so a miss is visible even under `skip`. This is the
|
||||||
|
observability the current silent-drop behaviour lacks — a miss should never again
|
||||||
|
be invisible.
|
||||||
|
|
||||||
|
## Apply policy to runtime definitions & document
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: ACTIVITY-WP-0014-T04
|
||||||
|
status: done
|
||||||
|
priority: medium
|
||||||
|
state_hub_task_id: "04e9d1d2-1192-4402-9402-b12c5d7d44e5"
|
||||||
|
```
|
||||||
|
|
||||||
|
Set `misfire_policy: catchup_latest` for `daily-statehub-wsjf-triage`, documented
|
||||||
|
run-miss options in `docs/runbook.md`.
|
||||||
|
|
||||||
|
**Deployed & verified to railiance01 (2026-06-24):** built `activity-core:
|
||||||
|
railiance01-prod` with the WP-0014 code (T02/T03/T05), imported into k3s
|
||||||
|
containerd, applied the ConfigMap, rolled `actcore-worker`/`api`/`event-router`
|
||||||
|
onto the new image, and ran `/admin/sync` (6 defs, 4 schedules upserted, 0
|
||||||
|
errors). The live Temporal schedule now reports `OverlapPolicy BufferOne` +
|
||||||
|
`CatchupWindow 1d` (= `catchup_latest`); pods healthy, API `db:true temporal:true`.
|
||||||
|
|
||||||
|
## Keep activity-core thin under the State Hub beachhead model
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: ACTIVITY-WP-0014-T05
|
||||||
|
status: done
|
||||||
|
priority: high
|
||||||
|
state_hub_task_id: "b7e5b877-1b09-421c-a04e-78f785dc00a1"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Architecture decision (Bernd, 2026-06-23):** the resilience that this incident
|
||||||
|
needs — queuing writes and caching reads while State Hub is unreachable — must
|
||||||
|
**not** be a burden carried by client repos. It belongs to State Hub as a
|
||||||
|
**per-machine local "beachhead"** (transparent read cache + write outbox, possibly
|
||||||
|
with State-Hub federation), owned by custodian/state-hub. It handles all three
|
||||||
|
failure modes: network interruption, central State Hub crash, central machine
|
||||||
|
down. This is handed off to state-hub (see the coordination message / proposal);
|
||||||
|
**do not build client-side queue/cache logic in activity-core.**
|
||||||
|
|
||||||
|
activity-core's only responsibilities under this model are thin:
|
||||||
|
|
||||||
|
- **Idempotent writes — DONE (2026-06-23, in-repo):** added
|
||||||
|
`activity_core/state_hub_write` (`idempotency_headers`); every State Hub write
|
||||||
|
(report-sink, ops-evidence, schedule-miss) now sends a stable `Idempotency-Key`
|
||||||
|
header derived from `run_id:instruction_id:event_type`. The read-based
|
||||||
|
`_progress_exists` dedup is now best-effort (returns `False` on connection
|
||||||
|
error instead of hard-failing), so the guarantee lives on the keyed write, not
|
||||||
|
a live read. Tests in `tests/test_state_hub_write.py`; documented in
|
||||||
|
`docs/runbook.md`.
|
||||||
|
- **Adopt the beachhead endpoint — MOVED to [[ACTIVITY-WP-0015]]:** pointing
|
||||||
|
`STATE_HUB_URL` at the local beachhead and retiring the bespoke
|
||||||
|
`actcore-state-hub-bridge` proxy depend on the state-hub beachhead existing
|
||||||
|
first. Split into WP-0015 (status `blocked`) so this workplan can close on its
|
||||||
|
completed in-repo work rather than waiting on an external capability.
|
||||||
|
|
||||||
|
T05 is done as far as activity-core can act now; the external-dependent adoption
|
||||||
|
lives in WP-0015.
|
||||||
@@ -0,0 +1,54 @@
|
|||||||
|
---
|
||||||
|
id: ACTIVITY-WP-0015
|
||||||
|
type: workplan
|
||||||
|
title: "Adopt State Hub Beachhead Endpoint"
|
||||||
|
domain: infotech
|
||||||
|
repo: activity-core
|
||||||
|
status: blocked
|
||||||
|
owner: claude
|
||||||
|
topic_slug: activity-core
|
||||||
|
created: "2026-06-24"
|
||||||
|
updated: "2026-06-24"
|
||||||
|
state_hub_workstream_id: "bbc07f9e-9323-4b2b-b556-c33b37d0b228"
|
||||||
|
---
|
||||||
|
|
||||||
|
# Adopt State Hub Beachhead Endpoint
|
||||||
|
|
||||||
|
Carries the **blocked remainder** of [[ACTIVITY-WP-0014]] T05. The in-repo half
|
||||||
|
(idempotency-keyed State Hub writes) shipped in WP-0014; this workplan is the
|
||||||
|
client-side adoption that depends on the state-hub-owned **beachhead** capability
|
||||||
|
(per-machine read cache + write outbox) existing first.
|
||||||
|
|
||||||
|
**Blocked on:** the state-hub beachhead (proposal sent to the `state-hub` agent,
|
||||||
|
2026-06-23). Do not build queue/cache logic in activity-core — see
|
||||||
|
[[statehub-beachhead-principle]].
|
||||||
|
|
||||||
|
## Point STATE_HUB_URL at the beachhead
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: ACTIVITY-WP-0015-T01
|
||||||
|
status: wait
|
||||||
|
priority: medium
|
||||||
|
state_hub_task_id: "76b6132d-394a-4a67-bef6-73bb9d1e277e"
|
||||||
|
```
|
||||||
|
|
||||||
|
Once the state-hub beachhead exposes a local endpoint, point activity-core's
|
||||||
|
`STATE_HUB_URL` (and the railiance runtime config) at it and verify reads are
|
||||||
|
served from cache and writes are queued/flushed correctly when central State Hub
|
||||||
|
is unreachable. Confirm idempotency-keyed writes dedup on flush (no duplicate
|
||||||
|
`daily_triage`/progress events).
|
||||||
|
|
||||||
|
## Retire the bespoke actcore-state-hub-bridge proxy
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: ACTIVITY-WP-0015-T02
|
||||||
|
status: wait
|
||||||
|
priority: medium
|
||||||
|
state_hub_task_id: "526c2129-cbf7-4531-a319-aebfc75cc6a3"
|
||||||
|
```
|
||||||
|
|
||||||
|
Remove the inline `hostNetwork` HTTP proxy `actcore-state-hub-bridge` from
|
||||||
|
`k8s/railiance/20-runtime.yaml` — it is a primitive precursor of the beachhead
|
||||||
|
and should be replaced by the state-hub-owned component, not extended. Re-verify
|
||||||
|
the daily triage end-to-end after cutover, including an overnight scheduled run
|
||||||
|
while the workstation is asleep (the original failure condition).
|
||||||
@@ -0,0 +1,379 @@
|
|||||||
|
---
|
||||||
|
id: ACTIVITY-WP-0016
|
||||||
|
type: workplan
|
||||||
|
title: "LLM Output Robustness & The Producer Trust Boundary"
|
||||||
|
domain: custodian
|
||||||
|
repo: activity-core
|
||||||
|
status: active
|
||||||
|
owner: codex
|
||||||
|
topic_slug: custodian
|
||||||
|
created: "2026-06-26"
|
||||||
|
updated: "2026-06-27"
|
||||||
|
state_hub_workstream_id: "4ef0d53b-1777-41ae-80c6-1b69fdb34726"
|
||||||
|
---
|
||||||
|
|
||||||
|
# ACTIVITY-WP-0016 — LLM Output Robustness & The Producer Trust Boundary
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
On 2026-06-26 the scheduled `daily-statehub-wsjf-triage` instruction fired on
|
||||||
|
time (`daily_triage` event 05:20:57Z) but its output **failed schema
|
||||||
|
validation**: `Expecting ',' delimiter: line 136 column 22 (char 5268)`. The
|
||||||
|
model emitted a long ranked WSJF recommendation list (reached rank 7+ with
|
||||||
|
nested `wsjf` objects) and the JSON broke deep in that list. Because the report
|
||||||
|
is a single monolithic JSON document, one malformed delimiter discarded the
|
||||||
|
**entire** run. This reset the three-clean-consecutive-scheduled-runs streak in
|
||||||
|
`ACTIVITY-WP-0006-T03` (06-24 ✅, 06-25 ✅, 06-26 ✗-validation) and is the
|
||||||
|
LLM-output-quality surface deferred from `ACTIVITY-WP-0010`.
|
||||||
|
|
||||||
|
The scheduling/runtime layer is healthy — this is purely an output-robustness
|
||||||
|
and boundary-design problem. Today's code (`src/activity_core/rules/executor.py`)
|
||||||
|
already: passes the output schema to llm-connect as a `json_schema` model param
|
||||||
|
(`_llm_run_config`), retries once, runs a fenced/`raw_decode` tolerant parser
|
||||||
|
(`_parse_json_output`), and preserves a bounded 4000-char preview on hard
|
||||||
|
failure (`_invalid_output_report`). None of that helps when error locality is
|
||||||
|
zero: the failure unit is the whole document, not the offending item.
|
||||||
|
|
||||||
|
## Design Frame — The Producer Trust Boundary
|
||||||
|
|
||||||
|
This workplan is anchored to a deliberate architectural stance, not just a bug
|
||||||
|
fix. Capture it in an ADR (T04) so future work inherits it.
|
||||||
|
|
||||||
|
**Premise.** activity-core has a *trust boundary* where free-form producer
|
||||||
|
output meets strict deterministic consumers (JSON Schema validators, the task
|
||||||
|
emitter, classic compute pipelines). The producers are **LLMs and humans (and
|
||||||
|
agents acting for either)**. Both are *untrusted producers*: their output may be
|
||||||
|
|
||||||
|
- **erroneous** — hallucination, truncation (token-limit cutoff), drift,
|
||||||
|
type slips, typos; or
|
||||||
|
- **malicious** — prompt injection, crafted payloads, oversized/deeply-nested
|
||||||
|
structures aimed at exhausting or confusing the consumer.
|
||||||
|
|
||||||
|
The architecture should treat the boundary as an adversarial frontier and place
|
||||||
|
**guardrails + error-correction tooling there**, rather than letting raw
|
||||||
|
producer output flow into deterministic consumers and fail (or worse, partially
|
||||||
|
succeed) downstream.
|
||||||
|
|
||||||
|
**Two non-fail-fast postures.** When we do *not* want to hard-fail on a problem,
|
||||||
|
there are two sensible strategies — and they compose:
|
||||||
|
|
||||||
|
- **A) Trust but handle exceptions** (optimistic / reactive). Consume the output
|
||||||
|
as-is; on exception, catch → repair → retry → or quarantine. Cheap on the
|
||||||
|
happy path. Blast radius depends entirely on how granular the catch is. Good
|
||||||
|
when failures are rare and locally recoverable. Risk: failures surface late,
|
||||||
|
possibly after partial side effects.
|
||||||
|
- **B) Verify and mitigate** (defensive / proactive). Validate, sanitize, clamp,
|
||||||
|
and normalize the output to a known-good shape *before* it enters the pipeline
|
||||||
|
— drop bad items, coerce types, bound sizes/depth, allow-list references — so
|
||||||
|
the consumer only ever sees clean input. Higher upfront cost, smaller blast
|
||||||
|
radius, no partial side effects. Good when failures are common or
|
||||||
|
consequences are high.
|
||||||
|
|
||||||
|
**Governing principles for this repo:**
|
||||||
|
|
||||||
|
1. **Push verification to the boundary; keep the interior strict.** Apply
|
||||||
|
posture **B** at the producer→consumer boundary (verify+mitigate structure);
|
||||||
|
keep posture **A** for residual exceptions inside the verified core. Never
|
||||||
|
relax the interior schema to absorb producer sloppiness.
|
||||||
|
2. **Make error locality match the unit of work.** One bad recommendation must
|
||||||
|
cost one recommendation, not the whole report. Framing the payload so each
|
||||||
|
item is independently parseable is the single highest-leverage change.
|
||||||
|
3. **Quarantine, never silently drop.** Invalid units are preserved as bounded,
|
||||||
|
provenance-tagged artifacts (index, error, raw snippet) so they can be
|
||||||
|
debugged or replayed — degraded-but-usable is distinct from total loss.
|
||||||
|
4. **Both human and agent input get the same rigor.** Guardrails are
|
||||||
|
producer-agnostic: the same size/depth/count caps, reference allow-lists, and
|
||||||
|
truncation detection apply whether the producer is an LLM, an agent, or a
|
||||||
|
human form submission.
|
||||||
|
|
||||||
|
## Reproduce & Root-Cause The Failure
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: ACTIVITY-WP-0016-T01
|
||||||
|
status: wait
|
||||||
|
priority: high
|
||||||
|
state_hub_task_id: "74fd16a5-4ea5-4dfe-8526-dfa27cf76138"
|
||||||
|
```
|
||||||
|
|
||||||
|
Recover the **full** raw llm-connect response for the 06-26 failure (the State
|
||||||
|
Hub event keeps only a 4000-char preview; the break is at char 5268) and
|
||||||
|
establish the precise cause.
|
||||||
|
|
||||||
|
Done when:
|
||||||
|
|
||||||
|
- the full raw response is pulled from the runtime llm-connect log / response
|
||||||
|
store and the exact offending token at char 5268 is identified;
|
||||||
|
- `finish_reason` is captured to confirm or rule out token-limit **truncation**
|
||||||
|
vs a structural mid-stream glitch;
|
||||||
|
- it is confirmed whether llm-connect actually **enforced** the `json_schema`
|
||||||
|
constrained-decoding hint or merely accepted it as advisory (this determines
|
||||||
|
whether the schema param is load-bearing);
|
||||||
|
- the failing payload is captured as a regression fixture under `tests/`.
|
||||||
|
|
||||||
|
2026-06-26 findings (local analysis on the workstation):
|
||||||
|
|
||||||
|
- **Mechanism confirmed structurally.** There are **16 active workstreams**
|
||||||
|
org-wide and the triage instruction emits ~one ranked recommendation per
|
||||||
|
candidate. The preserved preview holds 7 fully-formed recommendations; the JSON
|
||||||
|
break is at char 5268 (~rank 8–9). The unbounded one-per-workstream list is the
|
||||||
|
structural cause — more items = more tokens = higher odds of a mid-stream JSON
|
||||||
|
slip and/or truncation. This directly justifies T02's bounded top-N + per-item
|
||||||
|
framing.
|
||||||
|
- **Both attempts failed.** `executor._execute` retries once
|
||||||
|
(`src/activity_core/rules/executor.py:166-171`); the recorded error is from the
|
||||||
|
**retry** output, so the model produced invalid JSON twice — not a one-off.
|
||||||
|
- **activity-core discards the diagnostics needed to root-cause this.** Three
|
||||||
|
retention gaps mean the exact char-5268 token cannot be recovered from
|
||||||
|
activity-core data at all:
|
||||||
|
1. `LLMConnectClient.complete()` returns only `data["content"]`
|
||||||
|
(`llm_client.py:57-60`) — it drops `finish_reason`/`usage` from the
|
||||||
|
llm-connect HTTP response, so truncation-vs-structural cannot be
|
||||||
|
distinguished locally.
|
||||||
|
2. the report sink caps raw output at **4000 chars** (`_invalid_output_report`,
|
||||||
|
`executor.py:259`) — below the 5268 break.
|
||||||
|
3. the worker log caps the preview at **2000 chars** (`executor.py:175`).
|
||||||
|
- **Remaining (remote, operator-owned).** Confirming the exact offending token
|
||||||
|
and `finish_reason` requires llm-connect's producer-side logs on `railiance01`
|
||||||
|
— cluster access, outside this repo's SCOPE for direct action. Truncation is
|
||||||
|
the leading hypothesis given the 16-item input, but the mitigation (T02/T03) is
|
||||||
|
identical either way, so T01 does not block the build work.
|
||||||
|
- **Feeds T03/T04.** The retention gaps are themselves defects to fix: capture
|
||||||
|
`finish_reason`/`usage` and persist a larger bounded raw artifact on validation
|
||||||
|
failure so this class of failure is never un-debuggable again.
|
||||||
|
- Partial fixture saved:
|
||||||
|
`tests/fixtures/wp0016/daily_triage_2026-06-26_validation_failure.partial.json`
|
||||||
|
(the 4000-char preview + validation error; full payload pending the remote pull).
|
||||||
|
|
||||||
|
## Schema + Prompt Redesign For Error Locality
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: ACTIVITY-WP-0016-T02
|
||||||
|
status: progress
|
||||||
|
priority: high
|
||||||
|
state_hub_task_id: "ae67ca8c-ee01-4a8d-9e8a-a0a36c999758"
|
||||||
|
```
|
||||||
|
|
||||||
|
Redesign the daily-triage report contract so a single malformed item can no
|
||||||
|
longer discard the whole report (principle #2).
|
||||||
|
|
||||||
|
Done when:
|
||||||
|
|
||||||
|
- the recommendation list is **bounded** (configurable top-N, default 5–7) in
|
||||||
|
both the prompt and the output schema — long lists are where the model drifts;
|
||||||
|
- the report uses a **per-item-framed** shape (JSON Lines / NDJSON — one
|
||||||
|
recommendation object per line — or an equivalent delimited per-item form)
|
||||||
|
behind a minimal stable envelope (`summary` + framed items), so each item is
|
||||||
|
an independent parse unit;
|
||||||
|
- the prompt explicitly states the contract, the per-item framing, the cap, and
|
||||||
|
a "if uncertain, emit fewer well-formed items rather than more" instruction;
|
||||||
|
- `max_tokens` is set with headroom for the bounded list so truncation cannot
|
||||||
|
occur at the expected size;
|
||||||
|
- the output schema file (`_load_output_schema` target) is updated to match.
|
||||||
|
|
||||||
|
2026-06-26 progress (in-repo portion):
|
||||||
|
|
||||||
|
- **Strict, bounded schema written** — `schemas/daily-triage-report.json` went
|
||||||
|
from `recommendations.items: {type: object}` (accept-anything) to a strict
|
||||||
|
per-item contract: `required [rank, candidate, action, why]` with typed
|
||||||
|
`wsjf` sub-fields, plus `maxItems: 7`. The strict item shape is what lets the
|
||||||
|
T03 boundary parser validate each recommendation independently.
|
||||||
|
- **`maxItems` is a hint, not a hard reject** — the in-repo validator
|
||||||
|
(`_validate_schema_node`) only enforces `type`/`required`/`properties`/`items`
|
||||||
|
and ignores `maxItems`/`enum`. That is deliberate: a hard `maxItems` reject
|
||||||
|
would discard a whole 16-item report — the exact blast-radius bug WP-0016
|
||||||
|
removes. The bound is enforced via the prompt + the llm-connect `json_schema`
|
||||||
|
constraint hint + T03 mitigation (keep top-N by rank, quarantine extras).
|
||||||
|
- **DEPLOY COUPLING (important):** this schema file is consumed *both* as the
|
||||||
|
llm-connect hint *and* by the current whole-document validator. Tightening
|
||||||
|
per-item `required` fields makes the existing whole-doc validation hard-fail
|
||||||
|
**more** until T03 replaces it with per-item quarantine. Therefore the schema
|
||||||
|
change MUST ship together with T03 — do not deploy the strict schema to the
|
||||||
|
runtime bundle ahead of the T03 parser. Four executor/instruction tests that
|
||||||
|
asserted the old loose contract were updated to the strict contract; the
|
||||||
|
forwarded-schema test now reads the live file instead of hard-coding it.
|
||||||
|
- **Truncation hypothesis corroborated** — the instruction config carries
|
||||||
|
`max_tokens` on the order of ~1200 (per the wiring test fixture). 5268 chars ≈
|
||||||
|
~1300–1500 tokens, so a ~1200-token cap would truncate a 16-item list right at
|
||||||
|
the observed break. This strengthens T01's leading hypothesis and makes the
|
||||||
|
`max_tokens` headroom change below concrete.
|
||||||
|
|
||||||
|
**Bundle handoff (NOT in this repo — runtime-projected definition).** The triage
|
||||||
|
prompt and `max_tokens` live in the Railiance runtime bundle, not in repo files.
|
||||||
|
Apply there:
|
||||||
|
1. Instruct a **bounded top-N** (≤ 7) ranked recommendations, "if uncertain emit
|
||||||
|
fewer well-formed items rather than more."
|
||||||
|
2. Specify the **per-item framing** the T03 parser will consume (NDJSON: a
|
||||||
|
leading summary object, then one recommendation JSON object per line).
|
||||||
|
3. Raise **`max_tokens`** to give clear headroom for 7 framed items (eliminate
|
||||||
|
truncation at the expected size).
|
||||||
|
4. State the value vocabularies (`action`, `confidence`) the T04 guardrails will
|
||||||
|
check.
|
||||||
|
|
||||||
|
## Boundary Parser — Verify & Mitigate (Posture B)
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: ACTIVITY-WP-0016-T03
|
||||||
|
status: done
|
||||||
|
priority: high
|
||||||
|
state_hub_task_id: "d65a6281-f1f9-4a9b-a835-da065411b709"
|
||||||
|
```
|
||||||
|
|
||||||
|
Implement item-granular parsing with a quarantine lane in
|
||||||
|
`src/activity_core/rules/executor.py`, applying posture **B** at the boundary
|
||||||
|
(principles #1–#3).
|
||||||
|
|
||||||
|
Done when:
|
||||||
|
|
||||||
|
- the parser splits the envelope from the framed items, then parses **each item
|
||||||
|
independently**; a malformed item is routed to a bounded `quarantined_items`
|
||||||
|
artifact (index + validation error + raw snippet), not raised;
|
||||||
|
- a run with some valid and some invalid items emits a report over the surviving
|
||||||
|
valid items with `output_validated=true`, plus `partial=true` and
|
||||||
|
`quarantined_count` / `quarantined_items` markers — degraded-but-usable is
|
||||||
|
reported distinctly from total loss;
|
||||||
|
- a best-effort **repair** pass (close unterminated brackets/quotes, recover the
|
||||||
|
valid prefix) is attempted per item before quarantining it;
|
||||||
|
- truncation detected in T01 is handled as its own signal (recover whole items
|
||||||
|
emitted before the cutoff rather than failing the document);
|
||||||
|
- the existing monolithic-document path remains as the fallback when framing is
|
||||||
|
absent (backward compatible with task-only instructions).
|
||||||
|
|
||||||
|
2026-06-26 progress (implemented in `src/activity_core/rules/executor.py`):
|
||||||
|
|
||||||
|
- **Resilient recovery wired into `_execute`.** When the whole-document parse +
|
||||||
|
one retry still fail, report instructions (those with `report_sinks`) now run
|
||||||
|
`_resilient_report` *before* the total-loss `_invalid_output_report`. If it
|
||||||
|
recovers ≥1 valid item it returns a partial report; otherwise it returns None
|
||||||
|
and the prior total-loss path is preserved unchanged.
|
||||||
|
- **Brace/quote-aware object scanner, not line-splitting.** The real 06-26 output
|
||||||
|
was pretty-printed (multi-line objects), so naive NDJSON line recovery would
|
||||||
|
have failed. `_extract_object_spans` walks the `recommendations` array
|
||||||
|
brace-depth- and string-aware, so it recovers each recommendation object
|
||||||
|
whether pretty-printed across many lines *or* emitted one-per-line (NDJSON).
|
||||||
|
The truncated trailing object is returned with `complete=False`.
|
||||||
|
- **Layered mitigation per item:** `json.loads` → on failure for a truncated
|
||||||
|
tail, a best-effort `_try_repair` (balance open string/brackets/braces) →
|
||||||
|
then `_partition_items` validates each recovered object against the T02 item
|
||||||
|
schema. Valid items survive; malformed or over-`maxItems` items are
|
||||||
|
quarantined with provenance (`index`, `error`, `raw` snippet, `reason`).
|
||||||
|
- **Report shape on degradation:** `output_validated=True` over the survivors,
|
||||||
|
`review_required=True`, `partial=True`, `quarantined_count`, and a bounded
|
||||||
|
`quarantined_items` list (cap 20). Degraded-but-usable is now reported
|
||||||
|
distinctly from total loss.
|
||||||
|
- **Verified against the real failure shape.** New tests reconstruct a
|
||||||
|
pretty-printed report with 7 valid recommendations + a truncated tail (the
|
||||||
|
06-26 shape) and a one-bad-item-among-valid case. The 7-item run now recovers
|
||||||
|
all 7 and quarantines the broken tail (previously: whole run discarded);
|
||||||
|
log line `instruction_output_recovered: kept=7, quarantined=1`. The bad-item
|
||||||
|
run keeps 2 and quarantines the rank-less one.
|
||||||
|
- **Deferred to T04 (clean scope boundary):** enforcing `maxItems` top-N on the
|
||||||
|
*happy* path (valid JSON, all items schema-valid, but > N items) — the resilient
|
||||||
|
path only runs on failure, so over-limit-on-success is a guardrail/count-cap
|
||||||
|
concern, which is exactly T04's remit.
|
||||||
|
|
||||||
|
## Producer Guardrails + ADR-004
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: ACTIVITY-WP-0016-T04
|
||||||
|
status: done
|
||||||
|
priority: medium
|
||||||
|
state_hub_task_id: "f5c3af5b-9e28-42b0-9af5-4c99284e99b9"
|
||||||
|
```
|
||||||
|
|
||||||
|
Write the architecture decision record and add the producer-agnostic guardrails
|
||||||
|
(principle #4).
|
||||||
|
|
||||||
|
Done when:
|
||||||
|
|
||||||
|
- `docs/adr/adr-004-producer-trust-boundary.md` documents the trust boundary,
|
||||||
|
the untrusted-producer premise (erroneous **and** malicious; human and agent),
|
||||||
|
the A vs B taxonomy and where each applies, the error-locality principle, and
|
||||||
|
the quarantine-with-provenance rule;
|
||||||
|
- boundary guardrails are enforced at the consumer edge: max item **count**, max
|
||||||
|
string length, max nesting **depth**, and a **reference allow-list** (e.g. a
|
||||||
|
recommendation `candidate` / a task `target_repo` must resolve to a known
|
||||||
|
workstream/repo before it is acted on);
|
||||||
|
- guardrail rejections are quarantined with provenance, consistent with T03;
|
||||||
|
- SCOPE.md / INTENT.md are checked for drift and updated if the boundary stance
|
||||||
|
changes the documented contract.
|
||||||
|
|
||||||
|
2026-06-26 progress:
|
||||||
|
|
||||||
|
- **ADR-004 written** — `docs/adr/adr-004-producer-trust-boundary.md` documents
|
||||||
|
the untrusted-producer premise (erroneous + malicious; LLM/agent/human), the
|
||||||
|
A-vs-B posture taxonomy, the four governing principles, the concrete
|
||||||
|
activity-core mechanisms, a posture-by-layer table, consequences, and
|
||||||
|
alternatives considered. Accepted, scope cross-repo.
|
||||||
|
- **Producer guardrails implemented** in `executor.py`, applied uniformly on the
|
||||||
|
happy path *and* the recovery path via `_partition_items`: per-item order is
|
||||||
|
structural-type → schema → structural caps (`_MAX_DEPTH=8`,
|
||||||
|
`_MAX_STRING_LEN=4000`) → reference allow-list → count cap (`maxItems`). Each
|
||||||
|
quarantine carries a `reason` (`malformed`/`schema`/`guardrail`/`allow_list`/
|
||||||
|
`over_limit`).
|
||||||
|
- **Happy-path count cap closed** (the item deferred from T03): a syntactically
|
||||||
|
valid 9-item report now keeps 7 and quarantines 2 as `over_limit`, emitting a
|
||||||
|
`partial` report — without a retry.
|
||||||
|
- **Reference allow-list wired but inert.** `_allow_list_from_context` reads
|
||||||
|
`context["known_candidates"]`; when present, recommendations with an unknown
|
||||||
|
`candidate` are quarantined (`reason: allow_list`). Absent today → check is
|
||||||
|
inert; activation is a one-line context-resolver change. Keeps the guardrail
|
||||||
|
producer-agnostic (principle #4) and ready.
|
||||||
|
- **SCOPE.md updated** — instruction-executor bullet now names the quarantine
|
||||||
|
lane + guardrails; ADR-004 added to the Architecture Decisions list. No INTENT
|
||||||
|
drift: this hardens the existing output contract, it does not extend scope.
|
||||||
|
- New tests: happy-path count cap, oversized-string guardrail, allow-list
|
||||||
|
rejection (all green).
|
||||||
|
|
||||||
|
## Tests + Calibration Re-Entry
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: ACTIVITY-WP-0016-T05
|
||||||
|
status: progress
|
||||||
|
priority: high
|
||||||
|
state_hub_task_id: "c881500b-5459-4620-81c0-b176971e989f"
|
||||||
|
```
|
||||||
|
|
||||||
|
Prove the new posture and hand back to the calibration gates.
|
||||||
|
|
||||||
|
Done when:
|
||||||
|
|
||||||
|
- regression tests cover: the captured 06-26 payload, a truncated-mid-list
|
||||||
|
payload, a one-bad-item-among-good payload (asserts quarantine + partial), an
|
||||||
|
oversized/over-deep payload (asserts guardrail rejection), and an
|
||||||
|
injection-shaped reference (asserts allow-list rejection);
|
||||||
|
- the full suite passes and the result is recorded here with the count;
|
||||||
|
- a daily-triage smoke against the live runtime shows a previously-failing
|
||||||
|
payload now **degrades gracefully** (valid items delivered, bad items
|
||||||
|
quarantined) instead of discarding the run;
|
||||||
|
- a progress note hands back to `ACTIVITY-WP-0010-T04` and `ACTIVITY-WP-0006-T03`
|
||||||
|
that the output-robustness blocker is cleared so the three-clean-run gate can
|
||||||
|
resume on its own.
|
||||||
|
|
||||||
|
2026-06-26 progress (in-repo portion complete):
|
||||||
|
|
||||||
|
- **Regression coverage complete.** Across T03/T04/T05: truncated-mid-list,
|
||||||
|
one-bad-item-among-good (quarantine + partial), oversized-string and over-depth
|
||||||
|
guardrail rejection, allow-list (injection-shaped) rejection, happy-path count
|
||||||
|
cap, and a test driving the **actual captured 2026-06-26 payload**
|
||||||
|
(`tests/fixtures/wp0016/daily_triage_2026-06-26_validation_failure.partial.json`)
|
||||||
|
— it now recovers 6+ valid recommendations and quarantines the truncated tail,
|
||||||
|
where before it discarded the whole run.
|
||||||
|
- **Full suite green:** 218 passed, 1 skipped (recorded at T04; the T05 fixture +
|
||||||
|
over-depth tests add to this — see the commit).
|
||||||
|
- **Hand-back notes posted** to `ACTIVITY-WP-0006-T03` (State Hub event
|
||||||
|
`b6b8c2b8`) and `ACTIVITY-WP-0010-T04` (`b813f0dc`).
|
||||||
|
- **Remaining (remote, operator-owned):** the live daily-triage smoke on
|
||||||
|
`railiance01` proving end-to-end graceful degradation. It depends on deploying
|
||||||
|
the T02 bundle prompt/`max_tokens`/NDJSON changes together with this code, which
|
||||||
|
is cluster/operator work outside this repo's SCOPE. T05 therefore stays
|
||||||
|
`progress` until that live run exists; the in-repo deliverables are done.
|
||||||
|
|
||||||
|
## Relationships
|
||||||
|
|
||||||
|
- **Blocks / feeds:** `ACTIVITY-WP-0006-T03` (three clean scheduled runs) and
|
||||||
|
`ACTIVITY-WP-0010-T04` (collect three clean scheduled runs) — both stalled on
|
||||||
|
the same output-quality failure this workplan removes.
|
||||||
|
- **References:** `ACTIVITY-WP-0009` (scheduled-run trust gap).
|
||||||
|
- **Boundary discipline:** keeps activity-core inside its SCOPE — this hardens
|
||||||
|
the instruction-executor output contract; it does not move provider
|
||||||
|
credentials, cluster reconciliation, or task lifecycle into this repo.
|
||||||
58
workplans/ACTIVITY-WP-0017-core-hub-ops-evidence-sink.md
Normal file
58
workplans/ACTIVITY-WP-0017-core-hub-ops-evidence-sink.md
Normal file
@@ -0,0 +1,58 @@
|
|||||||
|
---
|
||||||
|
id: ACTIVITY-WP-0017
|
||||||
|
type: workplan
|
||||||
|
title: "Core Hub ops evidence sink"
|
||||||
|
domain: infotech
|
||||||
|
repo: activity-core
|
||||||
|
status: finished
|
||||||
|
owner: codex
|
||||||
|
topic_slug: custodian
|
||||||
|
created: "2026-06-27"
|
||||||
|
updated: "2026-06-27"
|
||||||
|
state_hub_workstream_id: "2a073bf4-febf-433e-a721-5daf71760912"
|
||||||
|
---
|
||||||
|
|
||||||
|
# Core Hub ops evidence sink
|
||||||
|
|
||||||
|
## Goal
|
||||||
|
|
||||||
|
Provide the activity-core side of the Core Hub replacement evidence path for
|
||||||
|
`CORE-WP-0008-T03`, without depending on the legacy Haskell Inter-Hub sink and
|
||||||
|
without placing secret material in activity definitions, logs, State Hub, or
|
||||||
|
chat.
|
||||||
|
|
||||||
|
## Task: Add Core Hub interaction-event sink
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: ACTIVITY-WP-0017-T01
|
||||||
|
status: done
|
||||||
|
priority: high
|
||||||
|
state_hub_task_id: "32aab1af-6be5-4b52-afa1-c11f52c65892"
|
||||||
|
```
|
||||||
|
|
||||||
|
Add a `core-hub-interaction-event` ops evidence sink that posts sanitized
|
||||||
|
ops-inventory probe evidence to Core Hub `/api/v2/interaction-events`, verifies
|
||||||
|
the created event is visible, and reports only non-secret ids/statuses.
|
||||||
|
|
||||||
|
Acceptance:
|
||||||
|
|
||||||
|
- runtime token is read through `CORE_HUB_RUNTIME_TOKEN_FILE` or a named
|
||||||
|
environment variable, never from workplan content;
|
||||||
|
- sink configuration accepts `CORE_HUB_BASE_URL` and a widget id or widget
|
||||||
|
mapping;
|
||||||
|
- emitted metadata reuses the existing compact/sanitized probe evidence path;
|
||||||
|
- missing Core Hub config skips cleanly with explicit non-secret missing keys;
|
||||||
|
- tests prove the POST/visibility check and secret non-disclosure.
|
||||||
|
|
||||||
|
Verification 2026-06-27: `tests/test_ops_evidence_sinks.py` passed, and
|
||||||
|
a disposable local Core Hub runtime accepted an activity-core
|
||||||
|
`core-hub-interaction-event` sink emission, then listed the created
|
||||||
|
`ops-endpoint-verified` event back through `/api/v2/interaction-events`.
|
||||||
|
The verification asserted sanitized metadata did not include response body,
|
||||||
|
authorization header, URL userinfo, or token query material.
|
||||||
|
|
||||||
|
Completed 2026-06-27: implemented the Core Hub interaction-event sink in
|
||||||
|
`activity_core.ops_evidence_sinks` with unit coverage for POST/visibility
|
||||||
|
verification, missing config behavior, and secret non-disclosure. This provides
|
||||||
|
the direct Core Hub consumer path needed by `CORE-WP-0008-T03`; deployed use
|
||||||
|
still requires an approved Core Hub runtime token and widget id/mapping.
|
||||||
@@ -3,6 +3,7 @@ type: session-note
|
|||||||
created: "2026-03-28"
|
created: "2026-03-28"
|
||||||
updated: "2026-06-03"
|
updated: "2026-06-03"
|
||||||
status: archived
|
status: archived
|
||||||
|
state_hub_workstream_id: "b221e65a-6f97-44b0-8dae-442fffcb7f64"
|
||||||
---
|
---
|
||||||
|
|
||||||
# WP-0002 Handoff Note — Continue on CoulombCore
|
# WP-0002 Handoff Note — Continue on CoulombCore
|
||||||
|
|||||||
Reference in New Issue
Block a user