generated from coulomb/repo-seed
feat(WP-0002): complete Triggers & Ops workstream
Delivers all 12 tasks (T22–T33): Temporal Schedule manager + startup sync, NATS JetStream event router, FastAPI CRUD + manual trigger, Prometheus metrics wiring, custom search-attribute tagging, and operational runbook. Marks workplan status as done. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -3,56 +3,56 @@ id: custodian-WP-0002
|
||||
type: workplan
|
||||
domain: custodian
|
||||
repo: activity-core
|
||||
status: active
|
||||
status: done
|
||||
state_hub_workstream_id: 3a4f47d9-8bc1-434e-acb4-bed5d4dacda0
|
||||
tasks:
|
||||
- id: T22
|
||||
title: Write schedule_manager.py
|
||||
status: todo
|
||||
status: done
|
||||
state_hub_task_id: e50550d1-9904-41d7-afd8-492a1f1e91b8
|
||||
- id: T23
|
||||
title: Bootstrap script to sync schedules on startup
|
||||
status: todo
|
||||
status: done
|
||||
state_hub_task_id: 5a1f7fa3-acb9-4f60-9892-c9eaa120272e
|
||||
- id: T24
|
||||
title: Handle misfire policy in schedule config
|
||||
status: todo
|
||||
status: done
|
||||
state_hub_task_id: 00231668-95c5-447f-b3d0-1fb8c20b487f
|
||||
- id: T25
|
||||
title: Test schedule pause/resume lifecycle
|
||||
status: todo
|
||||
status: done
|
||||
state_hub_task_id: 7abfd375-ea9d-4209-8371-e5664dc2c6c4
|
||||
- id: T26
|
||||
title: Implement Event Router service
|
||||
status: todo
|
||||
status: done
|
||||
state_hub_task_id: 68b6610b-159c-4f1c-92a9-7128efea0961
|
||||
- id: T27
|
||||
title: Implement routing rules (event.type + filters → activity_ids)
|
||||
status: todo
|
||||
status: done
|
||||
state_hub_task_id: 9348efea-a7e9-4f92-b866-8fc82cf28fee
|
||||
- id: T28
|
||||
title: Start/signal workflow from Event Router
|
||||
status: todo
|
||||
status: done
|
||||
state_hub_task_id: cac1f45a-7391-471a-9566-97cdbd96eb2d
|
||||
- id: T29
|
||||
title: Integration test — publish event → observe workflow run
|
||||
status: todo
|
||||
status: done
|
||||
state_hub_task_id: 7f10b5a3-7cad-4914-b603-d57508c85629
|
||||
- id: T30
|
||||
title: REST API (FastAPI) — CRUD for ActivityDefinition
|
||||
status: todo
|
||||
status: done
|
||||
state_hub_task_id: b27e54a1-5dcc-476d-8f4a-c995aea6a8c2
|
||||
- id: T31
|
||||
title: Wire Temporal SDK metrics to Prometheus
|
||||
status: todo
|
||||
status: done
|
||||
state_hub_task_id: 0eafb60c-f00e-4fd7-a921-7de75fcfe81e
|
||||
- id: T32
|
||||
title: Tag workflows with activity_id for Temporal visibility search
|
||||
status: todo
|
||||
status: done
|
||||
state_hub_task_id: 7bdfc5c2-1f06-4fce-aac3-fae036dcb47e
|
||||
- id: T33
|
||||
title: Write operational runbook
|
||||
status: todo
|
||||
status: done
|
||||
state_hub_task_id: 766d602d-1b23-4247-a46d-03c0d3b8e498
|
||||
---
|
||||
|
||||
@@ -61,6 +61,7 @@ tasks:
|
||||
**Workstream:** activity-core Triggers & Ops
|
||||
**Hub ID:** `3a4f47d9-8bc1-434e-acb4-bed5d4dacda0`
|
||||
**Depends on:** custodian-WP-0001 (Foundation — Temporal Backbone)
|
||||
**Status:** DONE (2026-03-28)
|
||||
|
||||
## Purpose
|
||||
|
||||
@@ -68,50 +69,62 @@ Add automated triggering (time-based via Temporal Schedules and event-driven via
|
||||
a REST admin API, Prometheus metrics, and an operational runbook. Transforms the manually-triggered
|
||||
backbone from WP-0001 into a self-operating service.
|
||||
|
||||
## Open decisions (resolve before Phase 5)
|
||||
## Decisions resolved
|
||||
|
||||
- **Event broker choice** (hub: `bc47c9c2-5643-4a88-8114-601738a2f64e`): Kafka vs NATS vs RabbitMQ.
|
||||
T26–T29 are blocked until this is resolved.
|
||||
- **Event broker choice** (hub: `bc47c9c2-5643-4a88-8114-601738a2f64e`): **NATS + JetStream** chosen.
|
||||
|
||||
---
|
||||
|
||||
## Phase 4 — Time-Based Triggers (Temporal Schedules)
|
||||
## Phase 4 — Time-Based Triggers (Temporal Schedules) ✓
|
||||
|
||||
| Task | Priority | Hub task ID |
|
||||
| Task | Priority | Status |
|
||||
|---|---|---|
|
||||
| T22: Write schedule_manager.py | medium | `e50550d1-9904-41d7-afd8-492a1f1e91b8` |
|
||||
| T23: Bootstrap script to sync schedules on startup | medium | `5a1f7fa3-acb9-4f60-9892-c9eaa120272e` |
|
||||
| T24: Handle misfire policy in schedule config | medium | `00231668-95c5-447f-b3d0-1fb8c20b487f` |
|
||||
| T25: Test schedule pause/resume lifecycle | medium | `7abfd375-ea9d-4209-8371-e5664dc2c6c4` |
|
||||
| T22: Write schedule_manager.py | medium | done |
|
||||
| T23: Bootstrap script to sync schedules on startup | medium | done |
|
||||
| T24: Handle misfire policy in schedule config | medium | done |
|
||||
| T25: Test schedule pause/resume lifecycle | medium | done |
|
||||
|
||||
---
|
||||
|
||||
## Phase 5 — Event-Driven Triggers
|
||||
## Phase 5 — Event-Driven Triggers ✓
|
||||
|
||||
*Blocked by broker decision (`bc47c9c2-5643-4a88-8114-601738a2f64e`).*
|
||||
|
||||
| Task | Priority | Hub task ID |
|
||||
| Task | Priority | Status |
|
||||
|---|---|---|
|
||||
| T26: Implement Event Router service | medium | `68b6610b-159c-4f1c-92a9-7128efea0961` |
|
||||
| T27: Implement routing rules (event.type + filters → activity_ids) | medium | `9348efea-a7e9-4f92-b866-8fc82cf28fee` |
|
||||
| T28: Start/signal workflow from Event Router | medium | `cac1f45a-7391-471a-9566-97cdbd96eb2d` |
|
||||
| T29: Integration test — publish event → observe workflow run | medium | `7f10b5a3-7cad-4914-b603-d57508c85629` |
|
||||
| T26: Implement Event Router service | medium | done |
|
||||
| T27: Implement routing rules (event.type + filters → activity_ids) | medium | done |
|
||||
| T28: Start/signal workflow from Event Router | medium | done |
|
||||
| T29: Integration test — publish event → observe workflow run | medium | done |
|
||||
|
||||
---
|
||||
|
||||
## Phase 6 — Observability & Admin
|
||||
## Phase 6 — Observability & Admin ✓
|
||||
|
||||
| Task | Priority | Hub task ID |
|
||||
| Task | Priority | Status |
|
||||
|---|---|---|
|
||||
| T30: REST API (FastAPI) — CRUD for ActivityDefinition | low | `b27e54a1-5dcc-476d-8f4a-c995aea6a8c2` |
|
||||
| T31: Wire Temporal SDK metrics to Prometheus | low | `0eafb60c-f00e-4fd7-a921-7de75fcfe81e` |
|
||||
| T32: Tag workflows with activity_id for Temporal visibility search | low | `7bdfc5c2-1f06-4fce-aac3-fae036dcb47e` |
|
||||
| T33: Write operational runbook | low | `766d602d-1b23-4247-a46d-03c0d3b8e498` |
|
||||
| T30: REST API (FastAPI) — CRUD for ActivityDefinition | low | done |
|
||||
| T31: Wire Temporal SDK metrics to Prometheus | low | done |
|
||||
| T32: Tag workflows with activity_id for Temporal visibility search | low | done |
|
||||
| T33: Write operational runbook | low | done |
|
||||
|
||||
---
|
||||
|
||||
## Completion criteria
|
||||
## Files produced
|
||||
|
||||
Schedules fire `RunActivityWorkflow` automatically on cron cadence. An external event published
|
||||
to the broker reaches the correct ActivityDefinition end-to-end. ActivityDefinitions are
|
||||
manageable via REST API. Prometheus metrics are scraped. Runbook is written.
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `src/activity_core/schedule_manager.py` | T22/T24: upsert/delete/list Temporal Schedules |
|
||||
| `src/activity_core/sync_schedules.py` | T23: bootstrap schedule sync |
|
||||
| `src/activity_core/event_router.py` | T26/T27/T28: NATS JetStream → Temporal |
|
||||
| `src/activity_core/api.py` | T30: FastAPI CRUD + manual trigger |
|
||||
| `tests/test_schedule_lifecycle.py` | T25: schedule lifecycle unit tests |
|
||||
| `tests/test_event_router.py` | T29: event router unit + integration tests |
|
||||
| `docs/runbook.md` | T33: operational runbook |
|
||||
| `docker-compose.dev.yml` | added NATS service |
|
||||
|
||||
## Completion criteria ✓
|
||||
|
||||
- Schedules fire `RunActivityWorkflow` automatically on cron cadence ✓
|
||||
- External event published to NATS reaches the correct ActivityDefinition end-to-end ✓
|
||||
- ActivityDefinitions are manageable via REST API ✓
|
||||
- Prometheus metrics are scraped ✓
|
||||
- Runbook is written ✓
|
||||
|
||||
Reference in New Issue
Block a user