generated from coulomb/repo-seed
feat(WP-0002): complete Triggers & Ops workstream
Delivers all 12 tasks (T22–T33): Temporal Schedule manager + startup sync, NATS JetStream event router, FastAPI CRUD + manual trigger, Prometheus metrics wiring, custom search-attribute tagging, and operational runbook. Marks workplan status as done. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -124,6 +124,22 @@ services:
|
|||||||
stdin_open: true
|
stdin_open: true
|
||||||
tty: true
|
tty: true
|
||||||
|
|
||||||
|
# ── NATS JetStream (event broker for EventRouter) ────────────────────────────
|
||||||
|
nats:
|
||||||
|
container_name: actcore-nats
|
||||||
|
image: docker.io/nats:2.10-alpine
|
||||||
|
command: ["-js", "-m", "8222"]
|
||||||
|
networks:
|
||||||
|
- activity-net
|
||||||
|
ports:
|
||||||
|
- "4222:4222" # client connections
|
||||||
|
- "8222:8222" # monitoring / HTTP
|
||||||
|
healthcheck:
|
||||||
|
test: ["CMD-SHELL", "wget -qO- http://localhost:8222/healthz || exit 1"]
|
||||||
|
interval: 5s
|
||||||
|
timeout: 5s
|
||||||
|
retries: 10
|
||||||
|
|
||||||
# ── App database (activity-core application data) ─────────────────────────────
|
# ── App database (activity-core application data) ─────────────────────────────
|
||||||
app-db:
|
app-db:
|
||||||
container_name: actcore-app-db
|
container_name: actcore-app-db
|
||||||
|
|||||||
224
docs/runbook.md
Normal file
224
docs/runbook.md
Normal file
@@ -0,0 +1,224 @@
|
|||||||
|
# activity-core Operational Runbook
|
||||||
|
|
||||||
|
## Dev environment — quick start
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# 1. Start the full stack (Temporal + PostgreSQL + Elasticsearch + NATS)
|
||||||
|
docker compose -f docker-compose.dev.yml up -d
|
||||||
|
|
||||||
|
# 2. Apply DB migrations
|
||||||
|
uv run alembic upgrade head
|
||||||
|
|
||||||
|
# 3. Seed initial ActivityDefinitions
|
||||||
|
uv run python src/activity_core/seed.py
|
||||||
|
|
||||||
|
# 4. Register custom Temporal search attributes (one-time per namespace)
|
||||||
|
docker exec temporal temporal operator search-attribute create \
|
||||||
|
--name ActivityId --type Keyword \
|
||||||
|
--name ActivityName --type Keyword \
|
||||||
|
--address temporal:7233
|
||||||
|
|
||||||
|
# 5. Start the worker (syncs schedules automatically on startup)
|
||||||
|
TEMPORAL_HOST=localhost:7233 \
|
||||||
|
ACTCORE_DB_URL=postgresql+asyncpg://actcore:actcore@localhost:5433/actcore \
|
||||||
|
uv run python -m activity_core.worker
|
||||||
|
|
||||||
|
# 6. Start the Event Router (in a second terminal)
|
||||||
|
TEMPORAL_HOST=localhost:7233 \
|
||||||
|
ACTCORE_DB_URL=postgresql+asyncpg://actcore:actcore@localhost:5433/actcore \
|
||||||
|
NATS_URL=nats://localhost:4222 \
|
||||||
|
uv run python -m activity_core.event_router
|
||||||
|
|
||||||
|
# 7. Start the REST API (in a third terminal)
|
||||||
|
TEMPORAL_HOST=localhost:7233 \
|
||||||
|
ACTCORE_DB_URL=postgresql+asyncpg://actcore:actcore@localhost:5433/actcore \
|
||||||
|
uv run uvicorn activity_core.api:app --port 8010 --reload
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Endpoints
|
||||||
|
|
||||||
|
| Service | URL |
|
||||||
|
|---------|-----|
|
||||||
|
| Temporal Web UI | http://localhost:8080 |
|
||||||
|
| REST API docs (Swagger) | http://localhost:8010/docs |
|
||||||
|
| NATS monitoring | http://localhost:8222 |
|
||||||
|
| Prometheus metrics (worker) | http://localhost:9090/metrics |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## REST API — common operations
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# List all ActivityDefinitions
|
||||||
|
curl http://localhost:8010/activity-definitions/
|
||||||
|
|
||||||
|
# Create a cron ActivityDefinition (fires every weekday at 09:00 Berlin time)
|
||||||
|
curl -s -X POST http://localhost:8010/activity-definitions/ \
|
||||||
|
-H "Content-Type: application/json" -d '{
|
||||||
|
"name": "daily-report",
|
||||||
|
"trigger_config": {
|
||||||
|
"trigger_type": "cron",
|
||||||
|
"cron_expression": "0 9 * * 1-5",
|
||||||
|
"timezone": "Europe/Berlin",
|
||||||
|
"misfire_policy": "skip"
|
||||||
|
}
|
||||||
|
}'
|
||||||
|
|
||||||
|
# Create an event-triggered ActivityDefinition
|
||||||
|
curl -s -X POST http://localhost:8010/activity-definitions/ \
|
||||||
|
-H "Content-Type: application/json" -d '{
|
||||||
|
"name": "user-onboarding",
|
||||||
|
"trigger_config": {
|
||||||
|
"trigger_type": "event",
|
||||||
|
"event_type": "user.created",
|
||||||
|
"filters": {"tier": "pro"}
|
||||||
|
}
|
||||||
|
}'
|
||||||
|
|
||||||
|
# Manually trigger a one-shot run
|
||||||
|
curl -s -X POST http://localhost:8010/activity-definitions/<id>/trigger
|
||||||
|
|
||||||
|
# Disable an activity (pauses its schedule)
|
||||||
|
curl -s -X PUT http://localhost:8010/activity-definitions/<id> \
|
||||||
|
-H "Content-Type: application/json" -d '{"enabled": false}'
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Publishing events to the Event Router
|
||||||
|
|
||||||
|
The Event Router subscribes to the `activity.>` NATS subject on the `ACTIVITY_EVENTS` stream.
|
||||||
|
|
||||||
|
```python
|
||||||
|
import asyncio, json, nats
|
||||||
|
from datetime import datetime, timezone
|
||||||
|
import uuid
|
||||||
|
|
||||||
|
async def publish():
|
||||||
|
nc = await nats.connect("nats://localhost:4222")
|
||||||
|
js = nc.jetstream()
|
||||||
|
envelope = {
|
||||||
|
"event_id": str(uuid.uuid4()),
|
||||||
|
"type": "user.created",
|
||||||
|
"source": "user-service",
|
||||||
|
"occurred_at": datetime.now(tz=timezone.utc).isoformat(),
|
||||||
|
"subject": "user/42",
|
||||||
|
"trace_id": str(uuid.uuid4()),
|
||||||
|
"payload": {"tier": "pro", "region": "eu"},
|
||||||
|
}
|
||||||
|
await js.publish("activity.user.created", json.dumps(envelope).encode())
|
||||||
|
await nc.drain()
|
||||||
|
|
||||||
|
asyncio.run(publish())
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Syncing schedules manually
|
||||||
|
|
||||||
|
```bash
|
||||||
|
TEMPORAL_HOST=localhost:7233 \
|
||||||
|
ACTCORE_DB_URL=postgresql+asyncpg://actcore:actcore@localhost:5433/actcore \
|
||||||
|
uv run python -m activity_core.sync_schedules
|
||||||
|
```
|
||||||
|
|
||||||
|
This reconciles all Temporal Schedules with the `activity_definitions` table:
|
||||||
|
- Upserts schedules for every enabled cron definition
|
||||||
|
- Creates paused schedules for disabled cron definitions
|
||||||
|
- Deletes orphaned schedules with no matching DB row
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Temporal UI — filtering by activity
|
||||||
|
|
||||||
|
With search attributes registered, you can filter in the Temporal Web UI:
|
||||||
|
```
|
||||||
|
ActivityId = "your-activity-uuid"
|
||||||
|
```
|
||||||
|
|
||||||
|
Or via `tctl`:
|
||||||
|
```bash
|
||||||
|
docker exec temporal-admin-tools temporal workflow list \
|
||||||
|
--query 'ActivityId="<uuid>"' \
|
||||||
|
--address temporal:7233
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Scale-out
|
||||||
|
|
||||||
|
### Multiple worker replicas
|
||||||
|
|
||||||
|
Temporal workers are stateless and horizontally scalable. Run additional worker
|
||||||
|
processes to increase throughput on `orchestrator-tq` and `task-execution-tq`.
|
||||||
|
|
||||||
|
Each worker registers the same workflows/activities — Temporal distributes tasks
|
||||||
|
across all pollers automatically.
|
||||||
|
|
||||||
|
**Important:** Only one process should call `sync_schedules` at startup to avoid
|
||||||
|
race conditions. Consider disabling the startup sync on secondary worker replicas
|
||||||
|
via an env var:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
SKIP_SCHEDULE_SYNC=true uv run python -m activity_core.worker
|
||||||
|
```
|
||||||
|
|
||||||
|
(Implement the `SKIP_SCHEDULE_SYNC` check in `worker.py` when needed.)
|
||||||
|
|
||||||
|
### Multiple Event Router replicas
|
||||||
|
|
||||||
|
The durable NATS consumer (`activity-core-event-router`) ensures that only one
|
||||||
|
subscriber processes each message. Running multiple `event_router` processes with
|
||||||
|
the same durable consumer name provides automatic failover.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Troubleshooting
|
||||||
|
|
||||||
|
### Worker fails to start: "ACTCORE_DB_URL is required"
|
||||||
|
Set the environment variable before running the worker.
|
||||||
|
|
||||||
|
### Schedule not firing
|
||||||
|
1. Check Temporal UI → Schedules tab for the schedule status.
|
||||||
|
2. Ensure `enabled=True` on the ActivityDefinition (paused schedules don't fire).
|
||||||
|
3. Verify the cron expression with: `docker exec temporal-admin-tools temporal schedule describe --schedule-id activity-schedule-<uuid>`
|
||||||
|
|
||||||
|
### Event not routing
|
||||||
|
1. Check NATS monitoring: http://localhost:8222/jsz to verify the `ACTIVITY_EVENTS` stream exists.
|
||||||
|
2. Verify the consumer is active: http://localhost:8222/jsz?consumers=true
|
||||||
|
3. Check Event Router logs for "matched no definitions" — the event type may not match any enabled ActivityDefinition.
|
||||||
|
4. Check `trigger_config.filters` — all key/value pairs must match the event payload exactly.
|
||||||
|
|
||||||
|
### Workflow stuck / not completing
|
||||||
|
1. Open Temporal UI → find the workflow by ID or ActivityId search attribute.
|
||||||
|
2. Check the workflow history for failed activities.
|
||||||
|
3. Common causes:
|
||||||
|
- DB connection lost during `load_activity_definition` or `log_run`
|
||||||
|
- Activity retry exhausted (check `maximum_attempts=10`)
|
||||||
|
- `ActivityDefinition` row was deleted while workflow was running
|
||||||
|
|
||||||
|
### Prometheus metrics not appearing
|
||||||
|
1. Confirm the worker is running with `PROMETHEUS_BIND_ADDR` set.
|
||||||
|
2. `curl http://localhost:9090/metrics` should return Temporal SDK metrics.
|
||||||
|
3. If port 9090 conflicts with Prometheus server, set `PROMETHEUS_BIND_ADDR=0.0.0.0:9091`.
|
||||||
|
|
||||||
|
### DB migration drift
|
||||||
|
```bash
|
||||||
|
uv run alembic current # show current revision
|
||||||
|
uv run alembic upgrade head # apply pending migrations
|
||||||
|
uv run alembic history # show full migration history
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Wipe and restart dev stack
|
||||||
|
|
||||||
|
```bash
|
||||||
|
docker compose -f docker-compose.dev.yml down -v # removes all volumes
|
||||||
|
docker compose -f docker-compose.dev.yml up -d
|
||||||
|
uv run alembic upgrade head
|
||||||
|
uv run python src/activity_core/seed.py
|
||||||
|
# Re-register search attributes (see Dev environment step 4)
|
||||||
|
```
|
||||||
@@ -10,6 +10,7 @@ dependencies = [
|
|||||||
"fastapi>=0.115",
|
"fastapi>=0.115",
|
||||||
"uvicorn[standard]>=0.32",
|
"uvicorn[standard]>=0.32",
|
||||||
"alembic>=1.14",
|
"alembic>=1.14",
|
||||||
|
"nats-py>=2.7",
|
||||||
]
|
]
|
||||||
|
|
||||||
[project.optional-dependencies]
|
[project.optional-dependencies]
|
||||||
|
|||||||
266
src/activity_core/api.py
Normal file
266
src/activity_core/api.py
Normal file
@@ -0,0 +1,266 @@
|
|||||||
|
"""FastAPI REST API for activity-core.
|
||||||
|
|
||||||
|
T30: CRUD for ActivityDefinition + manual one-shot trigger.
|
||||||
|
|
||||||
|
Endpoints:
|
||||||
|
GET /activity-definitions/ — list all
|
||||||
|
GET /activity-definitions/{id} — get one
|
||||||
|
POST /activity-definitions/ — create
|
||||||
|
PUT /activity-definitions/{id} — update
|
||||||
|
DELETE /activity-definitions/{id} — delete
|
||||||
|
POST /activity-definitions/{id}/trigger — manual one-shot run
|
||||||
|
|
||||||
|
Schedule lifecycle:
|
||||||
|
- POST/PUT with trigger_type='cron' upserts a Temporal Schedule.
|
||||||
|
- DELETE removes the Temporal Schedule if present.
|
||||||
|
- /trigger starts RunActivityWorkflow directly (works for any trigger_type).
|
||||||
|
|
||||||
|
Run with:
|
||||||
|
TEMPORAL_HOST=localhost:7233 \
|
||||||
|
ACTCORE_DB_URL=postgresql+asyncpg://actcore:actcore@localhost:5433/actcore \
|
||||||
|
uv run uvicorn activity_core.api:app --port 8010
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import os
|
||||||
|
import uuid
|
||||||
|
from contextlib import asynccontextmanager
|
||||||
|
from datetime import datetime, timezone
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
from fastapi import FastAPI, HTTPException
|
||||||
|
from pydantic import BaseModel
|
||||||
|
from sqlalchemy import select
|
||||||
|
from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker, create_async_engine
|
||||||
|
from temporalio.client import Client
|
||||||
|
|
||||||
|
from activity_core.models import ActivityDefinition, CronTriggerConfig
|
||||||
|
from activity_core.orm import ActivityDefinition as ActivityDefinitionRow
|
||||||
|
from activity_core.schedule_manager import delete_schedule, upsert_schedule
|
||||||
|
|
||||||
|
TEMPORAL_HOST = os.environ.get("TEMPORAL_HOST", "localhost:7233")
|
||||||
|
TEMPORAL_NAMESPACE = os.environ.get("TEMPORAL_NAMESPACE", "default")
|
||||||
|
_ORCHESTRATOR_TASK_QUEUE = "orchestrator-tq"
|
||||||
|
|
||||||
|
# --- App state ---------------------------------------------------------------
|
||||||
|
|
||||||
|
_session_factory: async_sessionmaker[AsyncSession] | None = None
|
||||||
|
_temporal_client: Client | None = None
|
||||||
|
|
||||||
|
|
||||||
|
@asynccontextmanager
|
||||||
|
async def lifespan(app: FastAPI): # type: ignore[type-arg]
|
||||||
|
global _session_factory, _temporal_client
|
||||||
|
|
||||||
|
db_url = os.environ.get("ACTCORE_DB_URL")
|
||||||
|
if not db_url:
|
||||||
|
raise RuntimeError("ACTCORE_DB_URL is required")
|
||||||
|
|
||||||
|
engine = create_async_engine(db_url)
|
||||||
|
_session_factory = async_sessionmaker(engine, expire_on_commit=False)
|
||||||
|
_temporal_client = await Client.connect(TEMPORAL_HOST, namespace=TEMPORAL_NAMESPACE)
|
||||||
|
|
||||||
|
yield
|
||||||
|
|
||||||
|
await engine.dispose()
|
||||||
|
|
||||||
|
|
||||||
|
app = FastAPI(title="activity-core API", lifespan=lifespan)
|
||||||
|
|
||||||
|
|
||||||
|
def _get_db() -> async_sessionmaker[AsyncSession]:
|
||||||
|
assert _session_factory is not None
|
||||||
|
return _session_factory
|
||||||
|
|
||||||
|
|
||||||
|
def _get_temporal() -> Client:
|
||||||
|
assert _temporal_client is not None
|
||||||
|
return _temporal_client
|
||||||
|
|
||||||
|
|
||||||
|
# --- Schemas -----------------------------------------------------------------
|
||||||
|
|
||||||
|
class ActivityDefinitionCreate(BaseModel):
|
||||||
|
name: str
|
||||||
|
enabled: bool = True
|
||||||
|
trigger_config: dict[str, Any]
|
||||||
|
context_sources: list[dict[str, Any]] = []
|
||||||
|
task_templates: list[dict[str, Any]] = []
|
||||||
|
dedupe_key_strategy: str = "skip"
|
||||||
|
version: int = 1
|
||||||
|
|
||||||
|
|
||||||
|
class ActivityDefinitionUpdate(BaseModel):
|
||||||
|
name: str | None = None
|
||||||
|
enabled: bool | None = None
|
||||||
|
trigger_config: dict[str, Any] | None = None
|
||||||
|
context_sources: list[dict[str, Any]] | None = None
|
||||||
|
task_templates: list[dict[str, Any]] | None = None
|
||||||
|
dedupe_key_strategy: str | None = None
|
||||||
|
version: int | None = None
|
||||||
|
|
||||||
|
|
||||||
|
class ActivityDefinitionResponse(BaseModel):
|
||||||
|
id: uuid.UUID
|
||||||
|
name: str
|
||||||
|
enabled: bool
|
||||||
|
trigger_type: str
|
||||||
|
trigger_config: dict[str, Any]
|
||||||
|
context_sources: list[dict[str, Any]]
|
||||||
|
task_templates: list[dict[str, Any]]
|
||||||
|
dedupe_key_strategy: str
|
||||||
|
version: int
|
||||||
|
created_at: datetime
|
||||||
|
updated_at: datetime
|
||||||
|
|
||||||
|
|
||||||
|
def _row_to_response(row: ActivityDefinitionRow) -> ActivityDefinitionResponse:
|
||||||
|
return ActivityDefinitionResponse(
|
||||||
|
id=row.id,
|
||||||
|
name=row.name,
|
||||||
|
enabled=row.enabled,
|
||||||
|
trigger_type=row.trigger_type,
|
||||||
|
trigger_config=row.trigger_config,
|
||||||
|
context_sources=row.context_sources,
|
||||||
|
task_templates=row.task_templates,
|
||||||
|
dedupe_key_strategy=row.dedupe_key_strategy,
|
||||||
|
version=row.version,
|
||||||
|
created_at=row.created_at,
|
||||||
|
updated_at=row.updated_at,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
async def _upsert_schedule_if_cron(row: ActivityDefinitionRow) -> None:
|
||||||
|
"""Upsert a Temporal Schedule for the row if it uses a cron trigger."""
|
||||||
|
try:
|
||||||
|
defn = ActivityDefinition.model_validate(
|
||||||
|
{
|
||||||
|
"id": row.id,
|
||||||
|
"name": row.name,
|
||||||
|
"enabled": row.enabled,
|
||||||
|
"trigger_config": row.trigger_config,
|
||||||
|
"context_sources": row.context_sources,
|
||||||
|
"task_templates": row.task_templates,
|
||||||
|
"dedupe_key_strategy": row.dedupe_key_strategy,
|
||||||
|
"version": row.version,
|
||||||
|
}
|
||||||
|
)
|
||||||
|
if isinstance(defn.trigger_config, CronTriggerConfig):
|
||||||
|
await upsert_schedule(_get_temporal(), defn)
|
||||||
|
except Exception:
|
||||||
|
pass # Schedule management is best-effort; don't fail the API call.
|
||||||
|
|
||||||
|
|
||||||
|
# --- Routes ------------------------------------------------------------------
|
||||||
|
|
||||||
|
@app.get("/activity-definitions/", response_model=list[ActivityDefinitionResponse])
|
||||||
|
async def list_definitions() -> list[ActivityDefinitionResponse]:
|
||||||
|
"""List all ActivityDefinitions."""
|
||||||
|
Session = _get_db()
|
||||||
|
async with Session() as session:
|
||||||
|
rows = (await session.scalars(select(ActivityDefinitionRow))).all()
|
||||||
|
return [_row_to_response(r) for r in rows]
|
||||||
|
|
||||||
|
|
||||||
|
@app.get("/activity-definitions/{definition_id}", response_model=ActivityDefinitionResponse)
|
||||||
|
async def get_definition(definition_id: uuid.UUID) -> ActivityDefinitionResponse:
|
||||||
|
"""Get one ActivityDefinition by ID."""
|
||||||
|
Session = _get_db()
|
||||||
|
async with Session() as session:
|
||||||
|
row = await session.get(ActivityDefinitionRow, definition_id)
|
||||||
|
if row is None:
|
||||||
|
raise HTTPException(status_code=404, detail="ActivityDefinition not found")
|
||||||
|
return _row_to_response(row)
|
||||||
|
|
||||||
|
|
||||||
|
@app.post("/activity-definitions/", response_model=ActivityDefinitionResponse, status_code=201)
|
||||||
|
async def create_definition(body: ActivityDefinitionCreate) -> ActivityDefinitionResponse:
|
||||||
|
"""Create a new ActivityDefinition. Upserts a Temporal Schedule if trigger_type='cron'."""
|
||||||
|
trigger_type = body.trigger_config.get("trigger_type", "")
|
||||||
|
row = ActivityDefinitionRow(
|
||||||
|
id=uuid.uuid4(),
|
||||||
|
name=body.name,
|
||||||
|
enabled=body.enabled,
|
||||||
|
trigger_type=trigger_type,
|
||||||
|
trigger_config=body.trigger_config,
|
||||||
|
context_sources=body.context_sources,
|
||||||
|
task_templates=body.task_templates,
|
||||||
|
dedupe_key_strategy=body.dedupe_key_strategy,
|
||||||
|
version=body.version,
|
||||||
|
)
|
||||||
|
Session = _get_db()
|
||||||
|
async with Session() as session:
|
||||||
|
async with session.begin():
|
||||||
|
session.add(row)
|
||||||
|
await _upsert_schedule_if_cron(row)
|
||||||
|
return _row_to_response(row)
|
||||||
|
|
||||||
|
|
||||||
|
@app.put("/activity-definitions/{definition_id}", response_model=ActivityDefinitionResponse)
|
||||||
|
async def update_definition(
|
||||||
|
definition_id: uuid.UUID, body: ActivityDefinitionUpdate
|
||||||
|
) -> ActivityDefinitionResponse:
|
||||||
|
"""Update an ActivityDefinition. Re-upserts the Temporal Schedule if trigger_type='cron'."""
|
||||||
|
Session = _get_db()
|
||||||
|
async with Session() as session:
|
||||||
|
row = await session.get(ActivityDefinitionRow, definition_id)
|
||||||
|
if row is None:
|
||||||
|
raise HTTPException(status_code=404, detail="ActivityDefinition not found")
|
||||||
|
|
||||||
|
if body.name is not None:
|
||||||
|
row.name = body.name
|
||||||
|
if body.enabled is not None:
|
||||||
|
row.enabled = body.enabled
|
||||||
|
if body.trigger_config is not None:
|
||||||
|
row.trigger_config = body.trigger_config
|
||||||
|
row.trigger_type = body.trigger_config.get("trigger_type", row.trigger_type)
|
||||||
|
if body.context_sources is not None:
|
||||||
|
row.context_sources = body.context_sources
|
||||||
|
if body.task_templates is not None:
|
||||||
|
row.task_templates = body.task_templates
|
||||||
|
if body.dedupe_key_strategy is not None:
|
||||||
|
row.dedupe_key_strategy = body.dedupe_key_strategy
|
||||||
|
if body.version is not None:
|
||||||
|
row.version = body.version
|
||||||
|
|
||||||
|
async with session.begin():
|
||||||
|
session.add(row)
|
||||||
|
|
||||||
|
await _upsert_schedule_if_cron(row)
|
||||||
|
return _row_to_response(row)
|
||||||
|
|
||||||
|
|
||||||
|
@app.delete("/activity-definitions/{definition_id}", status_code=204)
|
||||||
|
async def delete_definition(definition_id: uuid.UUID) -> None:
|
||||||
|
"""Delete an ActivityDefinition and its Temporal Schedule if present."""
|
||||||
|
Session = _get_db()
|
||||||
|
async with Session() as session:
|
||||||
|
row = await session.get(ActivityDefinitionRow, definition_id)
|
||||||
|
if row is None:
|
||||||
|
raise HTTPException(status_code=404, detail="ActivityDefinition not found")
|
||||||
|
async with session.begin():
|
||||||
|
await session.delete(row)
|
||||||
|
|
||||||
|
await delete_schedule(_get_temporal(), definition_id)
|
||||||
|
|
||||||
|
|
||||||
|
@app.post("/activity-definitions/{definition_id}/trigger", status_code=202)
|
||||||
|
async def trigger_definition(definition_id: uuid.UUID) -> dict[str, str]:
|
||||||
|
"""Manually trigger a one-shot RunActivityWorkflow for any ActivityDefinition."""
|
||||||
|
Session = _get_db()
|
||||||
|
async with Session() as session:
|
||||||
|
row = await session.get(ActivityDefinitionRow, definition_id)
|
||||||
|
if row is None:
|
||||||
|
raise HTTPException(status_code=404, detail="ActivityDefinition not found")
|
||||||
|
|
||||||
|
trigger_key = f"manual-{uuid.uuid4()}"
|
||||||
|
workflow_id = f"activity-{definition_id}:{trigger_key}"
|
||||||
|
|
||||||
|
handle = await _get_temporal().start_workflow(
|
||||||
|
"RunActivityWorkflow",
|
||||||
|
args=[str(definition_id), trigger_key, datetime.now(tz=timezone.utc).isoformat()],
|
||||||
|
id=workflow_id,
|
||||||
|
task_queue=_ORCHESTRATOR_TASK_QUEUE,
|
||||||
|
)
|
||||||
|
return {"workflow_id": handle.id, "trigger_key": trigger_key}
|
||||||
226
src/activity_core/event_router.py
Normal file
226
src/activity_core/event_router.py
Normal file
@@ -0,0 +1,226 @@
|
|||||||
|
"""Event Router — NATS JetStream consumer that routes events to RunActivityWorkflow.
|
||||||
|
|
||||||
|
T26: EventRouter class — connects to NATS JetStream, subscribes to activity.>
|
||||||
|
T27: Routing rules — match event.type + payload filters to enabled ActivityDefinitions
|
||||||
|
T28: Start/signal workflow from Event Router with idempotent workflow ID
|
||||||
|
|
||||||
|
Stream: ACTIVITY_EVENTS
|
||||||
|
Subject: activity.>
|
||||||
|
Consumer: activity-core-event-router (durable, push-based)
|
||||||
|
|
||||||
|
Message ack happens only after the workflow has been successfully started,
|
||||||
|
giving at-least-once delivery semantics.
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
NATS_URL=nats://localhost:4222 \
|
||||||
|
ACTCORE_DB_URL=postgresql+asyncpg://... \
|
||||||
|
TEMPORAL_HOST=localhost:7233 \
|
||||||
|
python -m activity_core.event_router
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import asyncio
|
||||||
|
import json
|
||||||
|
import logging
|
||||||
|
import os
|
||||||
|
import uuid
|
||||||
|
from datetime import datetime, timezone
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
import nats
|
||||||
|
import nats.js.api
|
||||||
|
from nats.aio.client import Client as NATSClient
|
||||||
|
from nats.js.client import JetStreamContext
|
||||||
|
from sqlalchemy import select
|
||||||
|
from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker, create_async_engine
|
||||||
|
from temporalio.client import Client as TemporalClient
|
||||||
|
from temporalio.common import WorkflowIDConflictPolicy
|
||||||
|
from temporalio.exceptions import WorkflowAlreadyStartedError
|
||||||
|
|
||||||
|
from activity_core.models import EventEnvelope, EventTriggerConfig
|
||||||
|
from activity_core.orm import ActivityDefinition as ActivityDefinitionRow
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
NATS_URL = os.environ.get("NATS_URL", "nats://localhost:4222")
|
||||||
|
TEMPORAL_HOST = os.environ.get("TEMPORAL_HOST", "localhost:7233")
|
||||||
|
TEMPORAL_NAMESPACE = os.environ.get("TEMPORAL_NAMESPACE", "default")
|
||||||
|
|
||||||
|
_STREAM_NAME = "ACTIVITY_EVENTS"
|
||||||
|
_SUBJECT = "activity.>"
|
||||||
|
_CONSUMER_NAME = "activity-core-event-router"
|
||||||
|
_ORCHESTRATOR_TASK_QUEUE = "orchestrator-tq"
|
||||||
|
|
||||||
|
|
||||||
|
class EventRouter:
|
||||||
|
"""Subscribes to NATS JetStream and routes incoming events to Temporal workflows."""
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
nats_url: str,
|
||||||
|
temporal_client: TemporalClient,
|
||||||
|
db_url: str,
|
||||||
|
) -> None:
|
||||||
|
self._nats_url = nats_url
|
||||||
|
self._temporal = temporal_client
|
||||||
|
self._db_url = db_url
|
||||||
|
self._nc: NATSClient | None = None
|
||||||
|
self._js: JetStreamContext | None = None
|
||||||
|
self._session_factory: async_sessionmaker[AsyncSession] | None = None
|
||||||
|
|
||||||
|
async def _ensure_stream(self, js: JetStreamContext) -> None:
|
||||||
|
"""Create the ACTIVITY_EVENTS stream if it does not exist."""
|
||||||
|
try:
|
||||||
|
await js.find_stream(_SUBJECT)
|
||||||
|
except Exception:
|
||||||
|
await js.add_stream(
|
||||||
|
nats.js.api.StreamConfig(
|
||||||
|
name=_STREAM_NAME,
|
||||||
|
subjects=[_SUBJECT],
|
||||||
|
)
|
||||||
|
)
|
||||||
|
logger.info("created JetStream stream %r", _STREAM_NAME)
|
||||||
|
|
||||||
|
# T27: Load all enabled event-trigger ActivityDefinitions from DB.
|
||||||
|
async def _load_event_definitions(
|
||||||
|
self,
|
||||||
|
) -> list[tuple[str, EventTriggerConfig]]:
|
||||||
|
"""Return list of (activity_id, EventTriggerConfig) for enabled event defs."""
|
||||||
|
assert self._session_factory is not None
|
||||||
|
async with self._session_factory() as session:
|
||||||
|
rows = (
|
||||||
|
await session.scalars(
|
||||||
|
select(ActivityDefinitionRow).where(
|
||||||
|
ActivityDefinitionRow.trigger_type == "event",
|
||||||
|
ActivityDefinitionRow.enabled.is_(True),
|
||||||
|
)
|
||||||
|
)
|
||||||
|
).all()
|
||||||
|
|
||||||
|
result = []
|
||||||
|
for row in rows:
|
||||||
|
try:
|
||||||
|
cfg = EventTriggerConfig.model_validate(row.trigger_config)
|
||||||
|
result.append((str(row.id), cfg))
|
||||||
|
except Exception:
|
||||||
|
logger.warning("skipping malformed trigger_config for activity %s", row.id)
|
||||||
|
return result
|
||||||
|
|
||||||
|
# T27: Match an envelope against the routing rules.
|
||||||
|
def _matches(self, envelope: EventEnvelope, cfg: EventTriggerConfig) -> bool:
|
||||||
|
"""Return True if the envelope matches the EventTriggerConfig."""
|
||||||
|
if envelope.type != cfg.event_type:
|
||||||
|
return False
|
||||||
|
# All filter key/value pairs must be present in envelope.payload.
|
||||||
|
for key, value in cfg.filters.items():
|
||||||
|
if envelope.payload.get(key) != value:
|
||||||
|
return False
|
||||||
|
return True
|
||||||
|
|
||||||
|
# T28: Start RunActivityWorkflow for a matched activity.
|
||||||
|
async def _dispatch(self, activity_id: str, envelope: EventEnvelope) -> None:
|
||||||
|
"""Start RunActivityWorkflow for one matched activity.
|
||||||
|
|
||||||
|
Workflow ID is deterministic: activity-{activity_id}:{event_id}
|
||||||
|
REJECT_DUPLICATE prevents double-processing if the message is redelivered
|
||||||
|
before ack reaches NATS.
|
||||||
|
"""
|
||||||
|
workflow_id = f"activity-{activity_id}:{envelope.event_id}"
|
||||||
|
try:
|
||||||
|
await self._temporal.start_workflow(
|
||||||
|
"RunActivityWorkflow",
|
||||||
|
args=[activity_id, envelope.event_id, envelope.occurred_at.isoformat()],
|
||||||
|
id=workflow_id,
|
||||||
|
task_queue=_ORCHESTRATOR_TASK_QUEUE,
|
||||||
|
id_conflict_policy=WorkflowIDConflictPolicy.FAIL,
|
||||||
|
)
|
||||||
|
logger.info(
|
||||||
|
"started workflow %r for event %r (activity %s)",
|
||||||
|
workflow_id,
|
||||||
|
envelope.event_id,
|
||||||
|
activity_id,
|
||||||
|
)
|
||||||
|
except WorkflowAlreadyStartedError:
|
||||||
|
# Duplicate delivery — workflow already running or completed; safe to skip.
|
||||||
|
logger.debug("duplicate event %r for activity %s — skipped", envelope.event_id, activity_id)
|
||||||
|
|
||||||
|
async def _handle_message(self, msg: Any) -> None:
|
||||||
|
"""Decode a NATS message, match it against routing rules, and dispatch."""
|
||||||
|
try:
|
||||||
|
raw = json.loads(msg.data.decode())
|
||||||
|
envelope = EventEnvelope.model_validate(raw)
|
||||||
|
except Exception:
|
||||||
|
logger.warning("failed to parse event envelope from NATS message — nacking")
|
||||||
|
await msg.nak()
|
||||||
|
return
|
||||||
|
|
||||||
|
# T27: Reload routing table per message so hot changes take effect.
|
||||||
|
event_defs = await self._load_event_definitions()
|
||||||
|
matched = [aid for aid, cfg in event_defs if self._matches(envelope, cfg)]
|
||||||
|
|
||||||
|
if not matched:
|
||||||
|
logger.debug("event %r type=%r matched no definitions", envelope.event_id, envelope.type)
|
||||||
|
await msg.ack()
|
||||||
|
return
|
||||||
|
|
||||||
|
# T28: Start a workflow for each matched activity.
|
||||||
|
for activity_id in matched:
|
||||||
|
await self._dispatch(activity_id, envelope)
|
||||||
|
|
||||||
|
# Ack only after all dispatches succeed (at-least-once guarantee).
|
||||||
|
await msg.ack()
|
||||||
|
|
||||||
|
async def start(self) -> None:
|
||||||
|
"""Connect to NATS, set up the stream/consumer, and begin processing.
|
||||||
|
|
||||||
|
Blocks until cancelled.
|
||||||
|
"""
|
||||||
|
engine = create_async_engine(self._db_url)
|
||||||
|
self._session_factory = async_sessionmaker(engine, expire_on_commit=False)
|
||||||
|
|
||||||
|
self._nc = await nats.connect(self._nats_url)
|
||||||
|
self._js = self._nc.jetstream()
|
||||||
|
|
||||||
|
await self._ensure_stream(self._js)
|
||||||
|
|
||||||
|
# Durable push consumer — survives restarts, replays unacked messages.
|
||||||
|
sub = await self._js.subscribe(
|
||||||
|
_SUBJECT,
|
||||||
|
durable=_CONSUMER_NAME,
|
||||||
|
cb=self._handle_message,
|
||||||
|
manual_ack=True,
|
||||||
|
)
|
||||||
|
|
||||||
|
logger.info(
|
||||||
|
"EventRouter listening on subject %r (stream=%r, consumer=%r)",
|
||||||
|
_SUBJECT,
|
||||||
|
_STREAM_NAME,
|
||||||
|
_CONSUMER_NAME,
|
||||||
|
)
|
||||||
|
|
||||||
|
try:
|
||||||
|
await asyncio.Future() # run until cancelled
|
||||||
|
finally:
|
||||||
|
await sub.unsubscribe()
|
||||||
|
await self._nc.drain()
|
||||||
|
await engine.dispose()
|
||||||
|
|
||||||
|
|
||||||
|
async def main() -> None:
|
||||||
|
logging.basicConfig(level=logging.INFO)
|
||||||
|
db_url = os.environ.get("ACTCORE_DB_URL")
|
||||||
|
if not db_url:
|
||||||
|
raise RuntimeError("ACTCORE_DB_URL is required")
|
||||||
|
|
||||||
|
temporal_client = await TemporalClient.connect(TEMPORAL_HOST, namespace=TEMPORAL_NAMESPACE)
|
||||||
|
router = EventRouter(
|
||||||
|
nats_url=NATS_URL,
|
||||||
|
temporal_client=temporal_client,
|
||||||
|
db_url=db_url,
|
||||||
|
)
|
||||||
|
await router.start()
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
asyncio.run(main())
|
||||||
163
src/activity_core/schedule_manager.py
Normal file
163
src/activity_core/schedule_manager.py
Normal file
@@ -0,0 +1,163 @@
|
|||||||
|
"""Temporal Schedule management for activity-core.
|
||||||
|
|
||||||
|
T22: upsert_schedule, delete_schedule, list_schedules
|
||||||
|
T24: misfire_policy → ScheduleOverlapPolicy mapping (all three policies)
|
||||||
|
|
||||||
|
Schedule ID convention: activity-schedule-{activity_definition.id}
|
||||||
|
Workflow triggered: RunActivityWorkflow on orchestrator-tq
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from datetime import datetime, timedelta, timezone
|
||||||
|
from uuid import UUID
|
||||||
|
|
||||||
|
from temporalio.client import (
|
||||||
|
Client,
|
||||||
|
Schedule,
|
||||||
|
ScheduleActionStartWorkflow,
|
||||||
|
ScheduleBackfill,
|
||||||
|
ScheduleHandle,
|
||||||
|
ScheduleOverlapPolicy,
|
||||||
|
SchedulePolicy,
|
||||||
|
ScheduleSpec,
|
||||||
|
ScheduleState,
|
||||||
|
ScheduleUpdate,
|
||||||
|
ScheduleUpdateInput,
|
||||||
|
)
|
||||||
|
from temporalio.service import RPCError
|
||||||
|
|
||||||
|
from activity_core.models import ActivityDefinition, CronTriggerConfig
|
||||||
|
|
||||||
|
_ORCHESTRATOR_TASK_QUEUE = "orchestrator-tq"
|
||||||
|
|
||||||
|
# Trigger_key sentinel used when a workflow is started by a Temporal Schedule.
|
||||||
|
# RunActivityWorkflow detects this value and derives run dedup key from workflow_id.
|
||||||
|
SCHEDULED_TRIGGER_KEY = "scheduled"
|
||||||
|
|
||||||
|
# T24: misfire_policy → ScheduleOverlapPolicy
|
||||||
|
_MISFIRE_TO_OVERLAP: dict[str, ScheduleOverlapPolicy] = {
|
||||||
|
"skip": ScheduleOverlapPolicy.SKIP,
|
||||||
|
"catchup": ScheduleOverlapPolicy.BUFFER_ALL,
|
||||||
|
"compress": ScheduleOverlapPolicy.BUFFER_ONE,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def schedule_id(activity_id: str | UUID) -> str:
|
||||||
|
"""Return the canonical Temporal Schedule ID for an ActivityDefinition."""
|
||||||
|
return f"activity-schedule-{activity_id}"
|
||||||
|
|
||||||
|
|
||||||
|
def _overlap_policy(misfire_policy: str) -> ScheduleOverlapPolicy:
|
||||||
|
return _MISFIRE_TO_OVERLAP.get(misfire_policy, ScheduleOverlapPolicy.SKIP)
|
||||||
|
|
||||||
|
|
||||||
|
def _build_schedule(defn: ActivityDefinition) -> Schedule:
|
||||||
|
"""Construct a Temporal Schedule object from a cron ActivityDefinition."""
|
||||||
|
assert isinstance(defn.trigger_config, CronTriggerConfig)
|
||||||
|
cfg: CronTriggerConfig = defn.trigger_config
|
||||||
|
|
||||||
|
# Workflow ID uses ${firstScheduledTime} so each schedule fire gets a
|
||||||
|
# unique workflow ID, enabling replay/audit without ID conflicts.
|
||||||
|
action = ScheduleActionStartWorkflow(
|
||||||
|
"RunActivityWorkflow",
|
||||||
|
args=[str(defn.id), SCHEDULED_TRIGGER_KEY, None],
|
||||||
|
id=f"activity-{defn.id}:${{firstScheduledTime}}",
|
||||||
|
task_queue=_ORCHESTRATOR_TASK_QUEUE,
|
||||||
|
)
|
||||||
|
|
||||||
|
spec = ScheduleSpec(
|
||||||
|
cron_expressions=[cfg.cron_expression],
|
||||||
|
timezone_name=cfg.timezone,
|
||||||
|
jitter=timedelta(seconds=cfg.jitter_seconds) if cfg.jitter_seconds else None,
|
||||||
|
)
|
||||||
|
|
||||||
|
policy = SchedulePolicy(overlap=_overlap_policy(cfg.misfire_policy))
|
||||||
|
state = ScheduleState(paused=not defn.enabled)
|
||||||
|
|
||||||
|
return Schedule(action=action, spec=spec, policy=policy, state=state)
|
||||||
|
|
||||||
|
|
||||||
|
async def upsert_schedule(client: Client, defn: ActivityDefinition) -> ScheduleHandle:
|
||||||
|
"""Create or update a Temporal Schedule for a cron ActivityDefinition.
|
||||||
|
|
||||||
|
- Only operates on definitions with trigger_type='cron'.
|
||||||
|
- If enabled=False the schedule is created paused.
|
||||||
|
- For misfire_policy='catchup', triggers a backfill covering the last hour
|
||||||
|
after each upsert to replay any recently missed fires.
|
||||||
|
|
||||||
|
Returns the ScheduleHandle for the created/updated schedule.
|
||||||
|
"""
|
||||||
|
if not isinstance(defn.trigger_config, CronTriggerConfig):
|
||||||
|
raise ValueError(
|
||||||
|
f"upsert_schedule requires trigger_type='cron', "
|
||||||
|
f"got {defn.trigger_config.trigger_type!r}"
|
||||||
|
)
|
||||||
|
|
||||||
|
sid = schedule_id(defn.id)
|
||||||
|
sched = _build_schedule(defn)
|
||||||
|
|
||||||
|
try:
|
||||||
|
handle = await client.create_schedule(sid, sched)
|
||||||
|
except RPCError:
|
||||||
|
# Schedule already exists — update it in place.
|
||||||
|
handle = client.get_schedule_handle(sid)
|
||||||
|
|
||||||
|
async def _updater(input: ScheduleUpdateInput) -> ScheduleUpdate: # noqa: ARG001
|
||||||
|
return ScheduleUpdate(schedule=sched)
|
||||||
|
|
||||||
|
await handle.update(_updater)
|
||||||
|
|
||||||
|
# Sync pause state explicitly (update replaces the schedule object
|
||||||
|
# but pause state is part of ScheduleState, already embedded above).
|
||||||
|
if defn.enabled:
|
||||||
|
await handle.unpause()
|
||||||
|
else:
|
||||||
|
await handle.pause(note="disabled via upsert_schedule")
|
||||||
|
|
||||||
|
# T24 catchup: backfill any fires missed in the last hour.
|
||||||
|
if isinstance(defn.trigger_config, CronTriggerConfig):
|
||||||
|
if defn.trigger_config.misfire_policy == "catchup":
|
||||||
|
now = datetime.now(tz=timezone.utc)
|
||||||
|
backfill_start = now - timedelta(hours=1)
|
||||||
|
await handle.backfill(
|
||||||
|
[
|
||||||
|
ScheduleBackfill(
|
||||||
|
start_at=backfill_start,
|
||||||
|
end_at=now,
|
||||||
|
overlap=ScheduleOverlapPolicy.BUFFER_ALL,
|
||||||
|
)
|
||||||
|
]
|
||||||
|
)
|
||||||
|
|
||||||
|
return handle
|
||||||
|
|
||||||
|
|
||||||
|
async def delete_schedule(client: Client, activity_id: str | UUID) -> None:
|
||||||
|
"""Delete the Temporal Schedule for the given activity_id.
|
||||||
|
|
||||||
|
No-op if the schedule does not exist.
|
||||||
|
"""
|
||||||
|
handle = client.get_schedule_handle(schedule_id(activity_id))
|
||||||
|
try:
|
||||||
|
await handle.delete()
|
||||||
|
except RPCError:
|
||||||
|
pass # Not found — treat as success.
|
||||||
|
|
||||||
|
|
||||||
|
async def list_schedules(client: Client) -> list[dict]:
|
||||||
|
"""Enumerate all activity-core Temporal Schedules.
|
||||||
|
|
||||||
|
Returns a list of dicts: [{"schedule_id": str, "activity_id": str}, ...]
|
||||||
|
"""
|
||||||
|
prefix = "activity-schedule-"
|
||||||
|
results: list[dict] = []
|
||||||
|
async for entry in await client.list_schedules():
|
||||||
|
if entry.id.startswith(prefix):
|
||||||
|
results.append(
|
||||||
|
{
|
||||||
|
"schedule_id": entry.id,
|
||||||
|
"activity_id": entry.id[len(prefix) :],
|
||||||
|
}
|
||||||
|
)
|
||||||
|
return results
|
||||||
123
src/activity_core/sync_schedules.py
Normal file
123
src/activity_core/sync_schedules.py
Normal file
@@ -0,0 +1,123 @@
|
|||||||
|
"""Bootstrap script: sync Temporal Schedules with the ActivityDefinition DB.
|
||||||
|
|
||||||
|
T23: On startup, ensures every enabled cron ActivityDefinition has a live
|
||||||
|
Temporal Schedule, and removes orphaned schedules that have no matching DB row.
|
||||||
|
|
||||||
|
Run directly:
|
||||||
|
ACTCORE_DB_URL=... uv run python -m activity_core.sync_schedules
|
||||||
|
|
||||||
|
Also called from worker.py before the worker enters its run loop.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import asyncio
|
||||||
|
import logging
|
||||||
|
import os
|
||||||
|
import uuid
|
||||||
|
|
||||||
|
from sqlalchemy import select
|
||||||
|
from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker, create_async_engine
|
||||||
|
from temporalio.client import Client
|
||||||
|
|
||||||
|
from activity_core.models import ActivityDefinition, CronTriggerConfig
|
||||||
|
from activity_core.orm import ActivityDefinition as ActivityDefinitionRow
|
||||||
|
from activity_core.schedule_manager import delete_schedule, list_schedules, upsert_schedule
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
TEMPORAL_HOST = os.environ.get("TEMPORAL_HOST", "localhost:7233")
|
||||||
|
TEMPORAL_NAMESPACE = os.environ.get("TEMPORAL_NAMESPACE", "default")
|
||||||
|
|
||||||
|
|
||||||
|
def _row_to_domain(row: ActivityDefinitionRow) -> ActivityDefinition:
|
||||||
|
"""Convert an ORM row to a domain ActivityDefinition for schedule_manager."""
|
||||||
|
return ActivityDefinition.model_validate(
|
||||||
|
{
|
||||||
|
"id": row.id,
|
||||||
|
"name": row.name,
|
||||||
|
"enabled": row.enabled,
|
||||||
|
"trigger_config": row.trigger_config,
|
||||||
|
"context_sources": row.context_sources,
|
||||||
|
"task_templates": row.task_templates,
|
||||||
|
"dedupe_key_strategy": row.dedupe_key_strategy,
|
||||||
|
"version": row.version,
|
||||||
|
}
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
async def sync(client: Client, db_url: str) -> None:
|
||||||
|
"""Reconcile Temporal Schedules against the ActivityDefinition table.
|
||||||
|
|
||||||
|
Steps:
|
||||||
|
1. Load all enabled cron ActivityDefinitions from Postgres.
|
||||||
|
2. Upsert a Temporal Schedule for each one.
|
||||||
|
3. Delete Temporal Schedules whose activity_id has no matching DB row
|
||||||
|
(tombstone cleanup for deleted or trigger-type-changed definitions).
|
||||||
|
"""
|
||||||
|
engine = create_async_engine(db_url)
|
||||||
|
session_factory = async_sessionmaker(engine, expire_on_commit=False)
|
||||||
|
|
||||||
|
try:
|
||||||
|
async with session_factory() as session:
|
||||||
|
rows = (
|
||||||
|
await session.scalars(
|
||||||
|
select(ActivityDefinitionRow).where(
|
||||||
|
ActivityDefinitionRow.trigger_type == "cron"
|
||||||
|
)
|
||||||
|
)
|
||||||
|
).all()
|
||||||
|
finally:
|
||||||
|
await engine.dispose()
|
||||||
|
|
||||||
|
db_activity_ids: set[str] = set()
|
||||||
|
upserted = 0
|
||||||
|
skipped = 0
|
||||||
|
|
||||||
|
for row in rows:
|
||||||
|
defn = _row_to_domain(row)
|
||||||
|
if not isinstance(defn.trigger_config, CronTriggerConfig):
|
||||||
|
continue # should not happen given the WHERE clause, but guard anyway
|
||||||
|
|
||||||
|
db_activity_ids.add(str(defn.id))
|
||||||
|
|
||||||
|
if defn.enabled:
|
||||||
|
await upsert_schedule(client, defn)
|
||||||
|
upserted += 1
|
||||||
|
logger.info("upserted schedule for activity %s (%s)", defn.id, defn.name)
|
||||||
|
else:
|
||||||
|
# Disabled definitions: schedule may exist (paused) — leave it;
|
||||||
|
# upsert_schedule already handles the paused state.
|
||||||
|
await upsert_schedule(client, defn)
|
||||||
|
skipped += 1
|
||||||
|
logger.info("upserted paused schedule for disabled activity %s", defn.id)
|
||||||
|
|
||||||
|
# Tombstone cleanup: remove Temporal Schedules with no matching DB row.
|
||||||
|
existing_schedules = await list_schedules(client)
|
||||||
|
deleted = 0
|
||||||
|
for entry in existing_schedules:
|
||||||
|
if entry["activity_id"] not in db_activity_ids:
|
||||||
|
await delete_schedule(client, entry["activity_id"])
|
||||||
|
deleted += 1
|
||||||
|
logger.info("deleted orphaned schedule %s", entry["schedule_id"])
|
||||||
|
|
||||||
|
logger.info(
|
||||||
|
"sync_schedules complete — upserted=%d skipped_disabled=%d deleted_orphans=%d",
|
||||||
|
upserted,
|
||||||
|
skipped,
|
||||||
|
deleted,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
async def main() -> None:
|
||||||
|
logging.basicConfig(level=logging.INFO)
|
||||||
|
db_url = os.environ.get("ACTCORE_DB_URL")
|
||||||
|
if not db_url:
|
||||||
|
raise RuntimeError("ACTCORE_DB_URL is required")
|
||||||
|
|
||||||
|
client = await Client.connect(TEMPORAL_HOST, namespace=TEMPORAL_NAMESPACE)
|
||||||
|
await sync(client, db_url)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
asyncio.run(main())
|
||||||
@@ -4,23 +4,31 @@ Starts two workers (wired up in T20):
|
|||||||
- orchestrator-tq: RunActivityWorkflow + its activities
|
- orchestrator-tq: RunActivityWorkflow + its activities
|
||||||
- task-execution-tq: TaskExecutorWorkflow
|
- task-execution-tq: TaskExecutorWorkflow
|
||||||
|
|
||||||
|
T23: Calls sync_schedules before entering the worker run loop to ensure
|
||||||
|
all cron ActivityDefinitions have live Temporal Schedules.
|
||||||
|
|
||||||
|
T31: Exposes Prometheus metrics via the Temporal SDK runtime on :9090/metrics.
|
||||||
|
|
||||||
Run with:
|
Run with:
|
||||||
TEMPORAL_HOST=localhost:7233 \
|
TEMPORAL_HOST=localhost:7233 \
|
||||||
ACTCORE_DB_URL=postgresql+asyncpg://actcore:actcore@localhost:5433/actcore \
|
ACTCORE_DB_URL=postgresql+asyncpg://actcore:actcore@localhost:5433/actcore \
|
||||||
python -m activity_core.worker
|
python -m activity_core.worker
|
||||||
|
|
||||||
Environment variables:
|
Environment variables:
|
||||||
TEMPORAL_HOST Temporal frontend address (default: localhost:7233)
|
TEMPORAL_HOST Temporal frontend address (default: localhost:7233)
|
||||||
TEMPORAL_NAMESPACE Temporal namespace (default: default)
|
TEMPORAL_NAMESPACE Temporal namespace (default: default)
|
||||||
ACTCORE_DB_URL App DB connection string (required)
|
ACTCORE_DB_URL App DB connection string (required)
|
||||||
|
PROMETHEUS_BIND_ADDR Prometheus metrics bind (default: 0.0.0.0:9090)
|
||||||
"""
|
"""
|
||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
import asyncio
|
import asyncio
|
||||||
|
import logging
|
||||||
import os
|
import os
|
||||||
|
|
||||||
from temporalio.client import Client
|
from temporalio.client import Client
|
||||||
|
from temporalio.runtime import PrometheusConfig, Runtime, TelemetryConfig
|
||||||
from temporalio.worker import Worker
|
from temporalio.worker import Worker
|
||||||
|
|
||||||
from activity_core.activities import (
|
from activity_core.activities import (
|
||||||
@@ -30,10 +38,14 @@ from activity_core.activities import (
|
|||||||
persist_task_instance,
|
persist_task_instance,
|
||||||
resolve_context,
|
resolve_context,
|
||||||
)
|
)
|
||||||
|
from activity_core.sync_schedules import sync as sync_schedules
|
||||||
from activity_core.workflows import RunActivityWorkflow, TaskExecutorWorkflow
|
from activity_core.workflows import RunActivityWorkflow, TaskExecutorWorkflow
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
TEMPORAL_HOST = os.environ.get("TEMPORAL_HOST", "localhost:7233")
|
TEMPORAL_HOST = os.environ.get("TEMPORAL_HOST", "localhost:7233")
|
||||||
TEMPORAL_NAMESPACE = os.environ.get("TEMPORAL_NAMESPACE", "default")
|
TEMPORAL_NAMESPACE = os.environ.get("TEMPORAL_NAMESPACE", "default")
|
||||||
|
PROMETHEUS_BIND_ADDR = os.environ.get("PROMETHEUS_BIND_ADDR", "0.0.0.0:9090")
|
||||||
|
|
||||||
ORCHESTRATOR_TASK_QUEUE = "orchestrator-tq"
|
ORCHESTRATOR_TASK_QUEUE = "orchestrator-tq"
|
||||||
TASK_EXECUTION_TASK_QUEUE = "task-execution-tq"
|
TASK_EXECUTION_TASK_QUEUE = "task-execution-tq"
|
||||||
@@ -45,7 +57,23 @@ async def run() -> None:
|
|||||||
raise RuntimeError("ACTCORE_DB_URL is required")
|
raise RuntimeError("ACTCORE_DB_URL is required")
|
||||||
init_session_factory(db_url)
|
init_session_factory(db_url)
|
||||||
|
|
||||||
client = await Client.connect(TEMPORAL_HOST, namespace=TEMPORAL_NAMESPACE)
|
# T31: Configure the Temporal SDK runtime to emit metrics in Prometheus format.
|
||||||
|
runtime = Runtime(
|
||||||
|
telemetry=TelemetryConfig(
|
||||||
|
metrics=PrometheusConfig(bind_address=PROMETHEUS_BIND_ADDR)
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
|
client = await Client.connect(
|
||||||
|
TEMPORAL_HOST, namespace=TEMPORAL_NAMESPACE, runtime=runtime
|
||||||
|
)
|
||||||
|
|
||||||
|
# T23: Sync Temporal Schedules with the DB before workers start accepting tasks.
|
||||||
|
logger.info("Syncing Temporal Schedules with ActivityDefinition DB...")
|
||||||
|
try:
|
||||||
|
await sync_schedules(client, db_url)
|
||||||
|
except Exception:
|
||||||
|
logger.exception("schedule sync failed — continuing worker startup")
|
||||||
|
|
||||||
orchestrator_worker = Worker(
|
orchestrator_worker = Worker(
|
||||||
client,
|
client,
|
||||||
@@ -70,4 +98,5 @@ async def run() -> None:
|
|||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
if __name__ == "__main__":
|
||||||
|
logging.basicConfig(level=logging.INFO)
|
||||||
asyncio.run(run())
|
asyncio.run(run())
|
||||||
|
|||||||
@@ -15,7 +15,7 @@ import uuid
|
|||||||
from datetime import timedelta
|
from datetime import timedelta
|
||||||
|
|
||||||
from temporalio import workflow
|
from temporalio import workflow
|
||||||
from temporalio.common import RetryPolicy
|
from temporalio.common import RetryPolicy, SearchAttributeKey, TypedSearchAttributes, SearchAttributePair
|
||||||
|
|
||||||
with workflow.unsafe.imports_passed_through():
|
with workflow.unsafe.imports_passed_through():
|
||||||
from activity_core.activities import (
|
from activity_core.activities import (
|
||||||
@@ -25,6 +25,12 @@ with workflow.unsafe.imports_passed_through():
|
|||||||
resolve_context,
|
resolve_context,
|
||||||
)
|
)
|
||||||
from activity_core.template_engine import evaluate_templates
|
from activity_core.template_engine import evaluate_templates
|
||||||
|
from activity_core.schedule_manager import SCHEDULED_TRIGGER_KEY
|
||||||
|
|
||||||
|
# T32: Custom search attributes for Temporal visibility (must be registered in Temporal first).
|
||||||
|
# Registration: temporal operator search-attribute create --name ActivityId --type Keyword
|
||||||
|
_ACTIVITY_ID_KEY = SearchAttributeKey.for_keyword("ActivityId")
|
||||||
|
_ACTIVITY_NAME_KEY = SearchAttributeKey.for_keyword("ActivityName")
|
||||||
|
|
||||||
_RETRY_POLICY = RetryPolicy(
|
_RETRY_POLICY = RetryPolicy(
|
||||||
initial_interval=timedelta(seconds=1),
|
initial_interval=timedelta(seconds=1),
|
||||||
@@ -74,6 +80,16 @@ class RunActivityWorkflow:
|
|||||||
retry_policy=_RETRY_POLICY,
|
retry_policy=_RETRY_POLICY,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
# T32: Tag this workflow execution with activity metadata so runs are
|
||||||
|
# filterable in the Temporal UI (requires ActivityId + ActivityName to be
|
||||||
|
# registered as custom search attributes — see docs/runbook.md).
|
||||||
|
workflow.upsert_search_attributes(
|
||||||
|
TypedSearchAttributes([
|
||||||
|
SearchAttributePair(_ACTIVITY_ID_KEY, activity_id),
|
||||||
|
SearchAttributePair(_ACTIVITY_NAME_KEY, defn.get("name", "")),
|
||||||
|
])
|
||||||
|
)
|
||||||
|
|
||||||
# ── 2. Resolve context ────────────────────────────────────────────────
|
# ── 2. Resolve context ────────────────────────────────────────────────
|
||||||
context_snapshot: dict = await workflow.execute_activity(
|
context_snapshot: dict = await workflow.execute_activity(
|
||||||
resolve_context,
|
resolve_context,
|
||||||
@@ -89,9 +105,14 @@ class RunActivityWorkflow:
|
|||||||
|
|
||||||
# ── 4. Log the run ────────────────────────────────────────────────────
|
# ── 4. Log the run ────────────────────────────────────────────────────
|
||||||
# run_id is derived deterministically so log_run retries are idempotent.
|
# run_id is derived deterministically so log_run retries are idempotent.
|
||||||
run_id = str(
|
# For schedule-fired runs the trigger_key is the sentinel "scheduled";
|
||||||
uuid.uuid5(uuid.NAMESPACE_URL, f"{activity_id}:{trigger_key}")
|
# each fire has a unique workflow_id (embeds ${firstScheduledTime}), so
|
||||||
)
|
# we use the workflow_id as the dedup key instead.
|
||||||
|
if trigger_key == SCHEDULED_TRIGGER_KEY:
|
||||||
|
dedup_source = workflow.info().workflow_id
|
||||||
|
else:
|
||||||
|
dedup_source = f"{activity_id}:{trigger_key}"
|
||||||
|
run_id = str(uuid.uuid5(uuid.NAMESPACE_URL, dedup_source))
|
||||||
await workflow.execute_activity(
|
await workflow.execute_activity(
|
||||||
log_run,
|
log_run,
|
||||||
{
|
{
|
||||||
|
|||||||
0
tests/__init__.py
Normal file
0
tests/__init__.py
Normal file
303
tests/test_event_router.py
Normal file
303
tests/test_event_router.py
Normal file
@@ -0,0 +1,303 @@
|
|||||||
|
"""T29: Integration test — publish event → observe workflow run.
|
||||||
|
|
||||||
|
Requires the docker compose stack to be running including NATS:
|
||||||
|
docker compose -f docker-compose.dev.yml up -d
|
||||||
|
|
||||||
|
Run with:
|
||||||
|
ACTCORE_DB_URL=postgresql+asyncpg://actcore:actcore@localhost:5433/actcore \
|
||||||
|
NATS_URL=nats://localhost:4222 \
|
||||||
|
TEMPORAL_HOST=localhost:7233 \
|
||||||
|
uv run pytest tests/test_event_router.py -v -s
|
||||||
|
|
||||||
|
These tests are skipped automatically if NATS or Temporal is unreachable.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import asyncio
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
import uuid
|
||||||
|
from datetime import datetime, timezone
|
||||||
|
from unittest.mock import AsyncMock, MagicMock
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
from activity_core.event_router import EventRouter
|
||||||
|
from activity_core.models import EventEnvelope, EventTriggerConfig
|
||||||
|
|
||||||
|
|
||||||
|
# ── Unit tests (no external deps) ────────────────────────────────────────────
|
||||||
|
|
||||||
|
def _make_envelope(
|
||||||
|
event_type: str = "user.created",
|
||||||
|
payload: dict | None = None,
|
||||||
|
) -> EventEnvelope:
|
||||||
|
return EventEnvelope(
|
||||||
|
event_id=str(uuid.uuid4()),
|
||||||
|
type=event_type,
|
||||||
|
source="test-service",
|
||||||
|
occurred_at=datetime.now(tz=timezone.utc),
|
||||||
|
subject="user/123",
|
||||||
|
trace_id=str(uuid.uuid4()),
|
||||||
|
payload=payload or {},
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _make_router() -> EventRouter:
|
||||||
|
"""Return an EventRouter wired with mock clients (no real connections)."""
|
||||||
|
temporal_mock = MagicMock()
|
||||||
|
return EventRouter(
|
||||||
|
nats_url="nats://localhost:4222",
|
||||||
|
temporal_client=temporal_mock,
|
||||||
|
db_url="postgresql+asyncpg://actcore:actcore@localhost:5433/actcore",
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
# T27: _matches unit tests
|
||||||
|
|
||||||
|
def test_matches_exact_event_type() -> None:
|
||||||
|
router = _make_router()
|
||||||
|
cfg = EventTriggerConfig(event_type="user.created")
|
||||||
|
envelope = _make_envelope(event_type="user.created")
|
||||||
|
assert router._matches(envelope, cfg)
|
||||||
|
|
||||||
|
|
||||||
|
def test_matches_wrong_event_type() -> None:
|
||||||
|
router = _make_router()
|
||||||
|
cfg = EventTriggerConfig(event_type="user.updated")
|
||||||
|
envelope = _make_envelope(event_type="user.created")
|
||||||
|
assert not router._matches(envelope, cfg)
|
||||||
|
|
||||||
|
|
||||||
|
def test_matches_with_filters_all_present() -> None:
|
||||||
|
router = _make_router()
|
||||||
|
cfg = EventTriggerConfig(
|
||||||
|
event_type="user.created",
|
||||||
|
filters={"region": "eu", "tier": "pro"},
|
||||||
|
)
|
||||||
|
envelope = _make_envelope(
|
||||||
|
event_type="user.created",
|
||||||
|
payload={"region": "eu", "tier": "pro", "extra": "ignored"},
|
||||||
|
)
|
||||||
|
assert router._matches(envelope, cfg)
|
||||||
|
|
||||||
|
|
||||||
|
def test_matches_with_filters_partial_missing() -> None:
|
||||||
|
router = _make_router()
|
||||||
|
cfg = EventTriggerConfig(
|
||||||
|
event_type="user.created",
|
||||||
|
filters={"region": "eu", "tier": "pro"},
|
||||||
|
)
|
||||||
|
envelope = _make_envelope(
|
||||||
|
event_type="user.created",
|
||||||
|
payload={"region": "eu"}, # "tier" missing
|
||||||
|
)
|
||||||
|
assert not router._matches(envelope, cfg)
|
||||||
|
|
||||||
|
|
||||||
|
def test_matches_with_filters_wrong_value() -> None:
|
||||||
|
router = _make_router()
|
||||||
|
cfg = EventTriggerConfig(event_type="order.placed", filters={"status": "paid"})
|
||||||
|
envelope = _make_envelope(event_type="order.placed", payload={"status": "pending"})
|
||||||
|
assert not router._matches(envelope, cfg)
|
||||||
|
|
||||||
|
|
||||||
|
# T28: _dispatch unit test (mocked Temporal client)
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_dispatch_starts_workflow_with_correct_id() -> None:
|
||||||
|
temporal_mock = AsyncMock()
|
||||||
|
handle_mock = AsyncMock()
|
||||||
|
temporal_mock.start_workflow.return_value = handle_mock
|
||||||
|
|
||||||
|
router = EventRouter(
|
||||||
|
nats_url="nats://localhost:4222",
|
||||||
|
temporal_client=temporal_mock,
|
||||||
|
db_url="postgresql+asyncpg://actcore:actcore@localhost:5433/actcore",
|
||||||
|
)
|
||||||
|
|
||||||
|
activity_id = str(uuid.uuid4())
|
||||||
|
envelope = _make_envelope()
|
||||||
|
|
||||||
|
await router._dispatch(activity_id, envelope)
|
||||||
|
|
||||||
|
expected_id = f"activity-{activity_id}:{envelope.event_id}"
|
||||||
|
temporal_mock.start_workflow.assert_called_once()
|
||||||
|
call_args = temporal_mock.start_workflow.call_args
|
||||||
|
assert call_args.kwargs["id"] == expected_id
|
||||||
|
assert call_args.args[0] == "RunActivityWorkflow"
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_dispatch_duplicate_event_is_silently_skipped() -> None:
|
||||||
|
from temporalio.exceptions import WorkflowAlreadyStartedError
|
||||||
|
|
||||||
|
temporal_mock = AsyncMock()
|
||||||
|
temporal_mock.start_workflow.side_effect = WorkflowAlreadyStartedError(
|
||||||
|
workflow_id="activity-x:y", run_id="z", workflow_type="RunActivityWorkflow"
|
||||||
|
)
|
||||||
|
|
||||||
|
router = EventRouter(
|
||||||
|
nats_url="nats://localhost:4222",
|
||||||
|
temporal_client=temporal_mock,
|
||||||
|
db_url="postgresql+asyncpg://actcore:actcore@localhost:5433/actcore",
|
||||||
|
)
|
||||||
|
|
||||||
|
# Should not raise
|
||||||
|
await router._dispatch(str(uuid.uuid4()), _make_envelope())
|
||||||
|
|
||||||
|
|
||||||
|
# T28: _handle_message unit test (mocked NATS message)
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_handle_message_invalid_json_nacks() -> None:
|
||||||
|
router = _make_router()
|
||||||
|
router._session_factory = None # not needed for this test path
|
||||||
|
|
||||||
|
msg = MagicMock()
|
||||||
|
msg.data = b"not-json"
|
||||||
|
msg.nak = AsyncMock()
|
||||||
|
msg.ack = AsyncMock()
|
||||||
|
|
||||||
|
await router._handle_message(msg)
|
||||||
|
|
||||||
|
msg.nak.assert_called_once()
|
||||||
|
msg.ack.assert_not_called()
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_handle_message_no_match_acks_without_dispatch() -> None:
|
||||||
|
temporal_mock = AsyncMock()
|
||||||
|
router = EventRouter(
|
||||||
|
nats_url="nats://localhost:4222",
|
||||||
|
temporal_client=temporal_mock,
|
||||||
|
db_url="postgresql+asyncpg://actcore:actcore@localhost:5433/actcore",
|
||||||
|
)
|
||||||
|
# Patch _load_event_definitions to return empty (no definitions match)
|
||||||
|
router._load_event_definitions = AsyncMock(return_value=[])
|
||||||
|
|
||||||
|
envelope = _make_envelope()
|
||||||
|
msg = MagicMock()
|
||||||
|
msg.data = envelope.model_dump_json().encode()
|
||||||
|
msg.ack = AsyncMock()
|
||||||
|
msg.nak = AsyncMock()
|
||||||
|
|
||||||
|
await router._handle_message(msg)
|
||||||
|
|
||||||
|
msg.ack.assert_called_once()
|
||||||
|
temporal_mock.start_workflow.assert_not_called()
|
||||||
|
|
||||||
|
|
||||||
|
# ── Integration tests (require docker-compose stack) ─────────────────────────
|
||||||
|
|
||||||
|
NATS_URL = os.environ.get("NATS_URL", "nats://localhost:4222")
|
||||||
|
TEMPORAL_HOST = os.environ.get("TEMPORAL_HOST", "localhost:7233")
|
||||||
|
ACTCORE_DB_URL = os.environ.get(
|
||||||
|
"ACTCORE_DB_URL",
|
||||||
|
"postgresql+asyncpg://actcore:actcore@localhost:5433/actcore",
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
async def _nats_reachable() -> bool:
|
||||||
|
try:
|
||||||
|
import nats
|
||||||
|
nc = await nats.connect(NATS_URL, connect_timeout=2)
|
||||||
|
await nc.close()
|
||||||
|
return True
|
||||||
|
except Exception:
|
||||||
|
return False
|
||||||
|
|
||||||
|
|
||||||
|
async def _temporal_reachable() -> bool:
|
||||||
|
try:
|
||||||
|
from temporalio.client import Client
|
||||||
|
client = await Client.connect(TEMPORAL_HOST)
|
||||||
|
await client.service_client.health_check()
|
||||||
|
return True
|
||||||
|
except Exception:
|
||||||
|
return False
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture(scope="module")
|
||||||
|
async def integration_skip():
|
||||||
|
"""Skip the integration block if NATS or Temporal is unreachable."""
|
||||||
|
if not (await _nats_reachable() and await _temporal_reachable()):
|
||||||
|
pytest.skip("NATS and/or Temporal not reachable — skipping integration tests")
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_publish_event_starts_workflow(integration_skip: None) -> None:
|
||||||
|
"""Publish a NATS event and verify RunActivityWorkflow is started in Temporal."""
|
||||||
|
import nats as nats_lib
|
||||||
|
from sqlalchemy.ext.asyncio import create_async_engine, async_sessionmaker, AsyncSession
|
||||||
|
from sqlalchemy import select
|
||||||
|
from temporalio.client import Client, WorkflowExecutionStatus
|
||||||
|
|
||||||
|
from activity_core.orm import ActivityDefinition as ActivityDefinitionRow
|
||||||
|
|
||||||
|
# Create an event-triggered ActivityDefinition in the DB.
|
||||||
|
engine = create_async_engine(ACTCORE_DB_URL)
|
||||||
|
session_factory = async_sessionmaker(engine, expire_on_commit=False)
|
||||||
|
activity_id = uuid.uuid4()
|
||||||
|
event_type = f"test.event.{uuid.uuid4().hex[:8]}"
|
||||||
|
|
||||||
|
async with session_factory() as session:
|
||||||
|
async with session.begin():
|
||||||
|
row = ActivityDefinitionRow(
|
||||||
|
id=activity_id,
|
||||||
|
name=f"integration-test-{activity_id}",
|
||||||
|
enabled=True,
|
||||||
|
trigger_type="event",
|
||||||
|
trigger_config={"trigger_type": "event", "event_type": event_type, "filters": {}},
|
||||||
|
context_sources=[],
|
||||||
|
task_templates=[],
|
||||||
|
dedupe_key_strategy="skip",
|
||||||
|
version=1,
|
||||||
|
)
|
||||||
|
session.add(row)
|
||||||
|
|
||||||
|
temporal_client = await Client.connect(TEMPORAL_HOST)
|
||||||
|
router = EventRouter(
|
||||||
|
nats_url=NATS_URL,
|
||||||
|
temporal_client=temporal_client,
|
||||||
|
db_url=ACTCORE_DB_URL,
|
||||||
|
)
|
||||||
|
|
||||||
|
# Start the router in the background.
|
||||||
|
router_task = asyncio.create_task(router.start())
|
||||||
|
await asyncio.sleep(1) # allow subscription to establish
|
||||||
|
|
||||||
|
# Publish a matching event.
|
||||||
|
event_id = str(uuid.uuid4())
|
||||||
|
envelope = EventEnvelope(
|
||||||
|
event_id=event_id,
|
||||||
|
type=event_type,
|
||||||
|
source="integration-test",
|
||||||
|
occurred_at=datetime.now(tz=timezone.utc),
|
||||||
|
subject="test/1",
|
||||||
|
trace_id=str(uuid.uuid4()),
|
||||||
|
)
|
||||||
|
|
||||||
|
nc = await nats_lib.connect(NATS_URL)
|
||||||
|
await nc.publish(f"activity.{event_type}", envelope.model_dump_json().encode())
|
||||||
|
await nc.flush()
|
||||||
|
await nc.close()
|
||||||
|
|
||||||
|
# Give the router time to process and Temporal time to receive the start.
|
||||||
|
await asyncio.sleep(3)
|
||||||
|
|
||||||
|
# Assert the workflow was started.
|
||||||
|
expected_wf_id = f"activity-{activity_id}:{event_id}"
|
||||||
|
try:
|
||||||
|
desc = await temporal_client.get_workflow_handle(expected_wf_id).describe()
|
||||||
|
assert desc is not None, "Workflow handle should exist"
|
||||||
|
except Exception as e:
|
||||||
|
pytest.fail(f"Workflow {expected_wf_id!r} was not started: {e}")
|
||||||
|
finally:
|
||||||
|
router_task.cancel()
|
||||||
|
try:
|
||||||
|
await router_task
|
||||||
|
except asyncio.CancelledError:
|
||||||
|
pass
|
||||||
|
await engine.dispose()
|
||||||
168
tests/test_schedule_lifecycle.py
Normal file
168
tests/test_schedule_lifecycle.py
Normal file
@@ -0,0 +1,168 @@
|
|||||||
|
"""T25: Schedule pause/resume lifecycle tests.
|
||||||
|
|
||||||
|
Tests schedule_manager.py against a local embedded Temporal server
|
||||||
|
(temporalio[testing] — WorkflowEnvironment.start_local()).
|
||||||
|
|
||||||
|
Requires no Docker; the Temporal testing library bundles a self-contained server.
|
||||||
|
|
||||||
|
Run with:
|
||||||
|
uv run pytest tests/test_schedule_lifecycle.py -v
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import uuid
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
from temporalio.client import ScheduleOverlapPolicy
|
||||||
|
from temporalio.testing import WorkflowEnvironment
|
||||||
|
|
||||||
|
from activity_core.models import ActivityDefinition, CronTriggerConfig
|
||||||
|
from activity_core.schedule_manager import (
|
||||||
|
delete_schedule,
|
||||||
|
list_schedules,
|
||||||
|
schedule_id,
|
||||||
|
upsert_schedule,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _make_defn(
|
||||||
|
*,
|
||||||
|
cron: str = "0 9 * * 1-5",
|
||||||
|
misfire_policy: str = "skip",
|
||||||
|
enabled: bool = True,
|
||||||
|
jitter: int = 0,
|
||||||
|
) -> ActivityDefinition:
|
||||||
|
return ActivityDefinition(
|
||||||
|
id=uuid.uuid4(),
|
||||||
|
name="test-activity",
|
||||||
|
enabled=enabled,
|
||||||
|
trigger_config=CronTriggerConfig(
|
||||||
|
cron_expression=cron,
|
||||||
|
misfire_policy=misfire_policy,
|
||||||
|
jitter_seconds=jitter,
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture(scope="module")
|
||||||
|
async def env():
|
||||||
|
"""Start a local embedded Temporal server for the test module."""
|
||||||
|
async with await WorkflowEnvironment.start_local() as e:
|
||||||
|
yield e
|
||||||
|
|
||||||
|
|
||||||
|
# ── T25a: upsert creates a schedule and list_schedules finds it ──────────────
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_upsert_schedule_creates_schedule(env: WorkflowEnvironment) -> None:
|
||||||
|
defn = _make_defn()
|
||||||
|
sid = schedule_id(defn.id)
|
||||||
|
|
||||||
|
await upsert_schedule(env.client, defn)
|
||||||
|
|
||||||
|
schedules = await list_schedules(env.client)
|
||||||
|
ids = [s["schedule_id"] for s in schedules]
|
||||||
|
assert sid in ids, f"Expected schedule {sid!r} in {ids}"
|
||||||
|
|
||||||
|
# Cleanup
|
||||||
|
await delete_schedule(env.client, defn.id)
|
||||||
|
|
||||||
|
|
||||||
|
# ── T25b: upsert with enabled=False creates a paused schedule ────────────────
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_upsert_disabled_creates_paused_schedule(env: WorkflowEnvironment) -> None:
|
||||||
|
defn = _make_defn(enabled=False)
|
||||||
|
|
||||||
|
await upsert_schedule(env.client, defn)
|
||||||
|
|
||||||
|
handle = env.client.get_schedule_handle(schedule_id(defn.id))
|
||||||
|
desc = await handle.describe()
|
||||||
|
assert desc.schedule.state.paused, "Schedule should be paused when enabled=False"
|
||||||
|
|
||||||
|
await delete_schedule(env.client, defn.id)
|
||||||
|
|
||||||
|
|
||||||
|
# ── T25c: second upsert (enabled=True) unpauses the schedule ────────────────
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_upsert_reenables_paused_schedule(env: WorkflowEnvironment) -> None:
|
||||||
|
defn_disabled = _make_defn(enabled=False)
|
||||||
|
|
||||||
|
await upsert_schedule(env.client, defn_disabled)
|
||||||
|
|
||||||
|
# Re-enable the same activity
|
||||||
|
defn_enabled = ActivityDefinition(
|
||||||
|
id=defn_disabled.id,
|
||||||
|
name=defn_disabled.name,
|
||||||
|
enabled=True,
|
||||||
|
trigger_config=defn_disabled.trigger_config,
|
||||||
|
)
|
||||||
|
await upsert_schedule(env.client, defn_enabled)
|
||||||
|
|
||||||
|
handle = env.client.get_schedule_handle(schedule_id(defn_enabled.id))
|
||||||
|
desc = await handle.describe()
|
||||||
|
assert not desc.schedule.state.paused, "Schedule should be unpaused after re-enable"
|
||||||
|
|
||||||
|
await delete_schedule(env.client, defn_enabled.id)
|
||||||
|
|
||||||
|
|
||||||
|
# ── T25d: delete_schedule removes the schedule ───────────────────────────────
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_delete_schedule_removes_schedule(env: WorkflowEnvironment) -> None:
|
||||||
|
defn = _make_defn()
|
||||||
|
|
||||||
|
await upsert_schedule(env.client, defn)
|
||||||
|
await delete_schedule(env.client, defn.id)
|
||||||
|
|
||||||
|
schedules = await list_schedules(env.client)
|
||||||
|
ids = [s["schedule_id"] for s in schedules]
|
||||||
|
assert schedule_id(defn.id) not in ids, "Schedule should be gone after delete"
|
||||||
|
|
||||||
|
|
||||||
|
# ── T25e: delete_schedule is idempotent (no-op for non-existent schedule) ────
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_delete_schedule_nonexistent_is_noop(env: WorkflowEnvironment) -> None:
|
||||||
|
# Should not raise
|
||||||
|
await delete_schedule(env.client, uuid.uuid4())
|
||||||
|
|
||||||
|
|
||||||
|
# ── T24: misfire_policy round-trip ───────────────────────────────────────────
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_misfire_policy_skip_sets_overlap_skip(env: WorkflowEnvironment) -> None:
|
||||||
|
defn = _make_defn(misfire_policy="skip")
|
||||||
|
await upsert_schedule(env.client, defn)
|
||||||
|
|
||||||
|
handle = env.client.get_schedule_handle(schedule_id(defn.id))
|
||||||
|
desc = await handle.describe()
|
||||||
|
assert desc.schedule.policy.overlap == ScheduleOverlapPolicy.SKIP
|
||||||
|
|
||||||
|
await delete_schedule(env.client, defn.id)
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_misfire_policy_catchup_sets_overlap_buffer_all(env: WorkflowEnvironment) -> None:
|
||||||
|
defn = _make_defn(misfire_policy="catchup")
|
||||||
|
await upsert_schedule(env.client, defn)
|
||||||
|
|
||||||
|
handle = env.client.get_schedule_handle(schedule_id(defn.id))
|
||||||
|
desc = await handle.describe()
|
||||||
|
assert desc.schedule.policy.overlap == ScheduleOverlapPolicy.BUFFER_ALL
|
||||||
|
|
||||||
|
await delete_schedule(env.client, defn.id)
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_misfire_policy_compress_sets_overlap_buffer_one(env: WorkflowEnvironment) -> None:
|
||||||
|
defn = _make_defn(misfire_policy="compress")
|
||||||
|
await upsert_schedule(env.client, defn)
|
||||||
|
|
||||||
|
handle = env.client.get_schedule_handle(schedule_id(defn.id))
|
||||||
|
desc = await handle.describe()
|
||||||
|
assert desc.schedule.policy.overlap == ScheduleOverlapPolicy.BUFFER_ONE
|
||||||
|
|
||||||
|
await delete_schedule(env.client, defn.id)
|
||||||
11
uv.lock
generated
11
uv.lock
generated
@@ -9,6 +9,7 @@ dependencies = [
|
|||||||
{ name = "alembic" },
|
{ name = "alembic" },
|
||||||
{ name = "asyncpg" },
|
{ name = "asyncpg" },
|
||||||
{ name = "fastapi" },
|
{ name = "fastapi" },
|
||||||
|
{ name = "nats-py" },
|
||||||
{ name = "pydantic" },
|
{ name = "pydantic" },
|
||||||
{ name = "sqlalchemy", extra = ["asyncio"] },
|
{ name = "sqlalchemy", extra = ["asyncio"] },
|
||||||
{ name = "temporalio" },
|
{ name = "temporalio" },
|
||||||
@@ -27,6 +28,7 @@ requires-dist = [
|
|||||||
{ name = "alembic", specifier = ">=1.14" },
|
{ name = "alembic", specifier = ">=1.14" },
|
||||||
{ name = "asyncpg", specifier = ">=0.29" },
|
{ name = "asyncpg", specifier = ">=0.29" },
|
||||||
{ name = "fastapi", specifier = ">=0.115" },
|
{ name = "fastapi", specifier = ">=0.115" },
|
||||||
|
{ name = "nats-py", specifier = ">=2.7" },
|
||||||
{ name = "pydantic", specifier = ">=2.0" },
|
{ name = "pydantic", specifier = ">=2.0" },
|
||||||
{ name = "pytest", marker = "extra == 'dev'", specifier = ">=8.0" },
|
{ name = "pytest", marker = "extra == 'dev'", specifier = ">=8.0" },
|
||||||
{ name = "pytest-asyncio", marker = "extra == 'dev'", specifier = ">=0.24" },
|
{ name = "pytest-asyncio", marker = "extra == 'dev'", specifier = ">=0.24" },
|
||||||
@@ -367,6 +369,15 @@ wheels = [
|
|||||||
{ url = "https://files.pythonhosted.org/packages/70/bc/6f1c2f612465f5fa89b95bead1f44dcb607670fd42891d8fdcd5d039f4f4/markupsafe-3.0.3-cp314-cp314t-win_arm64.whl", hash = "sha256:32001d6a8fc98c8cb5c947787c5d08b0a50663d139f1305bac5885d98d9b40fa", size = 14146 },
|
{ url = "https://files.pythonhosted.org/packages/70/bc/6f1c2f612465f5fa89b95bead1f44dcb607670fd42891d8fdcd5d039f4f4/markupsafe-3.0.3-cp314-cp314t-win_arm64.whl", hash = "sha256:32001d6a8fc98c8cb5c947787c5d08b0a50663d139f1305bac5885d98d9b40fa", size = 14146 },
|
||||||
]
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "nats-py"
|
||||||
|
version = "2.14.0"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/c3/f8/b956c4621ba88748ed707c52e69f95b7a50c8914e750edca59a5bef84a76/nats_py-2.14.0.tar.gz", hash = "sha256:4ed02cb8e3b55c68074a063aa2687087115d805d1513297da90cb2068fb07bed", size = 120751 }
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/f9/39/0e87753df1072254bac190b33ed34b264f28f6aa9bea0f01b7e818071756/nats_py-2.14.0-py3-none-any.whl", hash = "sha256:4116f5d2233ce16e63c3d5538fa40a5e207f75fcf42a741773929ddf1e29d19d", size = 82259 },
|
||||||
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "nexus-rpc"
|
name = "nexus-rpc"
|
||||||
version = "1.3.0"
|
version = "1.3.0"
|
||||||
|
|||||||
@@ -3,56 +3,56 @@ id: custodian-WP-0002
|
|||||||
type: workplan
|
type: workplan
|
||||||
domain: custodian
|
domain: custodian
|
||||||
repo: activity-core
|
repo: activity-core
|
||||||
status: active
|
status: done
|
||||||
state_hub_workstream_id: 3a4f47d9-8bc1-434e-acb4-bed5d4dacda0
|
state_hub_workstream_id: 3a4f47d9-8bc1-434e-acb4-bed5d4dacda0
|
||||||
tasks:
|
tasks:
|
||||||
- id: T22
|
- id: T22
|
||||||
title: Write schedule_manager.py
|
title: Write schedule_manager.py
|
||||||
status: todo
|
status: done
|
||||||
state_hub_task_id: e50550d1-9904-41d7-afd8-492a1f1e91b8
|
state_hub_task_id: e50550d1-9904-41d7-afd8-492a1f1e91b8
|
||||||
- id: T23
|
- id: T23
|
||||||
title: Bootstrap script to sync schedules on startup
|
title: Bootstrap script to sync schedules on startup
|
||||||
status: todo
|
status: done
|
||||||
state_hub_task_id: 5a1f7fa3-acb9-4f60-9892-c9eaa120272e
|
state_hub_task_id: 5a1f7fa3-acb9-4f60-9892-c9eaa120272e
|
||||||
- id: T24
|
- id: T24
|
||||||
title: Handle misfire policy in schedule config
|
title: Handle misfire policy in schedule config
|
||||||
status: todo
|
status: done
|
||||||
state_hub_task_id: 00231668-95c5-447f-b3d0-1fb8c20b487f
|
state_hub_task_id: 00231668-95c5-447f-b3d0-1fb8c20b487f
|
||||||
- id: T25
|
- id: T25
|
||||||
title: Test schedule pause/resume lifecycle
|
title: Test schedule pause/resume lifecycle
|
||||||
status: todo
|
status: done
|
||||||
state_hub_task_id: 7abfd375-ea9d-4209-8371-e5664dc2c6c4
|
state_hub_task_id: 7abfd375-ea9d-4209-8371-e5664dc2c6c4
|
||||||
- id: T26
|
- id: T26
|
||||||
title: Implement Event Router service
|
title: Implement Event Router service
|
||||||
status: todo
|
status: done
|
||||||
state_hub_task_id: 68b6610b-159c-4f1c-92a9-7128efea0961
|
state_hub_task_id: 68b6610b-159c-4f1c-92a9-7128efea0961
|
||||||
- id: T27
|
- id: T27
|
||||||
title: Implement routing rules (event.type + filters → activity_ids)
|
title: Implement routing rules (event.type + filters → activity_ids)
|
||||||
status: todo
|
status: done
|
||||||
state_hub_task_id: 9348efea-a7e9-4f92-b866-8fc82cf28fee
|
state_hub_task_id: 9348efea-a7e9-4f92-b866-8fc82cf28fee
|
||||||
- id: T28
|
- id: T28
|
||||||
title: Start/signal workflow from Event Router
|
title: Start/signal workflow from Event Router
|
||||||
status: todo
|
status: done
|
||||||
state_hub_task_id: cac1f45a-7391-471a-9566-97cdbd96eb2d
|
state_hub_task_id: cac1f45a-7391-471a-9566-97cdbd96eb2d
|
||||||
- id: T29
|
- id: T29
|
||||||
title: Integration test — publish event → observe workflow run
|
title: Integration test — publish event → observe workflow run
|
||||||
status: todo
|
status: done
|
||||||
state_hub_task_id: 7f10b5a3-7cad-4914-b603-d57508c85629
|
state_hub_task_id: 7f10b5a3-7cad-4914-b603-d57508c85629
|
||||||
- id: T30
|
- id: T30
|
||||||
title: REST API (FastAPI) — CRUD for ActivityDefinition
|
title: REST API (FastAPI) — CRUD for ActivityDefinition
|
||||||
status: todo
|
status: done
|
||||||
state_hub_task_id: b27e54a1-5dcc-476d-8f4a-c995aea6a8c2
|
state_hub_task_id: b27e54a1-5dcc-476d-8f4a-c995aea6a8c2
|
||||||
- id: T31
|
- id: T31
|
||||||
title: Wire Temporal SDK metrics to Prometheus
|
title: Wire Temporal SDK metrics to Prometheus
|
||||||
status: todo
|
status: done
|
||||||
state_hub_task_id: 0eafb60c-f00e-4fd7-a921-7de75fcfe81e
|
state_hub_task_id: 0eafb60c-f00e-4fd7-a921-7de75fcfe81e
|
||||||
- id: T32
|
- id: T32
|
||||||
title: Tag workflows with activity_id for Temporal visibility search
|
title: Tag workflows with activity_id for Temporal visibility search
|
||||||
status: todo
|
status: done
|
||||||
state_hub_task_id: 7bdfc5c2-1f06-4fce-aac3-fae036dcb47e
|
state_hub_task_id: 7bdfc5c2-1f06-4fce-aac3-fae036dcb47e
|
||||||
- id: T33
|
- id: T33
|
||||||
title: Write operational runbook
|
title: Write operational runbook
|
||||||
status: todo
|
status: done
|
||||||
state_hub_task_id: 766d602d-1b23-4247-a46d-03c0d3b8e498
|
state_hub_task_id: 766d602d-1b23-4247-a46d-03c0d3b8e498
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -61,6 +61,7 @@ tasks:
|
|||||||
**Workstream:** activity-core Triggers & Ops
|
**Workstream:** activity-core Triggers & Ops
|
||||||
**Hub ID:** `3a4f47d9-8bc1-434e-acb4-bed5d4dacda0`
|
**Hub ID:** `3a4f47d9-8bc1-434e-acb4-bed5d4dacda0`
|
||||||
**Depends on:** custodian-WP-0001 (Foundation — Temporal Backbone)
|
**Depends on:** custodian-WP-0001 (Foundation — Temporal Backbone)
|
||||||
|
**Status:** DONE (2026-03-28)
|
||||||
|
|
||||||
## Purpose
|
## Purpose
|
||||||
|
|
||||||
@@ -68,50 +69,62 @@ Add automated triggering (time-based via Temporal Schedules and event-driven via
|
|||||||
a REST admin API, Prometheus metrics, and an operational runbook. Transforms the manually-triggered
|
a REST admin API, Prometheus metrics, and an operational runbook. Transforms the manually-triggered
|
||||||
backbone from WP-0001 into a self-operating service.
|
backbone from WP-0001 into a self-operating service.
|
||||||
|
|
||||||
## Open decisions (resolve before Phase 5)
|
## Decisions resolved
|
||||||
|
|
||||||
- **Event broker choice** (hub: `bc47c9c2-5643-4a88-8114-601738a2f64e`): Kafka vs NATS vs RabbitMQ.
|
- **Event broker choice** (hub: `bc47c9c2-5643-4a88-8114-601738a2f64e`): **NATS + JetStream** chosen.
|
||||||
T26–T29 are blocked until this is resolved.
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Phase 4 — Time-Based Triggers (Temporal Schedules)
|
## Phase 4 — Time-Based Triggers (Temporal Schedules) ✓
|
||||||
|
|
||||||
| Task | Priority | Hub task ID |
|
| Task | Priority | Status |
|
||||||
|---|---|---|
|
|---|---|---|
|
||||||
| T22: Write schedule_manager.py | medium | `e50550d1-9904-41d7-afd8-492a1f1e91b8` |
|
| T22: Write schedule_manager.py | medium | done |
|
||||||
| T23: Bootstrap script to sync schedules on startup | medium | `5a1f7fa3-acb9-4f60-9892-c9eaa120272e` |
|
| T23: Bootstrap script to sync schedules on startup | medium | done |
|
||||||
| T24: Handle misfire policy in schedule config | medium | `00231668-95c5-447f-b3d0-1fb8c20b487f` |
|
| T24: Handle misfire policy in schedule config | medium | done |
|
||||||
| T25: Test schedule pause/resume lifecycle | medium | `7abfd375-ea9d-4209-8371-e5664dc2c6c4` |
|
| T25: Test schedule pause/resume lifecycle | medium | done |
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Phase 5 — Event-Driven Triggers
|
## Phase 5 — Event-Driven Triggers ✓
|
||||||
|
|
||||||
*Blocked by broker decision (`bc47c9c2-5643-4a88-8114-601738a2f64e`).*
|
| Task | Priority | Status |
|
||||||
|
|
||||||
| Task | Priority | Hub task ID |
|
|
||||||
|---|---|---|
|
|---|---|---|
|
||||||
| T26: Implement Event Router service | medium | `68b6610b-159c-4f1c-92a9-7128efea0961` |
|
| T26: Implement Event Router service | medium | done |
|
||||||
| T27: Implement routing rules (event.type + filters → activity_ids) | medium | `9348efea-a7e9-4f92-b866-8fc82cf28fee` |
|
| T27: Implement routing rules (event.type + filters → activity_ids) | medium | done |
|
||||||
| T28: Start/signal workflow from Event Router | medium | `cac1f45a-7391-471a-9566-97cdbd96eb2d` |
|
| T28: Start/signal workflow from Event Router | medium | done |
|
||||||
| T29: Integration test — publish event → observe workflow run | medium | `7f10b5a3-7cad-4914-b603-d57508c85629` |
|
| T29: Integration test — publish event → observe workflow run | medium | done |
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Phase 6 — Observability & Admin
|
## Phase 6 — Observability & Admin ✓
|
||||||
|
|
||||||
| Task | Priority | Hub task ID |
|
| Task | Priority | Status |
|
||||||
|---|---|---|
|
|---|---|---|
|
||||||
| T30: REST API (FastAPI) — CRUD for ActivityDefinition | low | `b27e54a1-5dcc-476d-8f4a-c995aea6a8c2` |
|
| T30: REST API (FastAPI) — CRUD for ActivityDefinition | low | done |
|
||||||
| T31: Wire Temporal SDK metrics to Prometheus | low | `0eafb60c-f00e-4fd7-a921-7de75fcfe81e` |
|
| T31: Wire Temporal SDK metrics to Prometheus | low | done |
|
||||||
| T32: Tag workflows with activity_id for Temporal visibility search | low | `7bdfc5c2-1f06-4fce-aac3-fae036dcb47e` |
|
| T32: Tag workflows with activity_id for Temporal visibility search | low | done |
|
||||||
| T33: Write operational runbook | low | `766d602d-1b23-4247-a46d-03c0d3b8e498` |
|
| T33: Write operational runbook | low | done |
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Completion criteria
|
## Files produced
|
||||||
|
|
||||||
Schedules fire `RunActivityWorkflow` automatically on cron cadence. An external event published
|
| File | Purpose |
|
||||||
to the broker reaches the correct ActivityDefinition end-to-end. ActivityDefinitions are
|
|------|---------|
|
||||||
manageable via REST API. Prometheus metrics are scraped. Runbook is written.
|
| `src/activity_core/schedule_manager.py` | T22/T24: upsert/delete/list Temporal Schedules |
|
||||||
|
| `src/activity_core/sync_schedules.py` | T23: bootstrap schedule sync |
|
||||||
|
| `src/activity_core/event_router.py` | T26/T27/T28: NATS JetStream → Temporal |
|
||||||
|
| `src/activity_core/api.py` | T30: FastAPI CRUD + manual trigger |
|
||||||
|
| `tests/test_schedule_lifecycle.py` | T25: schedule lifecycle unit tests |
|
||||||
|
| `tests/test_event_router.py` | T29: event router unit + integration tests |
|
||||||
|
| `docs/runbook.md` | T33: operational runbook |
|
||||||
|
| `docker-compose.dev.yml` | added NATS service |
|
||||||
|
|
||||||
|
## Completion criteria ✓
|
||||||
|
|
||||||
|
- Schedules fire `RunActivityWorkflow` automatically on cron cadence ✓
|
||||||
|
- External event published to NATS reaches the correct ActivityDefinition end-to-end ✓
|
||||||
|
- ActivityDefinitions are manageable via REST API ✓
|
||||||
|
- Prometheus metrics are scraped ✓
|
||||||
|
- Runbook is written ✓
|
||||||
|
|||||||
Reference in New Issue
Block a user