Implement weekly coding retro schedule

2026-06-07 20:58:34 +02:00
parent 992fe94034
commit 14b2d40eb7
6 changed files with 526 additions and 0 deletions
--- a/activity-definitions/weekly-coding-retro.md
+++ b/activity-definitions/weekly-coding-retro.md
@@ -0,0 +1,48 @@
+---
+id: weekly-coding-retro
+name: Weekly Coding Retrospection
+enabled: false   # flip to true once the coding_retro resolver + session-memory publish (AGENTIC-WP-0010) are verified
+owner: custodian-agent
+governance: custodian
+status: proposed
+trigger:
+  type: cron
+  cron_expression: "0 19 * * 6"   # Saturday 19:00
+  timezone: Europe/Berlin
+  misfire_policy: skip
+context_sources:
+  - type: state-hub
+    query: coding_retro
+    params:
+      window_days: 7
+      limit: 100
+    bind_to: context.retro
+# The coding_retro resolver returns the most recent event_type=coding_retro read
+# model published to the hub by helix_forge session-memory (AGENTIC-WP-0010).
+# Its detail.suggestions[] are already ranked (impact x frequency, cross-flavor
+# first) and capped at 3 per repo, so the rule below just routes them.
+---
+
+# Weekly Coding Retrospection
+
+Runs every Saturday 19:00 Berlin time. Reads the previous week's coding-session
+analysis (published to the hub by helix_forge session-memory) and opens one
+improvement suggestion per relevant repo — the three most promising, already
+ranked upstream.
+
+```rule
+id: propose-weekly-improvements
+for_each: context.retro.suggestions
+bind_as: s
+condition: 'context.s.score > 0'
+action:
+  task_template: context.s.title
+  description: context.s.recommendation
+  target_repo: context.s.repo
+  priority: context.s.priority
+  labels: ["coding-retro", "improvement", "automated"]
+```
+
+Each suggestion carries `repo`, `title`, `recommendation`, `priority`, and
+`score`. The upstream retro caps the list at three per repo, so this emits at most
+three improvement tasks per relevant repository per week.
--- a/docs/runbook.md
+++ b/docs/runbook.md
@@ -149,6 +149,26 @@ activity registration issues before the next scheduled run.

 ---

+## Weekly maintenance definitions
+
+`weekly-sbom-staleness` is the canonical rule-only weekly maintenance schedule.
+It runs Mondays at 09:00 Europe/Berlin, resolves State Hub SBOM status for all
+repos, and emits one automated task per stale repo through explicit
+`for_each: context.repos.repos`.
+
+`weekly-coding-retro` follows the same cron -> context resolver -> per-repo task
+pattern for coding-session retrospection. It runs Saturdays at 19:00
+Europe/Berlin and resolves the latest State Hub `/progress/` item with
+`event_type=coding_retro` into `context.retro.suggestions`. Each positive-score
+suggestion emits one task to `context.s.repo` with labels
+`coding-retro`, `improvement`, and `automated`.
+
+Keep `weekly-coding-retro` disabled until Helix Forge publishes the
+`coding_retro` read model and a smoke run confirms the resolver returns a
+non-empty suggestion set with no duplicate target tasks on re-run.
+
+---
+
 ## Temporal UI — filtering by activity

 With search attributes registered, you can filter in the Temporal Web UI:
--- a/src/activity_core/context_resolvers/state_hub.py
+++ b/src/activity_core/context_resolvers/state_hub.py
@@ -9,6 +9,7 @@ Supported queries:
  - next_steps:       GET {STATE_HUB_URL}/state/next_steps
  - workplan_index:   GET {STATE_HUB_URL}/workstreams/workplan-index
  - hub_inbox:        GET {STATE_HUB_URL}/messages/?to_agent=hub&unread_only=true
+  - coding_retro:     latest /progress/ item with event_type=coding_retro
  - daily_triage_digest: curated scalar JSON digest for daily WSJF triage
  - recently_on_scope_hourly: POST {STATE_HUB_URL}/recently-on-scope/hourly

@@ -94,6 +95,8 @@ class StateHubContextResolver(ContextResolver):
                "unread_only": params.get("unread_only", True),
            }
            return _fetch_json("/messages/", query_params)
+        if query == "coding_retro":
+            return _coding_retro(params)
        if query == "daily_triage_digest":
            return _daily_triage_digest(params)
        if query == "recently_on_scope_hourly":
@@ -206,6 +209,181 @@ def _sbom_age_days(last_sbom_at: Any) -> tuple[int, bool, str | None]:
    return max(0, delta.days), True, last_sbom_at


+def _coding_retro(params: dict[str, Any]) -> dict[str, Any]:
+    """Return the latest weekly coding-retro read model from State Hub progress.
+
+    Helix Forge publishes this as a `progress` item with event_type=coding_retro.
+    The resolver keeps the workflow-facing shape stable even before the first
+    publication exists, so rules can safely iterate over
+    `context.retro.suggestions`.
+    """
+    event_type = str(params.get("event_type") or "coding_retro")
+    limit = _bounded_int(params.get("limit", 100), default=100, minimum=1, maximum=500)
+    items = _fetch_json("/progress/", {"limit": limit})
+    if not isinstance(items, list):
+        return _empty_coding_retro(event_type)
+
+    item = _latest_progress_item(items, event_type)
+    if item is None:
+        return _empty_coding_retro(event_type)
+
+    detail = _progress_detail(item)
+    return {
+        "suggestions": _normalise_coding_retro_suggestions(
+            detail.get("suggestions")
+        ),
+        "window": _coding_retro_window(detail, params),
+        "generated_at": _string_or_none(
+            detail.get("generated_at") or item.get("created_at")
+        ),
+        "source_progress_id": _string_or_none(item.get("id")),
+        "event_type": event_type,
+        "summary": _short_text(item.get("summary", ""), 200),
+    }
+
+
+def _empty_coding_retro(event_type: str) -> dict[str, Any]:
+    return {
+        "suggestions": [],
+        "window": None,
+        "generated_at": None,
+        "source_progress_id": None,
+        "event_type": event_type,
+        "summary": "",
+    }
+
+
+def _latest_progress_item(
+    items: list[Any],
+    event_type: str,
+) -> dict[str, Any] | None:
+    newest: dict[str, Any] | None = None
+    newest_key: tuple[datetime, int] | None = None
+    for index, item in enumerate(items):
+        if not isinstance(item, dict) or item.get("event_type") != event_type:
+            continue
+        key = (_parse_progress_timestamp(item.get("created_at")), index)
+        if newest_key is None or key > newest_key:
+            newest = item
+            newest_key = key
+    return newest
+
+
+def _parse_progress_timestamp(value: Any) -> datetime:
+    if not isinstance(value, str) or not value:
+        return datetime.min.replace(tzinfo=timezone.utc)
+    try:
+        parsed = datetime.fromisoformat(value.replace("Z", "+00:00"))
+    except ValueError:
+        return datetime.min.replace(tzinfo=timezone.utc)
+    if parsed.tzinfo is None:
+        parsed = parsed.replace(tzinfo=timezone.utc)
+    return parsed.astimezone(timezone.utc)
+
+
+def _progress_detail(item: dict[str, Any]) -> dict[str, Any]:
+    detail = item.get("detail")
+    if detail is None:
+        detail = item.get("details")
+    if isinstance(detail, str):
+        try:
+            detail = json.loads(detail)
+        except ValueError:
+            return {}
+    if isinstance(detail, dict):
+        return detail
+    return {}
+
+
+def _normalise_coding_retro_suggestions(value: Any) -> list[dict[str, Any]]:
+    if not isinstance(value, list):
+        return []
+    suggestions: list[dict[str, Any]] = []
+    for raw in value:
+        suggestion = _normalise_coding_retro_suggestion(raw)
+        if suggestion is not None:
+            suggestions.append(suggestion)
+    return suggestions
+
+
+def _normalise_coding_retro_suggestion(raw: Any) -> dict[str, Any] | None:
+    if not isinstance(raw, dict):
+        return None
+    repo = _clean_scalar(
+        raw.get("repo") or raw.get("target_repo") or raw.get("repo_slug")
+    )
+    title = _clean_scalar(raw.get("title") or raw.get("summary"))
+    if not repo or not title:
+        return None
+    return {
+        "repo": repo,
+        "title": title,
+        "recommendation": _clean_scalar(
+            raw.get("recommendation") or raw.get("description") or raw.get("body")
+        ),
+        "priority": _normalise_coding_retro_priority(raw.get("priority")),
+        "score": _normalise_score(raw.get("score")),
+    }
+
+
+def _coding_retro_window(
+    detail: dict[str, Any],
+    params: dict[str, Any],
+) -> Any:
+    window = detail.get("window")
+    if window is not None:
+        return window
+    derived = {
+        key: detail.get(key)
+        for key in ("window_start", "window_end", "since", "until")
+        if detail.get(key) is not None
+    }
+    if derived:
+        return derived
+    if params.get("window_days") is not None:
+        return {
+            "days": _bounded_int(
+                params.get("window_days"),
+                default=7,
+                minimum=1,
+                maximum=366,
+            )
+        }
+    return None
+
+
+def _normalise_coding_retro_priority(value: Any) -> str:
+    priority = str(value or "medium").strip().lower()
+    if priority in {"high", "medium", "low"}:
+        return priority
+    return "medium"
+
+
+def _normalise_score(value: Any) -> float:
+    try:
+        return float(value)
+    except (TypeError, ValueError):
+        return 0.0
+
+
+def _bounded_int(value: Any, *, default: int, minimum: int, maximum: int) -> int:
+    try:
+        number = int(value)
+    except (TypeError, ValueError):
+        number = default
+    return max(minimum, min(maximum, number))
+
+
+def _clean_scalar(value: Any) -> str:
+    return " ".join(str(value or "").split())
+
+
+def _string_or_none(value: Any) -> str | None:
+    if value is None:
+        return None
+    return str(value)
+
+
 def _daily_triage_digest(params: dict[str, Any]) -> str:
    """Return a compact JSON string safe to inject into an instruction prompt.

--- a/tests/test_integration_event_bridge.py
+++ b/tests/test_integration_event_bridge.py
@@ -25,6 +25,7 @@ from activity_core.rules.models import TaskRef, TaskSpec

 _DEFINITIONS_DIR = Path(__file__).parent.parent / "activity-definitions"
 _SBOM_DEF_PATH = _DEFINITIONS_DIR / "weekly-sbom-staleness.md"
+_CODING_RETRO_DEF_PATH = _DEFINITIONS_DIR / "weekly-coding-retro.md"


 # ── Helpers ───────────────────────────────────────────────────────────────────
@@ -95,6 +96,69 @@ def test_sbom_definition_parses_correctly():
    assert defn.rules[0]["id"] == "flag-stale-sbom"


+def test_coding_retro_definition_parses_disabled_until_verified():
+    defn = parse_file(_CODING_RETRO_DEF_PATH)
+
+    assert defn.id == "weekly-coding-retro"
+    assert defn.enabled is False
+    assert defn.trigger_config["trigger_type"] == "cron"
+    assert defn.trigger_config["cron_expression"] == "0 19 * * 6"
+    assert defn.trigger_config["timezone"] == "Europe/Berlin"
+    assert defn.context_sources == [
+        {
+            "type": "state-hub",
+            "query": "coding_retro",
+            "params": {"window_days": 7, "limit": 100},
+            "bind_to": "context.retro",
+        }
+    ]
+    assert len(defn.rules) == 1
+    assert defn.rules[0]["id"] == "propose-weekly-improvements"
+
+
+def test_coding_retro_rule_emits_one_task_per_positive_suggestion():
+    defn = parse_file(_CODING_RETRO_DEF_PATH)
+    rule = defn.rules[0]
+    context = {
+        "retro": {
+            "suggestions": [
+                {
+                    "repo": "activity-core",
+                    "title": "Harden coding retro smoke gates",
+                    "recommendation": "Dry-run with fixture and live hub evidence.",
+                    "priority": "high",
+                    "score": 8.5,
+                },
+                {
+                    "repo": "quiet-repo",
+                    "title": "Do not emit zero-score suggestion",
+                    "recommendation": "This should stay quiet.",
+                    "priority": "low",
+                    "score": 0,
+                },
+            ]
+        }
+    }
+
+    specs = expand_rule_actions([rule], _EmptyEvent(), context)
+
+    assert specs == [
+        {
+            "title": "Harden coding retro smoke gates",
+            "description": "Dry-run with fixture and live hub evidence.",
+            "target_repo": "activity-core",
+            "priority": "high",
+            "labels": ["coding-retro", "improvement", "automated"],
+            "due_in_days": None,
+            "source_type": "rule",
+            "source_id": "propose-weekly-improvements",
+            "triggering_event_id": "",
+            "activity_definition_id": "",
+            "condition": "context.s.score > 0",
+        }
+    ]
+
+
 def test_pipeline_emits_one_task_for_stale_repo_only():
    """Stale repo (45 days) matches; fresh repo (10 days) does not."""
    defn = parse_file(_SBOM_DEF_PATH)
--- a/tests/test_state_hub_context_resolver.py
+++ b/tests/test_state_hub_context_resolver.py
@@ -157,6 +157,128 @@ def test_repo_sbom_status_returns_empty_on_failure(monkeypatch) -> None:
    assert resolver.resolve("repo_sbom_status", None, {"repos": "all"}) == {}


+def test_coding_retro_returns_latest_progress_suggestions(monkeypatch) -> None:
+    calls: list[dict[str, Any]] = []
+
+    def fake_get(url: str, **kwargs: Any) -> DummyResponse:
+        calls.append({"url": url, **kwargs})
+        return DummyResponse([
+            {
+                "id": "older-retro",
+                "event_type": "coding_retro",
+                "summary": "older",
+                "created_at": "2026-05-31T17:00:00Z",
+                "detail": {
+                    "generated_at": "2026-05-31T17:00:00Z",
+                    "suggestions": [
+                        {
+                            "repo": "old-repo",
+                            "title": "Old recommendation",
+                            "recommendation": "Do the older thing.",
+                            "priority": "low",
+                            "score": 1,
+                        }
+                    ],
+                },
+            },
+            {
+                "id": "note-1",
+                "event_type": "note",
+                "summary": "ignore me",
+                "created_at": "2026-06-07T17:05:00Z",
+                "detail": {},
+            },
+            {
+                "id": "newer-retro",
+                "event_type": "coding_retro",
+                "summary": "weekly coding retro ready",
+                "created_at": "2026-06-07T17:10:00Z",
+                "detail": {
+                    "generated_at": "2026-06-07T17:09:30Z",
+                    "window": {
+                        "since": "2026-05-31T00:00:00Z",
+                        "until": "2026-06-07T00:00:00Z",
+                    },
+                    "suggestions": [
+                        {
+                            "target_repo": "activity-core",
+                            "title": "Harden schedule smoke gates",
+                            "description": "Add a smoke proof before enablement.",
+                            "priority": "HIGH",
+                            "score": "8.5",
+                        },
+                        {
+                            "repo_slug": "repo-without-title",
+                            "recommendation": "missing title should be skipped",
+                            "score": 9,
+                        },
+                    ],
+                },
+            },
+        ])
+
+    monkeypatch.setenv("STATE_HUB_URL", "http://state-hub.test/")
+    monkeypatch.setattr(httpx, "get", fake_get)
+
+    result = StateHubContextResolver().resolve(
+        "coding_retro",
+        None,
+        {"limit": 20, "window_days": 7},
+    )
+
+    assert calls == [
+        {
+            "url": "http://state-hub.test/progress/",
+            "params": {"limit": 20},
+            "timeout": 10.0,
+        }
+    ]
+    assert result["source_progress_id"] == "newer-retro"
+    assert result["generated_at"] == "2026-06-07T17:09:30Z"
+    assert result["window"] == {
+        "since": "2026-05-31T00:00:00Z",
+        "until": "2026-06-07T00:00:00Z",
+    }
+    assert result["summary"] == "weekly coding retro ready"
+    assert result["suggestions"] == [
+        {
+            "repo": "activity-core",
+            "title": "Harden schedule smoke gates",
+            "recommendation": "Add a smoke proof before enablement.",
+            "priority": "high",
+            "score": 8.5,
+        }
+    ]
+
+
+def test_coding_retro_returns_empty_shape_when_not_published(monkeypatch) -> None:
+    def fake_get(url: str, **kwargs: Any) -> DummyResponse:
+        return DummyResponse([
+            {
+                "id": "note-1",
+                "event_type": "note",
+                "created_at": "2026-06-07T17:10:00Z",
+            }
+        ])
+
+    monkeypatch.setattr(httpx, "get", fake_get)
+
+    result = StateHubContextResolver().resolve(
+        "coding_retro",
+        None,
+        {"event_type": "coding_retro"},
+    )
+
+    assert result == {
+        "suggestions": [],
+        "window": None,
+        "generated_at": None,
+        "source_progress_id": None,
+        "event_type": "coding_retro",
+        "summary": "",
+    }
+
+
 def test_resolver_failure_returns_empty(monkeypatch) -> None:
    def fake_get(url: str, **kwargs: Any) -> DummyResponse:
        raise httpx.ConnectError("offline")
--- a/workplans/ACTIVITY-WP-0008-weekly-coding-retro.md
+++ b/workplans/ACTIVITY-WP-0008-weekly-coding-retro.md
@@ -0,0 +1,94 @@
+---
+id: ACTIVITY-WP-0008
+type: workplan
+title: "Weekly Coding Retrospection schedule (Saturday evenings)"
+domain: custodian
+repo: activity-core
+status: blocked
+owner: codex
+topic_slug: custodian
+created: "2026-06-07"
+updated: "2026-06-07"
+state_hub_workstream_id: "7387fc50-1f2c-471a-9d85-bb085cbd0b63"
+---
+
+# Weekly Coding Retrospection schedule (Saturday evenings)
+
+**Origin:** requested from the `helix_forge` domain. Every Saturday 19:00
+Europe/Berlin, read the previous week's coding-session analysis (published to the
+hub by helix_forge session-memory) and open **one improvement suggestion per
+relevant repo — the three most promising**.
+
+This is the same shape as the existing `weekly-sbom-staleness` activity-definition
+(cron → context resolver → per-repo task emission); only the data source is new.
+
+**Dependency:** `AGENTIC-WP-0010` (helix_forge) publishes the
+`event_type=coding_retro` read model this schedule consumes. That side computes
+and ranks (top-3 per repo, cross-flavor first, recommendations from the Pattern
+Catalog); this side schedules and routes.
+
+## `coding_retro` Context Resolver
+
+```task
+id: ACTIVITY-WP-0008-T01
+status: done
+priority: high
+state_hub_task_id: "26846304-f5f1-4edf-aba3-227c9b11c9fa"
+```
+
+Add a context resolver (`context_resolvers/`) returning the latest weekly
+coding-retro published to the hub (`event_type=coding_retro`): its
+`suggestions[]` (repo, title, recommendation, priority, score), window, and
+`generated_at`. Bind under `context.retro`. Mirror the `repo_sbom_status` resolver
+shape so rules can `for_each` over `context.retro.suggestions`.
+
+**2026-06-07:** Implemented `query: coding_retro` in the State Hub context
+resolver. It reads recent `/progress/` items, selects the latest
+`event_type=coding_retro`, normalizes `suggestions[]`, and returns an empty
+suggestion list while the upstream publisher has not produced a read model yet.
+
+## `weekly-coding-retro` Activity-Definition
+
+```task
+id: ACTIVITY-WP-0008-T02
+status: done
+priority: high
+state_hub_task_id: "09eeacb7-dc0d-4617-8398-a99a4e5a227e"
+```
+
+Add `activity-definitions/weekly-coding-retro.md`: cron `0 19 * * 6`
+Europe/Berlin, `context_source` `coding_retro`, and a rule that `for_each` over
+`context.retro.suggestions` emits one improvement task to `target_repo` with the
+suggestion title + recommendation, priority, and labels
+`[coding-retro, improvement, automated]`. Ship `enabled: false` until the resolver
+ publish are verified. A starter draft is provided at
+`activity-definitions/weekly-coding-retro.md` (proposed by helix_forge).
+
+**2026-06-07:** Updated the starter definition against the implemented resolver:
+cron Saturday 19:00 Europe/Berlin, `context_source` `coding_retro` bound to
+`context.retro`, and a rule that emits one positive-score suggestion per target
+repo with the coding-retro/improvement/automated labels. It remains
+`enabled: false` until live publish verification succeeds.
+
+## Dry-Run Verify + Enable + Docs
+
+```task
+id: ACTIVITY-WP-0008-T03
+status: wait
+priority: medium
+state_hub_task_id: "9dcbebe7-13dd-4957-9a72-858418049aef"
+```
+
+Dry-run the definition end-to-end against a published `coding_retro` read model;
+confirm one task per relevant repo (≤ 3) with correct routing and no duplicates on
+re-run. Flip `enabled: true`. Document alongside `weekly-sbom-staleness`. After
+workplan updates, run from `~/state-hub`:
+
+```bash
+make fix-consistency REPO=activity-core
+```
+
+**2026-06-07:** Added fixture-level dry-run coverage and runbook documentation.
+Live State Hub did not yet expose a published `event_type=coding_retro` progress
+item, so the real dry-run, duplicate check, and `enabled: true` flip remain
+blocked on `AGENTIC-WP-0010`.