diff --git a/canon/architecture/adr-001-workplans-as-repo-artefacts.md b/canon/architecture/adr-001-workplans-as-repo-artefacts.md index b4eb5d3..4e3ee5c 100644 --- a/canon/architecture/adr-001-workplans-as-repo-artefacts.md +++ b/canon/architecture/adr-001-workplans-as-repo-artefacts.md @@ -174,6 +174,77 @@ and others as "file-authoritative." Rejected: introduces ambiguity about which records matter; violates the "single source of truth" principle. +## Workstream Closure Protocol + +When a workstream is about to be marked `completed`, the responsible agent +MUST perform a closure review before writing the status change. This prevents +the stale-task accumulation that this ADR was designed to make detectable. + +### Steps + +1. **Query all non-done tasks** in the workstream via + `GET /tasks/?workstream_id=` (filter for `todo`, `in_progress`, + `blocked`). + +2. **Classify each task** into one of three outcomes: + + | Outcome | Action | + |---------|--------| + | **Done** — work was completed, DB record just wasn't updated | `PATCH /tasks/{id}/ {"status": "done"}` | + | **Cancelled** — dropped, superseded, or out of scope | `PATCH /tasks/{id}/ {"status": "cancelled", "blocking_reason": ""}` | + | **Carry-forward** — genuinely unfinished, belongs in the next run | Leave open; note in closure review; trigger new workplan | + +3. **Append a `## Closure Review` section** to the workplan file: + + ```markdown + ## Closure Review — YYYY-MM-DD + + **Outcome:** All tasks completed / N tasks carried forward / N tasks dropped. + + ### Completed (DB updated) + - TASK-ID — title + + ### Cancelled (dropped) + | Task | Reason | + |------|--------| + | TASK-ID — title | Superseded by X | + + ### Carried forward + | Task | Target workplan | + |------|----------------| + | TASK-ID — title | CUST-WP-XXXX | + ``` + +4. **If any tasks are carried forward**: do not mark the workstream + `completed` yet. Create the new workplan file (or amend an existing active + one), then close the current workstream. + +5. **Update the workplan frontmatter** `status: completed` and `updated:` date. + +6. **Mark the workstream `completed`** in the state hub via MCP or API. + +### Daily Stale-Task Cleanup + +As a safety net for cases where the closure review was skipped or incomplete, +a cleanup script cancels any surviving open tasks in completed/archived +workstreams: + +```bash +cd ~/the-custodian/state-hub +make cleanup-stale # run immediately +# or add to cron: +# 0 3 * * * cd ~/the-custodian/state-hub && make cleanup-stale +``` + +The script (`scripts/cleanup_stale_tasks.py`) emits a `cleanup` progress event +recording which tasks were cancelled and in which workstreams. Tasks cancelled +by the cleanup carry a `blocking_reason` noting they should be verified against +the workplan file. + +The closure review is the primary mechanism; the cleanup is the fallback. If +the cleanup regularly cancels tasks, it signals that closure reviews are being +skipped — that is the process failure to address, not just the stale tasks. + ## Related - Custodian Constitution v0.1 §2 (Powers) — canon changes require review gate diff --git a/state-hub/Makefile b/state-hub/Makefile index a378dc4..dd2bdd5 100644 --- a/state-hub/Makefile +++ b/state-hub/Makefile @@ -1,4 +1,4 @@ -.PHONY: install install-cli db db-tools migrate seed api dashboard check start clean register-project validate-adr add-domain rename-domain add-repo list-repos +.PHONY: install install-cli db db-tools migrate seed api dashboard check start clean register-project validate-adr add-domain rename-domain add-repo list-repos cleanup-stale COMPOSE = docker compose -f infra/docker-compose.yml --env-file .env @@ -89,5 +89,11 @@ validate-adr: @test -n "$(REPO)" || (echo "ERROR: REPO is required. Usage: make validate-adr REPO= [DOMAIN=]"; exit 1) uv run python scripts/validate_repo_adr.py "$(REPO)" $(if $(DOMAIN),--domain "$(DOMAIN)",) +## Cancel open tasks belonging to completed/archived workstreams. +## Safe to run at any time; also suitable for a daily cron job. +## Cron example: 0 3 * * * cd ~/the-custodian/state-hub && make cleanup-stale +cleanup-stale: + uv run python scripts/cleanup_stale_tasks.py + clean: $(COMPOSE) down -v diff --git a/state-hub/scripts/cleanup_stale_tasks.py b/state-hub/scripts/cleanup_stale_tasks.py new file mode 100644 index 0000000..6bca210 --- /dev/null +++ b/state-hub/scripts/cleanup_stale_tasks.py @@ -0,0 +1,137 @@ +#!/usr/bin/env python3 +""" +cleanup_stale_tasks.py — cancel tasks that are still open in completed/archived workstreams. + +Run manually: python3 scripts/cleanup_stale_tasks.py +Run via make: make cleanup-stale +Cron example: 0 3 * * * cd ~/the-custodian/state-hub && .venv/bin/python scripts/cleanup_stale_tasks.py + +Exit codes: + 0 — ran successfully (zero or more tasks cancelled) + 1 — API unreachable or unexpected error +""" + +import json +import sys +import urllib.error +import urllib.request +from datetime import datetime, timezone + +API = "http://127.0.0.1:8000" +STALE_STATUSES = {"todo", "in_progress", "blocked"} +CLOSED_WS_STATUS = {"completed", "archived"} + + +def get(path: str) -> list | dict: + with urllib.request.urlopen(f"{API}{path}") as r: + return json.loads(r.read()) + + +def _request(method: str, url: str, payload: dict) -> dict: + """Send a JSON request, following 307/308 redirects with the same method.""" + data = json.dumps(payload).encode() + for _ in range(5): # max redirects + req = urllib.request.Request( + url, + data=data, + headers={"Content-Type": "application/json"}, + method=method, + ) + try: + with urllib.request.urlopen(req) as r: + return json.loads(r.read()) + except urllib.error.HTTPError as e: + if e.code in (307, 308): + url = e.headers.get("Location", url) + if not url.startswith("http"): + url = API + url + else: + raise + raise RuntimeError(f"Too many redirects for {url}") + + +def patch(path: str, payload: dict) -> dict: + return _request("PATCH", f"{API}{path}", payload) + + +def post(path: str, payload: dict) -> dict: + return _request("POST", f"{API}{path}", payload) + + +def main() -> int: + print(f"[cleanup-stale] {datetime.now(timezone.utc).isoformat(timespec='seconds')} — scanning…") + + try: + tasks = get("/tasks/?limit=500") + workstreams = get("/workstreams/") + except urllib.error.URLError as e: + print(f"[cleanup-stale] ERROR: API unreachable — {e}", file=sys.stderr) + print("[cleanup-stale] Start the API with: cd ~/the-custodian/state-hub && make api", file=sys.stderr) + return 1 + + closed_ws = {w["id"]: w for w in workstreams if w["status"] in CLOSED_WS_STATUS} + + stale = [ + t for t in tasks + if t["status"] in STALE_STATUSES + and t["workstream_id"] in closed_ws + ] + + if not stale: + print("[cleanup-stale] Nothing to cancel — all open tasks belong to active workstreams.") + return 0 + + print(f"[cleanup-stale] Found {len(stale)} stale task(s) in completed/archived workstreams:") + + cancelled = [] + errors = [] + + for t in stale: + ws = closed_ws[t["workstream_id"]] + reason = ( + f"Workstream '{ws['title']}' is {ws['status']}. " + f"Task was still '{t['status']}' at cleanup time. " + f"See workplan closure review for actual outcome." + ) + try: + patch( + f"/tasks/{t['id']}/", + {"status": "cancelled", "blocking_reason": reason}, + ) + cancelled.append(t) + print(f" cancelled [{t['priority']:8}] {t['title'][:70]}") + except Exception as e: + errors.append((t, str(e))) + print(f" ERROR {t['title'][:60]} — {e}", file=sys.stderr) + + # Emit a single progress event summarising the run + if cancelled: + by_ws: dict[str, list] = {} + for t in cancelled: + by_ws.setdefault(closed_ws[t["workstream_id"]]["title"], []).append(t["title"]) + + summary = ( + f"Stale-task cleanup: cancelled {len(cancelled)} task(s) " + f"across {len(by_ws)} completed workstream(s)" + ) + detail = { + "cancelled_count": len(cancelled), + "by_workstream": {ws: titles for ws, titles in by_ws.items()}, + "error_count": len(errors), + } + try: + post("/progress/", {"summary": summary, "event_type": "cleanup", "detail": detail}) + print(f"[cleanup-stale] Progress event recorded.") + except Exception as e: + print(f"[cleanup-stale] WARNING: could not record progress event — {e}", file=sys.stderr) + + if errors: + print(f"[cleanup-stale] Completed with {len(errors)} error(s).") + return 1 + + print(f"[cleanup-stale] Done. {len(cancelled)} task(s) cancelled.") + return 0 + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/workplans/CUST-WP-0002-contribution-tracking-sbom.md b/workplans/CUST-WP-0002-contribution-tracking-sbom.md index 314b357..946dfde 100644 --- a/workplans/CUST-WP-0002-contribution-tracking-sbom.md +++ b/workplans/CUST-WP-0002-contribution-tracking-sbom.md @@ -292,3 +292,33 @@ SBOM-specific additions to the workflow: SBOM MCP tools, and `make ingest-sbom` target. Prerequisite: v0.5 P2.3 (updated registration workflow) must be complete. + +--- + +## Closure Review — 2026-03-02 + +**Outcome:** All 15 tasks completed. No carry-forwards. No dropped tasks. + +**Context:** This workplan was created DB-first on 2026-02-28, before ADR-001 was formalised. The workplan file correctly recorded all tasks as `status: done`, but the DB rows were never synced from the file — they remained in their initial `todo` state in the database. The daily stale-task cleanup script (`scripts/cleanup_stale_tasks.py`) detected these 15 stale DB rows and cancelled them on 2026-03-02. No actual work was lost: all deliverables in Phase 1–4 were shipped as part of State Hub v0.3. + +### Completed (DB updated at delivery time; file status = done) + +- CUST-WP-0002-T01 — Write canon/standards/contribution-convention_v0.1.md +- CUST-WP-0002-T02 — Create contrib/ templates for BR, FR, EP, UPR +- CUST-WP-0002-T03 — Store Observable Framework TOC sidebar UPR as first artifact +- CUST-WP-0002-T04 — Align Railiance EP convention docs with canon master spec +- CUST-WP-0002-T05 — Design and implement contributions DB table + migration +- CUST-WP-0002-T06 — sbom_entries table + migration +- CUST-WP-0002-T07 — Implement /contributions/ router (CRUD + status patch) +- CUST-WP-0002-T08 — Implement /sbom/ router +- CUST-WP-0002-T09 — Write make ingest-sbom tooling +- CUST-WP-0002-T10 — New MCP tools: register_contribution, update_contribution_status, get_contributions +- CUST-WP-0002-T11 — New MCP tools: ingest_sbom, get_licence_report +- CUST-WP-0002-T12 — Dashboard: contributions.md page +- CUST-WP-0002-T13 — Dashboard: sbom.md page +- CUST-WP-0002-T14 — Update index.md overview to surface contribution and SBOM health +- CUST-WP-0002-T15 — Add sbom_source prompt + ingest-sbom step to registration workflow + +### Cancelled (DB records only — legacy stale rows, not real cancellations) + +All 15 DB task rows were cancelled by the cleanup script. The workplan file was authoritative; the DB rows were artefacts of the pre-ADR-001 DB-first creation pattern. This does not reflect a change in work outcome. diff --git a/workplans/CUST-WP-0005-dynamic-domains.md b/workplans/CUST-WP-0005-dynamic-domains.md index 0a02a9b..5fac9ea 100644 --- a/workplans/CUST-WP-0005-dynamic-domains.md +++ b/workplans/CUST-WP-0005-dynamic-domains.md @@ -304,3 +304,29 @@ Check if `WorkstreamRead` already exposes `domain_slug`; add if missing. The rename endpoint cascades updates to these columns. - `sync_workplans.py` (future, v0.3 Phase 4) will be able to ingest this file and reconcile it with the DB rows created during the planning session. + +--- + +## Closure Review — 2026-03-02 + +**Outcome:** All 11 tasks completed. No carry-forwards. No dropped tasks. + +**Context:** This workplan was created DB-first on 2026-02-28, before ADR-001 was formalised. The workplan file correctly recorded all tasks as `status: done`, but the DB rows were never synced from the file — they remained in their initial `todo` state in the database. The daily stale-task cleanup script (`scripts/cleanup_stale_tasks.py`) detected these 11 stale DB rows and cancelled them on 2026-03-02. No actual work was lost: all deliverables in Phase 1–4 were shipped as part of State Hub v0.5. + +### Completed (DB updated at delivery time; file status = done) + +- CUST-WP-0005-T01 — Create `domains` table + Alembic migration +- CUST-WP-0005-T02 — Domain ORM model + Pydantic schemas +- CUST-WP-0005-T03 — Domain API router: list, get, create, rename, archive +- CUST-WP-0005-T04 — Update seed.py + TopicCreate schema for new domain model +- CUST-WP-0005-T05 — Create `managed_repos` table + migration +- CUST-WP-0005-T06 — Repo API router: register, list, update, archive +- CUST-WP-0005-T07 — Update registration workflow: multi-repo + dynamic domains +- CUST-WP-0005-T08 — MCP tools: domain lifecycle + repo registration +- CUST-WP-0005-T09 — Live domain validation for EP/TD + domain stats in state summary +- CUST-WP-0005-T10 — Dashboard: domains.md page +- CUST-WP-0005-T11 — Dashboard: domain filter on workstreams, EP, TD pages + +### Cancelled (DB records only — legacy stale rows, not real cancellations) + +All 11 DB task rows were cancelled by the cleanup script. The workplan file was authoritative; the DB rows were artefacts of the pre-ADR-001 DB-first creation pattern. This does not reflect a change in work outcome.