diff --git a/workplans/IHUB-WP-0004-ihf-phase4-outcome-observation-and-antifragility.md b/workplans/IHUB-WP-0004-ihf-phase4-outcome-observation-and-antifragility.md new file mode 100644 index 0000000..44465fb --- /dev/null +++ b/workplans/IHUB-WP-0004-ihf-phase4-outcome-observation-and-antifragility.md @@ -0,0 +1,387 @@ +--- +id: IHUB-WP-0004 +type: workplan +title: "IHF Phase 4 — Outcome Observation and Antifragility Loop" +domain: inter_hub +repo: inter-hub +status: active +owner: custodian +topic_slug: inter_hub +created: "2026-03-29" +updated: "2026-03-29" +state_hub_workstream_id: "07e9c860-e39b-407f-9c0b-d44989498b48" +--- + +# IHF Phase 4 — Outcome Observation and Antifragility Loop + +## Goal + +Close the improvement loop by observing whether implemented changes actually +helped. Phase 3 established the governance layer — requirements, decisions, +policy constraints, and implementation references. Phase 4 connects those +implementation references to deployed versions, captures outcome signals per +widget, compares behaviour before and after a change, and detects regressions +and recurring friction. + +## Background + +Phase 1 (IHUB-WP-0001) delivered the Minimal Interaction Core. Phase 2 +(IHUB-WP-0002) delivered Structured Feedback and Triage. Phase 3 +(IHUB-WP-0003) delivered Governance and Decision Linkage. All Phase 3 exit +criteria are met. + +Phase 4 is the fourth of eight phases in the IHF specification +(`specs/InteractionHubFrameworkSpecification_v0.1.md`, §14 Phase 4). It +completes the antifragility loop: + +``` +Widget → InteractionEvent / Annotation + → RequirementCandidate → Requirement + → DecisionRecord → ImplementationChangeReference + → DeploymentRecord + → OutcomeSignal ← ChangeEvaluation + → RegressionDetection / RecurrenceTracking +``` + +**Technology stack:** IHP v1.5 (Haskell, Nix), PostgreSQL, AutoRefresh +(antifragility dashboard). Outcome signals and deployment records are +append-only / immutable by the same conventions as InteractionEvent and +TriageState. + +Reference: `docs/ihp-overview.md`, `docs/ihp-data-and-queries.md`, +`docs/ihp-controllers-views-forms.md`, `docs/ihp-realtime.md`. + +## Phase 4 Exit Criteria (from IHF spec §14 Phase 4) + +- The platform can determine whether a change improved outcomes +- Recurrent friction becomes visible +- The system supports evidence-based UI evolution + +## Data Artifacts Introduced (Phase 4) + +`DeploymentRecord`, `OutcomeSignal`, `ChangeEvaluation` + +--- + +## Tasks + +### T01 — Schema: DeploymentRecord, OutcomeSignal, ChangeEvaluation + +```task +id: IHUB-WP-0004-T01 +status: todo +priority: high +state_hub_task_id: "4d0aa6d5-f291-4053-a487-8c64627f8271" +``` + +Add Phase 4 tables to `Application/Schema.sql` and write migration: + +```sql +CREATE TABLE deployment_records ( + id UUID DEFAULT uuid_generate_v4() PRIMARY KEY NOT NULL, + impl_ref_id UUID REFERENCES implementation_change_references(id) ON DELETE SET NULL, + decision_id UUID NOT NULL REFERENCES decision_records(id) ON DELETE RESTRICT, + version_ref TEXT NOT NULL, + deployed_at TIMESTAMP WITH TIME ZONE DEFAULT now() NOT NULL, + deployed_by UUID REFERENCES users(id), + notes TEXT, + created_at TIMESTAMP WITH TIME ZONE DEFAULT now() NOT NULL +); + +CREATE INDEX deployment_records_decision_id_idx ON deployment_records (decision_id); +CREATE INDEX deployment_records_deployed_at_idx ON deployment_records (deployed_at DESC); + +-- Outcome signals — append-only, no update/delete +CREATE TABLE outcome_signals ( + id UUID DEFAULT uuid_generate_v4() PRIMARY KEY NOT NULL, + widget_id UUID NOT NULL REFERENCES widgets(id) ON DELETE CASCADE, + deployment_id UUID NOT NULL REFERENCES deployment_records(id) ON DELETE CASCADE, + signal_type TEXT NOT NULL, + value NUMERIC, + observed_at TIMESTAMP WITH TIME ZONE DEFAULT now() NOT NULL +); + +CREATE INDEX outcome_signals_widget_id_idx ON outcome_signals (widget_id); +CREATE INDEX outcome_signals_deployment_id_idx ON outcome_signals (deployment_id); +CREATE INDEX outcome_signals_observed_at_idx ON outcome_signals (observed_at DESC); + +-- Enforce append-only on outcome_signals +CREATE OR REPLACE FUNCTION prevent_outcome_signal_mutation() +RETURNS TRIGGER AS $$ +BEGIN + RAISE EXCEPTION 'outcome_signals is append-only: UPDATE and DELETE are not permitted'; +END; +$$ LANGUAGE plpgsql; + +CREATE TRIGGER outcome_signals_no_update + BEFORE UPDATE ON outcome_signals + FOR EACH ROW EXECUTE FUNCTION prevent_outcome_signal_mutation(); + +CREATE TRIGGER outcome_signals_no_delete + BEFORE DELETE ON outcome_signals + FOR EACH ROW EXECUTE FUNCTION prevent_outcome_signal_mutation(); + +-- Change evaluations — one per deployment +CREATE TABLE change_evaluations ( + id UUID DEFAULT uuid_generate_v4() PRIMARY KEY NOT NULL, + deployment_id UUID NOT NULL REFERENCES deployment_records(id) ON DELETE CASCADE, + decision_id UUID REFERENCES decision_records(id) ON DELETE SET NULL, + score SMALLINT NOT NULL CHECK (score BETWEEN 1 AND 5), + rationale TEXT NOT NULL, + evaluated_by UUID REFERENCES users(id), + evaluated_at TIMESTAMP WITH TIME ZONE DEFAULT now() NOT NULL, + UNIQUE (deployment_id) +); + +CREATE INDEX change_evaluations_deployment_id_idx ON change_evaluations (deployment_id); +``` + +- Valid `outcome_signals.signal_type` values: `improved`, `regressed`, `neutral`, `inconclusive` +- `deployment_records` is immutable (no update/delete — append-only by convention; controller enforces) +- `change_evaluations` has a UNIQUE constraint on `deployment_id` — one evaluation per deployment +- Verify Haskell types are generated correctly + +**Exit criteria:** `migrate` runs cleanly; all Phase 4 types available in GHCi. + +--- + +### T02 — DeploymentRecord controller and views + +```task +id: IHUB-WP-0004-T02 +status: todo +priority: high +state_hub_task_id: "4932b036-fe91-4146-9b35-7d3031894c2d" +``` + +1. Scaffold `DeploymentRecordsController` +2. Actions: index, show, new, create (no update/delete — immutable) +3. Fields: `decisionId` (select — required), `implRefId` (optional select from linked + impl refs), `versionRef` (free text, e.g. `v1.2.3`, `git:abc1234`), `deployedAt`, + `notes` +4. Index: table with decision title, version_ref, deployed_at, outcome signal count, + evaluation score (if present) +5. Show: full detail + linked decision chain (→ requirement → candidate → widget) + + list of outcome signals + change evaluation panel +6. Add "New Deployment" button on decision show page (only for decisions with at least + one impl ref) + +**Exit criteria:** Deployment records can be created and viewed; linked back to the +full decision → requirement → widget chain. + +--- + +### T03 — OutcomeSignal capture + +```task +id: IHUB-WP-0004-T03 +status: todo +priority: high +state_hub_task_id: "8b39bbb3-4129-4acc-97ac-38ecfcfd7c88" +``` + +1. `RecordOutcomeSignalAction { deploymentId }` (POST from deployment show page or + widget show page) +2. Fields: `signalType` (select: improved/regressed/neutral/inconclusive), + `value` (optional 0–100), `observedAt` (default now) +3. Append-only — no edit/delete in UI (DB trigger enforces) +4. List signals on deployment show page ordered by `observedAt` DESC +5. List signals on widget show page (last 10, across all deployments) +6. Signal type color roles: + - `improved` → green + - `regressed` → red + - `neutral` → gray + - `inconclusive` → yellow/amber + +**Exit criteria:** Outcome signals can be recorded from deployment and widget pages; +append-only constraint verified; color roles applied consistently. + +--- + +### T04 — Pre/post comparison: interaction behaviour before and after deployment + +```task +id: IHUB-WP-0004-T04 +status: todo +priority: high +state_hub_task_id: "27c4de52-755a-40e7-bef3-986fe4470f7c" +``` + +1. `ComparisonAction { deploymentId }` — rendered on deployment show page as a panel +2. Time windows: 30 days before `deployed_at` vs 30 days after (or until present) +3. Metrics computed via SQL aggregates (no in-memory processing for large datasets): + - Interaction event count by type + - Total annotation count + - Annotation severity distribution (low/medium/high/critical counts) + - High/critical annotation rate (high+critical / total) +4. Render side-by-side comparison table: Before | After | Delta +5. Delta column: green if annotation rate decreased, red if increased, gray if flat + +**Exit criteria:** Comparison panel renders correct before/after counts; delta +direction color-coded correctly; works with no post-deployment data (shows "—"). + +--- + +### T05 — Regression detection + +```task +id: IHUB-WP-0004-T05 +status: todo +priority: high +state_hub_task_id: "844a828b-b7d7-4822-becc-b377c08c673a" +``` + +1. A **regression** is defined as: a widget that has an `OutcomeSignal(improved)` + for a deployment, followed by a new `Annotation(severity IN ['high','critical'])` + created more than 1 day after the signal's `observed_at` (grace period) +2. `RegressionQuery` (pure SQL query, no controller action — used by dashboard and + widget show page): + ```sql + -- widgets with improved signal then subsequent high/critical annotation + ``` +3. Surface on widget show page: regression warning badge if widget is in regression +4. Surface on governance dashboard: regression count in KPI row, list of regressed + widgets +5. Surface on antifragility dashboard (T08): prominent regression alert panel + +**Exit criteria:** Regression query returns correct results; badge visible on +affected widgets; count accurate on dashboard. + +--- + +### T06 — ChangeEvaluation: score changes by observed effect + +```task +id: IHUB-WP-0004-T06 +status: todo +priority: medium +state_hub_task_id: "391c6136-baea-417a-9291-5ba9f633e03f" +``` + +1. `EvaluateChangeAction { deploymentId }` (POST from deployment show page) +2. Idempotent: if `change_evaluations.deployment_id` already exists, redirect to + deployment show page with "Already evaluated" message +3. Fields: `score` (1–5, required), `rationale` (textarea, required) +4. Show evaluation on deployment show page: score as ★ stars + rationale +5. Show evaluation summary on decision show page: "Deployment evaluated: ★★★★☆" +6. Score 1–2 → red, 3 → yellow, 4–5 → green (in all views) + +**Exit criteria:** One evaluation per deployment; idempotent; score displayed with +correct color in all views. + +--- + +### T07 — Recurrence tracking: detect repeated unresolved friction + +```task +id: IHUB-WP-0004-T07 +status: todo +priority: medium +state_hub_task_id: "6a5eea23-4e73-441a-ab93-49f5b76bf3ea" +``` + +1. A **recurrence** is: a widget that has had 2 or more `RequirementCandidate`s + created in separate decision cycles (a new cycle begins after a prior candidate + for the same widget was `accepted` and a `DeploymentRecord` exists for that + decision) +2. `RecurrenceQuery` (SQL): per widget, count completed cycles and flag widgets + with cycle_count ≥ 2 +3. Widget show page: recurrence count badge ("⟳ 3 cycles") if cycle_count ≥ 2 +4. Antifragility dashboard: recurrence leaderboard — top 10 widgets by cycle count, + sortable +5. Recurrence is informational only — no automated blocking + +**Exit criteria:** Recurrence count accurate per widget; leaderboard renders; badge +visible on widget show page. + +--- + +### T08 — Antifragility dashboard (AutoRefresh) + +```task +id: IHUB-WP-0004-T08 +status: todo +priority: high +state_hub_task_id: "e5c65c77-c757-49a6-a8e9-99c8d3503f59" +``` + +1. Add `AntifragilityDashboardAction { hubId }` to `HubsController` wrapped with + `autoRefresh do` +2. Dashboard panels: + - **KPI row**: total deployments / avg evaluation score / % improved signals / + regression count + - **Open gaps**: decisions with impl refs but no deployment record yet + - **Recent deployments** (last 20): version_ref, decision title, signal summary, + evaluation score + - **Regression alerts**: widgets currently in regression state + - **Recurrence leaderboard**: top 10 widgets by cycle count +3. Link from hub Show page alongside Triage Dashboard and Governance Dashboard + +**Exit criteria:** Dashboard live-updates on deployment/signal/evaluation changes. +All five panels render with correct data. + +--- + +### T09 — Phase 4 gate: tests, consistency, docs + +```task +id: IHUB-WP-0004-T09 +status: todo +priority: high +state_hub_task_id: "1dda0a32-4913-4007-a9f4-1d86761a8cf1" +``` + +1. **Integration tests** (`Test/`): + - DeploymentRecord create + link to decision + - OutcomeSignal append-only (DB trigger fires on update/delete) + - Pre/post comparison: correct counts with known fixture data + - Regression detection: widget with improved signal + subsequent high annotation + - ChangeEvaluation create + idempotent (second create → duplicate rejection) + - Recurrence count: widget with 2 completed cycles + - Antifragility dashboard data fetch: compiles and returns correct counts +2. **Consistency sync** via State Hub MCP: + `check_repo_consistency(repo_slug="inter-hub", fix=True)` +3. **Documentation updates:** + - Update `SCOPE.md` current state section: Phase 4 complete + - Write `docs/phase4-summary.md`: what was built, known limitations, Phase 5 + readiness +4. **Smoke test checklist:** + - Create deployment record linked to a decision + - Record outcome signals (improved, then regressed) + - Observe pre/post comparison panel + - Evaluate the change (score 4) + - Confirm regression badge appears on widget show page + - Confirm antifragility dashboard shows all panels + +**Exit criteria:** All tests pass; consistency sync reports no errors; smoke test +completed; SCOPE.md updated. + +--- + +## Phase 4 Dependencies + +- Phase 3 schema stable (T01 depends on `decision_records`, + `implementation_change_references`, `widgets` from Phase 3) +- `deployment_records` before `outcome_signals` and `change_evaluations` (FK) +- Schema (T01) before all controller work (T02–T08) +- `DeploymentRecord` (T02) before `OutcomeSignal` (T03), comparison (T04), + regression (T05), `ChangeEvaluation` (T06) +- All feature tasks (T01–T08) before gate (T09) + +## Notes + +- **DeploymentRecord is immutable.** No update/delete — same convention as + `DecisionRecord`. A mis-recorded deployment should be noted in a new + `DeploymentRecord` with a correcting note. +- **OutcomeSignal is append-only.** DB trigger enforces — same pattern as + `InteractionEvent`. Observations are evidence; they cannot be revised. +- **ChangeEvaluation is one-per-deployment** (UNIQUE constraint). A wrong + evaluation cannot be changed — create a new deployment record if you need to + re-evaluate a re-deployment. +- **Regression detection is a heuristic, not a hard constraint.** It is a signal + for operator attention, not an automated gate. +- **Recurrence tracking uses completed cycles only.** A cycle is only counted + when: prior candidate accepted → deployment record exists for that decision → + new annotation created. Partial cycles (no deployment yet) do not count. +- **No ML in Phase 4.** All scoring, comparison, and detection is rule-based SQL. + Agent-assisted distillation begins in Phase 5.