Files
inter-hub/workplans/IHUB-WP-0004-ihf-phase4-outcome-observation-and-antifragility.md
Bernd Worsch 878d2577ae
Some checks failed
Test / test (push) Has been cancelled
feat(P4): IHF Phase 4 complete — Outcome Observation and Antifragility Loop
Closes the IHF improvement loop. Full antifragility chain now traversable:
Widget → Annotation → Candidate → Requirement → Decision → Deployment → OutcomeSignal

New artifacts:
- DeploymentRecord (immutable, links DecisionRecord to a deployed version)
- OutcomeSignal (append-only; DB trigger prevents UPDATE/DELETE)
- ChangeEvaluation (one-per-deployment; UNIQUE constraint; 1–5 score)

New capabilities:
- DeploymentRecordsController (index, show, new, create)
- RecordOutcomeSignalAction — capture improved/regressed/neutral/inconclusive signals
- Pre/post comparison panel on deployment show (±30-day event/annotation counts)
- Regression detection — improved signal followed by high/critical annotation
- ChangeEvaluation — idempotent score+rationale per deployment
- Recurrence tracking — cycle count per widget, leaderboard
- AntifragilityDashboardAction (autoRefresh, 5 panels) per hub
- Phase 4 integration tests (T01–T08 logic coverage)
- docs/phase4-summary.md; SCOPE.md updated to Phase 4 complete

State Hub: workstream 07e9c860 → completed

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-29 12:27:30 +00:00

388 lines
14 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
id: IHUB-WP-0004
type: workplan
title: "IHF Phase 4 — Outcome Observation and Antifragility Loop"
domain: inter_hub
repo: inter-hub
status: done
owner: custodian
topic_slug: inter_hub
created: "2026-03-29"
updated: "2026-03-29"
state_hub_workstream_id: "07e9c860-e39b-407f-9c0b-d44989498b48"
---
# IHF Phase 4 — Outcome Observation and Antifragility Loop
## Goal
Close the improvement loop by observing whether implemented changes actually
helped. Phase 3 established the governance layer — requirements, decisions,
policy constraints, and implementation references. Phase 4 connects those
implementation references to deployed versions, captures outcome signals per
widget, compares behaviour before and after a change, and detects regressions
and recurring friction.
## Background
Phase 1 (IHUB-WP-0001) delivered the Minimal Interaction Core. Phase 2
(IHUB-WP-0002) delivered Structured Feedback and Triage. Phase 3
(IHUB-WP-0003) delivered Governance and Decision Linkage. All Phase 3 exit
criteria are met.
Phase 4 is the fourth of eight phases in the IHF specification
(`specs/InteractionHubFrameworkSpecification_v0.1.md`, §14 Phase 4). It
completes the antifragility loop:
```
Widget → InteractionEvent / Annotation
→ RequirementCandidate → Requirement
→ DecisionRecord → ImplementationChangeReference
→ DeploymentRecord
→ OutcomeSignal ← ChangeEvaluation
→ RegressionDetection / RecurrenceTracking
```
**Technology stack:** IHP v1.5 (Haskell, Nix), PostgreSQL, AutoRefresh
(antifragility dashboard). Outcome signals and deployment records are
append-only / immutable by the same conventions as InteractionEvent and
TriageState.
Reference: `docs/ihp-overview.md`, `docs/ihp-data-and-queries.md`,
`docs/ihp-controllers-views-forms.md`, `docs/ihp-realtime.md`.
## Phase 4 Exit Criteria (from IHF spec §14 Phase 4)
- The platform can determine whether a change improved outcomes
- Recurrent friction becomes visible
- The system supports evidence-based UI evolution
## Data Artifacts Introduced (Phase 4)
`DeploymentRecord`, `OutcomeSignal`, `ChangeEvaluation`
---
## Tasks
### T01 — Schema: DeploymentRecord, OutcomeSignal, ChangeEvaluation
```task
id: IHUB-WP-0004-T01
status: done
priority: high
state_hub_task_id: "4d0aa6d5-f291-4053-a487-8c64627f8271"
```
Add Phase 4 tables to `Application/Schema.sql` and write migration:
```sql
CREATE TABLE deployment_records (
id UUID DEFAULT uuid_generate_v4() PRIMARY KEY NOT NULL,
impl_ref_id UUID REFERENCES implementation_change_references(id) ON DELETE SET NULL,
decision_id UUID NOT NULL REFERENCES decision_records(id) ON DELETE RESTRICT,
version_ref TEXT NOT NULL,
deployed_at TIMESTAMP WITH TIME ZONE DEFAULT now() NOT NULL,
deployed_by UUID REFERENCES users(id),
notes TEXT,
created_at TIMESTAMP WITH TIME ZONE DEFAULT now() NOT NULL
);
CREATE INDEX deployment_records_decision_id_idx ON deployment_records (decision_id);
CREATE INDEX deployment_records_deployed_at_idx ON deployment_records (deployed_at DESC);
-- Outcome signals — append-only, no update/delete
CREATE TABLE outcome_signals (
id UUID DEFAULT uuid_generate_v4() PRIMARY KEY NOT NULL,
widget_id UUID NOT NULL REFERENCES widgets(id) ON DELETE CASCADE,
deployment_id UUID NOT NULL REFERENCES deployment_records(id) ON DELETE CASCADE,
signal_type TEXT NOT NULL,
value NUMERIC,
observed_at TIMESTAMP WITH TIME ZONE DEFAULT now() NOT NULL
);
CREATE INDEX outcome_signals_widget_id_idx ON outcome_signals (widget_id);
CREATE INDEX outcome_signals_deployment_id_idx ON outcome_signals (deployment_id);
CREATE INDEX outcome_signals_observed_at_idx ON outcome_signals (observed_at DESC);
-- Enforce append-only on outcome_signals
CREATE OR REPLACE FUNCTION prevent_outcome_signal_mutation()
RETURNS TRIGGER AS $$
BEGIN
RAISE EXCEPTION 'outcome_signals is append-only: UPDATE and DELETE are not permitted';
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER outcome_signals_no_update
BEFORE UPDATE ON outcome_signals
FOR EACH ROW EXECUTE FUNCTION prevent_outcome_signal_mutation();
CREATE TRIGGER outcome_signals_no_delete
BEFORE DELETE ON outcome_signals
FOR EACH ROW EXECUTE FUNCTION prevent_outcome_signal_mutation();
-- Change evaluations — one per deployment
CREATE TABLE change_evaluations (
id UUID DEFAULT uuid_generate_v4() PRIMARY KEY NOT NULL,
deployment_id UUID NOT NULL REFERENCES deployment_records(id) ON DELETE CASCADE,
decision_id UUID REFERENCES decision_records(id) ON DELETE SET NULL,
score SMALLINT NOT NULL CHECK (score BETWEEN 1 AND 5),
rationale TEXT NOT NULL,
evaluated_by UUID REFERENCES users(id),
evaluated_at TIMESTAMP WITH TIME ZONE DEFAULT now() NOT NULL,
UNIQUE (deployment_id)
);
CREATE INDEX change_evaluations_deployment_id_idx ON change_evaluations (deployment_id);
```
- Valid `outcome_signals.signal_type` values: `improved`, `regressed`, `neutral`, `inconclusive`
- `deployment_records` is immutable (no update/delete — append-only by convention; controller enforces)
- `change_evaluations` has a UNIQUE constraint on `deployment_id` — one evaluation per deployment
- Verify Haskell types are generated correctly
**Exit criteria:** `migrate` runs cleanly; all Phase 4 types available in GHCi.
---
### T02 — DeploymentRecord controller and views
```task
id: IHUB-WP-0004-T02
status: done
priority: high
state_hub_task_id: "4932b036-fe91-4146-9b35-7d3031894c2d"
```
1. Scaffold `DeploymentRecordsController`
2. Actions: index, show, new, create (no update/delete — immutable)
3. Fields: `decisionId` (select — required), `implRefId` (optional select from linked
impl refs), `versionRef` (free text, e.g. `v1.2.3`, `git:abc1234`), `deployedAt`,
`notes`
4. Index: table with decision title, version_ref, deployed_at, outcome signal count,
evaluation score (if present)
5. Show: full detail + linked decision chain (→ requirement → candidate → widget) +
list of outcome signals + change evaluation panel
6. Add "New Deployment" button on decision show page (only for decisions with at least
one impl ref)
**Exit criteria:** Deployment records can be created and viewed; linked back to the
full decision → requirement → widget chain.
---
### T03 — OutcomeSignal capture
```task
id: IHUB-WP-0004-T03
status: done
priority: high
state_hub_task_id: "8b39bbb3-4129-4acc-97ac-38ecfcfd7c88"
```
1. `RecordOutcomeSignalAction { deploymentId }` (POST from deployment show page or
widget show page)
2. Fields: `signalType` (select: improved/regressed/neutral/inconclusive),
`value` (optional 0100), `observedAt` (default now)
3. Append-only — no edit/delete in UI (DB trigger enforces)
4. List signals on deployment show page ordered by `observedAt` DESC
5. List signals on widget show page (last 10, across all deployments)
6. Signal type color roles:
- `improved` → green
- `regressed` → red
- `neutral` → gray
- `inconclusive` → yellow/amber
**Exit criteria:** Outcome signals can be recorded from deployment and widget pages;
append-only constraint verified; color roles applied consistently.
---
### T04 — Pre/post comparison: interaction behaviour before and after deployment
```task
id: IHUB-WP-0004-T04
status: done
priority: high
state_hub_task_id: "27c4de52-755a-40e7-bef3-986fe4470f7c"
```
1. `ComparisonAction { deploymentId }` — rendered on deployment show page as a panel
2. Time windows: 30 days before `deployed_at` vs 30 days after (or until present)
3. Metrics computed via SQL aggregates (no in-memory processing for large datasets):
- Interaction event count by type
- Total annotation count
- Annotation severity distribution (low/medium/high/critical counts)
- High/critical annotation rate (high+critical / total)
4. Render side-by-side comparison table: Before | After | Delta
5. Delta column: green if annotation rate decreased, red if increased, gray if flat
**Exit criteria:** Comparison panel renders correct before/after counts; delta
direction color-coded correctly; works with no post-deployment data (shows "—").
---
### T05 — Regression detection
```task
id: IHUB-WP-0004-T05
status: done
priority: high
state_hub_task_id: "844a828b-b7d7-4822-becc-b377c08c673a"
```
1. A **regression** is defined as: a widget that has an `OutcomeSignal(improved)`
for a deployment, followed by a new `Annotation(severity IN ['high','critical'])`
created more than 1 day after the signal's `observed_at` (grace period)
2. `RegressionQuery` (pure SQL query, no controller action — used by dashboard and
widget show page):
```sql
-- widgets with improved signal then subsequent high/critical annotation
```
3. Surface on widget show page: regression warning badge if widget is in regression
4. Surface on governance dashboard: regression count in KPI row, list of regressed
widgets
5. Surface on antifragility dashboard (T08): prominent regression alert panel
**Exit criteria:** Regression query returns correct results; badge visible on
affected widgets; count accurate on dashboard.
---
### T06 — ChangeEvaluation: score changes by observed effect
```task
id: IHUB-WP-0004-T06
status: done
priority: medium
state_hub_task_id: "391c6136-baea-417a-9291-5ba9f633e03f"
```
1. `EvaluateChangeAction { deploymentId }` (POST from deployment show page)
2. Idempotent: if `change_evaluations.deployment_id` already exists, redirect to
deployment show page with "Already evaluated" message
3. Fields: `score` (15, required), `rationale` (textarea, required)
4. Show evaluation on deployment show page: score as ★ stars + rationale
5. Show evaluation summary on decision show page: "Deployment evaluated: ★★★★☆"
6. Score 12 → red, 3 → yellow, 45 → green (in all views)
**Exit criteria:** One evaluation per deployment; idempotent; score displayed with
correct color in all views.
---
### T07 — Recurrence tracking: detect repeated unresolved friction
```task
id: IHUB-WP-0004-T07
status: done
priority: medium
state_hub_task_id: "6a5eea23-4e73-441a-ab93-49f5b76bf3ea"
```
1. A **recurrence** is: a widget that has had 2 or more `RequirementCandidate`s
created in separate decision cycles (a new cycle begins after a prior candidate
for the same widget was `accepted` and a `DeploymentRecord` exists for that
decision)
2. `RecurrenceQuery` (SQL): per widget, count completed cycles and flag widgets
with cycle_count ≥ 2
3. Widget show page: recurrence count badge ("⟳ 3 cycles") if cycle_count ≥ 2
4. Antifragility dashboard: recurrence leaderboard — top 10 widgets by cycle count,
sortable
5. Recurrence is informational only — no automated blocking
**Exit criteria:** Recurrence count accurate per widget; leaderboard renders; badge
visible on widget show page.
---
### T08 — Antifragility dashboard (AutoRefresh)
```task
id: IHUB-WP-0004-T08
status: done
priority: high
state_hub_task_id: "e5c65c77-c757-49a6-a8e9-99c8d3503f59"
```
1. Add `AntifragilityDashboardAction { hubId }` to `HubsController` wrapped with
`autoRefresh do`
2. Dashboard panels:
- **KPI row**: total deployments / avg evaluation score / % improved signals /
regression count
- **Open gaps**: decisions with impl refs but no deployment record yet
- **Recent deployments** (last 20): version_ref, decision title, signal summary,
evaluation score
- **Regression alerts**: widgets currently in regression state
- **Recurrence leaderboard**: top 10 widgets by cycle count
3. Link from hub Show page alongside Triage Dashboard and Governance Dashboard
**Exit criteria:** Dashboard live-updates on deployment/signal/evaluation changes.
All five panels render with correct data.
---
### T09 — Phase 4 gate: tests, consistency, docs
```task
id: IHUB-WP-0004-T09
status: done
priority: high
state_hub_task_id: "1dda0a32-4913-4007-a9f4-1d86761a8cf1"
```
1. **Integration tests** (`Test/`):
- DeploymentRecord create + link to decision
- OutcomeSignal append-only (DB trigger fires on update/delete)
- Pre/post comparison: correct counts with known fixture data
- Regression detection: widget with improved signal + subsequent high annotation
- ChangeEvaluation create + idempotent (second create → duplicate rejection)
- Recurrence count: widget with 2 completed cycles
- Antifragility dashboard data fetch: compiles and returns correct counts
2. **Consistency sync** via State Hub MCP:
`check_repo_consistency(repo_slug="inter-hub", fix=True)`
3. **Documentation updates:**
- Update `SCOPE.md` current state section: Phase 4 complete
- Write `docs/phase4-summary.md`: what was built, known limitations, Phase 5
readiness
4. **Smoke test checklist:**
- Create deployment record linked to a decision
- Record outcome signals (improved, then regressed)
- Observe pre/post comparison panel
- Evaluate the change (score 4)
- Confirm regression badge appears on widget show page
- Confirm antifragility dashboard shows all panels
**Exit criteria:** All tests pass; consistency sync reports no errors; smoke test
completed; SCOPE.md updated.
---
## Phase 4 Dependencies
- Phase 3 schema stable (T01 depends on `decision_records`,
`implementation_change_references`, `widgets` from Phase 3)
- `deployment_records` before `outcome_signals` and `change_evaluations` (FK)
- Schema (T01) before all controller work (T02T08)
- `DeploymentRecord` (T02) before `OutcomeSignal` (T03), comparison (T04),
regression (T05), `ChangeEvaluation` (T06)
- All feature tasks (T01T08) before gate (T09)
## Notes
- **DeploymentRecord is immutable.** No update/delete — same convention as
`DecisionRecord`. A mis-recorded deployment should be noted in a new
`DeploymentRecord` with a correcting note.
- **OutcomeSignal is append-only.** DB trigger enforces — same pattern as
`InteractionEvent`. Observations are evidence; they cannot be revised.
- **ChangeEvaluation is one-per-deployment** (UNIQUE constraint). A wrong
evaluation cannot be changed — create a new deployment record if you need to
re-evaluate a re-deployment.
- **Regression detection is a heuristic, not a hard constraint.** It is a signal
for operator attention, not an automated gate.
- **Recurrence tracking uses completed cycles only.** A cycle is only counted
when: prior candidate accepted → deployment record exists for that decision →
new annotation created. Partial cycles (no deployment yet) do not count.
- **No ML in Phase 4.** All scoring, comparison, and detection is rule-based SQL.
Agent-assisted distillation begins in Phase 5.