Closes the IHF improvement loop. Full antifragility chain now traversable: Widget → Annotation → Candidate → Requirement → Decision → Deployment → OutcomeSignal New artifacts: - DeploymentRecord (immutable, links DecisionRecord to a deployed version) - OutcomeSignal (append-only; DB trigger prevents UPDATE/DELETE) - ChangeEvaluation (one-per-deployment; UNIQUE constraint; 1–5 score) New capabilities: - DeploymentRecordsController (index, show, new, create) - RecordOutcomeSignalAction — capture improved/regressed/neutral/inconclusive signals - Pre/post comparison panel on deployment show (±30-day event/annotation counts) - Regression detection — improved signal followed by high/critical annotation - ChangeEvaluation — idempotent score+rationale per deployment - Recurrence tracking — cycle count per widget, leaderboard - AntifragilityDashboardAction (autoRefresh, 5 panels) per hub - Phase 4 integration tests (T01–T08 logic coverage) - docs/phase4-summary.md; SCOPE.md updated to Phase 4 complete State Hub: workstream 07e9c860 → completed Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
14 KiB
id, type, title, domain, repo, status, owner, topic_slug, created, updated, state_hub_workstream_id
| id | type | title | domain | repo | status | owner | topic_slug | created | updated | state_hub_workstream_id |
|---|---|---|---|---|---|---|---|---|---|---|
| IHUB-WP-0004 | workplan | IHF Phase 4 — Outcome Observation and Antifragility Loop | inter_hub | inter-hub | done | custodian | inter_hub | 2026-03-29 | 2026-03-29 | 07e9c860-e39b-407f-9c0b-d44989498b48 |
IHF Phase 4 — Outcome Observation and Antifragility Loop
Goal
Close the improvement loop by observing whether implemented changes actually helped. Phase 3 established the governance layer — requirements, decisions, policy constraints, and implementation references. Phase 4 connects those implementation references to deployed versions, captures outcome signals per widget, compares behaviour before and after a change, and detects regressions and recurring friction.
Background
Phase 1 (IHUB-WP-0001) delivered the Minimal Interaction Core. Phase 2 (IHUB-WP-0002) delivered Structured Feedback and Triage. Phase 3 (IHUB-WP-0003) delivered Governance and Decision Linkage. All Phase 3 exit criteria are met.
Phase 4 is the fourth of eight phases in the IHF specification
(specs/InteractionHubFrameworkSpecification_v0.1.md, §14 Phase 4). It
completes the antifragility loop:
Widget → InteractionEvent / Annotation
→ RequirementCandidate → Requirement
→ DecisionRecord → ImplementationChangeReference
→ DeploymentRecord
→ OutcomeSignal ← ChangeEvaluation
→ RegressionDetection / RecurrenceTracking
Technology stack: IHP v1.5 (Haskell, Nix), PostgreSQL, AutoRefresh (antifragility dashboard). Outcome signals and deployment records are append-only / immutable by the same conventions as InteractionEvent and TriageState.
Reference: docs/ihp-overview.md, docs/ihp-data-and-queries.md,
docs/ihp-controllers-views-forms.md, docs/ihp-realtime.md.
Phase 4 Exit Criteria (from IHF spec §14 Phase 4)
- The platform can determine whether a change improved outcomes
- Recurrent friction becomes visible
- The system supports evidence-based UI evolution
Data Artifacts Introduced (Phase 4)
DeploymentRecord, OutcomeSignal, ChangeEvaluation
Tasks
T01 — Schema: DeploymentRecord, OutcomeSignal, ChangeEvaluation
id: IHUB-WP-0004-T01
status: done
priority: high
state_hub_task_id: "4d0aa6d5-f291-4053-a487-8c64627f8271"
Add Phase 4 tables to Application/Schema.sql and write migration:
CREATE TABLE deployment_records (
id UUID DEFAULT uuid_generate_v4() PRIMARY KEY NOT NULL,
impl_ref_id UUID REFERENCES implementation_change_references(id) ON DELETE SET NULL,
decision_id UUID NOT NULL REFERENCES decision_records(id) ON DELETE RESTRICT,
version_ref TEXT NOT NULL,
deployed_at TIMESTAMP WITH TIME ZONE DEFAULT now() NOT NULL,
deployed_by UUID REFERENCES users(id),
notes TEXT,
created_at TIMESTAMP WITH TIME ZONE DEFAULT now() NOT NULL
);
CREATE INDEX deployment_records_decision_id_idx ON deployment_records (decision_id);
CREATE INDEX deployment_records_deployed_at_idx ON deployment_records (deployed_at DESC);
-- Outcome signals — append-only, no update/delete
CREATE TABLE outcome_signals (
id UUID DEFAULT uuid_generate_v4() PRIMARY KEY NOT NULL,
widget_id UUID NOT NULL REFERENCES widgets(id) ON DELETE CASCADE,
deployment_id UUID NOT NULL REFERENCES deployment_records(id) ON DELETE CASCADE,
signal_type TEXT NOT NULL,
value NUMERIC,
observed_at TIMESTAMP WITH TIME ZONE DEFAULT now() NOT NULL
);
CREATE INDEX outcome_signals_widget_id_idx ON outcome_signals (widget_id);
CREATE INDEX outcome_signals_deployment_id_idx ON outcome_signals (deployment_id);
CREATE INDEX outcome_signals_observed_at_idx ON outcome_signals (observed_at DESC);
-- Enforce append-only on outcome_signals
CREATE OR REPLACE FUNCTION prevent_outcome_signal_mutation()
RETURNS TRIGGER AS $$
BEGIN
RAISE EXCEPTION 'outcome_signals is append-only: UPDATE and DELETE are not permitted';
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER outcome_signals_no_update
BEFORE UPDATE ON outcome_signals
FOR EACH ROW EXECUTE FUNCTION prevent_outcome_signal_mutation();
CREATE TRIGGER outcome_signals_no_delete
BEFORE DELETE ON outcome_signals
FOR EACH ROW EXECUTE FUNCTION prevent_outcome_signal_mutation();
-- Change evaluations — one per deployment
CREATE TABLE change_evaluations (
id UUID DEFAULT uuid_generate_v4() PRIMARY KEY NOT NULL,
deployment_id UUID NOT NULL REFERENCES deployment_records(id) ON DELETE CASCADE,
decision_id UUID REFERENCES decision_records(id) ON DELETE SET NULL,
score SMALLINT NOT NULL CHECK (score BETWEEN 1 AND 5),
rationale TEXT NOT NULL,
evaluated_by UUID REFERENCES users(id),
evaluated_at TIMESTAMP WITH TIME ZONE DEFAULT now() NOT NULL,
UNIQUE (deployment_id)
);
CREATE INDEX change_evaluations_deployment_id_idx ON change_evaluations (deployment_id);
- Valid
outcome_signals.signal_typevalues:improved,regressed,neutral,inconclusive deployment_recordsis immutable (no update/delete — append-only by convention; controller enforces)change_evaluationshas a UNIQUE constraint ondeployment_id— one evaluation per deployment- Verify Haskell types are generated correctly
Exit criteria: migrate runs cleanly; all Phase 4 types available in GHCi.
T02 — DeploymentRecord controller and views
id: IHUB-WP-0004-T02
status: done
priority: high
state_hub_task_id: "4932b036-fe91-4146-9b35-7d3031894c2d"
- Scaffold
DeploymentRecordsController - Actions: index, show, new, create (no update/delete — immutable)
- Fields:
decisionId(select — required),implRefId(optional select from linked impl refs),versionRef(free text, e.g.v1.2.3,git:abc1234),deployedAt,notes - Index: table with decision title, version_ref, deployed_at, outcome signal count, evaluation score (if present)
- Show: full detail + linked decision chain (→ requirement → candidate → widget) + list of outcome signals + change evaluation panel
- Add "New Deployment" button on decision show page (only for decisions with at least one impl ref)
Exit criteria: Deployment records can be created and viewed; linked back to the full decision → requirement → widget chain.
T03 — OutcomeSignal capture
id: IHUB-WP-0004-T03
status: done
priority: high
state_hub_task_id: "8b39bbb3-4129-4acc-97ac-38ecfcfd7c88"
RecordOutcomeSignalAction { deploymentId }(POST from deployment show page or widget show page)- Fields:
signalType(select: improved/regressed/neutral/inconclusive),value(optional 0–100),observedAt(default now) - Append-only — no edit/delete in UI (DB trigger enforces)
- List signals on deployment show page ordered by
observedAtDESC - List signals on widget show page (last 10, across all deployments)
- Signal type color roles:
improved→ greenregressed→ redneutral→ grayinconclusive→ yellow/amber
Exit criteria: Outcome signals can be recorded from deployment and widget pages; append-only constraint verified; color roles applied consistently.
T04 — Pre/post comparison: interaction behaviour before and after deployment
id: IHUB-WP-0004-T04
status: done
priority: high
state_hub_task_id: "27c4de52-755a-40e7-bef3-986fe4470f7c"
ComparisonAction { deploymentId }— rendered on deployment show page as a panel- Time windows: 30 days before
deployed_atvs 30 days after (or until present) - Metrics computed via SQL aggregates (no in-memory processing for large datasets):
- Interaction event count by type
- Total annotation count
- Annotation severity distribution (low/medium/high/critical counts)
- High/critical annotation rate (high+critical / total)
- Render side-by-side comparison table: Before | After | Delta
- Delta column: green if annotation rate decreased, red if increased, gray if flat
Exit criteria: Comparison panel renders correct before/after counts; delta direction color-coded correctly; works with no post-deployment data (shows "—").
T05 — Regression detection
id: IHUB-WP-0004-T05
status: done
priority: high
state_hub_task_id: "844a828b-b7d7-4822-becc-b377c08c673a"
- A regression is defined as: a widget that has an
OutcomeSignal(improved)for a deployment, followed by a newAnnotation(severity IN ['high','critical'])created more than 1 day after the signal'sobserved_at(grace period) RegressionQuery(pure SQL query, no controller action — used by dashboard and widget show page):-- widgets with improved signal then subsequent high/critical annotation- Surface on widget show page: regression warning badge if widget is in regression
- Surface on governance dashboard: regression count in KPI row, list of regressed widgets
- Surface on antifragility dashboard (T08): prominent regression alert panel
Exit criteria: Regression query returns correct results; badge visible on affected widgets; count accurate on dashboard.
T06 — ChangeEvaluation: score changes by observed effect
id: IHUB-WP-0004-T06
status: done
priority: medium
state_hub_task_id: "391c6136-baea-417a-9291-5ba9f633e03f"
EvaluateChangeAction { deploymentId }(POST from deployment show page)- Idempotent: if
change_evaluations.deployment_idalready exists, redirect to deployment show page with "Already evaluated" message - Fields:
score(1–5, required),rationale(textarea, required) - Show evaluation on deployment show page: score as ★ stars + rationale
- Show evaluation summary on decision show page: "Deployment evaluated: ★★★★☆"
- Score 1–2 → red, 3 → yellow, 4–5 → green (in all views)
Exit criteria: One evaluation per deployment; idempotent; score displayed with correct color in all views.
T07 — Recurrence tracking: detect repeated unresolved friction
id: IHUB-WP-0004-T07
status: done
priority: medium
state_hub_task_id: "6a5eea23-4e73-441a-ab93-49f5b76bf3ea"
- A recurrence is: a widget that has had 2 or more
RequirementCandidates created in separate decision cycles (a new cycle begins after a prior candidate for the same widget wasacceptedand aDeploymentRecordexists for that decision) RecurrenceQuery(SQL): per widget, count completed cycles and flag widgets with cycle_count ≥ 2- Widget show page: recurrence count badge ("⟳ 3 cycles") if cycle_count ≥ 2
- Antifragility dashboard: recurrence leaderboard — top 10 widgets by cycle count, sortable
- Recurrence is informational only — no automated blocking
Exit criteria: Recurrence count accurate per widget; leaderboard renders; badge visible on widget show page.
T08 — Antifragility dashboard (AutoRefresh)
id: IHUB-WP-0004-T08
status: done
priority: high
state_hub_task_id: "e5c65c77-c757-49a6-a8e9-99c8d3503f59"
- Add
AntifragilityDashboardAction { hubId }toHubsControllerwrapped withautoRefresh do - Dashboard panels:
- KPI row: total deployments / avg evaluation score / % improved signals / regression count
- Open gaps: decisions with impl refs but no deployment record yet
- Recent deployments (last 20): version_ref, decision title, signal summary, evaluation score
- Regression alerts: widgets currently in regression state
- Recurrence leaderboard: top 10 widgets by cycle count
- Link from hub Show page alongside Triage Dashboard and Governance Dashboard
Exit criteria: Dashboard live-updates on deployment/signal/evaluation changes. All five panels render with correct data.
T09 — Phase 4 gate: tests, consistency, docs
id: IHUB-WP-0004-T09
status: done
priority: high
state_hub_task_id: "1dda0a32-4913-4007-a9f4-1d86761a8cf1"
- Integration tests (
Test/):- DeploymentRecord create + link to decision
- OutcomeSignal append-only (DB trigger fires on update/delete)
- Pre/post comparison: correct counts with known fixture data
- Regression detection: widget with improved signal + subsequent high annotation
- ChangeEvaluation create + idempotent (second create → duplicate rejection)
- Recurrence count: widget with 2 completed cycles
- Antifragility dashboard data fetch: compiles and returns correct counts
- Consistency sync via State Hub MCP:
check_repo_consistency(repo_slug="inter-hub", fix=True) - Documentation updates:
- Update
SCOPE.mdcurrent state section: Phase 4 complete - Write
docs/phase4-summary.md: what was built, known limitations, Phase 5 readiness
- Update
- Smoke test checklist:
- Create deployment record linked to a decision
- Record outcome signals (improved, then regressed)
- Observe pre/post comparison panel
- Evaluate the change (score 4)
- Confirm regression badge appears on widget show page
- Confirm antifragility dashboard shows all panels
Exit criteria: All tests pass; consistency sync reports no errors; smoke test completed; SCOPE.md updated.
Phase 4 Dependencies
- Phase 3 schema stable (T01 depends on
decision_records,implementation_change_references,widgetsfrom Phase 3) deployment_recordsbeforeoutcome_signalsandchange_evaluations(FK)- Schema (T01) before all controller work (T02–T08)
DeploymentRecord(T02) beforeOutcomeSignal(T03), comparison (T04), regression (T05),ChangeEvaluation(T06)- All feature tasks (T01–T08) before gate (T09)
Notes
- DeploymentRecord is immutable. No update/delete — same convention as
DecisionRecord. A mis-recorded deployment should be noted in a newDeploymentRecordwith a correcting note. - OutcomeSignal is append-only. DB trigger enforces — same pattern as
InteractionEvent. Observations are evidence; they cannot be revised. - ChangeEvaluation is one-per-deployment (UNIQUE constraint). A wrong evaluation cannot be changed — create a new deployment record if you need to re-evaluate a re-deployment.
- Regression detection is a heuristic, not a hard constraint. It is a signal for operator attention, not an automated gate.
- Recurrence tracking uses completed cycles only. A cycle is only counted when: prior candidate accepted → deployment record exists for that decision → new annotation created. Partial cycles (no deployment yet) do not count.
- No ML in Phase 4. All scoring, comparison, and detection is rule-based SQL. Agent-assisted distillation begins in Phase 5.