generated from coulomb/repo-seed
chore: register Phase 4 workplan (IHUB-WP-0004)
Some checks failed
Test / test (push) Has been cancelled
Some checks failed
Test / test (push) Has been cancelled
IHF Phase 4 — Outcome Observation and Antifragility Loop. 9 tasks covering DeploymentRecord, OutcomeSignal, ChangeEvaluation, pre/post comparison, regression detection, recurrence tracking, and the antifragility dashboard. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,387 @@
|
|||||||
|
---
|
||||||
|
id: IHUB-WP-0004
|
||||||
|
type: workplan
|
||||||
|
title: "IHF Phase 4 — Outcome Observation and Antifragility Loop"
|
||||||
|
domain: inter_hub
|
||||||
|
repo: inter-hub
|
||||||
|
status: active
|
||||||
|
owner: custodian
|
||||||
|
topic_slug: inter_hub
|
||||||
|
created: "2026-03-29"
|
||||||
|
updated: "2026-03-29"
|
||||||
|
state_hub_workstream_id: "07e9c860-e39b-407f-9c0b-d44989498b48"
|
||||||
|
---
|
||||||
|
|
||||||
|
# IHF Phase 4 — Outcome Observation and Antifragility Loop
|
||||||
|
|
||||||
|
## Goal
|
||||||
|
|
||||||
|
Close the improvement loop by observing whether implemented changes actually
|
||||||
|
helped. Phase 3 established the governance layer — requirements, decisions,
|
||||||
|
policy constraints, and implementation references. Phase 4 connects those
|
||||||
|
implementation references to deployed versions, captures outcome signals per
|
||||||
|
widget, compares behaviour before and after a change, and detects regressions
|
||||||
|
and recurring friction.
|
||||||
|
|
||||||
|
## Background
|
||||||
|
|
||||||
|
Phase 1 (IHUB-WP-0001) delivered the Minimal Interaction Core. Phase 2
|
||||||
|
(IHUB-WP-0002) delivered Structured Feedback and Triage. Phase 3
|
||||||
|
(IHUB-WP-0003) delivered Governance and Decision Linkage. All Phase 3 exit
|
||||||
|
criteria are met.
|
||||||
|
|
||||||
|
Phase 4 is the fourth of eight phases in the IHF specification
|
||||||
|
(`specs/InteractionHubFrameworkSpecification_v0.1.md`, §14 Phase 4). It
|
||||||
|
completes the antifragility loop:
|
||||||
|
|
||||||
|
```
|
||||||
|
Widget → InteractionEvent / Annotation
|
||||||
|
→ RequirementCandidate → Requirement
|
||||||
|
→ DecisionRecord → ImplementationChangeReference
|
||||||
|
→ DeploymentRecord
|
||||||
|
→ OutcomeSignal ← ChangeEvaluation
|
||||||
|
→ RegressionDetection / RecurrenceTracking
|
||||||
|
```
|
||||||
|
|
||||||
|
**Technology stack:** IHP v1.5 (Haskell, Nix), PostgreSQL, AutoRefresh
|
||||||
|
(antifragility dashboard). Outcome signals and deployment records are
|
||||||
|
append-only / immutable by the same conventions as InteractionEvent and
|
||||||
|
TriageState.
|
||||||
|
|
||||||
|
Reference: `docs/ihp-overview.md`, `docs/ihp-data-and-queries.md`,
|
||||||
|
`docs/ihp-controllers-views-forms.md`, `docs/ihp-realtime.md`.
|
||||||
|
|
||||||
|
## Phase 4 Exit Criteria (from IHF spec §14 Phase 4)
|
||||||
|
|
||||||
|
- The platform can determine whether a change improved outcomes
|
||||||
|
- Recurrent friction becomes visible
|
||||||
|
- The system supports evidence-based UI evolution
|
||||||
|
|
||||||
|
## Data Artifacts Introduced (Phase 4)
|
||||||
|
|
||||||
|
`DeploymentRecord`, `OutcomeSignal`, `ChangeEvaluation`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Tasks
|
||||||
|
|
||||||
|
### T01 — Schema: DeploymentRecord, OutcomeSignal, ChangeEvaluation
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: IHUB-WP-0004-T01
|
||||||
|
status: todo
|
||||||
|
priority: high
|
||||||
|
state_hub_task_id: "4d0aa6d5-f291-4053-a487-8c64627f8271"
|
||||||
|
```
|
||||||
|
|
||||||
|
Add Phase 4 tables to `Application/Schema.sql` and write migration:
|
||||||
|
|
||||||
|
```sql
|
||||||
|
CREATE TABLE deployment_records (
|
||||||
|
id UUID DEFAULT uuid_generate_v4() PRIMARY KEY NOT NULL,
|
||||||
|
impl_ref_id UUID REFERENCES implementation_change_references(id) ON DELETE SET NULL,
|
||||||
|
decision_id UUID NOT NULL REFERENCES decision_records(id) ON DELETE RESTRICT,
|
||||||
|
version_ref TEXT NOT NULL,
|
||||||
|
deployed_at TIMESTAMP WITH TIME ZONE DEFAULT now() NOT NULL,
|
||||||
|
deployed_by UUID REFERENCES users(id),
|
||||||
|
notes TEXT,
|
||||||
|
created_at TIMESTAMP WITH TIME ZONE DEFAULT now() NOT NULL
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE INDEX deployment_records_decision_id_idx ON deployment_records (decision_id);
|
||||||
|
CREATE INDEX deployment_records_deployed_at_idx ON deployment_records (deployed_at DESC);
|
||||||
|
|
||||||
|
-- Outcome signals — append-only, no update/delete
|
||||||
|
CREATE TABLE outcome_signals (
|
||||||
|
id UUID DEFAULT uuid_generate_v4() PRIMARY KEY NOT NULL,
|
||||||
|
widget_id UUID NOT NULL REFERENCES widgets(id) ON DELETE CASCADE,
|
||||||
|
deployment_id UUID NOT NULL REFERENCES deployment_records(id) ON DELETE CASCADE,
|
||||||
|
signal_type TEXT NOT NULL,
|
||||||
|
value NUMERIC,
|
||||||
|
observed_at TIMESTAMP WITH TIME ZONE DEFAULT now() NOT NULL
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE INDEX outcome_signals_widget_id_idx ON outcome_signals (widget_id);
|
||||||
|
CREATE INDEX outcome_signals_deployment_id_idx ON outcome_signals (deployment_id);
|
||||||
|
CREATE INDEX outcome_signals_observed_at_idx ON outcome_signals (observed_at DESC);
|
||||||
|
|
||||||
|
-- Enforce append-only on outcome_signals
|
||||||
|
CREATE OR REPLACE FUNCTION prevent_outcome_signal_mutation()
|
||||||
|
RETURNS TRIGGER AS $$
|
||||||
|
BEGIN
|
||||||
|
RAISE EXCEPTION 'outcome_signals is append-only: UPDATE and DELETE are not permitted';
|
||||||
|
END;
|
||||||
|
$$ LANGUAGE plpgsql;
|
||||||
|
|
||||||
|
CREATE TRIGGER outcome_signals_no_update
|
||||||
|
BEFORE UPDATE ON outcome_signals
|
||||||
|
FOR EACH ROW EXECUTE FUNCTION prevent_outcome_signal_mutation();
|
||||||
|
|
||||||
|
CREATE TRIGGER outcome_signals_no_delete
|
||||||
|
BEFORE DELETE ON outcome_signals
|
||||||
|
FOR EACH ROW EXECUTE FUNCTION prevent_outcome_signal_mutation();
|
||||||
|
|
||||||
|
-- Change evaluations — one per deployment
|
||||||
|
CREATE TABLE change_evaluations (
|
||||||
|
id UUID DEFAULT uuid_generate_v4() PRIMARY KEY NOT NULL,
|
||||||
|
deployment_id UUID NOT NULL REFERENCES deployment_records(id) ON DELETE CASCADE,
|
||||||
|
decision_id UUID REFERENCES decision_records(id) ON DELETE SET NULL,
|
||||||
|
score SMALLINT NOT NULL CHECK (score BETWEEN 1 AND 5),
|
||||||
|
rationale TEXT NOT NULL,
|
||||||
|
evaluated_by UUID REFERENCES users(id),
|
||||||
|
evaluated_at TIMESTAMP WITH TIME ZONE DEFAULT now() NOT NULL,
|
||||||
|
UNIQUE (deployment_id)
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE INDEX change_evaluations_deployment_id_idx ON change_evaluations (deployment_id);
|
||||||
|
```
|
||||||
|
|
||||||
|
- Valid `outcome_signals.signal_type` values: `improved`, `regressed`, `neutral`, `inconclusive`
|
||||||
|
- `deployment_records` is immutable (no update/delete — append-only by convention; controller enforces)
|
||||||
|
- `change_evaluations` has a UNIQUE constraint on `deployment_id` — one evaluation per deployment
|
||||||
|
- Verify Haskell types are generated correctly
|
||||||
|
|
||||||
|
**Exit criteria:** `migrate` runs cleanly; all Phase 4 types available in GHCi.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### T02 — DeploymentRecord controller and views
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: IHUB-WP-0004-T02
|
||||||
|
status: todo
|
||||||
|
priority: high
|
||||||
|
state_hub_task_id: "4932b036-fe91-4146-9b35-7d3031894c2d"
|
||||||
|
```
|
||||||
|
|
||||||
|
1. Scaffold `DeploymentRecordsController`
|
||||||
|
2. Actions: index, show, new, create (no update/delete — immutable)
|
||||||
|
3. Fields: `decisionId` (select — required), `implRefId` (optional select from linked
|
||||||
|
impl refs), `versionRef` (free text, e.g. `v1.2.3`, `git:abc1234`), `deployedAt`,
|
||||||
|
`notes`
|
||||||
|
4. Index: table with decision title, version_ref, deployed_at, outcome signal count,
|
||||||
|
evaluation score (if present)
|
||||||
|
5. Show: full detail + linked decision chain (→ requirement → candidate → widget) +
|
||||||
|
list of outcome signals + change evaluation panel
|
||||||
|
6. Add "New Deployment" button on decision show page (only for decisions with at least
|
||||||
|
one impl ref)
|
||||||
|
|
||||||
|
**Exit criteria:** Deployment records can be created and viewed; linked back to the
|
||||||
|
full decision → requirement → widget chain.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### T03 — OutcomeSignal capture
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: IHUB-WP-0004-T03
|
||||||
|
status: todo
|
||||||
|
priority: high
|
||||||
|
state_hub_task_id: "8b39bbb3-4129-4acc-97ac-38ecfcfd7c88"
|
||||||
|
```
|
||||||
|
|
||||||
|
1. `RecordOutcomeSignalAction { deploymentId }` (POST from deployment show page or
|
||||||
|
widget show page)
|
||||||
|
2. Fields: `signalType` (select: improved/regressed/neutral/inconclusive),
|
||||||
|
`value` (optional 0–100), `observedAt` (default now)
|
||||||
|
3. Append-only — no edit/delete in UI (DB trigger enforces)
|
||||||
|
4. List signals on deployment show page ordered by `observedAt` DESC
|
||||||
|
5. List signals on widget show page (last 10, across all deployments)
|
||||||
|
6. Signal type color roles:
|
||||||
|
- `improved` → green
|
||||||
|
- `regressed` → red
|
||||||
|
- `neutral` → gray
|
||||||
|
- `inconclusive` → yellow/amber
|
||||||
|
|
||||||
|
**Exit criteria:** Outcome signals can be recorded from deployment and widget pages;
|
||||||
|
append-only constraint verified; color roles applied consistently.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### T04 — Pre/post comparison: interaction behaviour before and after deployment
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: IHUB-WP-0004-T04
|
||||||
|
status: todo
|
||||||
|
priority: high
|
||||||
|
state_hub_task_id: "27c4de52-755a-40e7-bef3-986fe4470f7c"
|
||||||
|
```
|
||||||
|
|
||||||
|
1. `ComparisonAction { deploymentId }` — rendered on deployment show page as a panel
|
||||||
|
2. Time windows: 30 days before `deployed_at` vs 30 days after (or until present)
|
||||||
|
3. Metrics computed via SQL aggregates (no in-memory processing for large datasets):
|
||||||
|
- Interaction event count by type
|
||||||
|
- Total annotation count
|
||||||
|
- Annotation severity distribution (low/medium/high/critical counts)
|
||||||
|
- High/critical annotation rate (high+critical / total)
|
||||||
|
4. Render side-by-side comparison table: Before | After | Delta
|
||||||
|
5. Delta column: green if annotation rate decreased, red if increased, gray if flat
|
||||||
|
|
||||||
|
**Exit criteria:** Comparison panel renders correct before/after counts; delta
|
||||||
|
direction color-coded correctly; works with no post-deployment data (shows "—").
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### T05 — Regression detection
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: IHUB-WP-0004-T05
|
||||||
|
status: todo
|
||||||
|
priority: high
|
||||||
|
state_hub_task_id: "844a828b-b7d7-4822-becc-b377c08c673a"
|
||||||
|
```
|
||||||
|
|
||||||
|
1. A **regression** is defined as: a widget that has an `OutcomeSignal(improved)`
|
||||||
|
for a deployment, followed by a new `Annotation(severity IN ['high','critical'])`
|
||||||
|
created more than 1 day after the signal's `observed_at` (grace period)
|
||||||
|
2. `RegressionQuery` (pure SQL query, no controller action — used by dashboard and
|
||||||
|
widget show page):
|
||||||
|
```sql
|
||||||
|
-- widgets with improved signal then subsequent high/critical annotation
|
||||||
|
```
|
||||||
|
3. Surface on widget show page: regression warning badge if widget is in regression
|
||||||
|
4. Surface on governance dashboard: regression count in KPI row, list of regressed
|
||||||
|
widgets
|
||||||
|
5. Surface on antifragility dashboard (T08): prominent regression alert panel
|
||||||
|
|
||||||
|
**Exit criteria:** Regression query returns correct results; badge visible on
|
||||||
|
affected widgets; count accurate on dashboard.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### T06 — ChangeEvaluation: score changes by observed effect
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: IHUB-WP-0004-T06
|
||||||
|
status: todo
|
||||||
|
priority: medium
|
||||||
|
state_hub_task_id: "391c6136-baea-417a-9291-5ba9f633e03f"
|
||||||
|
```
|
||||||
|
|
||||||
|
1. `EvaluateChangeAction { deploymentId }` (POST from deployment show page)
|
||||||
|
2. Idempotent: if `change_evaluations.deployment_id` already exists, redirect to
|
||||||
|
deployment show page with "Already evaluated" message
|
||||||
|
3. Fields: `score` (1–5, required), `rationale` (textarea, required)
|
||||||
|
4. Show evaluation on deployment show page: score as ★ stars + rationale
|
||||||
|
5. Show evaluation summary on decision show page: "Deployment evaluated: ★★★★☆"
|
||||||
|
6. Score 1–2 → red, 3 → yellow, 4–5 → green (in all views)
|
||||||
|
|
||||||
|
**Exit criteria:** One evaluation per deployment; idempotent; score displayed with
|
||||||
|
correct color in all views.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### T07 — Recurrence tracking: detect repeated unresolved friction
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: IHUB-WP-0004-T07
|
||||||
|
status: todo
|
||||||
|
priority: medium
|
||||||
|
state_hub_task_id: "6a5eea23-4e73-441a-ab93-49f5b76bf3ea"
|
||||||
|
```
|
||||||
|
|
||||||
|
1. A **recurrence** is: a widget that has had 2 or more `RequirementCandidate`s
|
||||||
|
created in separate decision cycles (a new cycle begins after a prior candidate
|
||||||
|
for the same widget was `accepted` and a `DeploymentRecord` exists for that
|
||||||
|
decision)
|
||||||
|
2. `RecurrenceQuery` (SQL): per widget, count completed cycles and flag widgets
|
||||||
|
with cycle_count ≥ 2
|
||||||
|
3. Widget show page: recurrence count badge ("⟳ 3 cycles") if cycle_count ≥ 2
|
||||||
|
4. Antifragility dashboard: recurrence leaderboard — top 10 widgets by cycle count,
|
||||||
|
sortable
|
||||||
|
5. Recurrence is informational only — no automated blocking
|
||||||
|
|
||||||
|
**Exit criteria:** Recurrence count accurate per widget; leaderboard renders; badge
|
||||||
|
visible on widget show page.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### T08 — Antifragility dashboard (AutoRefresh)
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: IHUB-WP-0004-T08
|
||||||
|
status: todo
|
||||||
|
priority: high
|
||||||
|
state_hub_task_id: "e5c65c77-c757-49a6-a8e9-99c8d3503f59"
|
||||||
|
```
|
||||||
|
|
||||||
|
1. Add `AntifragilityDashboardAction { hubId }` to `HubsController` wrapped with
|
||||||
|
`autoRefresh do`
|
||||||
|
2. Dashboard panels:
|
||||||
|
- **KPI row**: total deployments / avg evaluation score / % improved signals /
|
||||||
|
regression count
|
||||||
|
- **Open gaps**: decisions with impl refs but no deployment record yet
|
||||||
|
- **Recent deployments** (last 20): version_ref, decision title, signal summary,
|
||||||
|
evaluation score
|
||||||
|
- **Regression alerts**: widgets currently in regression state
|
||||||
|
- **Recurrence leaderboard**: top 10 widgets by cycle count
|
||||||
|
3. Link from hub Show page alongside Triage Dashboard and Governance Dashboard
|
||||||
|
|
||||||
|
**Exit criteria:** Dashboard live-updates on deployment/signal/evaluation changes.
|
||||||
|
All five panels render with correct data.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### T09 — Phase 4 gate: tests, consistency, docs
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: IHUB-WP-0004-T09
|
||||||
|
status: todo
|
||||||
|
priority: high
|
||||||
|
state_hub_task_id: "1dda0a32-4913-4007-a9f4-1d86761a8cf1"
|
||||||
|
```
|
||||||
|
|
||||||
|
1. **Integration tests** (`Test/`):
|
||||||
|
- DeploymentRecord create + link to decision
|
||||||
|
- OutcomeSignal append-only (DB trigger fires on update/delete)
|
||||||
|
- Pre/post comparison: correct counts with known fixture data
|
||||||
|
- Regression detection: widget with improved signal + subsequent high annotation
|
||||||
|
- ChangeEvaluation create + idempotent (second create → duplicate rejection)
|
||||||
|
- Recurrence count: widget with 2 completed cycles
|
||||||
|
- Antifragility dashboard data fetch: compiles and returns correct counts
|
||||||
|
2. **Consistency sync** via State Hub MCP:
|
||||||
|
`check_repo_consistency(repo_slug="inter-hub", fix=True)`
|
||||||
|
3. **Documentation updates:**
|
||||||
|
- Update `SCOPE.md` current state section: Phase 4 complete
|
||||||
|
- Write `docs/phase4-summary.md`: what was built, known limitations, Phase 5
|
||||||
|
readiness
|
||||||
|
4. **Smoke test checklist:**
|
||||||
|
- Create deployment record linked to a decision
|
||||||
|
- Record outcome signals (improved, then regressed)
|
||||||
|
- Observe pre/post comparison panel
|
||||||
|
- Evaluate the change (score 4)
|
||||||
|
- Confirm regression badge appears on widget show page
|
||||||
|
- Confirm antifragility dashboard shows all panels
|
||||||
|
|
||||||
|
**Exit criteria:** All tests pass; consistency sync reports no errors; smoke test
|
||||||
|
completed; SCOPE.md updated.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Phase 4 Dependencies
|
||||||
|
|
||||||
|
- Phase 3 schema stable (T01 depends on `decision_records`,
|
||||||
|
`implementation_change_references`, `widgets` from Phase 3)
|
||||||
|
- `deployment_records` before `outcome_signals` and `change_evaluations` (FK)
|
||||||
|
- Schema (T01) before all controller work (T02–T08)
|
||||||
|
- `DeploymentRecord` (T02) before `OutcomeSignal` (T03), comparison (T04),
|
||||||
|
regression (T05), `ChangeEvaluation` (T06)
|
||||||
|
- All feature tasks (T01–T08) before gate (T09)
|
||||||
|
|
||||||
|
## Notes
|
||||||
|
|
||||||
|
- **DeploymentRecord is immutable.** No update/delete — same convention as
|
||||||
|
`DecisionRecord`. A mis-recorded deployment should be noted in a new
|
||||||
|
`DeploymentRecord` with a correcting note.
|
||||||
|
- **OutcomeSignal is append-only.** DB trigger enforces — same pattern as
|
||||||
|
`InteractionEvent`. Observations are evidence; they cannot be revised.
|
||||||
|
- **ChangeEvaluation is one-per-deployment** (UNIQUE constraint). A wrong
|
||||||
|
evaluation cannot be changed — create a new deployment record if you need to
|
||||||
|
re-evaluate a re-deployment.
|
||||||
|
- **Regression detection is a heuristic, not a hard constraint.** It is a signal
|
||||||
|
for operator attention, not an automated gate.
|
||||||
|
- **Recurrence tracking uses completed cycles only.** A cycle is only counted
|
||||||
|
when: prior candidate accepted → deployment record exists for that decision →
|
||||||
|
new annotation created. Partial cycles (no deployment yet) do not count.
|
||||||
|
- **No ML in Phase 4.** All scoring, comparison, and detection is rule-based SQL.
|
||||||
|
Agent-assisted distillation begins in Phase 5.
|
||||||
Reference in New Issue
Block a user