generated from coulomb/repo-seed
Some checks failed
Test / test (push) Has been cancelled
Closes the IHF improvement loop. Full antifragility chain now traversable: Widget → Annotation → Candidate → Requirement → Decision → Deployment → OutcomeSignal New artifacts: - DeploymentRecord (immutable, links DecisionRecord to a deployed version) - OutcomeSignal (append-only; DB trigger prevents UPDATE/DELETE) - ChangeEvaluation (one-per-deployment; UNIQUE constraint; 1–5 score) New capabilities: - DeploymentRecordsController (index, show, new, create) - RecordOutcomeSignalAction — capture improved/regressed/neutral/inconclusive signals - Pre/post comparison panel on deployment show (±30-day event/annotation counts) - Regression detection — improved signal followed by high/critical annotation - ChangeEvaluation — idempotent score+rationale per deployment - Recurrence tracking — cycle count per widget, leaderboard - AntifragilityDashboardAction (autoRefresh, 5 panels) per hub - Phase 4 integration tests (T01–T08 logic coverage) - docs/phase4-summary.md; SCOPE.md updated to Phase 4 complete State Hub: workstream 07e9c860 → completed Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
82 lines
4.7 KiB
Markdown
82 lines
4.7 KiB
Markdown
# Phase 4 Summary — Outcome Observation and Antifragility Loop
|
||
|
||
**Workplan:** IHUB-WP-0004
|
||
**Completed:** 2026-03-29
|
||
**Phase:** 4 of 8 in the IHF specification
|
||
|
||
---
|
||
|
||
## What Was Built
|
||
|
||
Phase 4 closes the IHF improvement loop by connecting deployed versions to observed outcomes. The full chain is now traversable:
|
||
|
||
```
|
||
Widget → InteractionEvent / Annotation
|
||
→ RequirementCandidate → Requirement
|
||
→ DecisionRecord → ImplementationChangeReference
|
||
→ DeploymentRecord
|
||
→ OutcomeSignal ← ChangeEvaluation
|
||
→ RegressionDetection / RecurrenceTracking
|
||
```
|
||
|
||
### T01 — Schema
|
||
|
||
Three new tables:
|
||
- **`deployment_records`** — immutable link from a decision to a deployed version (`version_ref`, `deployed_at`, `deployed_by`, `notes`)
|
||
- **`outcome_signals`** — append-only observations of widget behaviour post-deployment (`signal_type`, `value`, `observed_at`); PostgreSQL trigger prevents UPDATE/DELETE
|
||
- **`change_evaluations`** — one score (1–5) per deployment with rationale; UNIQUE constraint on `deployment_id`
|
||
|
||
### T02 — DeploymentRecords Controller and Views
|
||
|
||
`DeploymentRecordsController` with index, show, new, create (no update/delete — immutable by convention). Decision show page includes a "New Deployment" button (gated on having at least one implementation reference). Deployment show page renders the full decision chain: Decision → ImplRef → Requirement → Candidate → Widget.
|
||
|
||
### T03 — OutcomeSignal Capture
|
||
|
||
`RecordOutcomeSignalAction` (POST from deployment show page). Signal types: `improved`, `regressed`, `neutral`, `inconclusive` with color coding (green/red/gray/yellow). Widget show page lists the last 10 signals across all deployments.
|
||
|
||
### T04 — Pre/Post Comparison
|
||
|
||
Comparison panel on the deployment show page. Computes interaction event counts and annotation severity distribution for the 30-day window before vs. after `deployed_at`. Delta column: green if annotation rate decreased, red if increased. Works with no post-deployment data (shows "—").
|
||
|
||
### T05 — Regression Detection
|
||
|
||
`regressedWidgetIds` pure function in `Application.Helper.Controller`. A regression is: any widget with an `OutcomeSignal(improved)` followed (> 1 day later) by a new `Annotation(severity=high|critical)`. Regression badge on widget show page; regression count and widget list on governance dashboard; prominent alerts panel on antifragility dashboard.
|
||
|
||
### T06 — ChangeEvaluation
|
||
|
||
`EvaluateChangeAction` (POST). Idempotent — second attempt on same deployment redirects with "Already evaluated" message. Score 1–5 rendered as ★ stars with color (1–2 red, 3 yellow, 4–5 green). Evaluation summary shown on decision show page alongside each deployment row.
|
||
|
||
### T07 — Recurrence Tracking
|
||
|
||
`widgetCycleCounts` function computes completed improvement cycles per widget (cycle = accepted candidate → requirement → decision → deployment → new candidate). Cycle count badge ("⟳ N cycles") on widget show page for cycle_count ≥ 2. Top-10 leaderboard on antifragility dashboard.
|
||
|
||
### T08 — Antifragility Dashboard
|
||
|
||
`AntifragilityDashboardAction` wrapped with `autoRefresh do`. Five panels:
|
||
1. **KPI row**: total deployments / avg evaluation score / % improved signals / regression count
|
||
2. **Regression alerts**: widgets currently in regression (red panel, links to widget show pages)
|
||
3. **Open gaps**: decisions with impl refs but no deployment record yet
|
||
4. **Recent deployments** (last 20): version ref, decision title, signal dots, evaluation score
|
||
5. **Recurrence leaderboard**: top 10 widgets by cycle count
|
||
|
||
Linked from hub show page (green "Antifragility" button) and governance dashboard.
|
||
|
||
---
|
||
|
||
## Known Limitations
|
||
|
||
- **Pre/post comparison uses Haskell-side filtering**, not SQL aggregates. For production use with large datasets, replace `computeComparison` with `[typedSql|...|]` queries (IHP v1.5 typed quasiquoter).
|
||
- **Regression detection is a heuristic**. It detects any high/critical annotation after an improved signal — it does not distinguish whether the annotation relates to the same aspect of the widget that was improved.
|
||
- **Cycle count requires a strict data chain**. Cycles are only counted when the full candidate→requirement→decision→deployment chain exists. Partial cycles (e.g., a decision without a deployment) are not counted.
|
||
- **No ML in Phase 4**. All scoring and detection is rule-based. Agent-assisted distillation begins in Phase 5.
|
||
|
||
---
|
||
|
||
## Phase 5 Readiness
|
||
|
||
Phase 5 (Agent-Assisted Distillation) can now build on:
|
||
- `OutcomeSignal` as evidence input for agent distillation
|
||
- `ChangeEvaluation` scores as feedback signal for agent learning
|
||
- `RegressionDetection` results as priority signal for agent attention
|
||
- The full traceability chain from widget to outcome is traversable in one hop
|