Files
inter-hub/workplans/IHUB-WP-0005-ihf-phase5-agent-assisted-distillation.md
Bernd Worsch a7ee8d414e
Some checks failed
Test / test (push) Has been cancelled
chore(consistency): sync task status from DB [auto]
Marks all Phase 5 tasks and workplan done in file.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-29 15:55:06 +00:00

421 lines
16 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
id: IHUB-WP-0005
type: workplan
title: "IHF Phase 5 — Agent-Assisted Distillation and Suggestion"
domain: inter_hub
repo: inter-hub
status: done
owner: custodian
topic_slug: inter_hub
created: "2026-03-29"
updated: "2026-03-29"
state_hub_workstream_id: "535a6479-9852-4386-8ad0-f86397a018c5"
---
# IHF Phase 5 — Agent-Assisted Distillation and Suggestion
## Goal
Introduce bounded AI support into the interaction-governance loop. Phase 4
established the antifragility loop — outcome signals, regression detection,
recurrence tracking. Phase 5 adds AI capability to reduce the cognitive burden
on human reviewers: cluster summaries, drafted requirement candidates, duplicate
detection, policy-sensitivity flagging, and implementation path proposals.
All AI outputs are **attributable** (model_ref recorded), **reviewable**
(AgentReviewRecord), and **reversible** (proposals can be rejected). No
autonomous final decisions. No silent requirement promotion.
## Background
Phase 1 (IHUB-WP-0001) delivered the Minimal Interaction Core. Phase 2
(IHUB-WP-0002) delivered Structured Feedback and Triage. Phase 3
(IHUB-WP-0003) delivered Governance and Decision Linkage. Phase 4
(IHUB-WP-0004) delivered Outcome Observation and Antifragility. All Phase 4
exit criteria are met.
Phase 5 is the fifth of eight phases in the IHF specification
(`specs/InteractionHubFrameworkSpecification_v0.1.md`, §14 Phase 5).
**Technology stack:** IHP v1.5 (Haskell, Nix), PostgreSQL, Anthropic API
(claude-sonnet-4-6). Agent calls are synchronous HTTP in controller actions.
API key from `IHP_ANTHROPIC_API_KEY` environment variable.
Reference: `docs/ihp-overview.md`, `docs/ihp-data-and-queries.md`,
`docs/ihp-controllers-views-forms.md`, `docs/ihp-realtime.md`.
## Phase 5 Exit Criteria (from IHF spec §14 Phase 5)
- AI assistance reduces triage and synthesis burden
- Human reviewers remain in control
- AI outputs are auditable and attributable
## Data Artifacts Introduced (Phase 5)
`AgentProposal`, `AgentReviewRecord`, `ConfidenceAnnotation`
---
## Tasks
### T01 — Schema: AgentProposal, AgentReviewRecord, ConfidenceAnnotation
```task
id: IHUB-WP-0005-T01
status: done
priority: high
state_hub_task_id: "6e1a9d31-a7e9-4d71-a726-44eaf739371c"
```
Add Phase 5 tables to `Application/Schema.sql` and write migration:
```sql
CREATE TABLE agent_proposals (
id UUID DEFAULT uuid_generate_v4() PRIMARY KEY NOT NULL,
proposal_type TEXT NOT NULL,
-- proposal_type values: summary | requirement_draft | duplicate_flag |
-- policy_flag | impl_proposal
source_widget_id UUID REFERENCES widgets(id) ON DELETE SET NULL,
source_candidate_id UUID REFERENCES requirement_candidates(id) ON DELETE SET NULL,
source_thread_id UUID REFERENCES annotation_threads(id) ON DELETE SET NULL,
source_decision_id UUID REFERENCES decision_records(id) ON DELETE SET NULL,
content TEXT NOT NULL,
model_ref TEXT NOT NULL, -- e.g. "claude-sonnet-4-6"
confidence NUMERIC CHECK (confidence BETWEEN 0 AND 1),
status TEXT NOT NULL DEFAULT 'pending',
-- status values: pending | accepted | rejected | superseded
created_at TIMESTAMP WITH TIME ZONE DEFAULT now() NOT NULL
);
CREATE INDEX agent_proposals_proposal_type_idx ON agent_proposals (proposal_type);
CREATE INDEX agent_proposals_status_idx ON agent_proposals (status);
CREATE INDEX agent_proposals_source_widget_id_idx ON agent_proposals (source_widget_id);
CREATE INDEX agent_proposals_created_at_idx ON agent_proposals (created_at DESC);
-- One review record per proposal (human decision on AI output)
CREATE TABLE agent_review_records (
id UUID DEFAULT uuid_generate_v4() PRIMARY KEY NOT NULL,
proposal_id UUID NOT NULL REFERENCES agent_proposals(id) ON DELETE CASCADE,
reviewer_id UUID REFERENCES users(id),
decision TEXT NOT NULL, -- accepted | rejected | modified
notes TEXT,
reviewed_at TIMESTAMP WITH TIME ZONE DEFAULT now() NOT NULL,
UNIQUE (proposal_id)
);
CREATE INDEX agent_review_records_proposal_id_idx ON agent_review_records (proposal_id);
-- Confidence annotations — per-dimension breakdown of AI confidence
CREATE TABLE confidence_annotations (
id UUID DEFAULT uuid_generate_v4() PRIMARY KEY NOT NULL,
proposal_id UUID NOT NULL REFERENCES agent_proposals(id) ON DELETE CASCADE,
dimension TEXT NOT NULL,
-- dimension values: accuracy | relevance | completeness | policy_alignment
score NUMERIC NOT NULL CHECK (score BETWEEN 0 AND 1),
explanation TEXT,
created_at TIMESTAMP WITH TIME ZONE DEFAULT now() NOT NULL
);
CREATE INDEX confidence_annotations_proposal_id_idx ON confidence_annotations (proposal_id);
```
- `agent_proposals.status` transitions: `pending → accepted | rejected | superseded`
- `agent_review_records` has UNIQUE on `proposal_id` — one review per proposal
- Add `IHP_ANTHROPIC_API_KEY` to CLAUDE.md required env vars table
**Exit criteria:** `migrate` runs cleanly; all Phase 5 types available in GHCi.
---
### T02 — AgentProposalsController and views
```task
id: IHUB-WP-0005-T02
status: done
priority: high
state_hub_task_id: "5a9b9d51-dcdf-4dad-b449-08a7091e4563"
```
1. Scaffold `AgentProposalsController`
2. Actions: `index`, `show`, `AcceptProposalAction`, `RejectProposalAction`
(no update/delete — proposals are immutable audit artifacts)
3. Index: table filterable by `proposal_type` and `status`:
- type badge (color per type), source widget name, confidence bar, status,
created_at
4. Show: content panel + confidence dimension breakdown + review form
(decision select + notes textarea) + attribution footer (model_ref, created_at)
5. After accept/reject: redirect to proposal show with success message
6. Accept/reject is idempotent — if review record already exists, redirect
with "Already reviewed" message
7. Add nav link "Agent" to global nav in `FrontController.hs`
**Proposal type colors:**
- `summary` → blue
- `requirement_draft` → indigo
- `duplicate_flag` → orange
- `policy_flag` → red
- `impl_proposal` → green
**Exit criteria:** Proposals can be listed, viewed, accepted, and rejected;
review record created correctly; idempotent guard works.
---
### T03 — Cluster summarization via Claude API
```task
id: IHUB-WP-0005-T03
status: done
priority: high
state_hub_task_id: "630d8d95-a009-4406-82e2-27f62fabcd3c"
```
1. `SummarizeClusterAction { widgetId }` (POST from widget show page)
2. Fetch last 20 annotations + annotation threads for the widget
3. Build prompt:
- **System:** "You are a distillation assistant for a governed interaction
hub. Summarize the following user feedback cluster into a concise,
actionable summary (24 sentences). Be factual and neutral."
- **User:** formatted annotation text
4. Call Anthropic Messages API (`claude-sonnet-4-6`, `max_tokens=300`)
5. Create `AgentProposal`:
- `proposal_type = "summary"`
- `source_widget_id = widgetId`
- `model_ref = "claude-sonnet-4-6"`
- `content = response text`
- `confidence = NULL` (summaries have no numeric confidence)
6. On widget show page: "Summarize Feedback" button → POST; after POST
redirect back, show latest summary proposal in a collapsible panel
7. HTTP client: use `http-conduit` + `aeson`; API key from
`IHP_ANTHROPIC_API_KEY` env var (error if absent)
8. Handle API errors gracefully — set error flash, redirect back
**Exit criteria:** Summary proposal created and visible on widget show page;
API errors produce a user-visible error message (not a 500).
---
### T04 — AI-drafted requirement candidate
```task
id: IHUB-WP-0005-T04
status: done
priority: high
state_hub_task_id: "4c9d23f7-1744-48c8-90b5-71854d9b7daf"
```
1. `DraftRequirementAction { widgetId }` (POST from widget show page)
2. Fetch last 20 annotations for the widget
3. Build prompt:
- **System:** "You are a requirements analyst. Given these friction
annotations, draft a single structured requirement candidate. Respond
with JSON: {\"title\": \"...\", \"description\": \"...\"}."
- **User:** formatted annotation text
4. Parse JSON response; create `AgentProposal`:
- `proposal_type = "requirement_draft"`
- `content = raw JSON string from response`
5. On `AcceptProposalAction` for a `requirement_draft` proposal:
- Parse `content` as JSON
- Create `RequirementCandidate`:
- `title` and `description` from parsed JSON
- `source_widget_id = proposal.source_widget_id`
- `category = "friction"`
- `status = "open"`
- Update proposal `status = "accepted"`
- Create `AgentReviewRecord` (decision = "accepted")
- Set success flash: "Requirement candidate created from AI draft"
6. "Draft Requirement" button on widget show page (only when ≥ 3 annotations)
**Exit criteria:** Draft proposal created; acceptance creates a
`RequirementCandidate`; review record present; no candidate created without
human acceptance.
---
### T05 — Duplicate candidate detection
```task
id: IHUB-WP-0005-T05
status: done
priority: medium
state_hub_task_id: "969b7c7f-c3ba-4892-a1d0-faedf536d1c6"
```
1. `DetectDuplicatesAction { requirementCandidateId }` (POST from candidate
show page)
2. Fetch the candidate + all other `RequirementCandidate` records
3. Build prompt:
- **System:** "You are a deduplication assistant. Given a target candidate
and a list of existing candidates, identify likely duplicates. Respond
with JSON: {\"duplicates\": [{\"id\": \"uuid\", \"reason\": \"...\"}]}."
- **User:** target candidate + list of candidates (id, title, description)
4. Parse response; create `AgentProposal`:
- `proposal_type = "duplicate_flag"`
- `source_candidate_id = requirementCandidateId`
- `content = JSON string of duplicates array`
5. On candidate show page: "Check Duplicates" button → POST; show "Possible
Duplicates" panel with links to flagged candidates and rationale
6. Informational only — no automated merging or status changes
**Exit criteria:** Duplicate proposal created; flagged candidates rendered as
links on candidate show page; empty duplicates array handled gracefully.
---
### T06 — Policy-sensitive issue detection
```task
id: IHUB-WP-0005-T06
status: done
priority: medium
state_hub_task_id: "475290e0-7842-4336-a57c-04fa62652094"
```
1. `DetectPolicySensitivityAction { requirementCandidateId }` (POST from
candidate show page)
2. Fetch candidate + widget `policy_scope` + existing `PolicyReference`
constraint notes for linked decisions
3. Build prompt:
- **System:** "You are a policy compliance assistant. Analyse this
requirement candidate for potential policy concerns. Valid scopes:
internal, external, regulatory, contractual, architectural. Respond
with JSON: {\"concerns\": [{\"scope\": \"...\", \"note\": \"...\"}],
\"severity\": \"low|medium|high\"}."
- **User:** candidate title/description + policy context
4. Create `AgentProposal`:
- `proposal_type = "policy_flag"`
- `content = JSON string`
- `confidence = severity mapped to numeric (low=0.3, medium=0.6, high=0.9)`
5. Create one `ConfidenceAnnotation` per concern scope dimension
6. On candidate show page: "Policy Check" panel — amber badge if concerns,
green badge if clean
**Exit criteria:** Policy proposal created with confidence annotation;
concern severity rendered correctly; clean result (empty concerns) handled.
---
### T07 — Implementation path proposal
```task
id: IHUB-WP-0005-T07
status: done
priority: medium
state_hub_task_id: "7ee1274e-fa1b-4ae8-a360-4abafc1773f0"
```
1. `ProposeImplementationAction { decisionRecordId }` (POST from decision
show page)
2. Fetch decision + linked requirement + existing `ImplementationChangeReference`
records
3. Build prompt:
- **System:** "You are a traceability-aware implementation analyst. Propose
13 concrete implementation paths for this decision. Each path should
include a work_item_ref (e.g. PROJ-123), a system (github|linear|jira),
and a rationale. Respond with JSON: {\"proposals\": [{\"work_item_ref\":
\"...\", \"system\": \"...\", \"rationale\": \"...\"}]}."
- **User:** decision title, rationale, outcome, requirement description,
existing impl refs
4. Create `AgentProposal`:
- `proposal_type = "impl_proposal"`
- `source_decision_id = decisionRecordId`
- `content = JSON string`
5. On decision show page: "Propose Implementation" button; accepted proposals
pre-fill the AddImplementationRef form (first proposal in the array)
6. Surface as collapsed "AI Suggestions" panel on decision show page
**Exit criteria:** Impl proposal created; acceptance pre-fills the impl ref
form; multiple proposal paths rendered clearly.
---
### T08 — Agent attribution and audit dashboard (AutoRefresh)
```task
id: IHUB-WP-0005-T08
status: done
priority: high
state_hub_task_id: "53b58abb-cb50-4985-a1c0-b05da17dfc3f"
```
1. Add `AgentAuditDashboardAction { hubId }` to `HubsController` wrapped
with `autoRefresh do`
2. Dashboard panels:
- **KPI row**: total proposals / pending / acceptance rate / rejection rate
- **Proposals by type**: count breakdown per `proposal_type`
- **Unreviewed queue**: proposals with `status = "pending"`, oldest first,
with "Review" links
- **Recent proposals** (last 20): type badge, source widget, status,
confidence, age
- **Attribution log**: `model_ref × proposal_type` count matrix
3. Link from hub Show page alongside Triage / Governance / Antifragility
4. Add "Agent" link to global nav
**Exit criteria:** Dashboard live-updates on proposal/review changes. All five
panels render with correct data.
---
### T09 — Phase 5 gate: tests, consistency, docs
```task
id: IHUB-WP-0005-T09
status: done
priority: high
state_hub_task_id: "c7b8aef0-b241-4038-99c0-494c45f226a6"
```
1. **Integration tests** (`Test/`):
- AgentProposal create + fetch (all fields)
- AcceptProposalAction for `requirement_draft` → creates `RequirementCandidate`
- RejectProposalAction → sets `status = "rejected"`, creates review record
- Review record idempotency (second accept/reject → "Already reviewed")
- ConfidenceAnnotation create + link to proposal
- Duplicate detection: proposal with empty duplicates array
- Agent audit dashboard data fetch: compiles and returns correct counts
2. **Consistency sync** via State Hub MCP:
`check_repo_consistency(repo_slug="inter-hub", fix=True)`
3. **Documentation updates:**
- Update `SCOPE.md` current state section: Phase 5 complete
- Write `docs/phase5-summary.md`: what was built, governance constraints
upheld, known limitations, Phase 6 readiness
4. **Smoke test checklist:**
- Summarize feedback cluster on a widget with annotations
- Draft a requirement from the summary
- Accept the draft → verify RequirementCandidate created
- Check duplicates on a candidate
- Run policy check on a candidate
- Confirm agent audit dashboard shows all panels
**Exit criteria:** All tests pass; consistency sync reports no errors; smoke
test completed; SCOPE.md updated.
---
## Phase 5 Dependencies
- Phase 4 schema stable (Phase 4 tables needed as context for impl proposals)
- `http-conduit` and `aeson` available (already in IHP's Haskell ecosystem)
- `IHP_ANTHROPIC_API_KEY` set in environment
- Schema (T01) before all controller work (T02T08)
- `AgentProposalsController` (T02) before all agent action tasks (T03T07)
- All feature tasks (T01T08) before gate (T09)
## Notes
- **All AI outputs are attributed.** `model_ref` is recorded on every
`AgentProposal`. Reviewers see exactly which model produced the output.
- **No silent promotion.** A `requirement_draft` proposal only becomes a
`RequirementCandidate` when a human explicitly accepts it via
`AcceptProposalAction`. The controller enforces this.
- **Review record is idempotent.** UNIQUE constraint on
`agent_review_records.proposal_id`. Second accept/reject redirects with
"Already reviewed" message.
- **API errors do not crash.** All Claude API calls are wrapped in error
handling; failures produce a user-visible flash message and redirect.
- **Confidence is optional for summaries.** `AgentProposal.confidence` is
nullable — summary proposals do not produce a numeric confidence score.
Policy flag proposals derive confidence from severity (low/medium/high).
- **No ML infrastructure.** All intelligence is delegated to the Anthropic
API. Phase 5 adds no local model serving, no embeddings storage.