Seeded intent and maturity model

2026-06-15 00:04:09 +02:00
parent 5529976295
commit 84924f8160
5 changed files with 1644 additions and 0 deletions
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -0,0 +1,162 @@
 # reuse-surface — Agent Instructions
 ## Repo Identity
 **Purpose:** Capability registry for planning and implementation reuse based on discovery and delivery maturity.
 **Domain:** helix_forge
 **Repo slug:** reuse-surface
 **Topic ID:** `f39fa2a3-c491-414c-a91b-b4c5fcc6139c`
 **Workplan prefix:** `REUSE-WP-`
 ---
 ## State Hub Integration
 The Custodian State Hub tracks work across all domains. Interact via HTTP REST —
 there is no MCP server for Codex agents.
 | Context | URL |
 |---------|-----|
 | Local workstation | `http://127.0.0.1:8000` |
 | Remote via tunnel | `http://127.0.0.1:18000` |
 ### Orient at session start
 ```bash
 # Offline brief — works without hub connection
 cat .custodian-brief.md
 # Active workstreams for this domain
 curl -s "http://127.0.0.1:8000/workstreams/?topic_id=f39fa2a3-c491-414c-a91b-b4c5fcc6139c&status=active" \
  | python3 -m json.tool
 # Check inbox
 curl -s "http://127.0.0.1:8000/messages/?to_agent=reuse-surface&unread_only=true" \
  | python3 -m json.tool
 ```
 Mark a message read:
 ```bash
 curl -s -X PATCH "http://127.0.0.1:8000/messages/<id>/read" \
  -H "Content-Type: application/json" -d '{}'
 ```
 ### Log progress (required at session close)
 ```bash
 curl -s -X POST http://127.0.0.1:8000/progress/ \
  -H "Content-Type: application/json" \
  -d '{
    "summary": "what was done",
    "event_type": "note",
    "author": "codex",
    "workstream_id": "<uuid>",
    "task_id": "<uuid>"
  }'
 ```
 Omit `workstream_id` / `task_id` when not applicable.
 ### Update task status
 ```bash
 curl -s -X PATCH "http://127.0.0.1:8000/tasks/<task_id>" \
  -H "Content-Type: application/json" \
  -d '{"status": "progress"}'
 # values: wait | todo | progress | done | cancel
 ```
 ### Flag a task for human review
 ```bash
 curl -s -X PATCH "http://127.0.0.1:8000/tasks/<task_id>" \
  -H "Content-Type: application/json" \
  -d '{"needs_human": true, "intervention_note": "reason"}'
 ```
 ---
 ## Session Protocol
 **Start:**
 1. `cat .custodian-brief.md` — domain goal and open workstreams (offline-safe)
 2. Check inbox: `GET /messages/?to_agent=reuse-surface&unread_only=true`; mark read
 3. Scan workplans: `ls workplans/` — note `status: ready`, `active`, or `blocked` files and open tasks
 4. Check human-needed tasks: `GET /tasks/?needs_human=true`
 **During work:**
 - Update task statuses in workplan files as tasks progress
 - Record significant decisions via `POST /decisions/`
 **Close:**
 1. Update workplan file task statuses to reflect progress
 2. Log: `POST /progress/` with a summary of what changed
 3. Note for the custodian operator: after workplan file changes, run from
   `~/state-hub`:
   ```bash
   make fix-consistency REPO=reuse-surface
   ```
   This syncs task status from files into the hub DB.
 ---
 ## Workplan Convention (ADR-001)
 Work items originate as files in this repo — not in the hub. The hub is a
 read/cache/index layer that rebuilds from files.
 **File location:** `workplans/REUSE-WP-NNNN-<slug>.md`
 **Archived location:** finished workplans may move to
 `workplans/archived/YYMMDD-REUSE-WP-NNNN-<slug>.md`. The `YYMMDD` prefix is
 the completion/archive date; the frontmatter `id` does not change.
 **Ad Hoc Tasks:** small opportunistic fixes discovered during a session use
 `workplans/ADHOC-YYYY-MM-DD.md` with task ids `ADHOC-YYYY-MM-DD-T01`, etc. Use
 this only for low-risk work completed directly; create a normal workplan for
 anything needing analysis, design, approval, dependencies, or multiple phases.
 **Frontmatter:**
 ```yaml
 ---
 id: REUSE-WP-NNNN
 type: workplan
 title: "..."
 domain: helix_forge
 repo: reuse-surface
 status: proposed | ready | active | blocked | backlog | finished | archived
 owner: codex
 topic_slug: ...
 created: "YYYY-MM-DD"
 updated: "YYYY-MM-DD"
 state_hub_workstream_id: "<uuid>"   # written by fix-consistency — do not edit
 ---
 ```
 Use `proposed` for a new draft, `ready` after review against current repo
 state, and `finished` after implementation. `stalled` and `needs_review` are
 derived health labels, not frontmatter statuses.
 **Task block format** (one per `##` section):
 ```
 ## Task Title
 ` ` `task
 id: REUSE-WP-NNNN-T01
 status: wait | todo | progress | done | cancel
 priority: high | medium | low
 state_hub_task_id: "<uuid>"         # written by fix-consistency — do not edit
 ` ` `
 Task description text.
 ```
 Status progression: `todo` → `progress` → `done`; use `wait` for waiting/blocked work and `cancel` for stopped work.
 To create a new workplan:
 1. Write the file following the format above
 2. Notify the custodian operator to run `make fix-consistency REPO=reuse-surface`
   (or send a message to the hub agent via `POST /messages/`)
--- a/INTENT.md
+++ b/INTENT.md
@@ -0,0 +1,291 @@
 # INTENT.md — reuse-surface
 ## Project Name
 `reuse-surface`
 ## One-Line Intent
 `reuse-surface` provides a registry-centric capability reuse surface that makes capabilities visible, comparable, assessable, and reusable for planning, implementation, and operation.
 ## Purpose
 The purpose of `reuse-surface` is to turn scattered capabilities into inspectable reuse assets.
 A capability that is not registered is effectively invisible for reuse. `reuse-surface` therefore treats registry membership as the boundary of relevance: once a capability is present in the registry, it can be discovered, assessed, compared, planned with, implemented against, improved, and eventually standardized.
 The project exists to answer four practical questions:
 1. **What capabilities do we have or intend to have?**
 2. **How mature are they for planning reuse?**
 3. **How available are they for implementation or operational reuse?**
 4. **How well do consumers experience them in terms of completeness and reliability?**
 ## Core Idea
 `reuse-surface` is not merely a catalog of features or services. It is a structured reuse layer for capabilities.
 A capability describes a bounded behavior or power that can be reused across products, repositories, systems, agents, workflows, or organizations. Features may expose capabilities. Services may deliver capabilities. Products may bundle capabilities. But the registry focuses on the reusable capability itself.
 The central assumption is:
 > A capability registry should not primarily describe what exists. It should describe what can be reused, at which confidence level, for which kind of work.
 ## Scope
 `reuse-surface` shall provide the conceptual, structural, and eventually technical foundation for a capability registry.
 In scope:
 - capability registration
 - capability identity and naming
 - capability descriptions and boundaries
 - discovery maturity assessment
 - availability maturity assessment
 - completeness assessment based on SCOPE vs INTENT and consumer expectations
 - reliability assessment based on consumer-relevant quality signals
 - relationships between capabilities
 - evidence references for maturity claims
 - registry formats usable by humans and agents
 - planning support for prototype, MVP, enhancement, and platform decisions
 - implementation support through discoverable consumption modes
 Out of scope for the initial intent:
 - judging internal code quality as capability maturity
 - replacing feature maturity, service maturity, or operational maturity models
 - enforcing one specific implementation architecture
 - requiring all capabilities to become services or products
 - treating unregistered capabilities as relevant for reuse analysis
 ## Capability Maturity Dimensions
 `reuse-surface` uses four complementary dimensions.
 ### Internal Registry Assessments
 These dimensions are assessed from the registry perspective.
 #### Discovery Maturity
 Discovery maturity measures how reusable the capability is for planning, orientation, comparison, roadmap design, and architectural reasoning.
 Canonical levels:
 - **D0 Named** — capability is visible in the registry.
 - **D1 Described** — capability has meaning, intent, and context.
 - **D2 Bounded** — scope, inclusions, exclusions, and neighboring capabilities are defined.
 - **D3 Explored** — obvious relevant aspects have been investigated.
 - **D4 Researched** — prior art, alternatives, products, standards, and tradeoffs have been deeply examined.
 - **D5 Grounded** — concrete use cases, actors, scenarios, and prioritization criteria are documented.
 - **D6 Exhaustive** — use-case and scope exploration has likely reached saturation.
 - **D7 Generalized** — the capability has become a reusable planning primitive beyond one repo, product, or domain.
 #### Availability Maturity
 Availability maturity measures how directly the capability can be consumed for implementation or operation.
 Canonical levels:
 - **A0 Informational Only** — read and plan only.
 - **A1 Experimental Prototype** — learn and experiment.
 - **A2 Source Module / Library** — import and build with code.
 - **A3 Command-Line Package** — automate via CLI.
 - **A4 Service API / SDK** — integrate into applications.
 - **A5 Containerized Service** — deploy and operate.
 - **A6 Managed Platform Capability** — consume as an internal platform service.
 - **A7 External Cloud Service Offering** — consume as a public or commercial cloud/API service.
 ### External Consumer Evidence
 These dimensions are derived from consumer experience and evidence.
 #### Completeness
 Completeness measures how well current SCOPE satisfies declared INTENT and consumer expectations.
 Canonical levels:
 - **C0 Unknown** — no meaningful evidence.
 - **C1 Fragmentary** — isolated parts of the expected capability are present.
 - **C2 Partial** — some important expectations are satisfied, but major gaps remain.
 - **C3 Functional Core** — the central expected use case works.
 - **C4 Broadly Covered** — most common expectations are satisfied; gaps are known and bounded.
 - **C5 Expectation Complete** — declared intent is substantially fulfilled for known expectations.
 - **C6 Saturated** — further consumer discovery rarely reveals missing scope.
 #### Reliability
 Reliability measures how consistently the capability satisfies consumer-relevant quality expectations in real or realistic use.
 Canonical levels:
 - **R0 Unknown** — no meaningful evidence.
 - **R1 Fragile** — frequently breaks, surprises, or disappoints consumers.
 - **R2 Tolerable** — works in selected situations, but consumers must expect friction.
 - **R3 Usable** — works reliably for normal use with known limitations.
 - **R4 Dependable** — consumers can rely on it for important workflows.
 - **R5 Trusted** — strong consumer confidence; failures are rare and well handled.
 - **R6 Proven** — reliability is demonstrated across broad, repeated, and demanding use.
 ## Capability Vector
 A registered capability may be summarized with a compact vector:
 ```text
 D5 / A4 / C3 / R3
 ```
 Meaning:
 - discovery maturity: grounded
 - availability maturity: service API / SDK
 - completeness: functional core
 - reliability: usable
 The vector is descriptive, not a moral grade. Different capability types may have different target vectors.
 For example:
 - a research method may naturally target `D5 / A0 / C4 / R3`
 - a CLI tool may naturally target `D5 / A3 / C5 / R5`
 - an internal platform service may naturally target `D6 / A6 / C5 / R5`
 - a commercial API offering may naturally target `D7 / A7 / C6 / R6`
 ## Guiding Principles
 ### Registry First
 Capabilities outside the registry are invisible for reuse analysis. The registry is the reuse surface.
 ### Reuse Over Inventory
 The registry should optimize for capability reuse, not merely inventory completeness.
 ### Planning and Implementation Are Different
 Planning reuse feeds primarily on discovery maturity. Implementation reuse feeds primarily on availability maturity.
 ### Internal and External Evidence Must Stay Separate
 Discovery and availability are internal registry assessments. Completeness and reliability are external consumer-evidence dimensions.
 ### Capability Maturity Is Not Feature Maturity
 Internal code structure, UI polish, local implementation elegance, and feature detail quality may matter, but they are not the same as capability maturity. They can be assessed separately when needed.
 ### SCOPE vs INTENT Matters
 Completeness depends on the relationship between declared intent, current scope, and consumer expectations. Broken expectations are first-class evidence.
 ### Consumer Experience Matters
 Reliability depends on consumer-relevant evidence such as bug reports, support tickets, incidents, failed integrations, ratings, adoption, retention, and qualitative feedback.
 ### Target Maturity Should Be Explicit
 Not every capability needs to become a cloud service. Each capability should have a current vector and, where useful, a target vector.
 ## Expected Registry Entry Shape
 A registry entry should eventually support a structure similar to:
 ```yaml
 id: capability.example
 name: Example Capability
 summary: Short capability summary.
 maturity:
  discovery:
    current: D2
    target: D5
  availability:
    current: A1
    target: A4
 external_evidence:
  completeness:
    current: C1
    confidence: low
  reliability:
    current: R0
    confidence: low
 discovery:
  intent: What this capability is meant to make possible.
  includes: []
  excludes: []
  assumptions: []
  use_cases: []
  research_memos: []
 availability:
  current_artifacts: []
  target_artifacts: []
  consumption_modes: []
 relations:
  depends_on: []
  supports: []
  related_to: []
 evidence:
  documentation: []
  tests: []
  consumer_feedback: []
  bug_reports: []
  incidents: []
 ```
 ## Initial Repository Role
 The initial role of `reuse-surface` is to define and maintain the capability registry model, standards, schemas, examples, and reference tooling.
 Likely early repository contents:
 ```text
 reuse-surface/
 ├── INTENT.md
 ├── README.md
 ├── standards/
 │   └── CapabilityMaturityStandard.md
 ├── registry/
 │   ├── capabilities.yaml
 │   └── examples/
 ├── schemas/
 │   └── capability.schema.json
 ├── docs/
 │   ├── CapabilityRegistryConcept.md
 │   └── CapabilityAssessmentGuide.md
 └── tools/
    └── README.md
 ```
 ## Success Criteria
 `reuse-surface` is successful when it helps humans and agents:
 - find reusable capabilities before rebuilding them
 - compare capability maturity consistently
 - distinguish conceptual readiness from delivery availability
 - distinguish internal registry assessment from external consumer evidence
 - plan prototype, MVP, enhancement, and platform work more effectively
 - identify capability gaps, duplicates, overlaps, and standardization candidates
 - track progress from named ideas to generalized reusable capabilities
 - make capability reuse a normal part of product and architecture work
 ## Non-Goals
 `reuse-surface` is not intended to become:
 - a generic feature tracker
 - a project management replacement
 - a service catalog only
 - a CMDB only
 - an implementation quality gate by itself
 - a forced architecture pattern
 - a claim that all reusable things must be services
 ## Working Mantra
 > Make capabilities visible enough to plan with, available enough to build with, and evidenced enough to trust.
--- a/SCOPE.md
+++ b/SCOPE.md
@@ -0,0 +1,32 @@
 # SCOPE
 > This file was generated by `statehub register`. Refine it as the repository
 > boundaries become clearer.
 ## One-liner
 Capability registry for planning and implementation reuse based on discovery and delivery maturity.
 ## Core Idea
 reuse-surface exists to provide the capability described in INTENT.md.
 ## In Scope
 - Maintain the repository's primary implementation.
 - Keep docs, tests, and operational metadata current.
 ## Out of Scope
 - Own unrelated adjacent systems.
 - Make irreversible operational decisions without human approval.
 ## Current State
 - Status: active; implementation and stability should be verified by the repo agent.
 ## Getting Oriented
 - Start with: INTENT.md
 - Agent instructions: AGENTS.md
 - Workplans: workplans/
--- a/specs/CapabilityMaturityStandard.md
+++ b/specs/CapabilityMaturityStandard.md
--- a/workplans/REUSE-WP-0001-statehub-bootstrap.md
+++ b/workplans/REUSE-WP-0001-statehub-bootstrap.md
@@ -0,0 +1,58 @@
 ---
 id: REUSE-WP-0001
 type: workplan
 title: "Bootstrap State Hub integration"
 domain: helix_forge
 repo: reuse-surface
 status: ready
 owner: codex
 topic_slug: helix-forge
 created: "2026-06-14"
 updated: "2026-06-14"
 state_hub_workstream_id: "293ef16d-c15b-41cc-9258-4a3a233f4bf2"
 ---
 # Bootstrap State Hub integration
 Capability registry for planning and implementation reuse based on discovery and delivery maturity.
 ## Review Generated Integration Files
 ```task
 id: REUSE-WP-0001-T01
 status: todo
 priority: high
 state_hub_task_id: "18c811c8-f9f6-4452-b3f0-b713c91918a4"
 ```
 Review `INTENT.md`, `SCOPE.md`, `AGENTS.md`, and `.custodian-brief.md`.
 Replace generated placeholders with repo-specific facts where needed.
 ## Verify Local Developer Workflow
 ```task
 id: REUSE-WP-0001-T02
 status: todo
 priority: high
 state_hub_task_id: "0877cf67-6abe-48c3-b76c-df552620d678"
 ```
 Identify the repo's install, test, lint, build, and run commands. Add or refine
 those commands in the agent instructions so future coding sessions can verify
 changes confidently.
 ## Seed First Real Workplan
 ```task
 id: REUSE-WP-0001-T03
 status: todo
 priority: medium
 state_hub_task_id: "f480aa11-f7e1-45fe-bf0b-4a3c164eb85b"
 ```
 Create the first implementation workplan for the repository's most important
 next change. After workplan file updates, run from `~/state-hub`:
 ```bash
 make fix-consistency REPO=reuse-surface
 ```