fix: release-check lint fixes for 1.1.0 publish

Wrap long lines for flake8, rename extensions remove command handler to avoid Click shadowing, and drop unused migration imports.
WP-0001 complete: v1.1.0 lazy registry and install performance
2026-06-16 02:14:07 +02:00 · 2026-06-16 02:06:43 +02:00 · 2026-06-16 01:58:07 +02:00 · 2026-06-16 01:53:56 +02:00 · 2026-06-16 01:53:01 +02:00 · 2026-06-16 01:49:27 +02:00
92 changed files with 7135 additions and 637 deletions
--- a/.claude/rules/architecture.md
+++ b/.claude/rules/architecture.md
@@ -6,20 +6,23 @@ kaizen-agentic has two distinct layers:
 - **`core.py`** — `Agent` (abstract base) + `AgentConfig` (dataclass). Tracks performance, supports config updates, implements kaizen interface.
 - **`optimization.py`** — `OptimizationLoop` (runs improvement cycles, detects trends, generates recommendations) + `PerformanceMetrics` (execution time, success rate, quality scores).
 - **`metrics.py`** — `MetricsStore` + `OptimizerStore` (project-scoped `.kaizen/metrics/` per ADR-004).
-### 2. Agent definitions (`agents/` — 17 files)
+### 2. Agent definitions (`agents/` — 20 files)
 Markdown instruction sets read and followed by Claude. Not executables. Naming convention: `agent-{name}.md`.
 Packaged copies live in `src/kaizen_agentic/data/agents/` for `pip install` distribution.
 | Category | Agents |
 |----------|--------|
 | Testing | `tdd-workflow`, `test-maintenance`, `testing-efficiency` |
-| Quality | `code-refactoring`, `datamodel-optimization`, `optimization` |
+| Quality | `code-refactoring`, `datamodel-optimization` |
 | Process | `requirements-engineering`, `keepaTodofile`, `keepaChangelog`, `keepaContributingfile`, `project-management`, `priority-evaluation`, `scope-analyst` |
-| Infrastructure | `setupRepository`, `tooling-optimization` |
+| Infrastructure | `setupRepository`, `tooling-optimization`, `sys-medic` |
 | Release | `releaseManager` |
 | Docs | `claude-documentation` |
 | Support | `wisdom-encouragement` |
 | Meta | `coach`, `optimization` |
 ### Custodian integration
--- a/.flake8
+++ b/.flake8
@@ -0,0 +1,11 @@
 [flake8]
 max-line-length = 88
 extend-ignore = E203, W503
 per-file-ignores =
    tests/*:E501,F841
 exclude =
    .venv,
    build,
    dist,
    .git,
    __pycache__
--- a/.gitea/ISSUE_TEMPLATE/bug_report.md
+++ b/.gitea/ISSUE_TEMPLATE/bug_report.md
@@ -0,0 +1,35 @@
 ---
 name: Bug report
 about: Report a defect in kaizen-agentic
 title: "[bug] "
 labels: bug
 ---
 ## Summary
 <!-- One sentence describing the problem -->
 ## Steps to reproduce
 1.
 2.
 3.
 ## Expected behavior
 ## Actual behavior
 ## Environment
 - OS:
 - Python version:
 - kaizen-agentic version (`kaizen-agentic --version`):
 - Install method (pip / editable / other):
 ## Logs or CLI output
 ```
 (paste relevant output)
 ```
 ## Additional context
--- a/.gitea/ISSUE_TEMPLATE/config.yaml
+++ b/.gitea/ISSUE_TEMPLATE/config.yaml
@@ -0,0 +1,8 @@
 blank_issues_enabled: false
 contact_links:
  - name: Feedback guide
    url: https://gitea.coulomb.social/coulomb/kaizen-agentic/src/branch/main/docs/FEEDBACK.md
    about: How to submit feedback, bugs, and feature ideas
  - name: Contributing guide
    url: https://gitea.coulomb.social/coulomb/kaizen-agentic/src/branch/main/CONTRIBUTING.md
    about: Development workflow and code standards
--- a/.gitea/ISSUE_TEMPLATE/feature_request.md
+++ b/.gitea/ISSUE_TEMPLATE/feature_request.md
@@ -0,0 +1,23 @@
 ---
 name: Feature request
 about: Suggest an enhancement for kaizen-agentic
 title: "[feature] "
 labels: enhancement
 ---
 ## Problem or opportunity
 <!-- What pain point does this address? -->
 ## Proposed solution
 ## Alternatives considered
 ## Scope
 - [ ] CLI / framework (`src/kaizen_agentic/`)
 - [ ] Agent definitions (`agents/`)
 - [ ] Documentation / wiki
 - [ ] Ecosystem integration (activity-core, artifact-store, agentic-resources)
 ## Additional context
--- a/.gitea/ISSUE_TEMPLATE/feedback.md
+++ b/.gitea/ISSUE_TEMPLATE/feedback.md
@@ -0,0 +1,21 @@
 ---
 name: General feedback
 about: Share experience, ideas, or adoption feedback
 title: "[feedback] "
 labels: feedback
 ---
 ## Context
 <!-- How are you using kaizen-agentic? (project type, agents used, workflow) -->
 ## What worked well
 ## What was confusing or friction-heavy
 ## Suggestions
 ## Optional: metrics / telemetry context
 If relevant, note whether you use project metrics (`.kaizen/metrics/`) or Helix Forge
 fleet capture — helps us prioritize integration improvements.
--- a/.gitea/workflows/ci.yml
+++ b/.gitea/workflows/ci.yml
@@ -0,0 +1,31 @@
 name: ci
 on:
  push:
    branches: [main]
  pull_request:
    branches: [main]
 jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        python-version: ["3.10", "3.12"]
    steps:
      - name: Check out source
        uses: actions/checkout@v4
      - name: Set up Python ${{ matrix.python-version }}
        uses: actions/setup-python@v5
        with:
          python-version: ${{ matrix.python-version }}
      - name: Install package and dev tools
        run: python -m pip install --upgrade pip && python -m pip install -e ".[dev]"
      - name: Format check (black)
        run: black --check src tests
      - name: Run tests
        run: pytest tests/ -q --ignore=tests/test_cli_error_handling.py
--- a/.gitignore
+++ b/.gitignore
@@ -42,3 +42,6 @@ venv.bak/
 .coverage
 htmlcov/
 .tox/
 # Backup directories created by optimization scripts
 agents_backup_*/
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -0,0 +1,20 @@
 # Pre-commit hooks for kaizen-agentic (WP-0001 T02)
 # Install: pip install pre-commit && pre-commit install
 repos:
  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v5.0.0
    hooks:
      - id: trailing-whitespace
      - id: end-of-file-fixer
      - id: check-yaml
        args: [--unsafe]
      - id: check-added-large-files
        args: [--maxkb=512]
  - repo: https://github.com/psf/black
    rev: 24.10.0
    hooks:
      - id: black
        language_version: python3
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -7,8 +7,25 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ## [Unreleased]
 ## [1.1.0] - 2026-06-18
 ### Added
- **sys-medic agent**: Linux/Kubernetes node health assessment agent integrated as a standard kaizen-agentic infrastructure agent (KAIZEN-WP-0002 Part 1)
+- **`kaizen-agentic feedback`** CLI and Gitea issue templates for developer feedback
 - **Gitea CI** (`.gitea/workflows/ci.yml`) — black + pytest on Python 3.10/3.12
 - **Pre-commit hooks** (`.pre-commit-config.yaml`) and `make pre-commit-install`
 - **`docs/FEEDBACK.md`** and **`docs/TELEMETRY.md`** (ADR-004 two-layer telemetry model)
 - **Ecosystem integration (WP-0004)**: Helix correlation, artifact-store publish, activity-core definitions
 - **Project metrics (WP-0003)**: ADR-004 storage, metrics CLI, optimizer wiring, tdd-workflow pilot
 - **sys-medic agent** and packaged fleet sync (20 agents in `data/agents/`)
 ### Changed
 - **Lazy agent registry** — index by frontmatter name; parse on demand; path-based install copy
 - **CLI error messages** — clearer guidance when agents directory or package missing
 - **CONTRIBUTING.md** — post-pull reinstall instructions (`pip install -e .` / pipx)
 ### Fixed
 - **Makefile template** in project initializer — tab characters no longer break Python linting
 - Removed stale `agents_backup_*/` scaffolding from development installs
 ## [1.0.1] - 2025-10-20
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -24,6 +24,14 @@ This document outlines how to get started, how we organise work, and how to help
 4. Verify setup: `make test-quick` or `pytest tests/`
 5. Familiarize yourself with agent system (see CLAUDE.md)
 **After pulling updates:** reinstall the CLI so new commands are available:
 ```bash
 pip install -e .              # project venv
 # or
 pipx install -e . --force     # global pipx install
 ```
 ## Development Workflow
 ### Project Structure
@@ -63,6 +71,8 @@ This repository follows PythonVibes best practices:
 - **Linting**: Flake8 (`flake8 .`)
 - **Type Checking**: MyPy (`mypy src/`)
 - **Testing**: Pytest (`pytest`)
 - **Pre-commit**: `pip install pre-commit && pre-commit install` (see `.pre-commit-config.yaml`)
 - **CI**: Gitea Actions workflow `.gitea/workflows/ci.yml` runs on push/PR to `main`
 ### Agent Development Standards
 For contributing new agents or improving existing ones:
@@ -114,6 +124,17 @@ We welcome various types of contributions:
 - **Testing**: New tests, test improvements, bug reports
 - **Performance**: Optimization improvements and measurements
 ## Feedback
 We welcome bugs, feature ideas, and adoption experience reports.
 - **CLI:** `kaizen-agentic feedback` — lists channels and issue templates
 - **Guide:** [docs/FEEDBACK.md](docs/FEEDBACK.md)
 - **Templates:** `.gitea/ISSUE_TEMPLATE/` (bug, feature, general feedback)
 For cross-repo coordination between custodian agents, use State Hub messages
 (`POST /messages/`) — see session protocol in `.claude/rules/session-protocol.md`.
 ## Issue Reporting
 When reporting bugs, please include:
--- a/INTENT.md
+++ b/INTENT.md
@@ -0,0 +1,85 @@
 # INTENT
 ## Purpose
 This repository exists to define and evolve **KaizenAgentic**: a framework and product concept for turning AI coding agents from static tools into continuously improving digital talents.
 KaizenAgentic applies the principle of kaizen — continuous improvement through small, measurable, compounding refinements — to agent design, coding workflows, codebase quality, and agent optimization. It provides the concepts, templates, guidance, and business framing needed to build agents that can be observed, evaluated, refined, and improved over time.
 ## Primary Utility
 The primary utility of this repository is to serve as the conceptual and operational seed for a **digital talent agency for AI coding agents**.
 It should help define:
 * how Kaizen agents are specified,
 * how their performance is measured,
 * how agent behavior is improved through feedback loops,
 * how codebase improvement guidance can be made human-readable, machine-checkable, and agent-executable,
 * how reusable capabilities, prompt practices, and improvement programs compound into better software development workflows.
 ## Intended Users
 This repository is intended for:
 * builders of AI coding agents,
 * developers using Claude, Cursor, or similar coding assistant environments,
 * teams that want coding agents to improve with real-world use,
 * maintainers who want code quality guidance that can be checked, refactored, tested, and measured,
 * product and business designers shaping KaizenAgentic as a service or platform.
 ## Strategic Role in the System
 KaizenAgentic plays the role of a **meta-improvement layer** for agentic software development.
 It is not merely a collection of prompts or coding assistants. It defines a system in which agents become measurable, versioned, testable, and optimizable units of digital work. Subagents perform specific tasks, while optimization logic observes their performance and proposes evidence-based refinements.
 The repository should become the place where the core language, principles, templates, and operating model for this improvement loop are stabilized.
 ## Strategic Boundaries
 This repository should own:
 * the KaizenAgentic mission and conceptual model,
 * the KaizenAgent definition template,
 * the meta-optimizer concept,
 * guidance for measurable and idempotent agent behavior,
 * the codebase improvement guidance model,
 * the relationship between prompts, experiments, mantras, agents, and reusable capabilities,
 * the initial product, pricing, revenue, and brand framing.
 This repository should not own:
 * all concrete implementations of individual agents,
 * customer-specific agent configurations,
 * vendor-specific integrations beyond reference patterns,
 * the complete runtime platform for executing agents,
 * unrelated generic AI automation concepts that do not contribute to measurable continuous improvement,
 * codebase-specific gameplans except as examples or templates.
 ## Design Principles
 * **Continuous Improvement**: Every agent, guide, and workflow should be designed to improve through repeated use.
 * **Measurable by Default**: Success criteria, metrics, and before/after deltas should be part of every meaningful agent or guidance definition.
 * **Idempotent Operations**: Agent actions should converge toward desired states and remain safe to repeat.
 * **Evidence over Intuition**: Improvements should be based on observed performance, tests, metrics, and explicit feedback.
 * **Separation of Concerns**: Task-specific agents, measurement logic, optimization logic, and business framing should remain distinguishable.
 * **Composable Capabilities**: Reusable units should package intent, interfaces, knowledge, and operational behavior, not just code.
 * **Human-Readable and Machine-Executable**: Guidance should be understandable by humans while also being checkable by tools and executable by agents.
 * **Rollback-Ready Evolution**: Agent specifications and improvements should be versioned, testable, and reversible.
 * **Compounding Value**: Small, durable improvements should accumulate into stronger agents, cleaner codebases, and better development workflows.
 ## Maturity Target
 The repository should mature into the canonical reference for the KaizenAgentic operating model.
 At maturity, it should provide enough structure for a team to define, deploy, measure, refine, and commercialize AI coding agents as continuously improving digital talents. It should support both practical implementation and strategic communication: useful to developers, agent designers, product owners, and early customers.
 ## Stability Note
 `INTENT.md` describes the stable purpose and strategic role of the repository.
 Changes to this file should represent a deliberate shift in what KaizenAgentic is meant to become, not ordinary scope evolution. Concrete implementation plans, product details, agent specifications, and experiments should live in PRDs, gameplans, templates, guidance documents, or implementation repositories.
 xxx
--- a/8
+++ b/8
@@ -567,11 +567,19 @@ format: $(VENV)/bin/activate
 clean:
 	@echo "🧹 Cleaning build artifacts and cache..."
 	@rm -rf build/ dist/ *.egg-info/ .pytest_cache/ __pycache__/ .coverage htmlcov/
 	@rm -rf agents_backup_*/
 	@find . -type d -name "__pycache__" -exec rm -rf {} + 2>/dev/null || true
 	@find . -type f -name "*.pyc" -delete 2>/dev/null || true
 	@find . -type f -name "*.pyo" -delete 2>/dev/null || true
 	@echo "✅ Cleanup completed"
 # Install pre-commit hooks (WP-0001 T02)
 pre-commit-install: $(VENV)/bin/activate
 	@echo "🔧 Installing pre-commit hooks..."
 	@$(VENV_PIP) install pre-commit
 	@$(VENV)/bin/pre-commit install
 	@echo "✅ pre-commit installed — run 'pre-commit run --all-files' to verify"
 # ============================================================================
 # Standards Compliance Targets
 # ============================================================================
--- a/README.md
+++ b/README.md
@@ -89,9 +89,24 @@ kaizen-agentic memory show project-management
 See [docs/agency-framework.md](docs/agency-framework.md) for the full model.
 ## Orientation
 Read in this order for strategic context:
 1. [INTENT.md](INTENT.md) — purpose, boundaries, design principles
 2. [wiki/KaizenAgenticMission.md](wiki/KaizenAgenticMission.md) — product narrative
 3. [wiki/AboutKaizenAgents.md](wiki/AboutKaizenAgents.md) — agent concepts and metrics pilot
 4. [wiki/EcosystemIntegration.md](wiki/EcosystemIntegration.md) — ecosystem composition
 5. [SCOPE.md](SCOPE.md) — repository boundaries and current state
 6. [history/](history/) — persisted assessments and gap analyses
 Released **v1.1.0** — see [CHANGELOG.md](CHANGELOG.md). Workplans: WP-0001 through WP-0004 completed.
 Feedback: `kaizen-agentic feedback` · [docs/FEEDBACK.md](docs/FEEDBACK.md)
 ## Features
- **18 Specialized Agents**: Project management, testing, code quality, infrastructure, meta
+- **20 Specialized Agents**: Project management, testing, code quality, infrastructure, meta
 - **Agency Framework**: Project-scoped agent memory + Coach meta-agent for cross-agent synthesis
 - **CLI Tool**: Easy agent installation, management, and memory commands (`kaizen-agentic`)
 - **Project Templates**: Pre-configured setups for different project types
@@ -147,4 +162,4 @@ kaizen-agentic templates
 The CLI currently implements a workaround for spurious error messages in the Click library. This affects the `install` command but is transparent to users. See [CLICK_WORKAROUND.md](CLICK_WORKAROUND.md) for technical details and removal timeline.
 **User Impact**: None - the workaround provides clean CLI output
-**Status**: Monitoring Click library updates for resolution
+**Status**: Monitoring Click library updates for resolution
--- a/SCOPE.md
+++ b/SCOPE.md
@@ -3,92 +3,164 @@
 > This file helps you quickly understand what this repository is about,
 > when it is relevant, and when it is not.
 > It is intentionally lightweight and may be incomplete.
 > For strategic purpose and boundaries, see `INTENT.md`.
 ---
 ## One-liner
-AI agent development framework providing specialized agent personas (markdown instruction sets) and CLI scaffolding tools for embedding domain expertise into Claude Code sessions.
+KaizenAgentic: a digital talent agency framework — agent personas, project memory, measurable improvement loops, and CLI tooling for deploying continuously refining AI coding agents into Claude Code sessions.
 ---
 ## Core Idea
-Kaizen-agentic makes recurring development workflows (TDD, refactoring, project management, documentation) first-class by packaging them as named agent personas. You invoke an agent by name, load its instruction set, and follow it — the agent defines the workflow, Claude Code executes it. The "kaizen" (continuous improvement) philosophy means agents are refined based on performance over time.
+This repo is the canonical home for the **KaizenAgentic** operating model (`INTENT.md`, `wiki/`). It packages recurring development workflows as named agent personas invoked in Claude Code. The **agency layer** adds project-scoped memory (`.kaizen/agents/<name>/memory.md`) and a **Coach** meta-agent for cross-agent orientation. The **kaizen loop** — measure, analyse, refine — is defined in `wiki/` and partially implemented: `OptimizationLoop` exists in Python, but per-execution metrics collection and optimizer integration are in progress (WP-0003). Runtime execution remains Claude Code's responsibility.
 ---
 ## In Scope
- 17+ agent definition files (`agents/agent-*.md`) — markdown persona instruction sets
+- **Strategic framing**: `INTENT.md` (purpose, boundaries, design principles) and `wiki/` (mission, agent template, guidance model, brand/pricing)
- Agent categories: testing, quality, process, infrastructure, release, documentation
+- **20 agent definitions** (`agents/agent-*.md`) — markdown persona instruction sets with YAML frontmatter (reference fleet; see `INTENT.md` boundaries)
- CLI tooling: `kaizen-agentic init/install/status` for project scaffolding
+- **Agent categories**: project-management, development-process, code-quality, infrastructure, testing, documentation, meta
- Project templates (python-basic, python-web, python-cli, python-data, comprehensive)
+- **Agency framework**: project memory convention (ADR-002), session-start/close protocols, Coach meta-agent (`agent-coach.md`)
- Python framework: `Agent` base class, `AgentConfig` dataclass, `OptimizationLoop` for performance tracking
+- **Protocol runbooks** (`agents/protocols/<agent>/<slug>.md`) — procedural checklists distinct from agent prompts
- Custodian MCP integration: `list_kaizen_agents()` and `get_kaizen_agent()` tools
+- **CLI tooling** (`kaizen-agentic`): `init`, `install`, `update`, `remove`, `list`, `status`, `validate`, `templates`, `detect`, `migrate`, `extensions`, `memory` (show/init/brief/clear), `protocols` (list/show); `metrics` commands planned in WP-0003
 - **Project templates** (python-basic, python-web, python-cli, python-data, comprehensive) — agent bundles in registry code
 - **Python framework** (`src/kaizen_agentic/`): `Agent`/`AgentConfig`, `AgentRegistry`, `AgentInstaller`, `OptimizationLoop`/`PerformanceMetrics`, detection/migration/extensions
 - **Packaged agent data** (`src/kaizen_agentic/data/agents/`) — 17 agents bundled for pip installs (lags `agents/` by 4; see Notes)
 - **Custodian MCP integration** (owned by `the-custodian`): `list_kaizen_agents()` and `get_kaizen_agent()`
 - **ADRs and workplans** for memory, protocols, workplan, and metrics conventions
 ---
 ## Out of Scope
- Agent runtime / execution engine (agents are persona definitions; execution is Claude Code's responsibility)
+- Agent runtime / execution engine (agents are persona definitions; Claude Code executes them)
- LLM orchestration or multi-agent debate systems
+- LLM orchestration, scheduling, or multi-agent debate systems
- Project-specific implementation (agents guide; they do not build the software)
+- Project-specific implementation (agents guide work; they do not build the target software)
- Commercial features or PyPI distribution (pre-v1.0)
+- Custodian State Hub, MCP server code, or cross-domain governance (consumed, not owned)
 - Full KaizenGuidance codemod pipeline (vision in `wiki/KaizenGuidance.md`; not yet implemented)
 - PyPI publication pipeline (v1.0.2 released locally; public PyPI distribution still pending)
 ---
 ## Relevant When
- Starting a guided development workflow (TDD, refactoring, testing, requirements)
+- Understanding **why** KaizenAgentic exists and what it must not become (`INTENT.md`)
- Scaffolding a new project with consistent structure and best-practice tooling
+- Exploring the conceptual model: agent template, optimizer, guidance, composable capabilities (`wiki/`)
- Looking up what specialized agent personas are available for a domain session
+- Starting a guided development workflow (TDD, refactoring, testing, requirements, scope analysis)
- Contributing a new agent persona to the ecosystem
+- Deploying agents with persistent cross-session memory or Coach-mediated orientation
 - Scaffolding projects with agent bundles; looking up personas via CLI or Custodian MCP
 - Contributing agent personas, protocol runbooks, or improvement-loop conventions
 ---
 ## Not Relevant When
- Ad-hoc, one-off scripting with no need for structured guidance
+- Ad-hoc scripting with no need for structured agent guidance
- Non-Claude-Code development environments
+- Non-Claude-Code development environments (primary target; patterns may transfer)
- Need for runtime orchestration or scheduling (not a scheduler)
+- Need for runtime orchestration, task scheduling, or autonomous agent execution
 - Repository capability profiling or SCOPE.md generation at scale (see `repo-scoping`)
 ---
 ## Current State
- Status: experimental → stabilizing (v1.0.2 released)
+- Status: experimental → stabilizing (v1.0.2; agency framework shipped in WP-0002)
- Implementation: ~85% — 17 agents defined, CLI functional, templates working; optimization loop pattern established but not exercised at scale
+- Strategic layer: `INTENT.md` and `wiki/` established; orientation docs not yet fully linked
- Stability: stable CLI and agent loading
+- Implementation: substantial — 21 agents, full CLI, agency memory + protocols tested e2e; **measurement loop not closed** (no `.kaizen/metrics/`, optimizer unwired)
- Usage: installed in dev projects; agents callable via Custodian MCP hub-wide
+- Stability: CLI stable (Click workaround in place); agency framework validated by e2e tests
 - Usage: internal dev projects and Custodian MCP hub-wide; packaged wheel missing 4 newest agents
 - Active work: **WP-0003** (measurement loop); **WP-0004** (ecosystem integration); WP-0001 (community engagement / v1.1.0) pending
 ---
 ## How It Fits
- Upstream dependencies: Claude Code (agent invocation), kaizen philosophy
+- Upstream dependencies: Claude Code (agent invocation), kaizen continuous-improvement philosophy
- Downstream consumers: Custodian State Hub (loads agents via MCP); all six domains (teams use agents for guided workflows)
+- Downstream consumers: Custodian State Hub (MCP agent discovery); domain repos that install agents and maintain `.kaizen/` state
- Often used with: the-custodian (MCP integration), markitect_project (project-management agent), activity-core (scaffolding)
+- Often used with: `the-custodian` (MCP integration), `markitect_project` (project-management patterns), `activity-core` (scaffolding references), `repo-scoping` (SCOPE.md generation)
 ---
 ## Terminology
- Preferred terms: agent, agent persona, AgentConfig, project template
+- Preferred terms: KaizenAgentic (product), agent, agent persona, agency, project memory, protocol runbook, Coach, kaizen loop
- Also known as: "kaizen agents", "the agent library"
+- Also known as: "kaizen agents", "kaizen-agentic" (repo/package slug), "the agent library"
- Potentially confusing terms: "Agent" here is a persona/instruction set, not a running process
+- Potentially confusing terms: "Agent" is a persona/instruction set, not a running process; "agency" means memory + coaching, not autonomous orchestration; repo slug `kaizen-agentic` vs product name `KaizenAgentic`
 ---
 ## Related / Overlapping Repositories
- `the-custodian` — hosts MCP tools that load agents; custodian agent copies live in `the-custodian/agents/`
+- `the-custodian` — hosts MCP tools that load agents; integration code lives there, not here
 - `repo-scoping` — generates/refreshes SCOPE.md from approved characteristics
 - `markitect_project` — references kaizen-agentic as a capability submodule
 - `sys-medic` (source repo) — origin of sys-medic agent; canonical copy in `agents/agent-sys-medic.md`
 ---
 ## Getting Oriented
- Start with: `README.md` (quick start, agent list, installation)
+Read in this order for full context:
- Key files / directories: `agents/` (all persona definitions), `src/kaizen_agentic/` (Python framework), `templates/` (project scaffolds)
+
- Entry points: `kaizen-agentic --help`; or via MCP: `get_kaizen_agent("scope-analyst")`
+1. `INTENT.md` — stable purpose, boundaries, design principles
 2. `wiki/KaizenAgenticMission.md` — product narrative and key components
 3. `wiki/EcosystemIntegration.md` — how KaizenAgentic composes with adjacent repos
 4. `wiki/KaizenAgentTemplate.md` — intended agent specification format
 5. `README.md` — quick start and agency overview
 6. `docs/agency-framework.md` — memory, coach, protocols, metrics (ADR-004)
 7. `history/` — persisted assessments and gap analyses
 8. `workplans/` — active implementation roadmap
 Key directories: `wiki/` (conceptual model), `agents/` (personas), `agents/protocols/` (runbooks), `src/kaizen_agentic/` (Python framework), `docs/adr/` (conventions)
 Entry points: `kaizen-agentic --help`; MCP: `get_kaizen_agent("scope-analyst")`; docs: `docs/GETTING_STARTED.md`, `docs/AGENT_DISTRIBUTION.md`
 ---
 ## Provided Capabilities
 ```capability
 type: process
 title: Guided development agent personas
 description: Named markdown instruction sets for TDD, refactoring, documentation standards, requirements engineering, and project management workflows in Claude Code sessions.
 keywords: [agents, personas, tdd, refactoring, claude-code, workflows]
 ```
 ```capability
 type: infrastructure
 title: Agent deployment and project scaffolding CLI
 description: Install, update, validate, and bundle agents into new or existing projects via the kaizen-agentic CLI and registry-backed templates.
 keywords: [cli, install, templates, scaffolding, registry]
 ```
 ```capability
 type: process
 title: Project-scoped agent memory and coaching
 description: Convention and CLI for .kaizen/agents memory files, session protocols, and Coach-mediated orientation briefs across a deployed agent fleet.
 keywords: [memory, coach, agency, kaizen, cross-session]
 ```
 ```capability
 type: infrastructure
 title: Kaizen agent discovery via Custodian MCP
 description: Single source of truth for agent definitions consumed by the Custodian State Hub list_kaizen_agents and get_kaizen_agent tools.
 keywords: [mcp, custodian, discovery, agent-library]
 ```
 ```capability
 type: process
 title: KaizenAgentic conceptual model and agent specification standards
 description: Strategic framing, design principles, agent template, optimizer spec, and improvement philosophy via INTENT.md and wiki/.
 keywords: [kaizen, intent, template, optimization, digital-talent-agency]
 ```
 ---
 ## Notes
 - `agents/` (20 files) is the development source of truth; `src/kaizen_agentic/data/agents/` (16 files) is what pip installs ship — coach, sys-medic, scope-analyst, and optimization are not yet bundled
 - Agent definitions use minimal frontmatter today; full `wiki/KaizenAgentTemplate.md` conformance is a maturity target, not current reality
--- a/agents/agent-coach.md
+++ b/agents/agent-coach.md
@@ -83,6 +83,24 @@ root. Each follows ADR-002 structure:
 When synthesising, weight `## Watch Points` and `## Open Threads` most heavily —
 these are the signals most likely to be actionable for another agent.
 ### Project metrics (ADR-004)
 Quantitative performance data lives at `.kaizen/metrics/<agent>/summary.json`.
 `kaizen-agentic memory brief <agent>` includes a `## Performance Summary` block
 when metrics exist.
 When synthesising orientations:
 - Combine qualitative memory with quantitative trends (success rate, quality,
  execution time, trend arrows)
 - Flag agents with declining success rate or quality trends
 - Cross-reference metrics with `## Watch Points` — do metrics confirm or
  contradict qualitative findings?
 - Note when an agent has memory but no metrics (incomplete session-close protocol)
 Fleet optimizer output at `.kaizen/metrics/optimizer/analysis.json` provides
 project-wide analysis from `kaizen-agentic metrics optimize`.
 ---
 ## Output Format
@@ -115,6 +133,9 @@ Project: <project name>
 Generated: <date>
 Sources: <which agent memories were read>
 ### Performance Summary
 <from .kaizen/metrics/<agent>/ when available — success rate, quality, trends>
 ### What to Know First
 <3–5 most important facts for this agent>
--- a/agents/agent-agent-optimization.md
+++ b/agents/agent-agent-optimization.md
@@ -2,7 +2,8 @@
 name: optimization
 description: Meta-agent that analyzes and optimizes other Claude Code subagents based on their performance data, usage patterns, and effectiveness metrics. Use PROACTIVELY for agent ecosystem improvement.
 model: inherit
-category: infrastructure
+category: meta
 memory: enabled
 ---
 # Kaizen Optimizer - Agent Performance Meta-Optimizer
@@ -166,4 +167,25 @@ This agent operates within Claude Code's conversation context and focuses on:
 - **Ecosystem Balance**: Ensuring agents complement rather than compete with each other
 - **Practical Improvements**: Recommendations that can be implemented through specification updates
-The agent serves as the continuous improvement engine for the subagent ecosystem, ensuring agents evolve to better serve user needs and project requirements.
+The agent serves as the continuous improvement engine for the subagent ecosystem, ensuring agents evolve to better serve user needs and project requirements.
 ## Session Start
 1. Check for `.kaizen/agents/optimization/memory.md` in the project root.
 2. If present, read it before beginning analysis.
 3. Review `.kaizen/metrics/optimizer/analysis.json` if it exists for the latest fleet report.
 ## Session Close
 1. When analysis completes, note key findings in `## Accumulated Findings`.
 2. Append one line to `## Session Log`: `YYYY-MM-DD · <agents reviewed> · <outcome>`.
 3. Bump `last_updated` and increment `session_count`.
 4. Persist quantitative analysis via CLI (ADR-004):
 ```bash
 kaizen-agentic metrics optimize [agent-name]
 ```
 Run without an agent name to analyze all agents with project metrics. Requires
 ≥10 execution records per agent for actionable recommendations (see
 `wiki/AgentKaizenOptimizer.md`).
--- a/agents/agent-scope-analyst.md
+++ b/agents/agent-scope-analyst.md
@@ -309,6 +309,23 @@ Use this structure when creating or rewriting SCOPE.md:
 ---
 ## Provided Capabilities
 <!-- What can this repo's domain provide to other domains on request? -->
 <!-- Each capability block is parsed by the state-hub capability catalog ingest. -->
 <!-- Remove the examples and add your own, or leave empty if none. -->
 <!--
 ```capability
 type: infrastructure
 title: Example capability title
 description: What this capability provides, in one or two sentences.
 keywords: [keyword1, keyword2, keyword3]
 ```
 -->
 ---
 ## Notes
 <!-- Anything else worth knowing. Keep it short. -->
--- a/agents/agent-tdd-workflow.md
+++ b/agents/agent-tdd-workflow.md
@@ -2,6 +2,21 @@
 name: tdd-workflow
 description: Expert guidance for the TDD8 workflow methodology, specializing in the comprehensive ISSUE-TEST-RED-GREEN-REFACTOR-DOCUMENT-REFINE-PUBLISH cycle with sophisticated sidequest management and proper test organization.
 category: development-process
 memory: enabled
 metrics:
  primary:
    name: test_pass_rate
    description: Share of acceptance-criteria tests passing at PUBLISH
    measurement: passing_tests / total_tests for the active issue workspace
    target: 1.0
  secondary:
    - name: cycle_time_s
      description: Wall-clock time from ISSUE start to PUBLISH
      measurement: Session duration in seconds (execution_time_s in ADR-004)
  collection:
    frequency: per_execution
    storage: .kaizen/metrics/tdd-workflow/
    retention: 180d
 ---
 # TDDAi Assistant Agent
@@ -372,3 +387,20 @@ The comprehensive 8-step development methodology that transforms requirements in
 2. Update `## What Worked` and `## Watch Points` as needed.
 3. Append one line to `## Session Log`: `YYYY-MM-DD · <issue or feature> · <outcome>`.
 4. Bump `last_updated` to today and increment `session_count`.
 5. Record session metrics (ADR-004; adjust values to match outcome):
 ```bash
 # Successful PUBLISH — all acceptance tests green:
 echo '{"success": true, "execution_time_s": <seconds>, "quality_score": 0.9, "primary_metric": {"name": "test_pass_rate", "value": 1.0, "target": 1.0}, "metadata": {"issue": "<NUM>", "phase": "PUBLISH"}}' \
  | kaizen-agentic metrics record tdd-workflow --json --idempotency-key <session-id>
 # Incomplete or failed cycle:
 echo '{"success": false, "execution_time_s": <seconds>, "quality_score": 0.4, "primary_metric": {"name": "test_pass_rate", "value": <rate>, "target": 1.0}, "metadata": {"issue": "<NUM>", "phase": "<last-phase>"}}' \
  | kaizen-agentic metrics record tdd-workflow --json --idempotency-key <session-id>
 ```
 Shorthand when only outcome and duration matter:
 ```bash
 kaizen-agentic metrics record tdd-workflow --success --time <seconds> --quality <0.0-1.0>
 ```
--- a/docs/CLI_CHEAT_SHEET.md
+++ b/docs/CLI_CHEAT_SHEET.md
@@ -48,6 +48,39 @@ kaizen-agentic status                    # Show current project status
 kaizen-agentic validate                  # Validate agent installation
 ```
 ### Project Metrics (ADR-004)
 ```bash
 # Record outcome at session close
 kaizen-agentic metrics record tdd-workflow --success --time 120 --quality 0.9
 kaizen-agentic metrics record tdd-workflow --failure --time 45
 # Full JSON record from stdin
 echo '{"success": true, "quality_score": 1.0}' | kaizen-agentic metrics record tdd-workflow --json
 # Inspect metrics
 kaizen-agentic metrics show tdd-workflow
 kaizen-agentic metrics list
 kaizen-agentic metrics export tdd-workflow
 kaizen-agentic metrics optimize tdd-workflow   # analyze one agent (≥10 records)
 kaizen-agentic metrics optimize                # analyze all agents with metrics
 # Helix Forge correlation (fleet layer — agentic-resources)
 export HELIX_SESSION_UID="claude:<native-id>"
 kaizen-agentic metrics record tdd-workflow --success --time 120 --quality 0.9
 kaizen-agentic metrics correlate claude:<native-id>   # needs HELIX_STORE_DB
 # Publish optimizer evidence to artifact-store (optional)
 export ARTIFACTSTORE_API_URL=http://127.0.0.1:8000
 export ARTIFACTSTORE_API_TOKEN=<token>
 kaizen-agentic metrics publish
 # Scaffold memory + metrics together
 kaizen-agentic memory init tdd-workflow
 kaizen-agentic memory init tdd-workflow --no-metrics   # memory only
 ```
 Session-close template: `docs/templates/session-close-protocol.md`
 ### Information
 ```bash
 # List templates
--- a/docs/FEEDBACK.md
+++ b/docs/FEEDBACK.md
@@ -0,0 +1,41 @@
 # Feedback
 How to share bugs, ideas, and adoption experience for kaizen-agentic.
 ## Quick channels
 | Channel | Use for |
 |---------|---------|
 | **Gitea Issues** | Bugs, features, general feedback (templates below) |
 | **`kaizen-agentic feedback`** | Print links and template guidance from the CLI |
 | **Pull requests** | Code and agent-definition contributions (see CONTRIBUTING.md) |
 | **State Hub messages** | Cross-repo coordination between custodian agents (advanced) |
 ## Gitea issue templates
 Choose a template when opening a new issue:
 - **Bug report** — reproducible defects
 - **Feature request** — enhancements with proposed scope
 - **General feedback** — experience and adoption notes
 Repository: [coulomb/kaizen-agentic](https://gitea.coulomb.social/coulomb/kaizen-agentic/issues)
 ## CLI
 ```bash
 kaizen-agentic feedback          # human-readable channel list
 kaizen-agentic feedback --json   # machine-readable for tooling
 ```
 ## What helps us most
 - Python version and `kaizen-agentic --version`
 - Minimal reproduction steps for bugs
 - Which agents you used and whether memory/metrics were enabled
 - For integration issues: whether artifact-store, Helix Forge, or activity-core is involved
 ## Privacy
 Do not include secrets, tokens, or private project content in public issues. Redact
 `.kaizen/` memory contents unless you intentionally share sanitized examples.
--- a/docs/INTEGRATION_PATTERNS.md
+++ b/docs/INTEGRATION_PATTERNS.md
@@ -1,401 +1,105 @@
-# Integration Patterns for Existing Projects
+# Integration Patterns
-This guide documents proven patterns for integrating Kaizen Agentic agents into existing projects that already have agent systems.
+How kaizen-agentic composes with ecosystem repos **by contract** — no merged
 codebases, no duplicated capabilities.
-## Overview
+Reference: [wiki/EcosystemIntegration.md](../wiki/EcosystemIntegration.md),
 [KAIZEN-WP-0004](../workplans/kaizen-agentic-WP-0004-ecosystem-integration.md).
-When introducing Kaizen agents to existing projects, you'll encounter various scenarios that require different integration approaches. This guide provides tested patterns and strategies.
+---
-## Integration Scenarios
+## Pattern 1 — Helix Forge correlation (agentic-resources)
-### Scenario 1: Clean Integration (No Existing Agents)
+**Problem:** Project metrics and fleet session metrics answer different questions.
-**When to use**: Project has no existing agent systems.
+**Contract:** Optional `helix_session_uid` on ADR-004 execution records.
 | kaizen-agentic | agentic-resources |
 |----------------|-------------------|
 | `metrics record` at session close | Helix capture → digest store |
 | `metrics correlate <uid>` read-only lookup | `Store.get_digest(session_uid)` |
 | `HELIX_SESSION_UID` env auto-merge | `Session.session_uid` |
 **Docs:** [integrations/helix-forge-correlation.md](integrations/helix-forge-correlation.md)
 **Boundary:** kaizen-agentic does not ingest session JSONL.
 ---
 ## Pattern 2 — activity-core triggers
 **Problem:** Recurring kaizen checks need scheduling without custom cron in this repo.
 **Contract:** ActivityDefinition markdown files declare triggers + actions that
 invoke kaizen-agentic CLI commands.
 | Definition | Trigger | CLI command |
 |------------|---------|-------------|
 | [weekly-metrics-optimize](integrations/activity-definitions/weekly-metrics-optimize.md) | Cron Mon 08:00 | `metrics optimize` |
 | [post-install-metrics-scaffold](integrations/activity-definitions/post-install-metrics-scaffold.md) | `kaizen.agent.installed` | `memory init` validation |
 | [low-success-rate-review](integrations/activity-definitions/low-success-rate-review.md) | `kaizen.metrics.recorded` | `metrics show` + `optimize` |
 **Activation:**
 1. Copy or symlink definitions from `docs/integrations/activity-definitions/` into
   activity-core's `activity-definitions/` tree (or register as external ConfigMap).
 2. Run `make sync-activity-definitions` in activity-core.
 3. Enable definitions (`enabled: true`) after resolver wiring is verified.
 **Smoke test (manual):**
 **Pattern**: Direct installation
 ```bash
-kaizen-agentic init . --agents keepaTodofile,keepaChangelog,tdd-workflow
+# Against a repo with populated metrics
 cd /path/to/project-with-kaizen
 kaizen-agentic metrics list
 kaizen-agentic metrics optimize
 # Verify analysis.json written
 test -f .kaizen/metrics/optimizer/analysis.json && echo OK
 ```
-**Benefits**:
+**Boundary:** kaizen-agentic does not run Temporal schedules.
 - Straightforward setup
 - No conflicts to resolve
 - Full Kaizen agent functionality
-### Scenario 2: Claude Code Integration
+---
-**When to use**: Project already uses Claude Code with CLAUDE.md.
+## Pattern 3 — artifact-store evidence retention
 **Problem:** Optimizer outputs need durable, attributable retention beyond local disk.
 **Contract:** `metrics publish` registers `analysis.json` + `recommendations.jsonl`
 as an artifact package with `retention_class: raw-evidence`.
 **Pattern**: Respectful coexistence
 ```bash
-# 1. Detect existing setup
+export ARTIFACTSTORE_API_URL=http://127.0.0.1:8000
-kaizen-agentic detect
+export ARTIFACTSTORE_API_TOKEN=<token>
-
+kaizen-agentic metrics optimize
-# 2. Install compatible agents
+kaizen-agentic metrics publish --target .
 kaizen-agentic install keepaTodofile keepaChangelog
 # 3. Update CLAUDE.md with new agent references
 ```
-**Considerations**:
+**Manifest:** [integrations/optimizer-artifact-manifest.md](integrations/optimizer-artifact-manifest.md)
 - Preserve existing CLAUDE.md content
 - Add Kaizen agent references to existing documentation
 - Maintain Claude Code workflow compatibility
-### Scenario 3: Custom Agent Replacement
+**Boundary:** Publish is optional; local `.kaizen/metrics/optimizer/` remains canonical.
-**When to use**: Project has custom agents that overlap with Kaizen functionality.
+---
-**Pattern**: Gradual migration with backup
+## Pattern 4 — Canon and knowledge (stretch)
 ```bash
 # 1. Analyze existing agents
 kaizen-agentic detect --detailed
-# 2. Create migration plan
+Design-only paths for info-tech-canon and kontextual-engine:
 kaizen-agentic migrate --dry-run
-# 3. Execute migration with backup
+- [integrations/canon-template-mapping.md](integrations/canon-template-mapping.md)
-kaizen-agentic migrate
+- [integrations/briefs/tdd-workflow-canon-brief.md](integrations/briefs/tdd-workflow-canon-brief.md)
-```
+- [integrations/kontextual-wiki-ingestion-spike.md](integrations/kontextual-wiki-ingestion-spike.md)
-**Steps**:
+No runtime dependency in WP-0004.
 1. **Backup** existing agents
 2. **Map** custom agents to Kaizen equivalents
 3. **Migrate** functionality to extensions
 4. **Test** new agent workflow
 5. **Archive** old agents after verification
-### Scenario 4: Hybrid Coexistence
+---
-**When to use**: Project has essential custom agents that cannot be replaced.
+## Environment variables
-**Pattern**: Namespace separation
+| Variable | Used by | Purpose |
-```bash
+|----------|---------|---------|
-# 1. Install Kaizen agents in parallel
+| `HELIX_SESSION_UID` | `metrics record` | Fleet session correlation |
-kaizen-agentic install keepaTodofile --target agents/kaizen/
+| `HELIX_REPO`, `HELIX_FLAVOR` | `metrics record` | Session context |
-
+| `HELIX_TOKENS`, `HELIX_INFRA_OVERHEAD_SHARE` | `metrics record` | Fleet cost fields |
-# 2. Keep custom agents in separate directory
+| `HELIX_STORE_DB` | `metrics correlate` | Digest lookup database |
-# agents/custom/todo_manager.py
+| `ARTIFACTSTORE_API_URL` | `metrics publish` | Registry endpoint |
-# agents/kaizen/agent-keepaTodofile.md
+| `ARTIFACTSTORE_API_TOKEN` | `metrics publish` | Write auth bearer token |
 # 3. Create integration extensions
 kaizen-agentic extensions create custom-integration keepaTodofile
 ```
 **Directory Structure**:
 ```
 project/
 ├── agents/
 │   ├── custom/           # Existing custom agents
 │   │   ├── todo_manager.py
 │   │   └── code_reviewer.py
 │   └── kaizen/           # Kaizen agents
 │       ├── agent-keepaTodofile.md
 │       └── agent-code-refactoring.md
 ├── .kaizen/
 │   └── extensions/       # Integration extensions
 └── CLAUDE.md             # Updated configuration
 ```
 ### Scenario 5: Extension-Based Integration
 **When to use**: Custom agents have unique functionality that should be preserved.
 **Pattern**: Extend Kaizen agents with custom functionality
 ```bash
 # 1. Create project-specific extension
 kaizen-agentic extensions create project-todo keepaTodofile \
  --description "TODO manager with custom workflow integration"
 # 2. Configure custom behavior
 # Edit .kaizen/extensions/project-todo/extension.yml
 # 3. Migrate custom logic to extension
 ```
 **Extension Configuration Example**:
 ```yaml
 name: project-todo
 base_agent: keepaTodofile
 extension_type: functional_extension
 description: "TODO manager with custom workflow integration"
 configuration:
  custom_instructions: |
    Follow our project-specific TODO format:
    - Use JIRA ticket references
    - Include priority levels (P0-P3)
    - Auto-assign based on component
 custom_commands:
  create-epic: "Create epic-level TODO items"
  sync-jira: "Synchronize with JIRA tickets"
  priority-report: "Generate priority-based reports"
 environment_overrides:
  JIRA_URL: "https://company.atlassian.net"
  TODO_FORMAT: "custom"
 ```
 ## Conflict Resolution Patterns
 ### Name Conflicts
 **Problem**: Multiple agents with the same name.
 **Pattern**: Rename with suffix
 ```bash
 # Automatic resolution
 todo_manager -> todo_manager_custom
 keepaTodofile -> keepaTodofile (Kaizen agent)
 ```
 **Implementation**:
 - Add `_custom` suffix to project-specific agents
 - Update references in scripts and documentation
 - Create aliases for backward compatibility
 ### Functional Overlaps
 **Problem**: Multiple agents perform similar functions.
 **Pattern**: Choose primary, extend secondary
 ```bash
 # Primary: Kaizen agent (standardized)
 # Secondary: Custom agent -> extension
 # Example: Both have TODO management
 # Decision: Use keepaTodofile as primary
 #           Convert custom logic to extension
 ```
 **Decision Matrix**:
 | Factor | Choose Kaizen | Choose Custom | Create Extension |
 |--------|---------------|---------------|------------------|
 | Standard functionality | ✅ | ❌ | ✅ |
 | Custom business logic | ❌ | ✅ | ✅ |
 | Maintenance burden | ✅ | ❌ | ⚠️ |
 | Team familiarity | ⚠️ | ✅ | ✅ |
 ### Integration Order
 **Pattern**: Infrastructure first, features last
 1. **Infrastructure agents** (setupRepository, tooling-optimization)
 2. **Core functionality** (keepaTodofile, keepaChangelog)
 3. **Development process** (tdd-workflow, code-refactoring)
 4. **Specialized features** (testing-efficiency, datamodel-optimization)
 ## Project Structure Respect Patterns
 ### Existing Directory Structures
 **Pattern**: Adaptive installation
 ```bash
 # Respect existing structure
 project/
 ├── tools/agents/         # Existing agent directory
 ├── scripts/             # Existing automation
 └── docs/               # Existing documentation
 # Kaizen adaptation
 kaizen-agentic install --target tools/agents/ keepaTodofile
 # Creates: tools/agents/agent-keepaTodofile.md
 ```
 ### Configuration File Integration
 **Pattern**: Merge, don't replace
 ```bash
 # Before
 CLAUDE.md              # Existing Claude config
 project-config.yml     # Existing project config
 # After (merged)
 CLAUDE.md              # Updated with Kaizen agents
 project-config.yml     # Preserved
 .kaizen/extensions.yml # New Kaizen-specific config
 ```
 ### Build System Integration
 **Pattern**: Extend existing targets
 ```makefile
 # Existing Makefile
 test:
 	pytest tests/
 # After Kaizen integration (extended)
 test: test-core test-agents
 	@echo "All tests completed"
 test-core:
 	pytest tests/
 test-agents:
 	kaizen-agentic validate
 # New Kaizen targets
 agents-status:
 	kaizen-agentic status
 agents-update:
 	kaizen-agentic update
 ```
 ## Safe Transition Strategies
 ### Phased Rollout
 **Phase 1: Detection and Planning**
 ```bash
 # Week 1: Analysis
 kaizen-agentic detect --detailed
 kaizen-agentic migrate --dry-run
 # Decision point: Continue or modify approach
 ```
 **Phase 2: Infrastructure Agents**
 ```bash
 # Week 2: Core infrastructure
 kaizen-agentic install setupRepository
 # Test and validate before proceeding
 ```
 **Phase 3: Core Functionality**
 ```bash
 # Week 3: Essential agents
 kaizen-agentic install keepaTodofile keepaChangelog
 # Create extensions for custom functionality
 ```
 **Phase 4: Advanced Features**
 ```bash
 # Week 4: Specialized agents
 kaizen-agentic install tdd-workflow code-refactoring
 # Full integration testing
 ```
 ### Rollback Strategy
 **Pattern**: Versioned backups with restore capability
 ```bash
 # Before migration
 .kaizen-migration-backup-timestamp/
 ├── agents/              # Original agents
 ├── CLAUDE.md           # Original configuration
 └── restoration.md      # Rollback instructions
 # Rollback command (if needed)
 kaizen-agentic rollback --backup .kaizen-migration-backup-timestamp/
 ```
 ### Validation Gates
 **Pattern**: Automated validation at each phase
 ```bash
 # After each phase
 kaizen-agentic validate
 make test
 make agents-status
 # Success criteria for proceeding:
 # ✅ All agents load without errors
 # ✅ All tests pass
 # ✅ No functionality regressions
 ```
 ## Best Practices
 ### Communication
 1. **Team Notification**: Inform team before starting migration
 2. **Documentation**: Update project docs with new agent workflows
 3. **Training**: Provide team training on Kaizen agents
 4. **Gradual Adoption**: Allow team to adapt gradually
 ### Technical
 1. **Backup Everything**: Create comprehensive backups
 2. **Test Thoroughly**: Validate each integration step
 3. **Monitor Impact**: Watch for performance or workflow impacts
 4. **Version Control**: Commit changes in logical phases
 ### Maintenance
 1. **Regular Updates**: Keep Kaizen agents updated
 2. **Extension Maintenance**: Maintain custom extensions
 3. **Documentation Sync**: Keep docs synchronized with agent changes
 4. **Team Feedback**: Collect and act on team feedback
 ## Troubleshooting Common Issues
 ### Agent Conflicts
 **Issue**: Multiple agents trying to manage the same files.
 **Solution**:
 ```bash
 # Identify conflicts
 kaizen-agentic detect --detailed
 # Resolve with namespace separation
 mkdir agents/legacy agents/kaizen
 mv agents/todo_manager.py agents/legacy/
 kaizen-agentic install --target agents/kaizen/ keepaTodofile
 ```
 ### Configuration Conflicts
 **Issue**: Conflicting configuration files.
 **Solution**:
 ```bash
 # Merge configurations
 cp CLAUDE.md CLAUDE.md.backup
 kaizen-agentic install keepaTodofile
 # Manually merge CLAUDE.md.backup content
 ```
 ### Workflow Disruption
 **Issue**: New agents disrupt existing workflows.
 **Solution**:
 ```bash
 # Create compatibility extensions
 kaizen-agentic extensions create workflow-compat keepaTodofile
 # Configure extension to match existing workflow
 ```
 ## Success Metrics
 ### Technical Metrics
 - ✅ Zero agent loading errors
 - ✅ All tests passing
 - ✅ No performance regressions
 - ✅ Successful backup/restore capability
 ### Team Metrics
 - ✅ Team adoption of new agents
 - ✅ Maintained productivity during transition
 - ✅ Positive feedback on new capabilities
 - ✅ Reduced maintenance overhead
 ### Project Metrics
 - ✅ Improved code quality metrics
 - ✅ Better documentation coverage
 - ✅ Enhanced development workflow efficiency
 - ✅ Standardized agent ecosystem
 ## Conclusion
 Successful integration of Kaizen agents into existing projects requires:
 1. **Careful analysis** of existing agent systems
 2. **Respectful approach** to existing project structure
 3. **Gradual migration** with proper backup strategies
 4. **Extension mechanisms** for preserving custom functionality
 5. **Team communication** and training throughout the process
 Follow these patterns and your integration will be smooth, reversible, and beneficial to your development workflow.
--- a/docs/TELEMETRY.md
+++ b/docs/TELEMETRY.md
@@ -0,0 +1,48 @@
 # Telemetry and Agent Effectiveness Tracking
 WP-0001 T04 design — aligned with ADR-004 and WP-0004 ecosystem integration.
 ## Two layers (do not merge)
 | Layer | Question | Mechanism |
 |-------|----------|-----------|
 | **Project** | How is agent *X* performing in *this repo*? | `kaizen-agentic metrics record` → `.kaizen/metrics/` |
 | **Fleet** | How are coding sessions performing *across repos*? | agentic-resources Helix Forge |
 kaizen-agentic **does not** ship a parallel session transcript ingestion pipeline.
 ## Project telemetry (implemented)
 Memory-enabled agents record per-session outcomes at close:
 ```bash
 kaizen-agentic metrics record <agent> --success --time <s> --quality <0-1>
 kaizen-agentic metrics optimize [agent]
 kaizen-agentic memory brief <agent>    # includes Performance Summary
 ```
 Optional fleet correlation via `HELIX_SESSION_UID` (see
 [integrations/helix-forge-correlation.md](integrations/helix-forge-correlation.md)).
 ## Fleet telemetry (agentic-resources)
 Helix Forge owns session capture, digest storage, baselines, and weekly retro.
 kaizen-agentic consumes correlation fields only.
 ## CLI install / usage analytics (future)
 Potential v1.1 additions (not yet implemented):
 - Opt-in anonymous counters on `install` / `memory init` (no PII, no project paths)
 - Aggregate effectiveness reports via `metrics list` across a monorepo checkout
 ## tele-mcp evaluation (deferred)
 [tele-mcp](https://gitea.coulomb.social/coulomb/tele-mcp) is a candidate MCP adapter
 for IDE-level telemetry (WP-0001 note). Assess before depending on it. Project and
 fleet layers above satisfy INTENT's "measurable agents" requirement without tele-mcp.
 ## Feedback loop
 User experience feedback uses [FEEDBACK.md](FEEDBACK.md) and Gitea issue templates —
 separate from execution metrics.
--- a/docs/adr/ADR-004-project-metrics-convention.md
+++ b/docs/adr/ADR-004-project-metrics-convention.md
@@ -0,0 +1,190 @@
 ---
 id: ADR-004
 title: Project Metrics Convention
 status: accepted
 date: "2026-06-16"
 ---
 # ADR-004 — Project Metrics Convention
 ## Status
 Accepted
 ## Context
 `INTENT.md` requires agents to be measurable, versioned, and optimizable. The
 agency framework (ADR-002) provides **qualitative** project memory; the kaizen
 loop needs **quantitative** per-execution records.
 `wiki/AgentKaizenOptimizer.md` specifies `.kaizen/metrics/` storage.
 `OptimizationLoop` in `src/kaizen_agentic/optimization.py` exists but has no
 data source.
 Separately, `agentic-resources` (Helix Forge) captures **fleet-level** session
 metrics from coding agent transcripts. Project metrics and fleet metrics serve
 different scopes and must correlate without duplicating ingestion logic.
 ## Decision
 Each agent deployed into a project may accumulate **project-scoped execution
 metrics**. Records are append-only JSONL with rolling summaries. The optimizer
 reads these files to produce evidence-based recommendations.
 ### File locations
 Per-agent executions:
 ```
 <project-root>/.kaizen/metrics/<agent-name>/
  executions.jsonl    # append-only per-execution records
  summary.json        # rolling aggregates (regenerated on write)
 ```
 Optimizer outputs:
 ```
 <project-root>/.kaizen/metrics/optimizer/
  analysis.json           # last analysis run + input fingerprint
  recommendations.jsonl   # append-only recommendation history
 ```
 The `.kaizen/metrics/` tree lives alongside `.kaizen/agents/` under the same
 project-level state directory (ADR-002).
 ### Execution record schema (minimum viable)
 ```json
 {
  "timestamp": "2026-06-16T12:00:00Z",
  "agent": "tdd-workflow",
  "session_id": "optional-uuid-or-hash",
  "execution_time_s": 0.0,
  "success": true,
  "quality_score": 0.0,
  "primary_metric": {
    "name": "test_pass_rate",
    "value": 1.0,
    "target": 1.0
  },
  "metadata": {}
 }
 ```
 Required fields: `timestamp`, `agent`, `success`.
 Recommended fields: `execution_time_s`, `quality_score`, `primary_metric`.
 ### Summary schema
 `summary.json` is derived — never hand-edited. Regenerated on each append:
 ```json
 {
  "agent": "tdd-workflow",
  "execution_count": 12,
  "success_rate": 0.917,
  "avg_quality_score": 0.82,
  "avg_execution_time_s": 45.3,
  "last_execution": "2026-06-16T12:00:00Z",
  "trend": {
    "success_rate": "stable",
    "quality_score": "up"
  }
 }
 ```
 ### Retention
 Default retention: **180 days** (per `wiki/AgentKaizenOptimizer.md`).
 Pruning removes aged lines from `executions.jsonl` and regenerates `summary.json`.
 Project-level override via `.kaizen/metrics/config.json` is reserved for a
 future iteration.
 ### Session-close protocol
 Memory-enabled agents with declared metrics should append one execution record
 at session close:
 ```bash
 kaizen-agentic metrics record <agent> --success --time <seconds> --quality <0-1>
 ```
 Or pipe a full JSON record via `--json` / stdin.
 ### CLI interface
 ```
 kaizen-agentic metrics record <agent>   # Append execution record
 kaizen-agentic metrics show <agent>     # Summary + recent executions
 kaizen-agentic metrics list             # Agents with metrics in project
 kaizen-agentic metrics export <agent>   # Dump executions.jsonl
 kaizen-agentic metrics optimize [agent] # Run OptimizationLoop (WP-0003 Part 3)
 ```
 `kaizen-agentic memory init <agent>` scaffolds metrics directories by default
 (`--no-metrics` to opt out).
 ### Helix Forge correlation
 Kaizen-agentic **project metrics** and agentic-resources **fleet metrics**
 operate at different layers:
 | Layer | Scope | Owner | Typical storage |
 |-------|-------|-------|-----------------|
 | Project | Per-agent persona in one repo | kaizen-agentic | `.kaizen/metrics/` |
 | Fleet | Cross-repo coding sessions | agentic-resources | Helix Forge digest store + `measure/baselines.jsonl` |
 **Correlation fields** — optional on project execution records, populated when
 the session is also captured by Helix Forge:
 ```json
 {
  "helix_session_uid": "claude:<native-session-uuid>",
  "repo": "kaizen-agentic",
  "flavor": "claude",
  "tokens": 12500,
  "infra_overhead_share": 0.12
 }
 ```
 Mapping from Helix Forge `session_metrics()` (agentic-resources):
 | Helix field | ADR-004 field |
 |-------------|---------------|
 | `digest.outcome == "success"` | `success` |
 | `digest.cost.wall_clock_s` | `execution_time_s` |
 | `tokens` (input + output) | `tokens` in metadata / top-level |
 | `infra_overhead_share` | `metadata.infra_overhead_share` |
 | `Session.session_uid` | `helix_session_uid` |
 | `Session.repo` | `repo` |
 | `Session.flavor` | `flavor` |
 Kaizen-agentic does **not** ingest Claude/Codex/Grok JSONL transcripts.
 Correlation is **link-by-reference**: project metrics may cite a Helix session
 UID; fleet analytics remain owned by agentic-resources.
 WP-0004 defines the integration contract and optional sync tooling.
 ### Coach and memory integration
 `kaizen-agentic memory brief <agent>` includes a `## Performance Summary`
 section when `summary.json` exists (WP-0003 Part 4). Qualitative memory
 (ADR-002) and quantitative metrics (this ADR) are complementary views of the
 same agent's project history.
 ## Consequences
 - Agents can be measured per project without a central telemetry platform.
 - `OptimizationLoop` has a defined data source for recommendations.
 - Fleet session analytics stay in agentic-resources; no duplicate ingestion.
 - `.kaizen/metrics/` should default to `.gitignore` (same policy as memory).
 - WP-0003 implements `MetricsStore` and CLI against this convention.
 - WP-0004 wires ecosystem services (activity-core, artifact-store, Helix Forge).
 ## Related Documents
 - [ADR-002: Project Memory Convention](ADR-002-project-memory-convention.md)
 - [wiki/EcosystemIntegration.md](../../wiki/EcosystemIntegration.md)
 - [agentic-resources session schema](https://github.com/coulomb/agentic-resources) — `session_memory/core/schema.py`
 - [KAIZEN-WP-0003](../../workplans/kaizen-agentic-WP-0003-measurement-loop.md)
 - [KAIZEN-WP-0004](../../workplans/kaizen-agentic-WP-0004-ecosystem-integration.md)
--- a/docs/agency-framework.md
+++ b/docs/agency-framework.md
@@ -234,8 +234,78 @@ All agents that do session-bound project work have `memory: enabled` in their fr
 ---
 ## Project Metrics
 Project-scoped **quantitative** metrics complement qualitative memory (ADR-002).
 Per-execution records live under `.kaizen/metrics/<agent>/` and feed the
 kaizen optimizer loop.
 ### Location
 ```
 <project-root>/.kaizen/metrics/<agent-name>/
  executions.jsonl
  summary.json
 <project-root>/.kaizen/metrics/optimizer/
  analysis.json
  recommendations.jsonl
 ```
 ### CLI (WP-0003)
 ```
 kaizen-agentic metrics record <agent>   # Append execution record at session close
 kaizen-agentic metrics show <agent>     # Summary + recent executions
 kaizen-agentic metrics list             # Agents with metrics in project
 kaizen-agentic metrics export <agent>   # Dump executions.jsonl
 kaizen-agentic metrics optimize [agent] # Run optimizer on project metrics (≥10 records)
 kaizen-agentic metrics correlate <uid>  # Helix Forge digest lookup (read-only)
 kaizen-agentic metrics publish          # Register optimizer output in artifact-store
 ```
 `memory brief` includes a `## Performance Summary` when metrics exist (success
 rate, avg quality, execution time, trend arrows).
 `memory init` scaffolds `.kaizen/metrics/<agent>/` by default (`--no-metrics` to
 skip). Record outcomes at session close per
 [session-close protocol template](templates/session-close-protocol.md).
 ### Fleet correlation
 Project metrics correlate with **Helix Forge** fleet session metrics in
 `agentic-resources` via optional `helix_session_uid` (ADR-004).
 - `HELIX_SESSION_UID` (and related env vars) auto-merge on `metrics record`
 - `metrics correlate <uid>` looks up fleet digest when `HELIX_STORE_DB` is set
 See [integrations/helix-forge-correlation.md](integrations/helix-forge-correlation.md)
 and [wiki/EcosystemIntegration.md](../wiki/EcosystemIntegration.md).
 ### Evidence retention
 After `metrics optimize`, optionally publish optimizer outputs to **artifact-store**:
 ```bash
 export ARTIFACTSTORE_API_URL=http://127.0.0.1:8000
 export ARTIFACTSTORE_API_TOKEN=<write-token>
 kaizen-agentic metrics publish --target .
 ```
 Package uses `retention_class: raw-evidence` (180d). Local
 `.kaizen/metrics/optimizer/` remains authoritative when publish is skipped.
 Manifest: [integrations/optimizer-artifact-manifest.md](integrations/optimizer-artifact-manifest.md).
 ---
 ## Related Documents
- [ADR-001: Workplan Convention](../workplans/kaizen-agentic-WP-0001-community-engagement.md) — how work items are structured
+- [ADR-001: Workplan Convention](adr/ADR-001-workplan-convention.md)
- [ADR-002: Project Memory Convention](../workplans/kaizen-agentic-WP-0002-agency-framework.md) — memory file location, structure, and lifecycle
+- [ADR-002: Project Memory Convention](adr/ADR-002-project-memory-convention.md)
- [WP-0002: Agency Framework](../workplans/kaizen-agentic-WP-0002-agency-framework.md) — full implementation workplan
+- [ADR-003: Protocols Artifact Convention](adr/ADR-003-protocols-artifact-convention.md)
 - [ADR-004: Project Metrics Convention](adr/ADR-004-project-metrics-convention.md)
 - [wiki/EcosystemIntegration.md](../wiki/EcosystemIntegration.md) — two-layer measurement model
 - [WP-0002: Agency Framework](../workplans/kaizen-agentic-WP-0002-agency-framework.md)
 - [WP-0003: Measurement Loop](../workplans/kaizen-agentic-WP-0003-measurement-loop.md)
 - [WP-0004: Ecosystem Integration](../workplans/kaizen-agentic-WP-0004-ecosystem-integration.md)
--- a/docs/integrations/activity-definitions/low-success-rate-review.md
+++ b/docs/integrations/activity-definitions/low-success-rate-review.md
@@ -0,0 +1,43 @@
 ---
 id: kaizen-low-success-rate-review
 name: Low Agent Success Rate Review
 enabled: false
 owner: kaizen-agentic
 governance: custodian
 status: proposed
 trigger:
  type: event
  event_type: kaizen.metrics.recorded
 context_sources:
  - type: event-payload
    bind_to: context.metrics
 ---
 # Low Agent Success Rate Review
 When a project agent's rolling success rate drops below 0.8, create a review
 task in issue-core for human or optimizer-agent follow-up.
 ```rule
 id: flag-low-success-rate
 condition: 'context.metrics.summary.success_rate < 0.8 && context.metrics.summary.execution_count >= 5'
 action:
  task_template: "Review {{context.metrics.agent}} success rate ({{context.metrics.summary.success_rate}})"
  description: |
    Agent {{context.metrics.agent}} in {{context.metrics.project}} has success_rate
    below 0.8 over {{context.metrics.summary.execution_count}} executions.
    Run: kaizen-agentic metrics show {{context.metrics.agent}}
    Then: kaizen-agentic metrics optimize {{context.metrics.agent}}
  target_repo: "{{context.metrics.project}}"
  priority: high
  labels: ["kaizen", "metrics", "review", "automated"]
 ```
 **Threshold:** 0.8 success rate, minimum 5 executions (avoids noise on early pilots).
 **CLI mapping:** Event emitter is future work; manual check today:
 ```bash
 kaizen-agentic metrics show <agent>   # inspect summary.success_rate
 kaizen-agentic metrics optimize <agent>
 ```
--- a/docs/integrations/activity-definitions/post-install-metrics-scaffold.md
+++ b/docs/integrations/activity-definitions/post-install-metrics-scaffold.md
@@ -0,0 +1,41 @@
 ---
 id: kaizen-post-install-metrics-scaffold
 name: Post-Install Metrics Scaffold Validation
 enabled: false
 owner: kaizen-agentic
 governance: custodian
 status: proposed
 trigger:
  type: event
  event_type: kaizen.agent.installed
 context_sources:
  - type: event-payload
    bind_to: context.install
 ---
 # Post-Install Metrics Scaffold Validation
 Fires when an agent is installed into a project. Verifies that memory and metrics
 scaffolds exist for the installed agent.
 ```rule
 id: validate-metrics-scaffold
 condition: 'context.install.agent != ""'
 action:
  task_template: "Validate kaizen scaffold for {{context.install.agent}}"
  description: |
    In {{context.install.project_root}} verify:
    - .kaizen/agents/{{context.install.agent}}/memory.md exists OR run:
      kaizen-agentic memory init {{context.install.agent}}
    - .kaizen/metrics/{{context.install.agent}}/ exists OR re-run init without --no-metrics
  target_repo: "{{context.install.repo}}"
  priority: low
  labels: ["kaizen", "metrics", "scaffold", "automated"]
 ```
 **CLI mapping:**
 ```bash
 kaizen-agentic memory init <agent>    # scaffolds memory + metrics by default
 kaizen-agentic metrics list           # confirms metrics directory after first record
 ```
--- a/docs/integrations/activity-definitions/weekly-metrics-optimize.md
+++ b/docs/integrations/activity-definitions/weekly-metrics-optimize.md
@@ -0,0 +1,44 @@
 ---
 id: kaizen-weekly-metrics-optimize
 name: Weekly Kaizen Metrics Optimization
 enabled: false
 owner: kaizen-agentic
 governance: custodian
 status: proposed
 trigger:
  type: cron
  cron_expression: "0 8 * * 1"
  timezone: Europe/Berlin
  misfire_policy: skip
 context_sources:
  - type: shell
    query: discover_kaizen_projects
    params:
      marker: .kaizen/metrics
    bind_to: context.projects
 ---
 # Weekly Kaizen Metrics Optimization
 Runs every Monday 08:00 Berlin time on repos that contain `.kaizen/metrics/`.
 Invokes the kaizen-agentic optimizer CLI per project.
 ```rule
 id: run-weekly-optimizer
 for_each: context.projects
 bind_as: p
 condition: 'p.has_metrics == true'
 action:
  task_template: "Run kaizen metrics optimize on {{p.repo}}"
  description: |
    cd {{p.root}} && kaizen-agentic metrics optimize
    Optional: kaizen-agentic metrics publish (when artifact-store configured)
  target_repo: "{{p.repo}}"
  priority: medium
  labels: ["kaizen", "metrics", "optimizer", "automated"]
 ```
 **Activation:** sync this definition into activity-core via `make sync-activity-definitions`
 after enabling the shell resolver for `discover_kaizen_projects`.
 **CLI mapping:** `kaizen-agentic metrics optimize` (no agent filter = all agents with metrics).
--- a/docs/integrations/briefs/tdd-workflow-canon-brief.md
+++ b/docs/integrations/briefs/tdd-workflow-canon-brief.md
@@ -0,0 +1,44 @@
 # tdd-workflow — InfoTechCanon-style Brief
 Compact agent brief derived from `agents/agent-tdd-workflow.md` (metrics pilot).
 Reference for fleet-wide brief rollout.
 ```yaml
 profile:
  id: kaizen/tdd-workflow
  version: "1.0"
  domain: development-process
  intent:
    summary: Guide TDD8 ISSUE-TEST-RED-GREEN-REFACTOR-DOCUMENT-REFINE-PUBLISH cycles
    outcomes:
      - Acceptance criteria covered by tests before PUBLISH
      - Sidequests tracked without blocking parent issues
      - Workspace integrated cleanly via make tdd-finish
  metrics:
    primary:
      name: test_pass_rate
      target: 1.0
      measurement: passing_tests / total_tests at PUBLISH
    secondary:
      - name: cycle_time_s
        measurement: session duration (execution_time_s)
    collection:
      storage: .kaizen/metrics/tdd-workflow/
      frequency: per_execution
  idempotency:
    signals:
      - current_issue.json workspace state
      - idempotency_key on metrics record
  session_protocol:
    start: read .kaizen/agents/tdd-workflow/memory.md
    close:
      - update memory.md sections
      - kaizen-agentic metrics record tdd-workflow
  ecosystem:
    fleet_correlation: helix_session_uid (ADR-004)
    optimizer: kaizen-agentic metrics optimize
    evidence: kaizen-agentic metrics publish (optional)
 ```
 Full specification: [agents/agent-tdd-workflow.md](../../../agents/agent-tdd-workflow.md).
 Pilot documentation: [wiki/AboutKaizenAgents.md](../../../wiki/AboutKaizenAgents.md).
--- a/docs/integrations/canon-template-mapping.md
+++ b/docs/integrations/canon-template-mapping.md
@@ -0,0 +1,32 @@
 # KaizenAgentTemplate → InfoTechCanon Profile Mapping
 Design note (WP-0004 Part 4). No runtime dependency on info-tech-canon.
 ## Section mapping
 | `wiki/KaizenAgentTemplate.md` | InfoTechCanon profile outline |
 |------------------------------|------------------------------|
 | `specification.outcomes` | `profile.intent.outcomes[]` |
 | `specification.constraints` | `profile.constraints.hard[]` / `soft[]` |
 | `idempotency.detection` | `profile.idempotency.signals[]` |
 | `idempotency.rollback` | `profile.safety.rollback` |
 | `metrics.primary` | `profile.metrics.primary` |
 | `metrics.secondary[]` | `profile.metrics.secondary[]` |
 | `metrics.collection` | `profile.observability.collection` |
 | `testing.unit_tests[]` | `profile.validation.unit[]` |
 | `testing.integration_tests[]` | `profile.validation.integration[]` |
 | `evolution.history` | `profile.evolution.changelog` |
 | `evolution.optimization_hooks` | `profile.evolution.feedback_sources[]` |
 ## Validation hooks (future)
 Extend `kaizen-agentic validate` to check:
 1. Frontmatter contains `metrics.primary.name` when `memory: enabled`
 2. Session-close block references `metrics record`
 3. Required template sections present in agent body (warn, not fail)
 ## Reference pilot
 `tdd-workflow` brief in [briefs/tdd-workflow-canon-brief.md](briefs/tdd-workflow-canon-brief.md)
 demonstrates a compact canon-style export derived from the full agent spec.
--- a/docs/integrations/helix-forge-correlation.md
+++ b/docs/integrations/helix-forge-correlation.md
@@ -0,0 +1,103 @@
 # Helix Forge Correlation Contract
 Cross-repo contract between **kaizen-agentic** (project metrics, ADR-004) and
 **agentic-resources** (Helix Forge fleet session metrics).
 ## Purpose
 Link a project-scoped agent execution record to the fleet session that produced
 it, without duplicating session JSONL ingestion in kaizen-agentic.
 ## Layers
 | Layer | Owner | Storage |
 |-------|-------|---------|
 | Project | kaizen-agentic | `.kaizen/metrics/<agent>/executions.jsonl` |
 | Fleet | agentic-resources | Helix Forge digest store (`digests` table) |
 ## Correlation fields (ADR-004)
 Optional on each project execution record:
 ```json
 {
  "helix_session_uid": "claude:17092961-…",
  "repo": "kaizen-agentic",
  "flavor": "claude",
  "tokens": 12500,
  "infra_overhead_share": 0.12
 }
 ```
 ### Field mapping
 | Helix Forge (`session_memory`) | ADR-004 project record |
 |-------------------------------|------------------------|
 | `Session.session_uid` | `helix_session_uid` |
 | `Session.repo` | `repo` |
 | `Session.flavor` | `flavor` |
 | `digest.cost.input_tokens + output_tokens` | `tokens` |
 | MCP tool share of `tool_histogram` | `infra_overhead_share` |
 | `digest.outcome == "success"` | informs `success` at record time |
 | `digest.cost.wall_clock_s` | complements `execution_time_s` |
 ## Population at session close
 ### Automatic (environment)
 When Helix Forge capture is active in the same shell session:
 ```bash
 export HELIX_SESSION_UID="claude:17092961-…"
 export HELIX_REPO="kaizen-agentic"
 export HELIX_FLAVOR="claude"
 export HELIX_TOKENS="12500"
 export HELIX_INFRA_OVERHEAD_SHARE="0.12"
 kaizen-agentic metrics record tdd-workflow --success --time 4200 --quality 0.9
 ```
 `metrics record` merges env vars into the execution record before append.
 ### Explicit (JSON)
 ```bash
 echo '{
  "success": true,
  "execution_time_s": 4200,
  "quality_score": 0.9,
  "helix_session_uid": "claude:17092961-…",
  "repo": "kaizen-agentic",
  "flavor": "claude",
  "tokens": 12500,
  "infra_overhead_share": 0.12
 }' | kaizen-agentic metrics record tdd-workflow --json
 ```
 ## Fleet lookup (read-only)
 ```bash
 export HELIX_STORE_DB=~/.helix-forge/store.db   # agentic-resources session store
 kaizen-agentic metrics correlate claude:17092961-…
 ```
 When `HELIX_STORE_DB` is unset, `metrics correlate` returns a **stub** response
 documenting expected fields — no ingestion code runs in kaizen-agentic.
 ## Bidirectional references
 | Document | Repo |
 |----------|------|
 | [ADR-004](../adr/ADR-004-project-metrics-convention.md) | kaizen-agentic |
 | [wiki/EcosystemIntegration.md](../../wiki/EcosystemIntegration.md) | kaizen-agentic |
 | [DESIGN-session-memory.md](https://github.com/coulomb/agentic-resources/blob/main/docs/DESIGN-session-memory.md) | agentic-resources |
 | `session_memory/core/store.py` — `get_digest()` | agentic-resources |
 agentic-resources should link back to this document from its session-memory design
 notes when documenting downstream consumers of `session_uid`.
 ## Non-goals
 - No Claude/Codex/Grok JSONL ingestion in kaizen-agentic
 - No write path to Helix Forge from kaizen-agentic CLI
 - No merge of fleet baselines into project `summary.json` (Coach may cite both)
--- a/docs/integrations/kontextual-wiki-ingestion-spike.md
+++ b/docs/integrations/kontextual-wiki-ingestion-spike.md
@@ -0,0 +1,41 @@
 # kontextual-engine Wiki Ingestion Spike
 Design note (WP-0004 Part 4). No runtime dependency.
 ## Proposed manifest
 ```yaml
 ingestion:
  source_repo: kaizen-agentic
  asset_class: strategic-knowledge
  paths:
    - wiki/**/*.md
    - INTENT.md
    - docs/adr/ADR-*.md
  exclude:
    - wiki/**/xxx
  metadata:
    domain: custodian
    topic_id: cee7bedf-2b48-46ef-8601-006474f2ad7a
    producer: kaizen-agentic
  refresh:
    trigger: git-push-main
    retention_class: operational-knowledge
 ```
 ## Rationale
 - `wiki/` holds product narrative and integration contracts not suited for agent prompts alone
 - ADRs are normative; kontextual-engine can index them for cross-repo retrieval
 - Agent definitions (`agents/`) remain separate — executable personas vs strategic docs
 ## Open questions
 1. Chunking strategy for `KaizenAgentTemplate.md` (section-aware vs whole-file)
 2. Whether Coach synthesis outputs should be ingested as derived assets
 3. Correlation with info-tech-canon profiles when both exist for one agent
 ## Next step
 Dedicated workplan after WP-0004 baseline; evaluate kontextual-engine ingestion API
 stability before hard dependency.
--- a/docs/integrations/optimizer-artifact-manifest.md
+++ b/docs/integrations/optimizer-artifact-manifest.md
@@ -0,0 +1,60 @@
 # Optimizer Evidence Artifact Manifest
 Package schema for `kaizen-agentic metrics publish` → **artifact-store**.
 ## Package identity
 | Field | Value |
 |-------|-------|
 | `producer` | `kaizen-agentic` |
 | `retention_class` | `raw-evidence` (180d default, ADR-004 aligned) |
 | `name` | `kaizen-optimizer-<project-slug>` |
 | `subject` | project directory name (override with `--subject`) |
 ## Files
 | Relative path | Source | Media type |
 |---------------|--------|------------|
 | `optimizer/analysis.json` | `.kaizen/metrics/optimizer/analysis.json` | `application/json` |
 | `optimizer/recommendations.jsonl` | `.kaizen/metrics/optimizer/recommendations.jsonl` | `application/x-ndjson` |
 `recommendations.jsonl` is omitted from upload when absent (e.g. insufficient samples).
 ## Metadata (`POST /packages`)
 ```json
 {
  "schema": "kaizen-agentic/optimizer-evidence/v1",
  "project": "demo-app",
  "project_root": "/path/to/demo-app",
  "producer": "kaizen-agentic",
  "retention_class": "raw-evidence",
  "retention_days": 180,
  "optimized_at": "2026-06-18",
  "agents": ["tdd-workflow", "coach"],
  "files": [
    "optimizer/analysis.json",
    "optimizer/recommendations.jsonl"
  ]
 }
 ```
 ## Publish workflow
 ```bash
 # 1. Ensure optimizer has run
 kaizen-agentic metrics optimize
 # 2. Publish (artifact-store must be reachable)
 export ARTIFACTSTORE_API_URL=http://127.0.0.1:8000
 export ARTIFACTSTORE_API_TOKEN=<write-token>
 kaizen-agentic metrics publish --target .
 ```
 Local-only workflows skip publish; `.kaizen/metrics/optimizer/` remains authoritative.
 ## Related
 - [artifact-store ingestion API](https://github.com/coulomb/artifact-store) — `POST /packages`, `/files`, `/finalize`
 - [ADR-004](../adr/ADR-004-project-metrics-convention.md)
 - [INTEGRATION_PATTERNS.md](../INTEGRATION_PATTERNS.md)
--- a/docs/templates/session-close-protocol.md
+++ b/docs/templates/session-close-protocol.md
@@ -0,0 +1,33 @@
 # Session-Close Protocol Template
 Reference template for memory-enabled agents. Copy the **Session Close** block
 into `agents/agent-<name>.md` and adapt the metrics line to the agent.
 ## Session Close
 1. Update `## Accumulated Findings`, `## What Worked`, and `## Watch Points` as needed.
 2. Append one line to `## Session Log`: `YYYY-MM-DD · <summary> · <outcome>`.
 3. Bump `last_updated` to today and increment `session_count` in memory frontmatter.
 4. Record session metrics (adjust flags to match outcome):
 ```bash
 kaizen-agentic metrics record <agent-name> --success --time <seconds> --quality <0.0-1.0>
 # or on failure:
 kaizen-agentic metrics record <agent-name> --failure --time <seconds>
 ```
 Optional: pass a full JSON record (ADR-004 schema) via stdin:
 ```bash
 echo '{"success": true, "quality_score": 0.9, "primary_metric": {"name": "...", "value": 1.0, "target": 1.0}}' \
  | kaizen-agentic metrics record <agent-name> --json
 ```
 Use `--idempotency-key <session-id>` to avoid duplicate records if the close
 protocol runs more than once for the same session.
 ## Pilot agents
 `tdd-workflow` is the reference implementation (WP-0003 Part 5). Other
 memory-enabled agents should adopt this block as the metrics CLI becomes available
 in their workflows.
--- a/history/2026-06-16-ecosystem-assessment.md
+++ b/history/2026-06-16-ecosystem-assessment.md
@@ -0,0 +1,172 @@
 # KaizenAgentic Ecosystem Assessment
 **Date:** 2026-06-16
 **Compared repos:** info-tech-canon, agentic-resources, activity-core, llm-connect, identity-canon, phase-memory, artifact-store, domain-tree, kontextual-engine, tele-mcp
 **Against:** `INTENT.md`, `wiki/`, WP-0003 measurement loop plan
 ---
 ## Strategic Insight
 INTENT's vision is **distributed across the ecosystem**, not missing from a single repo:
 | INTENT promise | Primary owner |
 |----------------|---------------|
 | Agent definitions + deployment | kaizen-agentic |
 | Project memory + Coach | kaizen-agentic |
 | Per-agent metrics + optimizer | kaizen-agentic (WP-0003) |
 | Session capture + fleet metrics | agentic-resources (Helix Forge) |
 | Scheduled improvement triggers | activity-core |
 | Evidence retention | artifact-store |
 | Rich memory graphs | phase-memory (future) |
 | Guidance as knowledge | kontextual-engine + info-tech-canon |
 | Semantic vocabulary | info-tech-canon, identity-canon |
 | Org placement | domain-tree |
 | Runtime telemetry MCP | tele-mcp (unassessed — not cloned) |
 KaizenAgentic matures by **stabilizing conventions and composing adjacent services**, consistent with INTENT boundaries.
 ---
 ## Per-Repo Assessment
 ### agentic-resources — P0
 **Role:** AgentOps / Helix Forge — Capture → Detect → Curate → Distribute → Measure on coding sessions.
 **Use:** Fleet-level session metrics (`session_memory/measure/`), JSONL baselines, cross-agent adapters (Claude/Codex/Grok). Complements project-scoped `.kaizen/metrics/`.
 **Action:** ADR-004 correlation fields; WP-0004 integration; do not re-implement session ingestion here.
 ### activity-core — P1
 **Role:** Event bridge — cron/NATS → task emission.
 **Use:** Scheduled `metrics optimize`, retention hygiene, metrics scaffold validation after agent install.
 **Action:** WP-0004 ActivityDefinitions after WP-0003 Part 2.
 ### artifact-store — P1
 **Role:** Artifact registry + retention gateway.
 **Use:** Persist optimizer `analysis.json`, recommendations, e2e evidence packages.
 **Action:** WP-0004 pilot registration with `raw-evidence` retention class.
 ### info-tech-canon — P2
 **Role:** Markdown-first semantic canon, agent briefs, patterns, profiles.
 **Use:** Map KaizenAgentTemplate → canon profiles; publish per-agent briefs; validation rules for `kaizen-agentic validate`.
 **Action:** WP-0004 Part 4 (later phase).
 ### phase-memory — P2
 **Role:** Profile-driven memory graphs (ephemeral → rigid).
 **Use:** Upgrade path from flat `.kaizen/agents/*/memory.md`.
 **Action:** Future WP after WP-0004; no WP-0003 blocker.
 ### kontextual-engine — P2
 **Role:** Knowledge operations engine.
 **Use:** Ingest `wiki/` and `agents/` as knowledge assets; KaizenGuidance catalog runtime.
 **Action:** WP-0004 Part 4 (guidance pilot).
 ### llm-connect — P3
 **Role:** Provider-neutral LLM adapter.
 **Use:** Automated Coach/optimizer narration when LLM synthesis moves beyond CLI context assembly.
 **Action:** Reference pattern; adopt when WP-0003+ adds LLM-powered recommendations.
 ### domain-tree — P3
 **Role:** Organizational domain tree (primary + secondary bindings).
 **Use:** Register kaizen-agentic and agent categories in org structure.
 **Action:** When capability catalog matures.
 ### identity-canon — P3
 **Role:** Identity/agent terminology research.
 **Use:** Distinguish agent persona vs instance vs session actor for "digital talent agency" framing.
 **Action:** Glossary alignment in wiki.
 ### tele-mcp — TBD
 **Status:** On Forgejo (`coulomb/tele-mcp`); not cloned; not in State Hub registry. Name suggests telemetry MCP.
 **Action:** Clone and assess before integration; candidate for WP-0001 T04 telemetry adapter.
 ---
 ## Two-Layer Measurement Model
 | Layer | Scope | Owner | Storage |
 |-------|-------|-------|---------|
 | **Fleet** | Cross-repo session outcomes | agentic-resources | Helix Forge store + `measure/baselines.jsonl` |
 | **Project** | Per-agent persona performance in one repo | kaizen-agentic | `.kaizen/metrics/<agent>/executions.jsonl` |
 Correlation via shared fields defined in ADR-004 (`helix_session_uid`, `repo`, `success`, `tokens`, `execution_time_s`).
 See `wiki/EcosystemIntegration.md` for integration contracts.
 ---
 ## Priority Matrix
 | Priority | Repo | WP |
 |----------|------|-----|
 | P0 | agentic-resources | WP-0004 Part 1 |
 | P1 | activity-core | WP-0004 Part 2 |
 | P1 | artifact-store | WP-0004 Part 3 |
 | P2 | info-tech-canon, kontextual-engine, phase-memory | WP-0004 Part 4 / future |
 | P3 | llm-connect, domain-tree, identity-canon | Adopt as needed |
 | TBD | tele-mcp | Assess when cloned |
 ---
 ## Follow-Up Workplans
 - **KAIZEN-WP-0003** — measurement loop (completed 2026-06-18)
 - **KAIZEN-WP-0004** — ecosystem integration (completed 2026-06-18)
 ---
 ## WP-0004 Outcomes (2026-06-18)
 ### Part 1 — Helix Forge correlation
 - `HELIX_SESSION_UID` env auto-merge on `metrics record`
 - `kaizen-agentic metrics correlate <uid>` read-only adapter (sqlite or stub)
 - Contract: `docs/integrations/helix-forge-correlation.md`
 - Worked example in `wiki/EcosystemIntegration.md`
 ### Part 2 — activity-core triggers
 - Three ActivityDefinition reference copies under `docs/integrations/activity-definitions/`
 - Activation contract: `docs/INTEGRATION_PATTERNS.md`
 ### Part 3 — artifact-store evidence
 - `kaizen-agentic metrics publish` with `raw-evidence` retention class
 - Manifest: `docs/integrations/optimizer-artifact-manifest.md`
 ### Part 4 — Canon and knowledge (stretch)
 - Template mapping: `docs/integrations/canon-template-mapping.md`
 - tdd-workflow canon brief: `docs/integrations/briefs/tdd-workflow-canon-brief.md`
 - kontextual-engine spike: `docs/integrations/kontextual-wiki-ingestion-spike.md`
 No hard dependencies on info-tech-canon, kontextual-engine, or agentic-resources
 runtime in kaizen-agentic — integration remains contract-based.
--- a/history/2026-06-16-intent-gap-analysis.md
+++ b/history/2026-06-16-intent-gap-analysis.md
@@ -0,0 +1,87 @@
 # KaizenAgentic Intent Gap Analysis
 **Date:** 2026-06-16
 **Scope:** `INTENT.md`, `wiki/`, codebase (`agents/`, `src/kaizen_agentic/`, `docs/`, workplans)
 **Author:** kaizen-agentic session assessment
 ---
 ## Executive Summary
 Kaizen-agentic is in a **two-layer state**: the strategic/conceptual layer (`INTENT.md`, `wiki/`) is well-developed; the operational layer (agents, CLI, agency framework) is substantial but implements a **deployment and memory** product more than a **measurable continuous-improvement engine**.
 The largest gap: the **measurement → optimization → specification refinement loop** described in INTENT is largely unbuilt. Addressed by **KAIZEN-WP-0003** (registered 2026-06-16).
 ---
 ## Alignment
 | INTENT asset | Status |
 |--------------|--------|
 | Mission and conceptual model | `wiki/` established |
 | KaizenAgent definition template | `wiki/KaizenAgentTemplate.md` — not enforced in agents |
 | Meta-optimizer concept | `wiki/AgentKaizenOptimizer.md` + `agent-optimization.md` — no data pipeline |
 | Idempotent/measurable principles | Documented; not in agent implementations |
 | Codebase improvement guidance | `wiki/KaizenGuidance.md` — vision only |
 | Prompts/experiments/mantras | `wiki/KaizenPrompting.md` — not operationalized |
 | Product/pricing/brand | `wiki/` complete |
 | Agency memory + Coach | WP-0002 shipped |
 | CLI deployment | Functional (21 agents) |
 ---
 ## Critical Gaps
 ### 1. Kaizen loop not closed
 INTENT requires evidence-based refinement with before/after deltas. Reality: `OptimizationLoop` exists but is unwired; no `.kaizen/metrics/`; WP-0001 telemetry unstarted.
 ### 2. Agent template not enforced
 Agents use minimal YAML frontmatter; `wiki/KaizenAgentTemplate.md` (metrics, idempotency, testing, evolution) is reference only.
 ### 3. KaizenGuidance unbuilt
 No guide catalog, manifests, codemods, or Parse→Measure pipeline.
 ### 4. Coach vs optimizer not integrated
 Qualitative memory (Coach) and quantitative optimization (optimizer) are separate paths.
 ### 5. Agent implementation boundary undeclared
 INTENT says repo should not own all concrete agent implementations; 21 agents live here as reference fleet — interim state needs explicit policy.
 ---
 ## Design Principles Scorecard
 | Principle | Status |
 |-----------|--------|
 | Continuous Improvement | Partial (memory; no automated refinement) |
 | Measurable by Default | Gap |
 | Idempotent Operations | Gap |
 | Evidence over Intuition | Gap |
 | Separation of Concerns | Partial |
 | Composable Capabilities | Gap |
 | Human-Readable + Machine-Executable | Gap (guidance) |
 | Rollback-Ready Evolution | Partial |
 | Compounding Value | Partial (memory only) |
 ---
 ## Remediation Sequence
 1. **WP-0003** — metrics convention, CLI, optimizer wiring, Coach bridge (active)
 2. **WP-0004** — ecosystem integration (agentic-resources, activity-core, artifact-store)
 3. Future — KaizenGuidance catalog, phase-memory upgrade, full template conformance
 ---
 ## Related Artifacts
 - `SCOPE.md` — updated 2026-06-16
 - `workplans/kaizen-agentic-WP-0003-measurement-loop.md`
 - `history/2026-06-16-ecosystem-assessment.md`
 - `wiki/EcosystemIntegration.md`
 - `docs/adr/ADR-004-project-metrics-convention.md`
--- a/history/README.md
+++ b/history/README.md
@@ -0,0 +1,11 @@
 # History
 Persisted assessments, gap analyses, and ecosystem reviews for KaizenAgentic.
 | Date | Document | Summary |
 |------|----------|---------|
 | 2026-06-16 | [2026-06-16-intent-gap-analysis.md](2026-06-16-intent-gap-analysis.md) | INTENT.md vs implementation gaps; remediation sequence |
 | 2026-06-16 | [2026-06-16-ecosystem-assessment.md](2026-06-16-ecosystem-assessment.md) | Cross-repo comparison (10 ecosystem repos) |
 These files are point-in-time records. Living conventions live in `INTENT.md`,
 `SCOPE.md`, `wiki/`, and `docs/adr/`.
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 [project]
 name = "kaizen-agentic"
-version = "1.0.2"
+version = "1.1.0"
 description = "AI agent development framework embracing continuous improvement (kaizen)"
 readme = "README.md"
 license = {file = "LICENSE"}
@@ -135,4 +135,4 @@ exclude_lines = [
 [tool.flake8]
 max-line-length = 88
-extend-ignore = ["E203", "W503"]
+extend-ignore = ["E203", "W503"]
--- a/registry/README.md
+++ b/registry/README.md
@@ -0,0 +1,12 @@
 # Capability Registry
 Markdown-first capability index for federation and reuse planning.
 ## Authoring
 1. Copy a capability entry template (see reuse-surface `templates/capability-entry.template.md`).
 2. Add the row to `indexes/capabilities.yaml`.
 3. Run `reuse-surface validate` from a checkout with the CLI installed.
 4. Merge to `main` and verify publish with `reuse-surface establish --publish-check`.
 Federation contract: reuse-surface `docs/RegistryFederation.md`.
--- a/registry/capabilities/.gitkeep
+++ b/registry/capabilities/.gitkeep
--- a/registry/indexes/capabilities.yaml
+++ b/registry/indexes/capabilities.yaml
@@ -0,0 +1,4 @@
 version: 1
 updated: '2026-06-16'
 domain: helix_forge
 capabilities: []
--- a/src/kaizen_agentic/init.py
+++ b/src/kaizen_agentic/init.py
@@ -9,13 +9,14 @@ It also includes a comprehensive agent distribution system for sharing
 specialized agents across projects via CLI tools and package management.
 """
-__version__ = "1.0.2"
+__version__ = "1.1.0"
 __author__ = "Kaizen Agentic Team"
 from .core import Agent, AgentConfig
 from .optimization import OptimizationLoop, PerformanceMetrics
 from .registry import AgentRegistry, AgentDefinition, AgentCategory
 from .installer import AgentInstaller, ProjectInitializer, InstallationConfig
 from .metrics import MetricsStore
 __all__ = [
    "Agent",
@@ -28,4 +29,5 @@ __all__ = [
    "AgentInstaller",
    "ProjectInitializer",
    "InstallationConfig",
    "MetricsStore",
 ]
--- a/src/kaizen_agentic/cli.py
+++ b/src/kaizen_agentic/cli.py
@@ -1,7 +1,7 @@
 """Command-line interface for Kaizen Agentic agent management."""
 import json
 import sys
 import subprocess
 import contextlib
 import io
 import click
@@ -10,6 +10,14 @@ from typing import List, Optional
 from .registry import AgentRegistry, AgentCategory
 from .installer import AgentInstaller, ProjectInitializer, InstallationConfig
 from .integrations.artifact_store import (
    default_api_token,
    default_api_url,
    publish_optimizer_evidence,
 )
 from .integrations.helix import HelixCorrelationAdapter, enrich_helix_correlation
 from .metrics import MetricsStore, OptimizerStore, performance_summary_markdown
 from .optimization import OptimizationLoop, MIN_SAMPLES_FOR_RECOMMENDATIONS
 def safe_cli_wrapper():
@@ -60,17 +68,22 @@ def safe_cli_wrapper():
    affected_commands = len(sys.argv) >= 2 and sys.argv[1] in ["install", "update"]
    try:
-        with contextlib.redirect_stderr(stderr_capture), contextlib.redirect_stdout(stdout_capture):
+        with contextlib.redirect_stderr(stderr_capture), contextlib.redirect_stdout(
            stdout_capture
        ):
            cli(standalone_mode=False)
    except click.UsageError as e:
        if affected_commands and "Got unexpected extra argument" in str(e):
            # This is the spurious error for install/update commands
            # Check if we got some stdout output indicating success
            captured_stdout = stdout_capture.getvalue()
-            success_indicators = ["Installing agents to:", "Updating all installed agents:"]
+            success_indicators = [
                "Installing agents to:",
                "Updating all installed agents:",
            ]
            if any(indicator in captured_stdout for indicator in success_indicators):
                # The command was actually executing, show the real output
-                print(captured_stdout, end='')
+                print(captured_stdout, end="")
                sys.exit(0)
            else:
                # This might be a real error
@@ -87,29 +100,51 @@ def safe_cli_wrapper():
        if e.code == 0:
            # Successful exit
-            print(captured_stdout, end='')
+            print(captured_stdout, end="")
        else:
            # Error exit - show both stdout and stderr unless it's the spurious error
            if affected_commands and "Got unexpected extra argument" in captured_stderr:
                # Show only stdout for install/update commands with spurious errors
-                print(captured_stdout, end='')
+                print(captured_stdout, end="")
-                success_indicators = ["Installing agents to:", "Updating all installed agents:"]
+                success_indicators = [
-                if any(indicator in captured_stdout for indicator in success_indicators):
+                    "Installing agents to:",
                    "Updating all installed agents:",
                ]
                if any(
                    indicator in captured_stdout for indicator in success_indicators
                ):
                    sys.exit(0)  # Override error exit if we see success indicators
            else:
                # Show everything for other commands
-                print(captured_stdout, end='')
+                print(captured_stdout, end="")
-                print(captured_stderr, end='', file=sys.stderr)
+                print(captured_stderr, end="", file=sys.stderr)
            sys.exit(e.code)
    except Exception as e:
        print(f"Error: {e}")
        sys.exit(1)
    # If we get here, show captured output
-    print(stdout_capture.getvalue(), end='')
+    print(stdout_capture.getvalue(), end="")
    stderr_content = stderr_capture.getvalue()
-    if stderr_content and not (affected_commands and "Got unexpected extra argument" in stderr_content):
+    if stderr_content and not (
-        print(stderr_content, end='', file=sys.stderr)
+        affected_commands and "Got unexpected extra argument" in stderr_content
    ):
        print(stderr_content, end="", file=sys.stderr)
 _FEEDBACK_CHANNELS = {
    "issues": "https://gitea.coulomb.social/coulomb/kaizen-agentic/issues",
    "issue_templates": "https://gitea.coulomb.social/coulomb/kaizen-agentic/issues/new/choose",
    "feedback_guide": (
        "https://gitea.coulomb.social/coulomb/kaizen-agentic/"
        "src/branch/main/docs/FEEDBACK.md"
    ),
    "contributing": (
        "https://gitea.coulomb.social/coulomb/kaizen-agentic/"
        "src/branch/main/CONTRIBUTING.md"
    ),
 }
@click.group()
@click.version_option()
@@ -118,6 +153,35 @@ def cli():
    pass
@cli.command("feedback")
@click.option("--json", "as_json", is_flag=True, help="Emit machine-readable JSON")
 def feedback(as_json: bool):
    """Show how to submit bugs, ideas, and adoption feedback."""
    payload = {
        "channels": _FEEDBACK_CHANNELS,
        "templates": ["bug_report", "feature_request", "feedback"],
        "cli_hint": (
            "Use Gitea issue templates or State Hub messages "
            "for cross-repo coordination"
        ),
    }
    if as_json:
        click.echo(json.dumps(payload, indent=2, sort_keys=True))
        return
    click.echo("Kaizen Agentic — feedback channels")
    click.echo("=" * 40)
    click.echo(f"Issues:          {_FEEDBACK_CHANNELS['issues']}")
    click.echo(f"New issue:       {_FEEDBACK_CHANNELS['issue_templates']}")
    click.echo(f"Feedback guide:  {_FEEDBACK_CHANNELS['feedback_guide']}")
    click.echo(f"Contributing:    {_FEEDBACK_CHANNELS['contributing']}")
    click.echo()
    click.echo("Templates: bug report · feature request · general feedback")
    click.echo(
        "Tip: include Python version and `kaizen-agentic --version` in bug reports."
    )
@cli.command("list")
@click.option(
    "--category",
@@ -739,11 +803,11 @@ def disable(name: str, target: str):
        click.echo(f"❌ Extension not found: {name}")
-@extensions.command()
+@extensions.command("remove")
@click.argument("name")
@click.option("--target", "-t", default=".", help="Target directory (default: current)")
@click.confirmation_option(prompt="Are you sure you want to remove this extension?")
-def remove(name: str, target: str):
+def remove_extension(name: str, target: str):
    """Remove an extension."""
    from .extensions import ExtensionManager
@@ -781,7 +845,12 @@ def memory_show(agent_name: str, target: str):
@memory.command("init")
@click.argument("agent_name")
@click.option("--target", "-t", default=".", help="Project root (default: current)")
-def memory_init(agent_name: str, target: str):
+@click.option(
    "--no-metrics",
    is_flag=True,
    help="Skip scaffolding .kaizen/metrics/<agent>/ (default: create metrics dir)",
 )
 def memory_init(agent_name: str, target: str, no_metrics: bool):
    """Scaffold an empty memory file for an agent."""
    memory_path = _memory_path(target, agent_name)
@@ -820,11 +889,17 @@ session_count: 0
    memory_path.write_text(content)
    click.echo(f"Initialized memory for '{agent_name}': {memory_path}")
    if not no_metrics:
        metrics_dir = MetricsStore(Path(target), agent_name).scaffold()
        click.echo(f"Initialized metrics for '{agent_name}': {metrics_dir}")
    # For agents with protocols, note the protocol location
    registry = _get_registry()
    protocols_dir = registry.agents_dir / "protocols" / agent_name
    if protocols_dir.exists():
-        slugs = [f.stem for f in sorted(protocols_dir.glob("*.md")) if f.name != "README.md"]
+        slugs = [
            f.stem for f in sorted(protocols_dir.glob("*.md")) if f.name != "README.md"
        ]
        if slugs:
            click.echo(f"  Protocols available for '{agent_name}':")
            for slug in slugs:
@@ -834,7 +909,9 @@ session_count: 0
@memory.command("brief")
@click.argument("agent_name")
@click.option("--target", "-t", default=".", help="Project root (default: current)")
-@click.option("--raw", is_flag=True, help="Dump raw memory files without synthesis header")
+@click.option(
    "--raw", is_flag=True, help="Dump raw memory files without synthesis header"
 )
 def memory_brief(agent_name: str, target: str, raw: bool):
    """Print a coach-synthesised orientation for an agent.
@@ -871,6 +948,7 @@ def memory_brief(agent_name: str, target: str, raw: bool):
        return
    from datetime import date as _date
    today = _date.today().isoformat()
    sources = ([agent_name] if own_memory else []) + list(other_memories.keys())
@@ -880,18 +958,29 @@ def memory_brief(agent_name: str, target: str, raw: bool):
    click.echo(f"Sources: {', '.join(sources) if sources else 'none'}")
    click.echo()
-    if not sources:
+    metrics_store = MetricsStore(project_root, agent_name)
    metrics_summary = metrics_store.read_summary()
    if metrics_summary is None and metrics_store.executions_path.exists():
        metrics_summary = metrics_store.write_summary()
    if not sources and not metrics_summary:
        click.echo("No agent memory files found in this project.")
        click.echo(f"  Run: kaizen-agentic memory init {agent_name}")
        click.echo("  Then load the coach agent (agents/agent-coach.md) for synthesis.")
        return
    performance_block = performance_summary_markdown(metrics_summary or {})
    if performance_block:
        click.echo(performance_block)
    # Own memory section
    if own_memory:
        click.echo("### Your Memory")
        click.echo(own_memory)
    else:
-        click.echo(f"### Your Memory\n(none — run: kaizen-agentic memory init {agent_name})\n")
+        click.echo(
            f"### Your Memory\n(none — run: kaizen-agentic memory init {agent_name})\n"
        )
    # Cross-agent context
    if other_memories:
@@ -901,17 +990,23 @@ def memory_brief(agent_name: str, target: str, raw: bool):
            click.echo(f"--- {name} ---")
            click.echo(content)
    else:
-        click.echo("### Context From Other Agents\nNo other agent memories found in this project.\n")
+        click.echo(
            "### Context From Other Agents\nNo other agent memories found in this project.\n"
        )
    click.echo("---")
-    click.echo("Tip: Load agents/agent-coach.md in your Claude session and pass this output")
+    click.echo(
        "Tip: Load agents/agent-coach.md in your Claude session and pass this output"
    )
    click.echo("     for a full cross-agent synthesis and orientation brief.")
@memory.command("clear")
@click.argument("agent_name")
@click.option("--target", "-t", default=".", help="Project root (default: current)")
-@click.confirmation_option(prompt="This will permanently delete the agent memory. Continue?")
+@click.confirmation_option(
    prompt="This will permanently delete the agent memory. Continue?"
 )
 def memory_clear(agent_name: str, target: str):
    """Wipe agent memory for the current project."""
    memory_path = _memory_path(target, agent_name)
@@ -928,6 +1023,270 @@ def memory_clear(agent_name: str, target: str):
        memory_path.parent.rmdir()
@cli.group()
 def metrics():
    """Manage project-scoped agent metrics (.kaizen/metrics/<agent>/)."""
    pass
@metrics.command("record")
@click.argument("agent_name")
@click.option("--target", "-t", default=".", help="Project root (default: current)")
@click.option(
    "--success", "outcome_success", is_flag=True, help="Record successful execution"
 )
@click.option(
    "--failure", "outcome_failure", is_flag=True, help="Record failed execution"
 )
@click.option("--time", "execution_time", type=float, help="Execution time in seconds")
@click.option("--quality", type=float, help="Quality score 0.0–1.0")
@click.option("--session-id", help="Optional session identifier")
@click.option("--idempotency-key", help="Skip append if this key was already recorded")
@click.option(
    "--json", "json_input", is_flag=True, help="Read full record JSON from stdin"
 )
 def metrics_record(
    agent_name: str,
    target: str,
    outcome_success: bool,
    outcome_failure: bool,
    execution_time: Optional[float],
    quality: Optional[float],
    session_id: Optional[str],
    idempotency_key: Optional[str],
    json_input: bool,
 ):
    """Append one execution record for an agent."""
    store = MetricsStore(_project_root(target), agent_name)
    if json_input:
        payload = json.load(sys.stdin)
        if not isinstance(payload, dict):
            click.echo("Error: JSON input must be an object", err=True)
            sys.exit(1)
    else:
        if outcome_success and outcome_failure:
            click.echo("Error: use only one of --success or --failure", err=True)
            sys.exit(1)
        if not outcome_success and not outcome_failure:
            click.echo(
                "Error: specify --success or --failure (or use --json)", err=True
            )
            sys.exit(1)
        payload = {"success": outcome_success}
        if execution_time is not None:
            payload["execution_time_s"] = execution_time
        if quality is not None:
            payload["quality_score"] = quality
        if session_id:
            payload["session_id"] = session_id
    payload = enrich_helix_correlation(payload)
    if store.append(payload, idempotency_key=idempotency_key):
        click.echo(f"Recorded metrics for '{agent_name}'")
    else:
        click.echo(
            f"Skipped duplicate record for '{agent_name}' (idempotency key exists)"
        )
@metrics.command("show")
@click.argument("agent_name")
@click.option("--target", "-t", default=".", help="Project root (default: current)")
@click.option(
    "--limit", "-n", default=5, show_default=True, help="Recent executions to show"
 )
 def metrics_show(agent_name: str, target: str, limit: int):
    """Print metrics summary and recent executions for an agent."""
    store = MetricsStore(_project_root(target), agent_name)
    if not store.executions_path.exists():
        click.echo(f"No metrics found for agent '{agent_name}'.")
        click.echo(f"  Expected: {store.agent_dir}")
        click.echo(f"  Run: kaizen-agentic memory init {agent_name}")
        return
    summary = store.read_summary() or store.write_summary()
    click.echo(f"Metrics for '{agent_name}':")
    click.echo("=" * 40)
    click.echo(json.dumps(summary, indent=2))
    records = store.read_executions()
    if records:
        click.echo("\nRecent executions:")
        for record in records[-limit:]:
            click.echo(json.dumps(record, sort_keys=True))
@metrics.command("list")
@click.option("--target", "-t", default=".", help="Project root (default: current)")
 def metrics_list(target: str):
    """List agents with metrics in the current project."""
    agents = MetricsStore.list_agents(_project_root(target))
    if not agents:
        click.echo("No agent metrics found in this project.")
        click.echo("  Run: kaizen-agentic memory init <agent>")
        return
    click.echo("Agents with metrics:")
    for name in agents:
        store = MetricsStore(_project_root(target), name)
        summary = store.read_summary()
        count = summary["execution_count"] if summary else len(store.read_executions())
        click.echo(f"  • {name} ({count} executions)")
@metrics.command("optimize")
@click.argument("agent_name", required=False)
@click.option("--target", "-t", default=".", help="Project root (default: current)")
@click.option(
    "--min-samples",
    default=MIN_SAMPLES_FOR_RECOMMENDATIONS,
    show_default=True,
    help="Minimum execution records required for recommendations",
 )
 def metrics_optimize(agent_name: Optional[str], target: str, min_samples: int):
    """Run optimizer analysis on project metrics and write recommendations."""
    project_root = _project_root(target)
    agents = [agent_name] if agent_name else MetricsStore.list_agents(project_root)
    if not agents:
        click.echo("No agent metrics found to optimize.")
        click.echo(
            "  Record executions with: kaizen-agentic metrics record <agent> --success"
        )
        return
    optimizer_store = OptimizerStore(project_root)
    combined_reports = []
    for name in agents:
        store = MetricsStore(project_root, name)
        records = store.read_executions()
        loop = OptimizationLoop.from_metrics_store(store, min_samples=1)
        report = loop.get_optimization_report_json()
        report["sample_threshold"] = min_samples
        report["meets_sample_threshold"] = len(records) >= min_samples
        combined_reports.append(report)
        click.echo(f"Agent: {name}")
        click.echo("=" * 40)
        click.echo(json.dumps(report, indent=2))
        if len(records) >= min_samples:
            optimizer_store.append_recommendations(
                name,
                report["recommendations"],
                metrics_count=len(records),
            )
        else:
            click.echo(
                f"  Note: {len(records)} record(s) — "
                f"need {min_samples} for actionable recommendations"
            )
        click.echo()
    analysis_payload = {
        "project": project_root.name,
        "optimized_at": _today(),
        "min_samples": min_samples,
        "agents": combined_reports,
    }
    analysis_path = optimizer_store.write_analysis(analysis_payload)
    click.echo(f"Wrote optimizer analysis: {analysis_path}")
@metrics.command("correlate")
@click.argument("session_uid")
@click.option(
    "--store-db",
    envvar="HELIX_STORE_DB",
    help="Helix Forge session-memory SQLite database path",
 )
 def metrics_correlate(session_uid: str, store_db: Optional[str]):
    """Look up Helix Forge digest summary for a session UID (read-only)."""
    adapter = HelixCorrelationAdapter(
        store_db=Path(store_db).resolve() if store_db else None
    )
    if adapter.store_db is None:
        adapter = HelixCorrelationAdapter.from_env()
    summary = adapter.lookup(session_uid)
    click.echo(json.dumps(summary, indent=2, sort_keys=True))
@metrics.command("publish")
@click.option("--target", "-t", default=".", help="Project root (default: current)")
@click.option(
    "--api-url",
    default=default_api_url,
    show_default=True,
    help="artifact-store API base URL (ARTIFACTSTORE_API_URL)",
 )
@click.option(
    "--token",
    default=default_api_token,
    help="artifact-store bearer token (ARTIFACTSTORE_API_TOKEN)",
 )
@click.option(
    "--subject",
    help="Package subject (default: project directory name)",
 )
@click.option(
    "--retention-class",
    default="raw-evidence",
    show_default=True,
    help="artifact-store retention class",
 )
 def metrics_publish(
    target: str,
    api_url: str,
    token: str,
    subject: Optional[str],
    retention_class: str,
 ):
    """Publish optimizer evidence to artifact-store (optional integration)."""
    project_root = _project_root(target)
    if not token:
        click.echo(
            "Error: artifact-store token required. Set ARTIFACTSTORE_API_TOKEN or --token.",
            err=True,
        )
        sys.exit(1)
    try:
        result = publish_optimizer_evidence(
            project_root,
            api_url=api_url,
            token=token,
            subject=subject,
            retention_class=retention_class,
        )
    except FileNotFoundError as exc:
        click.echo(f"Error: {exc}", err=True)
        sys.exit(1)
    except RuntimeError as exc:
        click.echo(f"Error: {exc}", err=True)
        sys.exit(1)
    click.echo(f"Published optimizer evidence package: {result.package_id}")
    click.echo(f"  Files uploaded: {result.files_uploaded}")
    click.echo(f"  Retention class: {result.retention_class}")
    if result.manifest_digest:
        click.echo(f"  Manifest digest: {result.manifest_digest}")
@metrics.command("export")
@click.argument("agent_name")
@click.option("--target", "-t", default=".", help="Project root (default: current)")
 def metrics_export(agent_name: str, target: str):
    """Dump executions.jsonl for an agent to stdout."""
    store = MetricsStore(_project_root(target), agent_name)
    if not store.executions_path.exists():
        click.echo(f"No metrics found for agent '{agent_name}'.", err=True)
        sys.exit(1)
    click.echo(store.executions_path.read_text(encoding="utf-8"), nl=False)
@cli.group()
 def protocols():
    """Browse agent protocol runbooks (agents/protocols/<agent>/<slug>.md)."""
@@ -1001,12 +1360,17 @@ def protocols_show(agent_name: str, slug: str):
    click.echo(protocol_path.read_text())
 def _project_root(target: str) -> Path:
    return Path(target).resolve()
 def _memory_path(target: str, agent_name: str) -> Path:
-    return Path(target).resolve() / ".kaizen" / "agents" / agent_name / "memory.md"
+    return _project_root(target) / ".kaizen" / "agents" / agent_name / "memory.md"
 def _today() -> str:
    from datetime import date
    return date.today().isoformat()
@@ -1032,14 +1396,20 @@ def _get_registry() -> AgentRegistry:
                # Try relative to package
                agents_dir = Path(kaizen_agentic.__file__).parent / "data" / "agents"
        except ImportError:
-            click.echo("Error: Could not find agents directory")
+            click.echo("Error: kaizen-agentic package is not installed.", err=True)
-            click.echo(
+            click.echo("  Fix: pip install -e .  (from repo root)", err=True)
-                "Make sure you're in a kaizen-agentic project or have the package installed"
+            click.echo("  Or: run from a project with an agents/ directory", err=True)
            )
            sys.exit(1)
    if not agents_dir.exists():
-        click.echo(f"Error: Agents directory not found: {agents_dir}")
+        click.echo(f"Error: agents directory not found: {agents_dir}", err=True)
        click.echo(
            "  Fix: cd into a kaizen-agentic checkout or a project with agents/",
            err=True,
        )
        click.echo(
            "  Or: kaizen-agentic install <template>  to scaffold agents", err=True
        )
        sys.exit(1)
    return AgentRegistry(agents_dir)
--- a/src/kaizen_agentic/data/agents/agent-coach.md
+++ b/src/kaizen_agentic/data/agents/agent-coach.md
@@ -0,0 +1,184 @@
 ---
 name: coach
 description: Coaching meta-agent that reads all agent memories in a project and synthesises cross-agent briefs and new-agent orientations
 category: meta
 memory: enabled
 ---
 # Coach Agent
 ## Role
 You are the **kaizen-agentic Coach** — a meta-agent that observes, synthesises,
 and advises. You do not perform domain work (coding, testing, infrastructure).
 Your sole purpose is to read across the accumulated memories of all agents in a
 project and produce useful, targeted briefs.
 You are invoked via:
 ```
 kaizen-agentic memory brief <agent-name>
 ```
 Or directly by the operator: *"Coach, brief the sys-medic agent on this project"*
 or *"Coach, what patterns have you observed across all agents?"*
 ---
 ## What You Do
 ### 1. Cross-Agent Synthesis
 Read all `.kaizen/agents/*/memory.md` files in the current project. Identify:
 - **Shared patterns**: themes that appear across multiple agents
  (e.g. "three agents flagged missing test coverage as a risk")
 - **Cross-domain risks**: signals in one agent's memory that should inform
  another (e.g. infrastructure instability flagged by sys-medic → tdd-workflow
  should account for flaky environments)
 - **Resource or architectural signals**: recurring mentions of specific files,
  modules, services, or systems across agents
 - **Contradictions or gaps**: where agents hold conflicting assumptions or where
  no agent has coverage
 ### 2. New-Agent Orientation
 When asked to brief a specific agent about to be deployed for the first time:
 1. Read all existing agent memories in the project
 2. Filter for what is relevant to the incoming agent's domain
 3. Produce a targeted orientation brief covering:
   - **Project context**: what kind of project this is, key constraints
   - **What to know first**: the most important facts for this agent
   - **Watch points**: risks or pitfalls flagged by other agents that are relevant
   - **What has worked**: successful approaches in adjacent domains
   - **Open threads**: unresolved items from other agents that may interact with
     this agent's work
 ### 3. Fleet Health Overview
 When asked for a fleet overview:
 - Summarise the health of the agent fleet: which agents are active, stale, or
  missing from the project
 - Flag agents with high `session_count` and still-open `## Open Threads`
 - Identify agents whose memories suggest overlapping concerns
 - Recommend whether any memory files should be reviewed or reset
 ---
 ## How to Read Agent Memory Files
 Memory files live at `.kaizen/agents/<name>/memory.md` relative to the project
 root. Each follows ADR-002 structure:
 ```
 ## Project Context      ← agent's understanding of the project
 ## Accumulated Findings ← patterns and recurring issues
 ## What Worked         ← validated approaches
 ## Watch Points        ← risks and traps
 ## Open Threads        ← unresolved items
 ## Session Log         ← chronological session summaries
 ```
 When synthesising, weight `## Watch Points` and `## Open Threads` most heavily —
 these are the signals most likely to be actionable for another agent.
 ### Project metrics (ADR-004)
 Quantitative performance data lives at `.kaizen/metrics/<agent>/summary.json`.
 `kaizen-agentic memory brief <agent>` includes a `## Performance Summary` block
 when metrics exist.
 When synthesising orientations:
 - Combine qualitative memory with quantitative trends (success rate, quality,
  execution time, trend arrows)
 - Flag agents with declining success rate or quality trends
 - Cross-reference metrics with `## Watch Points` — do metrics confirm or
  contradict qualitative findings?
 - Note when an agent has memory but no metrics (incomplete session-close protocol)
 Fleet optimizer output at `.kaizen/metrics/optimizer/analysis.json` provides
 project-wide analysis from `kaizen-agentic metrics optimize`.
 ---
 ## Output Format
 ### Cross-agent brief
 ```
 ## Cross-Agent Brief — <project name>
 Generated: <date>
 Agents with memory: <list>
 ### Shared Patterns
 <bullet list of themes appearing across ≥2 agents>
 ### Cross-Domain Risks
 <risks from one domain relevant to others>
 ### Open Threads (fleet-wide)
 <unresolved items that span or affect multiple agents>
 ### Fleet Health
 <which agents are active/stale, any concerning signals>
 ```
 ### New-agent orientation
 ```
 ## Orientation Brief for: <agent-name>
 Project: <project name>
 Generated: <date>
 Sources: <which agent memories were read>
 ### Performance Summary
 <from .kaizen/metrics/<agent>/ when available — success rate, quality, trends>
 ### What to Know First
 <3–5 most important facts for this agent>
 ### Watch Points
 <risks relevant to this agent's domain>
 ### What Has Worked
 <approaches validated by other agents that apply here>
 ### Open Threads You May Encounter
 <items from other agents that may intersect with your work>
 ```
 ---
 ## Behaviour Boundaries
 - **Do not** modify agent memory files
 - **Do not** perform any domain-specific work (coding, testing, diagnosis)
 - **Do not** make decisions — synthesise and advise only
 - **If no memories exist**: say so clearly and offer to help initialise them
 - **If asked about a specific agent not present**: note the gap
 ---
 ## Coach's Own Memory
 The coach maintains `.kaizen/agents/coach/memory.md` covering:
 - Fleet-level patterns observed over time
 - How the agent population in this project has evolved
 - Meta-observations about how well the memory convention is being followed
 - Recurring gaps or blind spots in the agent fleet
 ### Session Start
 1. Check for `.kaizen/agents/coach/memory.md`.
 2. If present, read it — prior fleet observations provide context for the current synthesis.
 3. Scan `.kaizen/agents/*/memory.md` to build the current fleet picture.
 ### Session Close
 1. Update `## Accumulated Findings` with new fleet-level patterns.
 2. Note any new agents added or memory files reset.
 3. Append one line to `## Session Log`: `YYYY-MM-DD · <brief requested for> · <key finding>`.
 4. Bump `last_updated` and `session_count`.
--- a/src/kaizen_agentic/data/agents/agent-agent-optimization.md
+++ b/src/kaizen_agentic/data/agents/agent-agent-optimization.md
@@ -1,7 +1,9 @@
 ---
-name: agent-optimizer
+name: optimization
 description: Meta-agent that analyzes and optimizes other Claude Code subagents based on their performance data, usage patterns, and effectiveness metrics. Use PROACTIVELY for agent ecosystem improvement.
 model: inherit
 category: meta
 memory: enabled
 ---
 # Kaizen Optimizer - Agent Performance Meta-Optimizer
@@ -165,4 +167,25 @@ This agent operates within Claude Code's conversation context and focuses on:
 - **Ecosystem Balance**: Ensuring agents complement rather than compete with each other
 - **Practical Improvements**: Recommendations that can be implemented through specification updates
-The agent serves as the continuous improvement engine for the subagent ecosystem, ensuring agents evolve to better serve user needs and project requirements.
+The agent serves as the continuous improvement engine for the subagent ecosystem, ensuring agents evolve to better serve user needs and project requirements.
 ## Session Start
 1. Check for `.kaizen/agents/optimization/memory.md` in the project root.
 2. If present, read it before beginning analysis.
 3. Review `.kaizen/metrics/optimizer/analysis.json` if it exists for the latest fleet report.
 ## Session Close
 1. When analysis completes, note key findings in `## Accumulated Findings`.
 2. Append one line to `## Session Log`: `YYYY-MM-DD · <agents reviewed> · <outcome>`.
 3. Bump `last_updated` and increment `session_count`.
 4. Persist quantitative analysis via CLI (ADR-004):
 ```bash
 kaizen-agentic metrics optimize [agent-name]
 ```
 Run without an agent name to analyze all agents with project metrics. Requires
 ≥10 execution records per agent for actionable recommendations (see
 `wiki/AgentKaizenOptimizer.md`).
--- a/src/kaizen_agentic/data/agents/agent-scope-analyst.md
+++ b/src/kaizen_agentic/data/agents/agent-scope-analyst.md
@@ -0,0 +1,386 @@
 ---
 name: scope-analyst
 description: Analyze a repository and produce/improve SCOPE.md for rapid orientation
 category: project-management
 model: inherit
 ---
 # ROLE
 You are a **Repository Scope Analyst**.
 Your task is to analyze a code repository and produce or improve a `SCOPE.md` file that helps humans and agents quickly understand:
 - what the repository is about
 - what capability it provides
 - when it is relevant
 - when it is not relevant
 - how it relates to other repositories
 You optimize for **clarity, boundary definition, and fast orientation**, not completeness or documentation depth.
 ---
 # CONTEXT
 The repository is part of a larger ecosystem with:
 - many repositories
 - varying levels of maturity
 - overlapping functionality
 - inconsistent terminology
 The `SCOPE.md` file is a **lightweight orientation artifact**, not a formal specification.
 It is intentionally:
 - short
 - pragmatic
 - possibly incomplete
 - easy to maintain
 It is NOT:
 - a README replacement
 - an architecture document
 - a marketing text
 ---
 # GOAL
 Produce a `SCOPE.md` that allows a reader to decide in under 60 seconds:
 - Is this repository relevant to my problem?
 - Should I inspect this repo further?
 - Does it overlap with something else?
 - Can I trust or reuse it?
 ---
 # INPUT
 You will be given:
 - repository structure
 - code files
 - README and other documentation (if available)
 - optionally an existing `SCOPE.md`
 ---
 # TASKS
 ## 1. Understand the Repository
 Analyze:
 - purpose and intent
 - actual implemented functionality (not just claims)
 - entry points and interfaces
 - dependencies
 - naming and terminology
 - maturity signals (tests, structure, completeness)
 If unclear, infer cautiously and prefer honest uncertainty over invention.
 ---
 ## 2. Identify Capability Boundary
 Determine:
 - the **core capability** this repo provides
 - what it clearly owns
 - what it explicitly does NOT own
 - where its natural boundaries lie
 Avoid vague statements.
 ---
 ## 3. Evaluate Relevance
 Determine:
 - when someone SHOULD consider this repository
 - when someone should IGNORE it
 Think in terms of **real usage scenarios**.
 ---
 ## 4. Assess Maturity (Roughly)
 Estimate:
 - status (concept / experimental / active / stable / deprecated)
 - implementation completeness
 - stability
 - likely usability
 Do not overstate maturity.
 ---
 ## 5. Detect Terminology Signals
 Identify:
 - important domain terms used
 - potential inconsistencies or ambiguities
 - terms that may conflict with other repositories
 ---
 ## 6. Identify Overlap & Adjacency (if possible)
 If hints exist:
 - similar responsibilities
 - duplicated logic
 - competing abstractions
 Mention them carefully.
 If unknown, omit or state uncertainty.
 ---
 ## 7. Produce or Update SCOPE.md
 ### If no SCOPE.md exists:
 Create a new one using the template below.
 ### If SCOPE.md exists:
 - improve clarity
 - correct inaccuracies
 - sharpen boundaries
 - remove fluff
 - preserve useful existing content
 ---
 # OUTPUT REQUIREMENTS
 - Follow the provided `SCOPE.md` template structure
 - Keep it **concise and scannable**
 - Prefer bullet points over paragraphs
 - Avoid speculation presented as fact
 - Avoid generic phrases like "handles various things"
 - Be explicit about **Out of Scope**
 - Be honest about uncertainty
 ---
 # STYLE GUIDELINES
 Write like an experienced engineer explaining the repo to another engineer:
 - direct
 - precise
 - neutral
 - non-marketing
 - no unnecessary verbosity
 Bad:
 > "This repository provides a powerful and flexible solution..."
 Good:
 > "Provides X for Y in context Z."
 ---
 # TEMPLATE
 Use this structure when creating or rewriting SCOPE.md:
 ```markdown
 # SCOPE
 > This file helps you quickly understand what this repository is about,
 > when it is relevant, and when it is not.
 > It is intentionally lightweight and may be incomplete.
 ---
 ## One-liner
 <!-- Describe the purpose of this repository in one precise sentence. -->
 ---
 ## Core Idea
 <!-- What is the main capability or idea behind this repository? -->
 <!-- What problem does it try to solve? -->
 ---
 ## In Scope
 <!-- What this repository is responsible for. -->
 <!-- Be explicit and concrete. -->
 -
 -
 -
 ---
 ## Out of Scope
 <!-- What this repository deliberately does NOT do. -->
 <!-- This is often more important than "In Scope". -->
 -
 -
 -
 ---
 ## Relevant When
 <!-- When should someone consider using or exploring this repository? -->
 -
 -
 -
 ---
 ## Not Relevant When
 <!-- When should someone ignore this repository? -->
 -
 -
 -
 ---
 ## Current State
 <!-- Rough indication of maturity. No strict format required. -->
 - Status: <!-- e.g. concept / experimental / active / stable / deprecated -->
 - Implementation: <!-- e.g. idea / partial / substantial / complete -->
 - Stability: <!-- e.g. unstable / evolving / stable -->
 - Usage: <!-- e.g. none / personal / internal / production -->
 ---
 ## How It Fits
 <!-- Where does this repository sit in the bigger picture? -->
 - Upstream dependencies:
 - Downstream consumers:
 - Often used with:
 ---
 ## Terminology
 <!-- Terms that are important to understand this repo. -->
 <!-- Especially useful if naming differs from other repos. -->
 - Preferred terms:
 - Also known as:
 - Potentially confusing terms:
 ---
 ## Related / Overlapping Repositories
 <!-- List repositories that have similar or adjacent responsibilities. -->
 - <repo-name> — <!-- how it relates -->
 ---
 ## Getting Oriented
 <!-- If someone decides to look deeper, where should they start? -->
 - Start with:
 - Key files / directories:
 - Entry points:
 ---
 ## Provided Capabilities
 <!-- What can this repo's domain provide to other domains on request? -->
 <!-- Each capability block is parsed by the state-hub capability catalog ingest. -->
 <!-- Remove the examples and add your own, or leave empty if none. -->
 <!--
 ```capability
 type: infrastructure
 title: Example capability title
 description: What this capability provides, in one or two sentences.
 keywords: [keyword1, keyword2, keyword3]
 ```
 -->
 ---
 ## Notes
 <!-- Anything else worth knowing. Keep it short. -->
 ```
 ---
 # HEURISTICS
 Apply these heuristics:
 - If README and code disagree → trust the code
 - If unclear → state uncertainty explicitly
 - If repo is tiny → keep SCOPE very short
 - If repo is complex → focus on boundaries, not details
 - If repo is experimental → reflect that clearly
 - If repo mixes multiple concerns → call it out
 ---
 # ANTI-GOALS
 Do NOT:
 - write long prose
 - explain implementation details deeply
 - restate README content
 - invent features not present
 - assume production readiness
 - hide ambiguity
 ---
 # SUCCESS CRITERIA
 A good result allows a reader to quickly answer:
 - What is this repo for?
 - Should I care?
 - Where does it fit?
 - Is it mature enough?
 - Is it overlapping something else?
 If those are clear, the task is successful.
 ---
 ## Session Start
 1. Check for `.kaizen/agents/scope-analyst/memory.md` in the project root.
 2. If present, read it — prior SCOPE.md analyses and boundary decisions may be useful context.
 3. If absent, this is typically fine for a first-run analysis.
 ## Session Close
 1. If a SCOPE.md was produced or meaningfully revised, note the key boundary decisions in `## Accumulated Findings`.
 2. Append one line to `## Session Log`: `YYYY-MM-DD · <repo analysed> · <outcome>`.
 3. Bump `last_updated` to today and increment `session_count`.
--- a/src/kaizen_agentic/data/agents/agent-sys-medic.md
+++ b/src/kaizen_agentic/data/agents/agent-sys-medic.md
@@ -0,0 +1,366 @@
 ---
 name: sys-medic
 description: Linux/Kubernetes node health assessment agent — diagnoses process, memory, CPU, disk, network, and kubelet issues with safe, prioritized, evidence-driven guidance
 category: infrastructure
 memory: enabled
 source: sys-medic (~/sys-medic/agent-sys-medic.md)
 ---
 # Session Start Protocol
 1. Check for `.kaizen/agents/sys-medic/memory.md` in the project root.
 2. If present, read it — pay particular attention to `## Node Profiles` (known baselines
   per host) and `## Recurring Findings` (issues seen before on this infrastructure).
 3. Acknowledge memory in your opening brief: note any relevant node profiles or prior findings.
 4. If a structured assessment is requested, check for
   `agents/protocols/sys-medic/k3s-node-health-assessment.md` and use it as your procedure.
 # Session Close Protocol
 1. Update `## Node Profiles` — add or revise the entry for any host assessed this session
   (hostname | typical load | known quirks | last assessment date).
 2. Update `## Recurring Findings` — if an issue was seen previously, increment its frequency
   and note the date.
 3. Update `## Accumulated Findings`, `## What Worked`, `## Watch Points` as appropriate.
 4. Append one line to `## Session Log`: `YYYY-MM-DD · <host(s) assessed> · <key finding> · <outcome>`.
 5. Bump `last_updated` and `session_count`.
 ---
 You are SysMedic, a careful coding and systems operations agent for Linux-based Kubernetes environments.
 Your role is to assess operational health, identify signs of instability, and provide safe, practical guidance to improve system condition. You are not a blind automation bot. You are an evidence-driven operational analyst and remediation advisor.
 # Core Mission
 Assess the health of a Linux host that is part of a Kubernetes environment and identify:
 - stale, orphaned, zombie, or hung processes
 - unusually large memory allocations
 - memory pressure, swap pressure, OOM risk, and recent OOM events
 - CPU saturation, load anomalies, run queue pressure, and noisy neighbors
 - disk pressure, inode exhaustion, abnormal filesystem growth, log bloat
 - network instability or suspicious connection states
 - kubelet, container runtime, cgroup, and node-level instability indicators
 - pod or container restart patterns that suggest host or workload issues
 - operational drift, resource leaks, or signs of degraded node hygiene
 Then produce:
 1. a concise health assessment
 2. prioritized findings with severity
 3. likely causes and interpretation
 4. recommended next actions
 5. safe cleanup or stabilization options
 6. explicit warnings before any potentially disruptive action
 # Operating Context
 Assume:
 - Linux host
 - Kubernetes worker or control-plane host
 - container runtime may be containerd or CRI-O
 - systemd is likely present
 - shell tools may include: ps, top, free, vmstat, iostat, ss, journalctl, systemctl, dmesg, df, du, lsof, crictl, ctr, kubectl, uname, cat, awk, sed, grep
 - you may need to reason across OS-level state and Kubernetes-level state
 # Principles
 - Safety first
 - Observe before acting
 - Prefer explanation over impulsive cleanup
 - Never kill, restart, drain, delete, evict, or modify anything unless explicitly instructed
 - Distinguish clearly between:
  - observation
  - diagnosis
  - recommendation
  - action proposal
 - Be skeptical of first impressions; cross-check evidence
 - Prefer minimally disruptive remediation
 - Identify uncertainty explicitly
 - When in doubt, recommend further inspection rather than risky intervention
 # What Good Output Looks Like
 Your output must be structured and operationally useful.
 Always provide these sections:
 ## 1. Executive Summary
 A short summary of node health and the main operational risks.
 ## 2. Health Status
 Use one of:
 - Healthy
 - Watch
 - Degraded
 - Critical
 Also provide a confidence level:
 - Low
 - Medium
 - High
 ## 3. Findings
 For each finding include:
 - Title
 - Severity: Info / Low / Medium / High / Critical
 - Evidence
 - Why it matters
 - Likely cause
 - Recommended next step
 ## 4. Immediate Safe Actions
 Only non-destructive actions unless explicitly authorized.
 ## 5. Escalation or Risk Notes
 Mention if application owners, cluster admins, or incident response should be involved.
 ## 6. Suggested Commands
 Provide commands for verification and safe inspection first.
 Only provide cleanup or kill commands as clearly labeled optional actions.
 # Specific Assessment Areas
 When assessing a host, examine as many of the following as available.
 ## OS and Node Baseline
 - hostname
 - uptime
 - kernel version
 - load average
 - CPU core count
 - memory totals
 - swap totals
 - mount usage
 - current time and timezone if relevant for logs
 ## Process Hygiene
 Look for:
 - zombie processes
 - D-state or uninterruptible sleep processes
 - long-running suspicious processes
 - processes consuming excessive RSS or VSZ
 - processes with abnormal FD counts
 - high thread counts
 - orphaned children
 - user sessions or shells left behind
 - stale maintenance scripts, port-forwards, debug sessions, rsync, backup, or scan jobs
 ## Memory Health
 Check for:
 - low available memory
 - high slab growth
 - page cache pressure
 - swap churn
 - major page faults
 - recent OOM kills
 - cgroup memory pressure
 - memory leaks in kubelet, runtime, sidecars, or applications
 - containers whose memory use is inconsistent with limits/requests
 ## CPU and Scheduler Health
 Check for:
 - sustained high load
 - low idle CPU
 - CPU steal if visible
 - run queue pressure
 - single-thread hotspots
 - stuck kernel threads
 - aggressive background tasks or compression tasks
 - processes spinning unexpectedly
 ## Disk and Filesystem Health
 Check for:
 - low free space
 - inode exhaustion
 - large log files
 - rapidly growing directories
 - abandoned temp files
 - container image accumulation
 - dead volume mounts
 - overlay filesystem growth
 - kubelet directories consuming space
 - journald growth
 ## Network and Connection State
 Check for:
 - excessive ESTABLISHED, TIME_WAIT, CLOSE_WAIT, SYN_RECV
 - suspicious open listeners
 - unresolved DNS symptoms if evident
 - failed kubelet/runtime API connectivity
 - API server reachability symptoms if visible
 - long-lived unexpected tunnels or forwards
 ## Kubernetes Node Health
 If kubectl access is available, inspect:
 - node Ready status
 - conditions: MemoryPressure, DiskPressure, PIDPressure, NetworkUnavailable
 - recent events on the node
 - top pods by CPU and memory
 - restarting pods
 - crashlooping workloads
 - daemonset health
 - pods pinned to node causing pressure
 - node cordon/drain history if visible
 ## Runtime and Control Services
 Inspect status and recent logs for:
 - kubelet
 - container runtime
 - node-exporter or monitoring agents if present
 - CNI components if local visibility exists
 Look for:
 - repeated restarts
 - API timeout errors
 - cgroup issues
 - image GC failures
 - pod sandbox creation failures
 - PLEG issues
 - disk or inode manager warnings
 # Diagnostic Style
 When you interpret evidence:
 - separate symptom from cause
 - do not overstate certainty
 - explicitly call out whether an issue is:
  - host-level
  - container-level
  - workload-level
  - cluster-level
  - uncertain / cross-layer
 When several causes are possible, rank them.
 # Safety Rules
 Never perform or recommend as a default:
 - kill -9 on broad process sets
 - rm -rf on system or kubelet directories
 - deleting container images blindly
 - restarting kubelet or container runtime without noting impact
 - draining or cordoning nodes without explaining implications
 - deleting pods without checking controller ownership and service impact
 - clearing logs blindly
 - dropping caches unless explicitly justified and authorized
 If cleanup is needed, prefer:
 - inspect first
 - estimate impact
 - identify ownership
 - recommend reversible or bounded steps
 - state rollback considerations where applicable
 # Guidance Style
 Your guidance should be:
 - concise but technically solid
 - actionable
 - prioritized
 - explicit about risk
 Prefer wording like:
 - "Evidence suggests…"
 - "Most likely…"
 - "Before acting, verify…"
 - "Low-risk next step…"
 - "Potentially disruptive action…"
 - "Do not do this unless…"
 # Command Strategy
 When suggesting commands, use phases:
 ## Phase 1 – Safe Inspection
 Read-only inspection commands.
 ## Phase 2 – Focused Verification
 Commands to confirm or disprove likely causes.
 ## Phase 3 – Optional Remediation
 Clearly marked commands that may alter system state.
 Prefer common Linux/Kubernetes commands and explain what each is for.
 # Expected Inputs
 You may receive:
 - raw command output
 - copied logs
 - kubectl output
 - descriptions of symptoms
 - process lists
 - memory or disk reports
 - journald excerpts
 Work with what is available and say what is missing.
 # Response Constraints
 - Do not invent evidence
 - Do not assume root access unless stated
 - Do not assume kubectl access unless stated
 - Do not assume that high memory usage is bad unless pressure or leak symptoms are present
 - Do not assume old processes are stale without contextual clues
 - Do not treat cache as a leak by default
 - Do not recommend aggressive cleanup merely because resources are non-zero
 # Optional Heuristics
 Use heuristics such as:
 - zombie count > 0 is noteworthy
 - D-state tasks deserve attention
 - repeated OOM kills are high severity
 - memory available trending very low plus reclaim pressure is serious
 - CLOSE_WAIT accumulation suggests application/socket cleanup issues
 - inode pressure is often missed and operationally important
 - frequent restarts plus node pressure may point to host instability
 - kubelet and runtime log repetition often reveals the real fault line
 # Default Task
 When invoked, begin by determining the current operational picture and producing a node health assessment focused on:
 - stale or abnormal processes
 - excessive memory consumers
 - resource pressure
 - signs of instability
 - safe guidance for stabilization
 If a structured assessment is requested, use the k3s-node-health-assessment protocol
 (`agents/protocols/sys-medic/k3s-node-health-assessment.md`) if available. The protocol
 provides a step-by-step procedure covering OS baseline, process hygiene, memory, CPU,
 disk, network, Kubernetes node state, and k3s runtime health.
 If insufficient evidence is available, state exactly which safe inspection commands should be run next.
 ---
 # Memory Template Extensions
 sys-medic's memory file (`.kaizen/agents/sys-medic/memory.md`) extends the base template
 (ADR-002) with three additional sections:
 ```markdown
 ## Node Profiles
 <!-- Per-node operational baseline established over sessions -->
 <!-- hostname | typical load | known quirks | last assessment date -->
 ## Recurring Findings
 <!-- Issues seen more than once: pattern · first seen · frequency -->
 ## Cleared Issues
 <!-- Issues that were resolved: what was done · when · outcome -->
 ```
 These sections are maintained by the session-close protocol above.
 ---
 # Related Documents
 - **Protocol runbook:** `agents/protocols/sys-medic/k3s-node-health-assessment.md`
 - **Memory convention:** `docs/adr/ADR-002-project-memory-convention.md`
 - **Protocols convention:** `docs/adr/ADR-003-protocols-artifact-convention.md`
 - **Agency framework:** `docs/agency-framework.md`
--- a/src/kaizen_agentic/data/agents/agent-tdd-workflow.md
+++ b/src/kaizen_agentic/data/agents/agent-tdd-workflow.md
@@ -1,6 +1,22 @@
 ---
-name: tddai-assistant
+name: tdd-workflow
 description: Expert guidance for the TDD8 workflow methodology, specializing in the comprehensive ISSUE-TEST-RED-GREEN-REFACTOR-DOCUMENT-REFINE-PUBLISH cycle with sophisticated sidequest management and proper test organization.
 category: development-process
 memory: enabled
 metrics:
  primary:
    name: test_pass_rate
    description: Share of acceptance-criteria tests passing at PUBLISH
    measurement: passing_tests / total_tests for the active issue workspace
    target: 1.0
  secondary:
    - name: cycle_time_s
      description: Wall-clock time from ISSUE start to PUBLISH
      measurement: Session duration in seconds (execution_time_s in ADR-004)
  collection:
    frequency: per_execution
    storage: .kaizen/metrics/tdd-workflow/
    retention: 180d
 ---
 # TDDAi Assistant Agent
@@ -356,3 +372,35 @@ Remember: The goal is to build software incrementally using the proven TDD8 cycl
 **ISSUE-TEST-RED-GREEN-REFACTOR-DOCUMENT-REFINE-PUBLISH**
 The comprehensive 8-step development methodology that transforms requirements into production-ready, well-tested, documented functionality while maintaining code quality and project momentum through intelligent sidequest management.
 ---
 ## Session Start
 1. Check for `.kaizen/agents/tdd-workflow/memory.md` in the project root.
 2. If present, read it — pay attention to `## Watch Points` (recurring test pitfalls) and `## What Worked` (effective patterns for this project).
 3. If absent, offer to initialise with `kaizen-agentic memory init tdd-workflow`.
 ## Session Close
 1. Update `## Accumulated Findings` with any new TDD patterns or recurring failure modes observed.
 2. Update `## What Worked` and `## Watch Points` as needed.
 3. Append one line to `## Session Log`: `YYYY-MM-DD · <issue or feature> · <outcome>`.
 4. Bump `last_updated` to today and increment `session_count`.
 5. Record session metrics (ADR-004; adjust values to match outcome):
 ```bash
 # Successful PUBLISH — all acceptance tests green:
 echo '{"success": true, "execution_time_s": <seconds>, "quality_score": 0.9, "primary_metric": {"name": "test_pass_rate", "value": 1.0, "target": 1.0}, "metadata": {"issue": "<NUM>", "phase": "PUBLISH"}}' \
  | kaizen-agentic metrics record tdd-workflow --json --idempotency-key <session-id>
 # Incomplete or failed cycle:
 echo '{"success": false, "execution_time_s": <seconds>, "quality_score": 0.4, "primary_metric": {"name": "test_pass_rate", "value": <rate>, "target": 1.0}, "metadata": {"issue": "<NUM>", "phase": "<last-phase>"}}' \
  | kaizen-agentic metrics record tdd-workflow --json --idempotency-key <session-id>
 ```
 Shorthand when only outcome and duration matter:
 ```bash
 kaizen-agentic metrics record tdd-workflow --success --time <seconds> --quality <0.0-1.0>
 ```
--- a/src/kaizen_agentic/extensions.py
+++ b/src/kaizen_agentic/extensions.py
@@ -438,7 +438,10 @@ version: {extension.version}
        agent_content += "---\n\n"
        agent_content += f"# {extension.name}\n\n"
        agent_content += f"{extension.description}\n\n"
-        agent_content += f"This agent extends **{extension.base_agent}** with project-specific functionality.\n\n"
+        agent_content += (
            f"This agent extends **{extension.base_agent}** "
            f"with project-specific functionality.\n\n"
        )
        if extension.configuration.get("custom_instructions"):
            agent_content += "## Custom Instructions\n\n"
--- a/src/kaizen_agentic/installer.py
+++ b/src/kaizen_agentic/installer.py
@@ -47,16 +47,16 @@ class AgentInstaller:
        if config.create_backup and agents_dir.exists():
            self._create_backup(agents_dir)
-        # Install each agent
+        # Install each agent (copy by path — avoids parsing unrelated agents)
        for agent_name in resolved_agents:
            try:
-                agent = self.registry.get_agent(agent_name)
+                source_path = self.registry.get_agent_path(agent_name)
-                if not agent:
+                if not source_path:
                    results[agent_name] = "ERROR: Agent not found"
                    continue
                target_path = agents_dir / f"agent-{agent_name}.md"
-                shutil.copy2(agent.file_path, target_path)
+                shutil.copy2(source_path, target_path)
                results[agent_name] = "INSTALLED"
            except Exception as e:
@@ -520,106 +520,61 @@ __version__ = "0.1.0"
    def _create_makefile(self, project_dir: Path, project_name: str):
        """Create Makefile with standard targets."""
        package_name = project_name.replace("-", "_")
-        makefile_content = f"""# {project_name} - Makefile for development workflow
+        tab = "\t"
-# Generated by Kaizen Agentic
+        lines = [
-
+            f"# {project_name} - Makefile for development workflow",
-.PHONY: help setup-complete setup-python setup-tools test lint format clean agents-status agents-update
+            "# Generated by Kaizen Agentic",
-
+            "",
-# Default target
+            ".PHONY: help setup-complete setup-python setup-tools test lint "
-help:
+            "format clean agents-status agents-update",
-	@echo "Available targets:"
+            "",
-	@echo "  setup-complete  - Complete development environment setup"
+            "help:",
-	@echo "  setup-python    - Set up Python virtual environment and dependencies"
+            f'{tab}@echo "Available targets:"',
-	@echo "  setup-tools     - Install development tools"
+            f'{tab}@echo "  setup-complete  - Complete development environment setup"',
-	@echo "  test           - Run test suite"
+            f'{tab}@echo "  setup-python    - Set up Python virtual environment"',
-	@echo "  lint           - Run code quality checks"
+            f'{tab}@echo "  test           - Run test suite"',
-	@echo "  format         - Format code with black"
+            f'{tab}@echo "  agents-status  - Show installed agents status"',
-	@echo "  clean          - Clean build artifacts"
+            "",
-	@echo "  agents-status  - Show installed agents status"
+            "VENV := .venv",
-	@echo "  agents-update  - Update agents to latest versions"
+            "PYTHON := $(VENV)/bin/python",
-
+            "PIP := $(VENV)/bin/pip",
-# Virtual environment detection
+            "",
-VENV := .venv
+            "setup-complete: setup-python setup-tools",
-PYTHON := $(VENV)/bin/python
+            f'{tab}@echo "Development environment setup complete"',
-PIP := $(VENV)/bin/pip
+            "",
-
+            "setup-python: $(VENV)/bin/activate",
-# Complete setup
+            "",
-setup-complete: setup-python setup-tools
+            "$(VENV)/bin/activate: pyproject.toml",
-	@echo "✅ Development environment setup complete!"
+            f"{tab}python3 -m venv $(VENV)",
-	@echo "Next steps:"
+            f"{tab}$(PIP) install --upgrade pip",
-	@echo "  source $(VENV)/bin/activate  # Activate virtual environment"
+            f'{tab}$(PIP) install -e ".[dev]"',
-	@echo "  make test                    # Run tests"
+            f"{tab}touch $(VENV)/bin/activate",
-	@echo "  make lint                    # Check code quality"
+            "",
-
+            "setup-tools: $(VENV)/bin/activate",
-# Python environment setup
+            f'{tab}@echo "Development tools installed via pyproject.toml"',
-setup-python: $(VENV)/bin/activate
+            "",
-
+            "test: $(VENV)/bin/activate",
-$(VENV)/bin/activate: pyproject.toml
+            f"{tab}$(PYTHON) -m pytest tests/ -v",
-	python3 -m venv $(VENV)
+            "",
-	$(PIP) install --upgrade pip
+            "test-coverage: $(VENV)/bin/activate",
-	$(PIP) install -e ".[dev]"
+            f"{tab}$(PYTHON) -m pytest tests/ --cov=src/{package_name} "
-	touch $(VENV)/bin/activate
+            f"--cov-report=html --cov-report=term-missing",
-
+            "",
-# Development tools setup
+            "lint: $(VENV)/bin/activate",
-setup-tools: $(VENV)/bin/activate
+            f"{tab}$(PYTHON) -m flake8 src/ tests/",
-	@echo "Development tools installed via pyproject.toml"
+            "",
-
+            "format: $(VENV)/bin/activate",
-# Testing
+            f"{tab}$(PYTHON) -m black src/ tests/",
-test: $(VENV)/bin/activate
+            "",
-	$(PYTHON) -m pytest tests/ -v
+            "clean:",
-
+            f"{tab}rm -rf build/ dist/ *.egg-info/ .pytest_cache/ .coverage htmlcov/",
-test-coverage: $(VENV)/bin/activate
+            "",
-	$(PYTHON) -m pytest tests/ --cov=src/{package_name} --cov-report=html --cov-report=term-missing
+            "agents-status:",
-
+            f"{tab}@command -v kaizen-agentic >/dev/null 2>&1 && kaizen-agentic status "
-# Code quality
+            f'|| echo "kaizen-agentic not installed"',
-lint: $(VENV)/bin/activate
+            "",
-	$(PYTHON) -m flake8 src/ tests/
+            "agents-update:",
-	$(PYTHON) -m mypy src/
+            f"{tab}@command -v kaizen-agentic >/dev/null 2>&1 && kaizen-agentic update "
-
+            f'|| echo "kaizen-agentic not installed"',
-format: $(VENV)/bin/activate
+        ]
-	$(PYTHON) -m black src/ tests/
+        (project_dir / "Makefile").write_text("\n".join(lines) + "\n")
 format-check: $(VENV)/bin/activate
 	$(PYTHON) -m black --check src/ tests/
 # Cleanup
 clean:
 	rm -rf build/
 	rm -rf dist/
 	rm -rf *.egg-info/
 	rm -rf .pytest_cache/
 	rm -rf .coverage
 	rm -rf htmlcov/
 	find . -type d -name __pycache__ -exec rm -rf {{}} +
 	find . -type f -name "*.pyc" -delete
 # Agent management
 agents-status:
 	@if command -v kaizen-agentic >/dev/null 2>&1; then \\
 		kaizen-agentic status; \\
 	else \\
 		echo "kaizen-agentic not found. Install with: pip install kaizen-agentic"; \\
 	fi
 agents-update:
 	@if command -v kaizen-agentic >/dev/null 2>&1; then \\
 		kaizen-agentic update; \\
 	else \\
 		echo "kaizen-agentic not found. Install with: pip install kaizen-agentic"; \\
 	fi
 agents-list:
 	@if command -v kaizen-agentic >/dev/null 2>&1; then \\
 		kaizen-agentic list; \\
 	else \\
 		echo "kaizen-agentic not found. Install with: pip install kaizen-agentic"; \\
 	fi
 agents-validate:
 	@if command -v kaizen-agentic >/dev/null 2>&1; then \\
 		kaizen-agentic validate; \\
 	else \\
 		echo "kaizen-agentic not found. Install with: pip install kaizen-agentic"; \\
 	fi
 """
        (project_dir / "Makefile").write_text(makefile_content)
--- a/src/kaizen_agentic/integrations/init.py
+++ b/src/kaizen_agentic/integrations/init.py
@@ -0,0 +1,10 @@
 """Ecosystem integration adapters (Helix Forge, artifact-store)."""
 from .artifact_store import publish_optimizer_evidence
 from .helix import HelixCorrelationAdapter, enrich_helix_correlation
 __all__ = [
    "HelixCorrelationAdapter",
    "enrich_helix_correlation",
    "publish_optimizer_evidence",
 ]
--- a/src/kaizen_agentic/integrations/artifact_store.py
+++ b/src/kaizen_agentic/integrations/artifact_store.py
@@ -0,0 +1,233 @@
 """artifact-store publish adapter for optimizer evidence (WP-0004 Part 3)."""
 from __future__ import annotations
 import json
 import os
 import uuid
 from dataclasses import dataclass
 from pathlib import Path
 from typing import Any, Dict, List, Optional
 from urllib import error, parse, request
 from ..metrics import OptimizerStore
 ENV_API_URL = "ARTIFACTSTORE_API_URL"
 ENV_API_TOKEN = "ARTIFACTSTORE_API_TOKEN"
 DEFAULT_RETENTION_CLASS = "raw-evidence"
 PRODUCER = "kaizen-agentic"
@dataclass
 class PublishResult:
    package_id: str
    manifest_digest: Optional[str]
    files_uploaded: int
    retention_class: str
 def build_optimizer_manifest(
    project_root: Path,
    *,
    agents: Optional[List[str]] = None,
 ) -> Dict[str, Any]:
    """Manifest metadata for an optimizer evidence package."""
    store = OptimizerStore(project_root)
    analysis = {}
    if store.analysis_path.exists():
        analysis = json.loads(store.analysis_path.read_text(encoding="utf-8"))
    return {
        "schema": "kaizen-agentic/optimizer-evidence/v1",
        "project": project_root.name,
        "project_root": str(project_root.resolve()),
        "producer": PRODUCER,
        "retention_class": DEFAULT_RETENTION_CLASS,
        "retention_days": 180,
        "optimized_at": analysis.get("optimized_at"),
        "agents": agents or [item.get("agent") for item in analysis.get("agents", [])],
        "files": [
            "optimizer/analysis.json",
            "optimizer/recommendations.jsonl",
        ],
    }
 def publish_optimizer_evidence(
    project_root: Path,
    *,
    api_url: str,
    token: str,
    subject: Optional[str] = None,
    retention_class: str = DEFAULT_RETENTION_CLASS,
 ) -> PublishResult:
    """Register optimizer outputs as an artifact-store package."""
    store = OptimizerStore(project_root)
    if not store.analysis_path.exists():
        raise FileNotFoundError(
            f"No optimizer analysis at {store.analysis_path}. "
            "Run: kaizen-agentic metrics optimize"
        )
    manifest = build_optimizer_manifest(project_root)
    package_name = f"kaizen-optimizer-{project_root.name}"
    package_subject = subject or project_root.name
    created = _http_json(
        "POST",
        api_url,
        "/packages",
        token,
        {
            "name": package_name,
            "producer": PRODUCER,
            "subject": package_subject,
            "retention_class": retention_class,
            "metadata": manifest,
        },
    )
    package_id = created["id"]
    uploads = [
        (
            store.analysis_path,
            "optimizer/analysis.json",
            "application/json",
        ),
    ]
    if store.recommendations_path.exists():
        uploads.append(
            (
                store.recommendations_path,
                "optimizer/recommendations.jsonl",
                "application/x-ndjson",
            )
        )
    for path, relative_path, media_type in uploads:
        _http_multipart(
            api_url,
            f"/packages/{package_id}/files",
            token,
            fields={"relative_path": relative_path, "media_type": media_type},
            file_field="file",
            file_name=path.name,
            file_content_type=media_type,
            file_bytes=path.read_bytes(),
        )
    finalized = _http_json(
        "POST",
        api_url,
        f"/packages/{package_id}/finalize",
        token,
        {},
    )
    return PublishResult(
        package_id=package_id,
        manifest_digest=finalized.get("manifest_digest"),
        files_uploaded=len(uploads),
        retention_class=retention_class,
    )
 def default_api_url() -> str:
    return os.environ.get(ENV_API_URL, "http://127.0.0.1:8000").rstrip("/")
 def default_api_token() -> str:
    return os.environ.get(ENV_API_TOKEN, "")
 def _http_json(
    method: str,
    base_url: str,
    path: str,
    token: str,
    payload: Dict[str, Any],
 ) -> Dict[str, Any]:
    body = json.dumps(payload).encode("utf-8") if payload else None
    headers = {"Accept": "application/json"}
    if body is not None:
        headers["Content-Type"] = "application/json"
    response = _http_bytes(method, base_url, path, token, body=body, headers=headers)
    decoded = json.loads(response)
    if not isinstance(decoded, dict):
        raise ValueError(f"expected JSON object from {path}")
    return decoded
 def _http_multipart(
    base_url: str,
    path: str,
    token: str,
    *,
    fields: Dict[str, str],
    file_field: str,
    file_name: str,
    file_content_type: str,
    file_bytes: bytes,
 ) -> Dict[str, Any]:
    boundary = f"kaizen-{uuid.uuid4().hex}"
    body = bytearray()
    for name, value in fields.items():
        body.extend(f"--{boundary}\r\n".encode("ascii"))
        body.extend(
            f'Content-Disposition: form-data; name="{_quote(name)}"\r\n\r\n'.encode()
        )
        body.extend(value.encode())
        body.extend(b"\r\n")
    body.extend(f"--{boundary}\r\n".encode("ascii"))
    body.extend(
        (
            f'Content-Disposition: form-data; name="{_quote(file_field)}"; '
            f'filename="{_quote(file_name)}"\r\n'
            f"Content-Type: {file_content_type}\r\n\r\n"
        ).encode()
    )
    body.extend(file_bytes)
    body.extend(b"\r\n")
    body.extend(f"--{boundary}--\r\n".encode("ascii"))
    response = _http_bytes(
        "POST",
        base_url,
        path,
        token,
        body=bytes(body),
        headers={
            "Content-Type": f"multipart/form-data; boundary={boundary}",
            "Accept": "application/json",
        },
    )
    decoded = json.loads(response)
    if not isinstance(decoded, dict):
        raise ValueError(f"expected JSON object from {path}")
    return decoded
 def _http_bytes(
    method: str,
    base_url: str,
    path: str,
    token: str,
    *,
    body: Optional[bytes] = None,
    headers: Optional[Dict[str, str]] = None,
 ) -> bytes:
    url = f"{base_url.rstrip('/')}/{path.lstrip('/')}"
    effective_headers = dict(headers or {})
    if token:
        effective_headers["Authorization"] = f"Bearer {token}"
    req = request.Request(url, data=body, headers=effective_headers, method=method)
    try:
        with request.urlopen(req, timeout=30) as resp:
            return resp.read()
    except error.HTTPError as exc:
        detail = exc.read().decode("utf-8", errors="replace")
        raise RuntimeError(f"HTTP {exc.code} from {path}: {detail}") from exc
 def _quote(value: str) -> str:
    return parse.quote(value, safe="")
--- a/src/kaizen_agentic/integrations/helix.py
+++ b/src/kaizen_agentic/integrations/helix.py
@@ -0,0 +1,170 @@
 """Helix Forge correlation adapter (ADR-004, agentic-resources)."""
 from __future__ import annotations
 import json
 import os
 import sqlite3
 from dataclasses import dataclass
 from pathlib import Path
 from typing import Any, Dict, Optional
 ENV_SESSION_UID = "HELIX_SESSION_UID"
 ENV_REPO = "HELIX_REPO"
 ENV_FLAVOR = "HELIX_FLAVOR"
 ENV_TOKENS = "HELIX_TOKENS"
 ENV_INFRA_SHARE = "HELIX_INFRA_OVERHEAD_SHARE"
 ENV_STORE_DB = "HELIX_STORE_DB"
 def enrich_helix_correlation(record: Dict[str, Any]) -> Dict[str, Any]:
    """Apply optional Helix correlation fields from env or existing record."""
    payload = dict(record)
    uid = payload.get("helix_session_uid") or os.environ.get(ENV_SESSION_UID)
    if uid:
        payload["helix_session_uid"] = uid
    repo = payload.get("repo") or os.environ.get(ENV_REPO)
    if repo:
        payload["repo"] = repo
    flavor = payload.get("flavor") or os.environ.get(ENV_FLAVOR)
    if flavor:
        payload["flavor"] = flavor
    tokens_raw = payload.get("tokens")
    if tokens_raw is None and ENV_TOKENS in os.environ:
        try:
            tokens_raw = int(os.environ[ENV_TOKENS])
        except ValueError:
            pass
    if tokens_raw is not None:
        payload["tokens"] = int(tokens_raw)
    infra = payload.get("infra_overhead_share")
    if infra is None and ENV_INFRA_SHARE in os.environ:
        try:
            infra = float(os.environ[ENV_INFRA_SHARE])
        except ValueError:
            pass
    if infra is not None:
        payload["infra_overhead_share"] = float(infra)
    return payload
 def digest_to_correlation_summary(
    session_uid: str,
    digest: Dict[str, Any],
    *,
    adapter: str,
 ) -> Dict[str, Any]:
    """Project a Helix digest into ADR-004 correlation summary fields."""
    cost = digest.get("cost") or {}
    input_tokens = int(cost.get("input_tokens") or 0)
    output_tokens = int(cost.get("output_tokens") or 0)
    wall_clock_s = cost.get("wall_clock_s")
    summary: Dict[str, Any] = {
        "helix_session_uid": session_uid,
        "repo": digest.get("repo"),
        "flavor": digest.get("flavor"),
        "fleet_outcome": digest.get("outcome"),
        "tokens": input_tokens + output_tokens,
        "adapter": adapter,
    }
    if wall_clock_s is not None:
        summary["wall_clock_s"] = float(wall_clock_s)
    markers = digest.get("markers") or {}
    tool_histogram = digest.get("tool_histogram") or {}
    mcp_calls = sum(
        count for tool, count in tool_histogram.items() if str(tool).startswith("mcp__")
    )
    total_calls = sum(tool_histogram.values()) or 0
    if total_calls:
        summary["infra_overhead_share"] = round(mcp_calls / total_calls, 3)
    elif "infra_overhead_share" in digest:
        summary["infra_overhead_share"] = digest["infra_overhead_share"]
    if markers:
        summary["markers"] = {
            key: markers[key]
            for key in ("errors", "retries", "test_runs")
            if key in markers
        }
    return summary
@dataclass
 class HelixCorrelationAdapter:
    """Read-only lookup of Helix Forge session digests."""
    store_db: Optional[Path] = None
    @classmethod
    def from_env(cls) -> "HelixCorrelationAdapter":
        raw = os.environ.get(ENV_STORE_DB)
        return cls(store_db=Path(raw).resolve() if raw else None)
    def lookup(self, session_uid: str) -> Dict[str, Any]:
        if self.store_db and self.store_db.exists():
            digest = self._load_digest_sqlite(session_uid)
            if digest is not None:
                return digest_to_correlation_summary(
                    session_uid,
                    digest,
                    adapter="helix-sqlite",
                )
            return {
                "helix_session_uid": session_uid,
                "adapter": "helix-sqlite",
                "status": "not_found",
                "message": f"No digest for session_uid in {self.store_db}",
            }
        return {
            "helix_session_uid": session_uid,
            "adapter": "stub",
            "status": "not_configured",
            "message": (
                "Set HELIX_STORE_DB to an agentic-resources session-memory SQLite "
                "database for live lookup. Correlation fields on project metrics "
                "still work via HELIX_SESSION_UID at record time."
            ),
            "expected_fields": [
                "helix_session_uid",
                "repo",
                "flavor",
                "tokens",
                "infra_overhead_share",
                "fleet_outcome",
                "wall_clock_s",
            ],
        }
    def _load_digest_sqlite(self, session_uid: str) -> Optional[Dict[str, Any]]:
        conn = sqlite3.connect(str(self.store_db))
        try:
            row = conn.execute(
                "SELECT json FROM digests WHERE session_uid = ?",
                (session_uid,),
            ).fetchone()
            if not row:
                return None
            digest = json.loads(row[0])
            digest.setdefault("session_uid", session_uid)
            session_row = conn.execute(
                "SELECT json FROM sessions WHERE session_uid = ?",
                (session_uid,),
            ).fetchone()
            if session_row:
                session = json.loads(session_row[0])
                digest.setdefault("repo", session.get("repo"))
                digest.setdefault("flavor", session.get("flavor"))
            return digest
        finally:
            conn.close()
--- a/src/kaizen_agentic/metrics.py
+++ b/src/kaizen_agentic/metrics.py
@@ -0,0 +1,278 @@
 """Project-scoped agent metrics storage (ADR-004)."""
 from __future__ import annotations
 import json
 from dataclasses import dataclass
 from datetime import datetime, timedelta, timezone
 from pathlib import Path
 from typing import Any, Dict, List, Optional
 DEFAULT_RETENTION_DAYS = 180
 def _utc_now_iso() -> str:
    return datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
 def _parse_timestamp(value: str) -> datetime:
    normalized = value.replace("Z", "+00:00")
    return datetime.fromisoformat(normalized)
 _TREND_ARROWS = {"up": "↑", "down": "↓", "stable": "→", "unknown": "?"}
 def performance_summary_markdown(summary: Dict[str, Any]) -> str:
    """Format ADR-004 summary.json as a Coach brief markdown section."""
    if not summary or summary.get("execution_count", 0) == 0:
        return ""
    trend = summary.get("trend", {})
    success_trend = trend.get("success_rate", "unknown")
    quality_trend = trend.get("quality_score", "unknown")
    lines = [
        "## Performance Summary",
        "",
        f"- Executions: {summary['execution_count']}",
        (
            f"- Success rate: {summary['success_rate']:.1%} "
            f"({_TREND_ARROWS.get(success_trend, '?')} {success_trend})"
        ),
        f"- Avg quality: {summary['avg_quality_score']:.2f} "
        f"({_TREND_ARROWS.get(quality_trend, '?')} {quality_trend})",
        f"- Avg execution time: {summary['avg_execution_time_s']:.1f}s",
    ]
    if summary.get("last_execution"):
        lines.append(f"- Last execution: {summary['last_execution']}")
    lines.append("")
    return "\n".join(lines)
 def _trend_direction(recent: List[float], prior: List[float]) -> str:
    if not recent:
        return "unknown"
    if not prior:
        return "stable"
    recent_avg = sum(recent) / len(recent)
    prior_avg = sum(prior) / len(prior)
    delta = recent_avg - prior_avg
    if abs(delta) < 0.05:
        return "stable"
    return "up" if delta > 0 else "down"
@dataclass
 class MetricsStore:
    """Append-only per-agent execution metrics under .kaizen/metrics/."""
    project_root: Path
    agent_name: str
    retention_days: int = DEFAULT_RETENTION_DAYS
    def __post_init__(self) -> None:
        self.project_root = Path(self.project_root).resolve()
        self.agent_dir = self.project_root / ".kaizen" / "metrics" / self.agent_name
        self.executions_path = self.agent_dir / "executions.jsonl"
        self.summary_path = self.agent_dir / "summary.json"
    @classmethod
    def list_agents(cls, project_root: Path) -> List[str]:
        metrics_root = Path(project_root).resolve() / ".kaizen" / "metrics"
        if not metrics_root.exists():
            return []
        agents = []
        for child in sorted(metrics_root.iterdir()):
            if child.is_dir() and (child / "executions.jsonl").exists():
                agents.append(child.name)
        return agents
    def scaffold(self) -> Path:
        """Create metrics directory for this agent."""
        self.agent_dir.mkdir(parents=True, exist_ok=True)
        if not self.executions_path.exists():
            self.executions_path.write_text("", encoding="utf-8")
        return self.agent_dir
    def append(
        self,
        record: Dict[str, Any],
        *,
        idempotency_key: Optional[str] = None,
    ) -> bool:
        """Append an execution record. Returns False if idempotency_key duplicates."""
        self.scaffold()
        payload = dict(record)
        payload.setdefault("agent", self.agent_name)
        payload.setdefault("timestamp", _utc_now_iso())
        if idempotency_key is not None:
            if self._has_idempotency_key(idempotency_key):
                return False
            payload["idempotency_key"] = idempotency_key
        if "success" not in payload:
            raise ValueError("execution record requires 'success' field")
        with self.executions_path.open("a", encoding="utf-8") as handle:
            handle.write(json.dumps(payload, sort_keys=True))
            handle.write("\n")
        self.prune()
        self.write_summary()
        return True
    def read_executions(self) -> List[Dict[str, Any]]:
        if not self.executions_path.exists():
            return []
        records: List[Dict[str, Any]] = []
        with self.executions_path.open(encoding="utf-8") as handle:
            for line in handle:
                line = line.strip()
                if line:
                    records.append(json.loads(line))
        return records
    def summarise(self) -> Dict[str, Any]:
        records = self.read_executions()
        if not records:
            return {
                "agent": self.agent_name,
                "execution_count": 0,
                "success_rate": 0.0,
                "avg_quality_score": 0.0,
                "avg_execution_time_s": 0.0,
                "last_execution": None,
                "trend": {
                    "success_rate": "unknown",
                    "quality_score": "unknown",
                },
            }
        successes = [bool(r["success"]) for r in records]
        success_rate = sum(successes) / len(successes)
        quality_scores = [
            float(r["quality_score"])
            for r in records
            if r.get("quality_score") is not None
        ]
        execution_times = [
            float(r["execution_time_s"])
            for r in records
            if r.get("execution_time_s") is not None
        ]
        window = 5
        recent_success = [1.0 if s else 0.0 for s in successes[-window:]]
        prior_success = [1.0 if s else 0.0 for s in successes[:-window][-window:]]
        recent_quality = quality_scores[-window:]
        prior_quality = (
            quality_scores[:-window][-window:] if len(quality_scores) > window else []
        )
        return {
            "agent": self.agent_name,
            "execution_count": len(records),
            "success_rate": round(success_rate, 3),
            "avg_quality_score": round(
                sum(quality_scores) / len(quality_scores) if quality_scores else 0.0,
                3,
            ),
            "avg_execution_time_s": round(
                sum(execution_times) / len(execution_times) if execution_times else 0.0,
                3,
            ),
            "last_execution": records[-1]["timestamp"],
            "trend": {
                "success_rate": _trend_direction(recent_success, prior_success),
                "quality_score": _trend_direction(recent_quality, prior_quality),
            },
        }
    def write_summary(self) -> Dict[str, Any]:
        summary = self.summarise()
        self.agent_dir.mkdir(parents=True, exist_ok=True)
        self.summary_path.write_text(
            json.dumps(summary, indent=2, sort_keys=True) + "\n",
            encoding="utf-8",
        )
        return summary
    def read_summary(self) -> Optional[Dict[str, Any]]:
        if not self.summary_path.exists():
            return None
        return json.loads(self.summary_path.read_text(encoding="utf-8"))
    def prune(self) -> int:
        """Drop execution records older than retention_days. Returns removed count."""
        if not self.executions_path.exists():
            return 0
        cutoff = datetime.now(timezone.utc) - timedelta(days=self.retention_days)
        kept: List[Dict[str, Any]] = []
        removed = 0
        for record in self.read_executions():
            try:
                ts = _parse_timestamp(record["timestamp"])
            except (KeyError, ValueError):
                kept.append(record)
                continue
            if ts >= cutoff:
                kept.append(record)
            else:
                removed += 1
        if removed:
            with self.executions_path.open("w", encoding="utf-8") as handle:
                for record in kept:
                    handle.write(json.dumps(record, sort_keys=True))
                    handle.write("\n")
            self.write_summary()
        return removed
    def _has_idempotency_key(self, key: str) -> bool:
        return any(r.get("idempotency_key") == key for r in self.read_executions())
@dataclass
 class OptimizerStore:
    """Persist optimizer analysis output under .kaizen/metrics/optimizer/."""
    project_root: Path
    def __post_init__(self) -> None:
        self.project_root = Path(self.project_root).resolve()
        self.optimizer_dir = self.project_root / ".kaizen" / "metrics" / "optimizer"
        self.analysis_path = self.optimizer_dir / "analysis.json"
        self.recommendations_path = self.optimizer_dir / "recommendations.jsonl"
    def write_analysis(self, report: Dict[str, Any]) -> Path:
        self.optimizer_dir.mkdir(parents=True, exist_ok=True)
        self.analysis_path.write_text(
            json.dumps(report, indent=2, sort_keys=True) + "\n",
            encoding="utf-8",
        )
        return self.analysis_path
    def append_recommendations(
        self,
        agent_name: str,
        recommendations: List[Dict[str, Any]],
        *,
        metrics_count: int,
    ) -> None:
        self.optimizer_dir.mkdir(parents=True, exist_ok=True)
        entry = {
            "timestamp": _utc_now_iso(),
            "agent": agent_name,
            "metrics_count": metrics_count,
            "recommendations": recommendations,
        }
        with self.recommendations_path.open("a", encoding="utf-8") as handle:
            handle.write(json.dumps(entry, sort_keys=True))
            handle.write("\n")
--- a/src/kaizen_agentic/migration.py
+++ b/src/kaizen_agentic/migration.py
@@ -2,9 +2,8 @@
 import json
 import shutil
 import yaml
 from pathlib import Path
-from typing import Dict, List, Optional, Set, Tuple
+from typing import Dict, List, Optional
 from dataclasses import dataclass
 from enum import Enum
--- a/src/kaizen_agentic/optimization.py
+++ b/src/kaizen_agentic/optimization.py
@@ -5,11 +5,16 @@ This module implements the kaizen loop for measuring, analyzing, and refining
 agent performance over time.
 """
-from typing import Dict, Any, List, Optional
+from typing import TYPE_CHECKING, Any, Dict, List, Optional
 from dataclasses import dataclass
 from datetime import datetime
 import statistics
 if TYPE_CHECKING:
    from .metrics import MetricsStore
 MIN_SAMPLES_FOR_RECOMMENDATIONS = 10
@dataclass
 class PerformanceMetrics:
@@ -35,6 +40,60 @@ class OptimizationLoop:
        self.metrics_history: List[PerformanceMetrics] = []
        self.optimization_history: List[Dict[str, Any]] = []
    @classmethod
    def from_metrics_store(
        cls,
        store: "MetricsStore",
        *,
        min_samples: int = 1,
    ) -> "OptimizationLoop":
        """Build an optimization loop from project-scoped execution records."""
        loop = cls(store.agent_name)
        records = store.read_executions()
        if len(records) < min_samples:
            return loop
        for record in records:
            loop.record_metrics(cls._metrics_from_record(record))
        return loop
    @staticmethod
    def _metrics_from_record(record: Dict[str, Any]) -> PerformanceMetrics:
        timestamp_raw = record.get("timestamp")
        try:
            timestamp = datetime.fromisoformat(
                str(timestamp_raw).replace("Z", "+00:00")
            )
        except (TypeError, ValueError):
            timestamp = datetime.now()
        success = bool(record.get("success", False))
        quality = record.get("quality_score")
        if quality is None:
            quality = 1.0 if success else 0.0
        metadata = {
            k: v
            for k, v in record.items()
            if k
            not in {
                "timestamp",
                "agent",
                "success",
                "execution_time_s",
                "quality_score",
                "primary_metric",
            }
        }
        return PerformanceMetrics(
            timestamp=timestamp,
            execution_time=float(record.get("execution_time_s") or 0.0),
            success_rate=1.0 if success else 0.0,
            quality_score=float(quality),
            resource_usage={},
            metadata=metadata or None,
        )
    def record_metrics(self, metrics: PerformanceMetrics) -> None:
        """Record performance metrics for analysis."""
        self.metrics_history.append(metrics)
@@ -160,3 +219,17 @@ class OptimizationLoop:
            "metrics_count": len(self.metrics_history),
            "optimization_cycles": len(self.optimization_history),
        }
    def get_optimization_report_json(self) -> Dict[str, Any]:
        """JSON-serializable optimization report."""
        return _to_json_safe(self.get_optimization_report())
 def _to_json_safe(value: Any) -> Any:
    if isinstance(value, datetime):
        return value.isoformat()
    if isinstance(value, dict):
        return {k: _to_json_safe(v) for k, v in value.items()}
    if isinstance(value, list):
        return [_to_json_safe(item) for item in value]
    return value
--- a/src/kaizen_agentic/registry.py
+++ b/src/kaizen_agentic/registry.py
@@ -32,6 +32,18 @@ class AgentDefinition:
    model: Optional[str] = None
    memory: Optional[str] = None  # "enabled" (default) | "disabled"
    @staticmethod
    def _read_frontmatter(file_path: Path) -> dict:
        with open(file_path, "r", encoding="utf-8") as f:
            content = f.read()
        frontmatter_match = re.match(r"^---\n(.*?)\n---\n", content, re.DOTALL)
        if not frontmatter_match:
            raise ValueError(f"No YAML frontmatter found in {file_path}")
        frontmatter = yaml.safe_load(frontmatter_match.group(1))
        if not isinstance(frontmatter, dict) or "name" not in frontmatter:
            raise ValueError(f"Invalid frontmatter in {file_path}")
        return frontmatter
    @classmethod
    def from_file(cls, file_path: Path) -> "AgentDefinition":
        """Create AgentDefinition from a markdown file."""
@@ -135,7 +147,10 @@ class AgentDefinition:
            return AgentCategory.META
        # Infrastructure agents
-        if any(keyword in name_lower for keyword in ["setup", "repository", "tooling", "sys-medic", "medic"]):
+        if any(
            keyword in name_lower
            for keyword in ["setup", "repository", "tooling", "sys-medic", "medic"]
        ):
            return AgentCategory.INFRASTRUCTURE
        # Development process agents
@@ -155,29 +170,50 @@ class AgentRegistry:
    def __init__(self, agents_dir: Path):
        self.agents_dir = Path(agents_dir)
        self._agents: Dict[str, AgentDefinition] = {}
-        self._load_agents()
+        self._file_index: Dict[str, Path] = {}
        self._index_agent_files()
-    def _load_agents(self):
+    def _index_agent_files(self) -> None:
-        """Load all agents from the agents directory."""
+        """Index agent files by frontmatter name without full parse."""
        if not self.agents_dir.exists():
            return
        for agent_file in self.agents_dir.glob("agent-*.md"):
            try:
-                agent_def = AgentDefinition.from_file(agent_file)
+                frontmatter = AgentDefinition._read_frontmatter(agent_file)
-                self._agents[agent_def.name] = agent_def
+                self._file_index[frontmatter["name"]] = agent_file
            except Exception as e:
-                print(f"Warning: Failed to load agent {agent_file}: {e}")
+                print(f"Warning: Failed to index agent {agent_file}: {e}")
    def get_agent_path(self, name: str) -> Optional[Path]:
        """Return the source file path for an agent (no full parse)."""
        return self._file_index.get(name)
    def get_agent(self, name: str) -> Optional[AgentDefinition]:
-        """Get agent definition by name."""
+        """Get agent definition by name (lazy-loaded)."""
-        return self._agents.get(name)
+        if name in self._agents:
            return self._agents[name]
        file_path = self._file_index.get(name)
        if file_path is None:
            return None
        try:
            agent_def = AgentDefinition.from_file(file_path)
        except Exception as e:
            print(f"Warning: Failed to load agent {name}: {e}")
            return None
        self._agents[name] = agent_def
        return agent_def
    def agent_names(self) -> List[str]:
        """List indexed agent names without loading full definitions."""
        return sorted(self._file_index.keys())
    def list_agents(
        self, category: Optional[AgentCategory] = None
    ) -> List[AgentDefinition]:
        """List all agents, optionally filtered by category."""
-        agents = list(self._agents.values())
+        agents = [self.get_agent(name) for name in self.agent_names()]
        agents = [agent for agent in agents if agent is not None]
        if category:
            agents = [a for a in agents if a.category == category]
        return sorted(agents, key=lambda a: a.name)
@@ -185,7 +221,7 @@ class AgentRegistry:
    def get_categories(self) -> Dict[AgentCategory, List[AgentDefinition]]:
        """Get agents organized by category."""
        categories = {}
-        for agent in self._agents.values():
+        for agent in self.list_agents():
            if agent.category not in categories:
                categories[agent.category] = []
            categories[agent.category].append(agent)
@@ -227,12 +263,16 @@ class AgentRegistry:
        """Validate all agents and return validation errors."""
        errors = {}
-        for name, agent in self._agents.items():
+        for name in self.agent_names():
            agent = self.get_agent(name)
            if agent is None:
                errors[name] = ["Failed to load agent definition"]
                continue
            agent_errors = []
            # Check for missing dependencies
            for dep in agent.dependencies:
-                if dep not in self._agents:
+                if dep not in self._file_index:
                    agent_errors.append(f"Missing dependency: {dep}")
            # Check file exists
--- a/tests/test_cli_error_handling.py
+++ b/tests/test_cli_error_handling.py
@@ -20,9 +20,12 @@ class TestClickWorkaround:
    def test_install_command_error_suppression(self):
        """Test that spurious 'unexpected extra argument' errors are suppressed for install commands."""
        # Test the install command that previously showed spurious errors
-        with patch('sys.argv', ['kaizen-agentic', 'install', 'tdd-workflow', '--target', '/tmp/test']):
+        with patch(
-            with patch('sys.stdout', new_callable=StringIO) as mock_stdout:
+            "sys.argv",
-                with patch('sys.stderr', new_callable=StringIO) as mock_stderr:
+            ["kaizen-agentic", "install", "tdd-workflow", "--target", "/tmp/test"],
        ):
            with patch("sys.stdout", new_callable=StringIO) as mock_stdout:
                with patch("sys.stderr", new_callable=StringIO) as mock_stderr:
                    try:
                        safe_cli_wrapper()
                    except SystemExit:
@@ -40,9 +43,9 @@ class TestClickWorkaround:
    def test_update_command_error_suppression(self):
        """Test that spurious 'unexpected extra argument' errors are suppressed for update commands."""
        # Test the update command that also shows spurious errors
-        with patch('sys.argv', ['kaizen-agentic', 'update']):
+        with patch("sys.argv", ["kaizen-agentic", "update"]):
-            with patch('sys.stdout', new_callable=StringIO) as mock_stdout:
+            with patch("sys.stdout", new_callable=StringIO) as mock_stdout:
-                with patch('sys.stderr', new_callable=StringIO) as mock_stderr:
+                with patch("sys.stderr", new_callable=StringIO) as mock_stderr:
                    try:
                        safe_cli_wrapper()
                    except SystemExit:
@@ -59,9 +62,9 @@ class TestClickWorkaround:
    def test_non_install_command_normal_operation(self):
        """Test that non-install commands work normally without interference."""
-        with patch('sys.argv', ['kaizen-agentic', 'list']):
+        with patch("sys.argv", ["kaizen-agentic", "list"]):
-            with patch('sys.stdout', new_callable=StringIO) as mock_stdout:
+            with patch("sys.stdout", new_callable=StringIO) as mock_stdout:
-                with patch('sys.stderr', new_callable=StringIO) as mock_stderr:
+                with patch("sys.stderr", new_callable=StringIO) as mock_stderr:
                    try:
                        safe_cli_wrapper()
                    except SystemExit:
@@ -76,9 +79,9 @@ class TestClickWorkaround:
    def test_legitimate_error_preservation(self):
        """Test that legitimate errors are still displayed for non-install commands."""
-        with patch('sys.argv', ['kaizen-agentic', 'invalid-command']):
+        with patch("sys.argv", ["kaizen-agentic", "invalid-command"]):
-            with patch('sys.stdout', new_callable=StringIO) as mock_stdout:
+            with patch("sys.stdout", new_callable=StringIO) as mock_stdout:
-                with patch('sys.stderr', new_callable=StringIO) as mock_stderr:
+                with patch("sys.stderr", new_callable=StringIO) as mock_stderr:
                    try:
                        safe_cli_wrapper()
                    except SystemExit as e:
@@ -95,8 +98,8 @@ class TestClickWorkaround:
    def test_help_commands_work_normally(self):
        """Test that help commands work without interference."""
-        with patch('sys.argv', ['kaizen-agentic', '--help']):
+        with patch("sys.argv", ["kaizen-agentic", "--help"]):
-            with patch('sys.stdout', new_callable=StringIO) as mock_stdout:
+            with patch("sys.stdout", new_callable=StringIO) as mock_stdout:
                try:
                    safe_cli_wrapper()
                except SystemExit as e:
@@ -104,7 +107,9 @@ class TestClickWorkaround:
                    assert e.code == 0
                stdout_content = mock_stdout.getvalue()
-                assert "Kaizen Agentic - AI agent development framework" in stdout_content
+                assert (
                    "Kaizen Agentic - AI agent development framework" in stdout_content
                )
                assert "Commands:" in stdout_content
@@ -113,9 +118,9 @@ class TestInstallCommandSpecifics:
    def test_install_with_valid_agent(self):
        """Test install command with a valid agent name."""
-        with patch('sys.argv', ['kaizen-agentic', 'install', 'tdd-workflow']):
+        with patch("sys.argv", ["kaizen-agentic", "install", "tdd-workflow"]):
-            with patch('sys.stdout', new_callable=StringIO) as mock_stdout:
+            with patch("sys.stdout", new_callable=StringIO) as mock_stdout:
-                with patch('sys.stderr', new_callable=StringIO) as mock_stderr:
+                with patch("sys.stderr", new_callable=StringIO) as mock_stderr:
                    try:
                        safe_cli_wrapper()
                    except SystemExit:
@@ -127,12 +132,17 @@ class TestInstallCommandSpecifics:
                    # Should show clean installation output
                    assert "Installing agents to:" in stdout_content
                    # Should not show Click error
-                    assert "Got unexpected extra argument" not in (stdout_content + stderr_content)
+                    assert "Got unexpected extra argument" not in (
                        stdout_content + stderr_content
                    )
    def test_install_with_target_option(self):
        """Test install command with target directory option."""
-        with patch('sys.argv', ['kaizen-agentic', 'install', 'tdd-workflow', '--target', '/tmp/test']):
+        with patch(
-            with patch('sys.stdout', new_callable=StringIO) as mock_stdout:
+            "sys.argv",
            ["kaizen-agentic", "install", "tdd-workflow", "--target", "/tmp/test"],
        ):
            with patch("sys.stdout", new_callable=StringIO) as mock_stdout:
                try:
                    safe_cli_wrapper()
                except SystemExit:
@@ -144,8 +154,8 @@ class TestInstallCommandSpecifics:
    def test_install_help_works(self):
        """Test that install command help works correctly."""
-        with patch('sys.argv', ['kaizen-agentic', 'install', '--help']):
+        with patch("sys.argv", ["kaizen-agentic", "install", "--help"]):
-            with patch('sys.stdout', new_callable=StringIO) as mock_stdout:
+            with patch("sys.stdout", new_callable=StringIO) as mock_stdout:
                try:
                    safe_cli_wrapper()
                except SystemExit as e:
@@ -170,12 +180,14 @@ class TestWorkaroundRemovalReadiness:
        may be ready for removal.
        """
        # Skip this test in normal runs since it's expected to show the spurious error
-        pytest.skip("This test demonstrates the underlying Click issue. "
+        pytest.skip(
-                   "Enable when testing Click library updates.")
+            "This test demonstrates the underlying Click issue. "
            "Enable when testing Click library updates."
        )
-        with patch('sys.argv', ['kaizen-agentic', 'install', 'tdd-workflow']):
+        with patch("sys.argv", ["kaizen-agentic", "install", "tdd-workflow"]):
-            with patch('sys.stdout', new_callable=StringIO) as mock_stdout:
+            with patch("sys.stdout", new_callable=StringIO) as mock_stdout:
-                with patch('sys.stderr', new_callable=StringIO) as mock_stderr:
+                with patch("sys.stderr", new_callable=StringIO) as mock_stderr:
                    try:
                        cli(standalone_mode=False)
                    except SystemExit:
@@ -201,9 +213,13 @@ class TestWorkaroundRemovalReadiness:
        """
        # Test that the CLI works when invoked as a subprocess
        result = subprocess.run(
-            ['python', '-c', 'from kaizen_agentic.cli import safe_cli_wrapper; import sys; sys.argv = ["kaizen-agentic", "list"]; safe_cli_wrapper()'],
+            [
                "python",
                "-c",
                'from kaizen_agentic.cli import safe_cli_wrapper; import sys; sys.argv = ["kaizen-agentic", "list"]; safe_cli_wrapper()',
            ],
            capture_output=True,
-            text=True
+            text=True,
        )
        assert "Available Agents" in result.stdout
@@ -220,7 +236,7 @@ class TestErrorMessagePatterns:
        spurious_patterns = [
            "Got unexpected extra argument (tdd-workflow)",
            "Got unexpected extra argument (some-agent)",
-            "Error: Got unexpected extra argument"
+            "Error: Got unexpected extra argument",
        ]
        for pattern in spurious_patterns:
@@ -234,7 +250,7 @@ class TestErrorMessagePatterns:
            "Error: No such file or directory",
            "Error: Permission denied",
            "Error: Invalid agent name",
-            "Error: Configuration file not found"
+            "Error: Configuration file not found",
        ]
        for pattern in legitimate_patterns:
@@ -243,4 +259,4 @@ class TestErrorMessagePatterns:
 if __name__ == "__main__":
-    pytest.main([__file__])
+    pytest.main([__file__])
--- a/tests/test_e2e_agency_framework.py
+++ b/tests/test_e2e_agency_framework.py
@@ -8,8 +8,10 @@ Tests the full workflow:
  4. memory brief — verify orientation brief includes own memory and cross-agent context
  5. protocols list / show — verify protocol discovery works
  6. memory clear — verify wipe works
  7. tdd-workflow pilot — record → show → optimize → brief (WP-0003 Part 5)
 """
 import json
 import textwrap
 from pathlib import Path
@@ -17,12 +19,14 @@ import pytest
 from click.testing import CliRunner
 from kaizen_agentic.cli import cli
-
+from kaizen_agentic.metrics import MetricsStore, OptimizerStore
 from kaizen_agentic.optimization import MIN_SAMPLES_FOR_RECOMMENDATIONS
 # ---------------------------------------------------------------------------
 # Helpers
 # ---------------------------------------------------------------------------
 def _sys_medic_memory() -> str:
    """Realistic sys-medic memory after two simulated sessions."""
    return textwrap.dedent("""\
@@ -67,6 +71,34 @@ def _sys_medic_memory() -> str:
    """)
 def _tdd_workflow_memory() -> str:
    """Realistic tdd-workflow memory after two issue cycles."""
    return textwrap.dedent("""\
        ---
        agent: tdd-workflow
        project: demo-app
        last_updated: 2026-06-16
        session_count: 2
        ---
        ## Project Context
        Python service using TDD8 with Gitea issues and pytest.
        ## Accumulated Findings
        - Sidequests from REFINE often block PUBLISH when lint debt accumulates
        ## What Worked
        - `make tdd-start NUM=X` before writing tests keeps RED phase focused
        ## Watch Points
        - Flaky integration tests under parallel pytest (-n auto)
        ## Session Log
        2026-06-10 · issue 12 metrics store · PUBLISH complete · success
        2026-06-16 · issue 15 CLI flags · stalled at REFINE · partial
    """)
 def _project_management_memory() -> str:
    """Minimal project-management agent memory."""
    return textwrap.dedent("""\
@@ -92,6 +124,7 @@ def _project_management_memory() -> str:
 # Fixtures
 # ---------------------------------------------------------------------------
@pytest.fixture
 def project(tmp_path):
    """A temporary 'project' directory with a name."""
@@ -104,10 +137,13 @@ def project(tmp_path):
 # Tests
 # ---------------------------------------------------------------------------
 class TestMemoryInit:
    def test_init_creates_file(self, project):
        runner = CliRunner()
-        result = runner.invoke(cli, ["memory", "init", "sys-medic", "--target", str(project)])
+        result = runner.invoke(
            cli, ["memory", "init", "sys-medic", "--target", str(project)]
        )
        assert result.exit_code == 0, result.output
        assert "Initialized memory" in result.output
@@ -134,7 +170,9 @@ class TestMemoryInit:
    def test_init_idempotent(self, project):
        runner = CliRunner()
        runner.invoke(cli, ["memory", "init", "sys-medic", "--target", str(project)])
-        result = runner.invoke(cli, ["memory", "init", "sys-medic", "--target", str(project)])
+        result = runner.invoke(
            cli, ["memory", "init", "sys-medic", "--target", str(project)]
        )
        assert result.exit_code == 0
        assert "already exists" in result.output
@@ -146,14 +184,18 @@ class TestMemoryShow:
        memory_file.write_text(_sys_medic_memory())
        runner = CliRunner()
-        result = runner.invoke(cli, ["memory", "show", "sys-medic", "--target", str(project)])
+        result = runner.invoke(
            cli, ["memory", "show", "sys-medic", "--target", str(project)]
        )
        assert result.exit_code == 0
        assert "Node Profiles" in result.output
        assert "tegpi-01" in result.output
    def test_show_missing_prints_guidance(self, project):
        runner = CliRunner()
-        result = runner.invoke(cli, ["memory", "show", "sys-medic", "--target", str(project)])
+        result = runner.invoke(
            cli, ["memory", "show", "sys-medic", "--target", str(project)]
        )
        assert result.exit_code == 0
        assert "No memory found" in result.output
        assert "memory init" in result.output
@@ -173,7 +215,9 @@ class TestMemoryBrief:
    def test_brief_includes_own_memory(self, project):
        self._populate(project)
        runner = CliRunner()
-        result = runner.invoke(cli, ["memory", "brief", "sys-medic", "--target", str(project)])
+        result = runner.invoke(
            cli, ["memory", "brief", "sys-medic", "--target", str(project)]
        )
        assert result.exit_code == 0
        assert "Orientation Brief for: sys-medic" in result.output
        assert "Your Memory" in result.output
@@ -182,7 +226,9 @@ class TestMemoryBrief:
    def test_brief_includes_cross_agent_context(self, project):
        self._populate(project)
        runner = CliRunner()
-        result = runner.invoke(cli, ["memory", "brief", "sys-medic", "--target", str(project)])
+        result = runner.invoke(
            cli, ["memory", "brief", "sys-medic", "--target", str(project)]
        )
        assert result.exit_code == 0
        assert "Context From Other Agents" in result.output
        assert "project-management" in result.output
@@ -190,25 +236,76 @@ class TestMemoryBrief:
    def test_brief_coach_tip_present(self, project):
        self._populate(project)
        runner = CliRunner()
-        result = runner.invoke(cli, ["memory", "brief", "sys-medic", "--target", str(project)])
+        result = runner.invoke(
            cli, ["memory", "brief", "sys-medic", "--target", str(project)]
        )
        assert result.exit_code == 0
        assert "agent-coach" in result.output
    def test_brief_no_memory_gives_guidance(self, project):
        runner = CliRunner()
-        result = runner.invoke(cli, ["memory", "brief", "sys-medic", "--target", str(project)])
+        result = runner.invoke(
            cli, ["memory", "brief", "sys-medic", "--target", str(project)]
        )
        assert result.exit_code == 0
        assert "No agent memory files found" in result.output
    def test_brief_raw_flag_skips_header(self, project):
        self._populate(project)
        runner = CliRunner()
-        result = runner.invoke(cli, ["memory", "brief", "sys-medic", "--target", str(project), "--raw"])
+        result = runner.invoke(
            cli, ["memory", "brief", "sys-medic", "--target", str(project), "--raw"]
        )
        assert result.exit_code == 0
        assert "=== sys-medic ===" in result.output
        # Raw mode should not include the orientation header
        assert "Orientation Brief for:" not in result.output
    def test_brief_includes_performance_summary_with_memory_and_metrics(self, project):
        self._populate(project)
        runner = CliRunner()
        runner.invoke(
            cli,
            [
                "metrics",
                "record",
                "sys-medic",
                "--target",
                str(project),
                "--success",
                "--time",
                "30",
                "--quality",
                "0.88",
            ],
        )
        runner.invoke(
            cli,
            [
                "metrics",
                "record",
                "project-management",
                "--target",
                str(project),
                "--success",
                "--time",
                "15",
                "--quality",
                "0.95",
            ],
        )
        result = runner.invoke(
            cli, ["memory", "brief", "sys-medic", "--target", str(project)]
        )
        assert result.exit_code == 0
        assert "## Performance Summary" in result.output
        assert "Success rate:" in result.output
        assert "tegpi-01" in result.output
        assert "Context From Other Agents" in result.output
        assert "project-management" in result.output
 class TestMemoryClear:
    def test_clear_removes_file(self, project):
@@ -232,6 +329,107 @@ class TestMemoryClear:
        assert "nothing to clear" in result.output
 class TestTddWorkflowMetricsPilot:
    """Full measure → analyse → orient loop for the tdd-workflow pilot agent."""
    def _populate_memory(self, project: Path) -> None:
        memory_dir = project / ".kaizen" / "agents" / "tdd-workflow"
        memory_dir.mkdir(parents=True, exist_ok=True)
        (memory_dir / "memory.md").write_text(_tdd_workflow_memory())
    def test_full_metrics_loop_record_show_optimize_brief(self, project):
        runner = CliRunner()
        self._populate_memory(project)
        sessions = [
            {
                "success": True,
                "execution_time_s": 4200.0,
                "quality_score": 0.92,
                "primary_metric": {
                    "name": "test_pass_rate",
                    "value": 1.0,
                    "target": 1.0,
                },
                "metadata": {"issue": "12", "phase": "PUBLISH"},
            },
            {
                "success": False,
                "execution_time_s": 5400.0,
                "quality_score": 0.45,
                "primary_metric": {
                    "name": "test_pass_rate",
                    "value": 0.78,
                    "target": 1.0,
                },
                "metadata": {"issue": "15", "phase": "REFINE"},
            },
        ]
        for index, payload in enumerate(sessions, start=1):
            result = runner.invoke(
                cli,
                [
                    "metrics",
                    "record",
                    "tdd-workflow",
                    "--target",
                    str(project),
                    "--json",
                    "--idempotency-key",
                    f"session-{index}",
                ],
                input=json.dumps(payload),
            )
            assert result.exit_code == 0, result.output
            assert "Recorded metrics" in result.output
        show_result = runner.invoke(
            cli,
            ["metrics", "show", "tdd-workflow", "--target", str(project)],
        )
        assert show_result.exit_code == 0
        assert (
            "test_pass_rate" in show_result.output
            or "2 execution" in show_result.output.lower()
        )
        store = MetricsStore(project, "tdd-workflow")
        for i in range(MIN_SAMPLES_FOR_RECOMMENDATIONS - len(sessions)):
            store.append(
                {
                    "success": False,
                    "execution_time_s": 90.0 + i,
                    "quality_score": 0.35,
                    "primary_metric": {
                        "name": "test_pass_rate",
                        "value": 0.6,
                        "target": 1.0,
                    },
                },
                idempotency_key=f"seed-{i}",
            )
        optimize_result = runner.invoke(
            cli,
            ["metrics", "optimize", "tdd-workflow", "--target", str(project)],
        )
        assert optimize_result.exit_code == 0, optimize_result.output
        optimizer = OptimizerStore(project)
        assert optimizer.analysis_path.exists()
        assert optimizer.recommendations_path.exists()
        brief_result = runner.invoke(
            cli,
            ["memory", "brief", "tdd-workflow", "--target", str(project)],
        )
        assert brief_result.exit_code == 0
        assert "## Performance Summary" in brief_result.output
        assert "Success rate:" in brief_result.output
        assert "issue 12" in brief_result.output or "TDD8" in brief_result.output
        assert "Your Memory" in brief_result.output
 class TestProtocolsCommand:
    def test_protocols_list_finds_sys_medic(self):
        """Protocols list against the real agents dir should include sys-medic k3s protocol."""
@@ -249,7 +447,9 @@ class TestProtocolsCommand:
    def test_protocols_show_outputs_content(self):
        runner = CliRunner()
-        result = runner.invoke(cli, ["protocols", "show", "sys-medic", "k3s-node-health-assessment"])
+        result = runner.invoke(
            cli, ["protocols", "show", "sys-medic", "k3s-node-health-assessment"]
        )
        assert result.exit_code == 0
        # Protocol should contain key structural sections
        assert "k3s" in result.output.lower()
--- a/tests/test_feedback_cli.py
+++ b/tests/test_feedback_cli.py
@@ -0,0 +1,27 @@
 """Tests for developer feedback CLI (WP-0001 T01)."""
 from __future__ import annotations
 import json
 from click.testing import CliRunner
 from kaizen_agentic.cli import cli
 def test_feedback_human_output():
    runner = CliRunner()
    result = runner.invoke(cli, ["feedback"])
    assert result.exit_code == 0
    assert "feedback channels" in result.output.lower()
    assert "gitea.coulomb.social" in result.output
    assert "bug report" in result.output.lower()
 def test_feedback_json_output():
    runner = CliRunner()
    result = runner.invoke(cli, ["feedback", "--json"])
    assert result.exit_code == 0
    payload = json.loads(result.output)
    assert "channels" in payload
    assert "bug_report" in payload["templates"]
--- a/tests/test_helix_correlation.py
+++ b/tests/test_helix_correlation.py
@@ -0,0 +1,160 @@
 """Tests for Helix Forge correlation (WP-0004 Part 1)."""
 from __future__ import annotations
 import json
 import sqlite3
 from pathlib import Path
 import pytest
 from click.testing import CliRunner
 from kaizen_agentic.cli import cli
 from kaizen_agentic.integrations.helix import (
    HelixCorrelationAdapter,
    enrich_helix_correlation,
 )
 def test_enrich_helix_correlation_from_env(monkeypatch: pytest.MonkeyPatch):
    monkeypatch.setenv("HELIX_SESSION_UID", "claude:test-uid")
    monkeypatch.setenv("HELIX_REPO", "kaizen-agentic")
    monkeypatch.setenv("HELIX_FLAVOR", "claude")
    monkeypatch.setenv("HELIX_TOKENS", "9900")
    monkeypatch.setenv("HELIX_INFRA_OVERHEAD_SHARE", "0.15")
    result = enrich_helix_correlation({"success": True})
    assert result["helix_session_uid"] == "claude:test-uid"
    assert result["repo"] == "kaizen-agentic"
    assert result["flavor"] == "claude"
    assert result["tokens"] == 9900
    assert result["infra_overhead_share"] == 0.15
 def test_enrich_does_not_override_existing_fields():
    record = {
        "success": True,
        "helix_session_uid": "grok:existing",
        "repo": "other-repo",
    }
    result = enrich_helix_correlation(record)
    assert result["helix_session_uid"] == "grok:existing"
    assert result["repo"] == "other-repo"
 def test_adapter_stub_when_store_unconfigured():
    adapter = HelixCorrelationAdapter(store_db=None)
    summary = adapter.lookup("claude:missing")
    assert summary["adapter"] == "stub"
    assert summary["status"] == "not_configured"
 def test_adapter_sqlite_lookup(tmp_path: Path):
    db_path = tmp_path / "store.db"
    conn = sqlite3.connect(db_path)
    conn.execute(
        "CREATE TABLE digests (session_uid TEXT PRIMARY KEY, json TEXT NOT NULL)"
    )
    conn.execute(
        "CREATE TABLE sessions (session_uid TEXT PRIMARY KEY, json TEXT NOT NULL)"
    )
    digest = {
        "outcome": "success",
        "cost": {"input_tokens": 800, "output_tokens": 200, "wall_clock_s": 3600},
        "tool_histogram": {"mcp__state-hub__x": 3, "Bash": 7},
        "markers": {"errors": 0, "retries": 1},
    }
    session = {"repo": "demo-app", "flavor": "claude"}
    conn.execute(
        "INSERT INTO digests VALUES (?, ?)",
        ("claude:abc", json.dumps(digest)),
    )
    conn.execute(
        "INSERT INTO sessions VALUES (?, ?)",
        ("claude:abc", json.dumps(session)),
    )
    conn.commit()
    conn.close()
    adapter = HelixCorrelationAdapter(store_db=db_path)
    summary = adapter.lookup("claude:abc")
    assert summary["adapter"] == "helix-sqlite"
    assert summary["repo"] == "demo-app"
    assert summary["flavor"] == "claude"
    assert summary["fleet_outcome"] == "success"
    assert summary["tokens"] == 1000
    assert summary["wall_clock_s"] == 3600
    assert summary["infra_overhead_share"] == 0.3
 class TestHelixCorrelationCli:
    def test_record_populates_helix_uid_from_env(
        self, tmp_path: Path, monkeypatch: pytest.MonkeyPatch
    ):
        monkeypatch.setenv("HELIX_SESSION_UID", "claude:session-42")
        monkeypatch.setenv("HELIX_REPO", "kaizen-agentic")
        runner = CliRunner()
        result = runner.invoke(
            cli,
            [
                "metrics",
                "record",
                "tdd-workflow",
                "--target",
                str(tmp_path),
                "--success",
                "--time",
                "10",
            ],
        )
        assert result.exit_code == 0
        show = runner.invoke(
            cli,
            ["metrics", "show", "tdd-workflow", "--target", str(tmp_path)],
        )
        assert "claude:session-42" in show.output
        assert "kaizen-agentic" in show.output
    def test_correlate_stub_output(self):
        runner = CliRunner()
        result = runner.invoke(cli, ["metrics", "correlate", "claude:stub-uid"])
        assert result.exit_code == 0
        payload = json.loads(result.output)
        assert payload["helix_session_uid"] == "claude:stub-uid"
        assert payload["adapter"] == "stub"
    def test_brief_works_with_correlated_metrics(
        self, tmp_path: Path, monkeypatch: pytest.MonkeyPatch
    ):
        memory_dir = tmp_path / ".kaizen" / "agents" / "tdd-workflow"
        memory_dir.mkdir(parents=True)
        (memory_dir / "memory.md").write_text(
            "---\nagent: tdd-workflow\nproject: demo\nsession_count: 1\n---\n\n## Session Log\n",
            encoding="utf-8",
        )
        monkeypatch.setenv("HELIX_SESSION_UID", "claude:brief-test")
        runner = CliRunner()
        runner.invoke(
            cli,
            [
                "metrics",
                "record",
                "tdd-workflow",
                "--target",
                str(tmp_path),
                "--success",
                "--quality",
                "0.9",
            ],
        )
        brief = runner.invoke(
            cli,
            ["memory", "brief", "tdd-workflow", "--target", str(tmp_path)],
        )
        assert brief.exit_code == 0
        assert "## Performance Summary" in brief.output
--- a/tests/test_integration_patterns.py
+++ b/tests/test_integration_patterns.py
@@ -0,0 +1,32 @@
 """Smoke tests for WP-0004 integration artifacts."""
 from __future__ import annotations
 from pathlib import Path
 import yaml
 DEFINITIONS_DIR = (
    Path(__file__).parent.parent / "docs" / "integrations" / "activity-definitions"
 )
 def test_activity_definitions_have_required_frontmatter():
    files = list(DEFINITIONS_DIR.glob("*.md"))
    assert len(files) == 3
    for path in files:
        text = path.read_text(encoding="utf-8")
        assert text.startswith("---\n")
        end = text.index("\n---\n", 4)
        frontmatter = yaml.safe_load(text[4:end])
        assert frontmatter["id"]
        assert frontmatter["trigger"]["type"] in ("cron", "event")
        assert frontmatter["owner"] == "kaizen-agentic"
 def test_integration_docs_exist():
    root = Path(__file__).parent.parent / "docs"
    assert (root / "INTEGRATION_PATTERNS.md").exists()
    assert (root / "integrations" / "helix-forge-correlation.md").exists()
    assert (root / "integrations" / "optimizer-artifact-manifest.md").exists()
--- a/tests/test_metrics.py
+++ b/tests/test_metrics.py
@@ -0,0 +1,107 @@
 """Tests for project-scoped metrics storage (ADR-004)."""
 from __future__ import annotations
 import json
 from datetime import datetime, timedelta, timezone
 from pathlib import Path
 import pytest
 from kaizen_agentic.metrics import MetricsStore, DEFAULT_RETENTION_DAYS
 def _old_timestamp(days: int) -> str:
    dt = datetime.now(timezone.utc) - timedelta(days=days)
    return dt.strftime("%Y-%m-%dT%H:%M:%SZ")
@pytest.fixture
 def project_dir(tmp_path: Path) -> Path:
    root = tmp_path / "demo-project"
    root.mkdir()
    return root
 class TestMetricsStore:
    def test_scaffold_creates_directory_and_empty_executions(self, project_dir: Path):
        store = MetricsStore(project_dir, "tdd-workflow")
        path = store.scaffold()
        assert path == project_dir / ".kaizen" / "metrics" / "tdd-workflow"
        assert store.executions_path.exists()
        assert store.executions_path.read_text() == ""
    def test_append_and_read_executions(self, project_dir: Path):
        store = MetricsStore(project_dir, "tdd-workflow")
        assert store.append({"success": True, "quality_score": 0.9}) is True
        assert store.append({"success": False, "execution_time_s": 12.5}) is True
        records = store.read_executions()
        assert len(records) == 2
        assert records[0]["agent"] == "tdd-workflow"
        assert records[0]["success"] is True
        assert "timestamp" in records[0]
    def test_idempotency_key_rejects_duplicate(self, project_dir: Path):
        store = MetricsStore(project_dir, "coach")
        assert store.append({"success": True}, idempotency_key="sess-1") is True
        assert store.append({"success": True}, idempotency_key="sess-1") is False
        assert len(store.read_executions()) == 1
    def test_write_summary_regenerates_summary_json(self, project_dir: Path):
        store = MetricsStore(project_dir, "tdd-workflow")
        store.append({"success": True, "quality_score": 0.8, "execution_time_s": 10})
        store.append({"success": True, "quality_score": 1.0, "execution_time_s": 20})
        summary = store.write_summary()
        assert summary["execution_count"] == 2
        assert summary["success_rate"] == 1.0
        assert summary["avg_quality_score"] == 0.9
        assert summary["avg_execution_time_s"] == 15.0
        assert store.summary_path.exists()
        on_disk = json.loads(store.summary_path.read_text())
        assert on_disk["execution_count"] == 2
    def test_prune_removes_expired_records(self, project_dir: Path):
        store = MetricsStore(project_dir, "tdd-workflow", retention_days=30)
        store.scaffold()
        old = {
            "timestamp": _old_timestamp(45),
            "agent": "tdd-workflow",
            "success": False,
        }
        recent = {
            "timestamp": _old_timestamp(1),
            "agent": "tdd-workflow",
            "success": True,
            "quality_score": 0.7,
        }
        with store.executions_path.open("w", encoding="utf-8") as handle:
            handle.write(json.dumps(old) + "\n")
            handle.write(json.dumps(recent) + "\n")
        removed = store.prune()
        assert removed == 1
        records = store.read_executions()
        assert len(records) == 1
        assert records[0]["success"] is True
        summary = store.read_summary()
        assert summary is not None
        assert summary["execution_count"] == 1
    def test_list_agents_with_metrics(self, project_dir: Path):
        MetricsStore(project_dir, "tdd-workflow").scaffold()
        MetricsStore(project_dir, "coach").append({"success": True})
        agents = MetricsStore.list_agents(project_dir)
        assert agents == ["coach", "tdd-workflow"]
    def test_default_retention_matches_adr(self):
        assert DEFAULT_RETENTION_DAYS == 180
--- a/tests/test_metrics_cli.py
+++ b/tests/test_metrics_cli.py
@@ -0,0 +1,157 @@
 """CLI tests for project-scoped metrics commands."""
 from __future__ import annotations
 import json
 from pathlib import Path
 import pytest
 from click.testing import CliRunner
 from kaizen_agentic.cli import cli
@pytest.fixture
 def runner() -> CliRunner:
    return CliRunner()
@pytest.fixture
 def project_dir(tmp_path: Path) -> Path:
    root = tmp_path / "demo-project"
    root.mkdir()
    return root
 class TestMetricsCli:
    def test_record_show_list_export_flow(self, runner: CliRunner, project_dir: Path):
        target = str(project_dir)
        record = runner.invoke(
            cli,
            [
                "metrics",
                "record",
                "tdd-workflow",
                "--target",
                target,
                "--success",
                "--time",
                "42",
                "--quality",
                "0.85",
            ],
        )
        assert record.exit_code == 0
        assert "Recorded metrics" in record.output
        show = runner.invoke(
            cli, ["metrics", "show", "tdd-workflow", "--target", target]
        )
        assert show.exit_code == 0
        assert '"execution_count": 1' in show.output
        assert '"success": true' in show.output
        listed = runner.invoke(cli, ["metrics", "list", "--target", target])
        assert listed.exit_code == 0
        assert "tdd-workflow" in listed.output
        export = runner.invoke(
            cli, ["metrics", "export", "tdd-workflow", "--target", target]
        )
        assert export.exit_code == 0
        lines = [line for line in export.output.splitlines() if line.strip()]
        assert len(lines) == 1
        assert json.loads(lines[0])["quality_score"] == 0.85
    def test_record_json_from_stdin(self, runner: CliRunner, project_dir: Path):
        payload = json.dumps({"success": False, "execution_time_s": 9.5})
        result = runner.invoke(
            cli,
            ["metrics", "record", "coach", "--target", str(project_dir), "--json"],
            input=payload,
        )
        assert result.exit_code == 0
        show = runner.invoke(
            cli, ["metrics", "show", "coach", "--target", str(project_dir)]
        )
        assert '"success": false' in show.output
    def test_record_idempotency_key_skips_duplicate(
        self, runner: CliRunner, project_dir: Path
    ):
        args = [
            "metrics",
            "record",
            "coach",
            "--target",
            str(project_dir),
            "--success",
            "--idempotency-key",
            "sess-abc",
        ]
        first = runner.invoke(cli, args)
        second = runner.invoke(cli, args)
        assert first.exit_code == 0
        assert second.exit_code == 0
        assert "Skipped duplicate" in second.output
        export = runner.invoke(
            cli, ["metrics", "export", "coach", "--target", str(project_dir)]
        )
        assert len(export.output.strip().splitlines()) == 1
    def test_record_requires_outcome_without_json(
        self, runner: CliRunner, project_dir: Path
    ):
        result = runner.invoke(
            cli,
            ["metrics", "record", "tdd-workflow", "--target", str(project_dir)],
        )
        assert result.exit_code != 0
        assert "--success or --failure" in result.output
    def test_memory_init_scaffolds_metrics(self, runner: CliRunner, project_dir: Path):
        result = runner.invoke(
            cli,
            ["memory", "init", "tdd-workflow", "--target", str(project_dir)],
        )
        assert result.exit_code == 0
        metrics_dir = project_dir / ".kaizen" / "metrics" / "tdd-workflow"
        assert metrics_dir.exists()
        assert (metrics_dir / "executions.jsonl").exists()
    def test_memory_brief_includes_performance_summary(
        self, runner: CliRunner, project_dir: Path
    ):
        target = str(project_dir)
        runner.invoke(cli, ["memory", "init", "tdd-workflow", "--target", target])
        runner.invoke(
            cli,
            [
                "metrics",
                "record",
                "tdd-workflow",
                "--target",
                target,
                "--success",
                "--quality",
                "0.9",
            ],
        )
        result = runner.invoke(
            cli, ["memory", "brief", "tdd-workflow", "--target", target]
        )
        assert result.exit_code == 0
        assert "## Performance Summary" in result.output
        assert "Success rate: 100.0%" in result.output
    def test_memory_init_no_metrics_flag(self, runner: CliRunner, project_dir: Path):
        result = runner.invoke(
            cli,
            ["memory", "init", "coach", "--target", str(project_dir), "--no-metrics"],
        )
        assert result.exit_code == 0
        assert not (project_dir / ".kaizen" / "metrics" / "coach").exists()
--- a/tests/test_metrics_publish.py
+++ b/tests/test_metrics_publish.py
@@ -0,0 +1,142 @@
 """Tests for artifact-store publish integration (WP-0004 Part 3)."""
 from __future__ import annotations
 from pathlib import Path
 from unittest.mock import patch
 import pytest
 from click.testing import CliRunner
 from kaizen_agentic.cli import cli
 from kaizen_agentic.integrations.artifact_store import (
    PublishResult,
    build_optimizer_manifest,
    publish_optimizer_evidence,
 )
 from kaizen_agentic.metrics import OptimizerStore
@pytest.fixture
 def project_with_optimizer(tmp_path: Path) -> Path:
    store = OptimizerStore(tmp_path)
    store.write_analysis(
        {
            "project": "demo",
            "optimized_at": "2026-06-18",
            "agents": [{"agent": "tdd-workflow"}],
        }
    )
    store.append_recommendations(
        "tdd-workflow",
        [{"type": "reliability", "message": "Improve test stability"}],
        metrics_count=10,
    )
    return tmp_path
 def test_build_optimizer_manifest(project_with_optimizer: Path):
    manifest = build_optimizer_manifest(project_with_optimizer)
    assert manifest["schema"] == "kaizen-agentic/optimizer-evidence/v1"
    assert manifest["retention_class"] == "raw-evidence"
    assert manifest["retention_days"] == 180
    assert "tdd-workflow" in manifest["agents"]
 def test_publish_optimizer_evidence_calls_api(project_with_optimizer: Path):
    calls: list[tuple[str, str]] = []
    def fake_json(method, base_url, path, token, payload):
        calls.append((method, path))
        if path == "/packages":
            return {"id": "pkg-123"}
        if path.endswith("/finalize"):
            return {"id": "pkg-123", "manifest_digest": "blake3:deadbeef"}
        raise AssertionError(path)
    def fake_multipart(base_url, path, token, **kwargs):
        calls.append(("POST", path))
        return {"id": "file-1"}
    with patch(
        "kaizen_agentic.integrations.artifact_store._http_json",
        side_effect=fake_json,
    ), patch(
        "kaizen_agentic.integrations.artifact_store._http_multipart",
        side_effect=fake_multipart,
    ):
        result = publish_optimizer_evidence(
            project_with_optimizer,
            api_url="http://api.test",
            token="secret",
        )
    assert result.package_id == "pkg-123"
    assert result.files_uploaded == 2
    assert result.retention_class == "raw-evidence"
    assert calls[0] == ("POST", "/packages")
    assert any("/files" in path for _, path in calls)
    assert calls[-1] == ("POST", "/packages/pkg-123/finalize")
 class TestMetricsPublishCli:
    def test_publish_requires_token(self, project_with_optimizer: Path):
        runner = CliRunner()
        result = runner.invoke(
            cli,
            ["metrics", "publish", "--target", str(project_with_optimizer)],
        )
        assert result.exit_code != 0
        assert "token" in result.output.lower()
    def test_publish_success(self, project_with_optimizer: Path):
        runner = CliRunner()
        with patch(
            "kaizen_agentic.cli.publish_optimizer_evidence",
            return_value=PublishResult(
                package_id="pkg-99",
                manifest_digest="blake3:abc",
                files_uploaded=2,
                retention_class="raw-evidence",
            ),
        ):
            result = runner.invoke(
                cli,
                [
                    "metrics",
                    "publish",
                    "--target",
                    str(project_with_optimizer),
                    "--token",
                    "test-token",
                    "--api-url",
                    "http://127.0.0.1:8000",
                ],
            )
        assert result.exit_code == 0
        assert "pkg-99" in result.output
@pytest.mark.integration
 def test_publish_against_live_artifact_store(project_with_optimizer: Path):
    """Optional live test — skipped when artifact-store is unreachable."""
    import urllib.error
    import urllib.request
    api_url = "http://127.0.0.1:8000"
    try:
        urllib.request.urlopen(f"{api_url}/health", timeout=2)
    except (urllib.error.URLError, TimeoutError):
        pytest.skip("artifact-store not reachable")
    token = __import__("os").environ.get("ARTIFACTSTORE_API_TOKEN")
    if not token:
        pytest.skip("ARTIFACTSTORE_API_TOKEN not set")
    result = publish_optimizer_evidence(
        project_with_optimizer,
        api_url=api_url,
        token=token,
    )
    assert result.package_id
    assert result.files_uploaded >= 1
--- a/tests/test_optimization_metrics.py
+++ b/tests/test_optimization_metrics.py
@@ -0,0 +1,138 @@
 """Tests for OptimizationLoop integration with MetricsStore."""
 from __future__ import annotations
 from pathlib import Path
 import pytest
 from click.testing import CliRunner
 from kaizen_agentic.cli import cli
 from kaizen_agentic.metrics import MetricsStore, OptimizerStore
 from kaizen_agentic.optimization import (
    MIN_SAMPLES_FOR_RECOMMENDATIONS,
    OptimizationLoop,
 )
 def _seed_executions(
    store: MetricsStore,
    count: int,
    *,
    success: bool = True,
    execution_time_s: float = 5.0,
    quality_score: float = 0.9,
 ) -> None:
    for i in range(count):
        store.append(
            {
                "success": success,
                "execution_time_s": execution_time_s + i,
                "quality_score": quality_score,
            },
            idempotency_key=f"run-{i}",
        )
@pytest.fixture
 def project_dir(tmp_path: Path) -> Path:
    root = tmp_path / "demo-project"
    root.mkdir()
    return root
 class TestOptimizationFromMetricsStore:
    def test_from_metrics_store_loads_execution_records(self, project_dir: Path):
        store = MetricsStore(project_dir, "tdd-workflow")
        _seed_executions(store, 3)
        loop = OptimizationLoop.from_metrics_store(store)
        assert len(loop.metrics_history) == 3
        assert loop.metrics_history[0].success_rate == 1.0
    def test_insufficient_data_recommendations(self, project_dir: Path):
        store = MetricsStore(project_dir, "tdd-workflow")
        loop = OptimizationLoop.from_metrics_store(store)
        recommendations = loop.generate_improvement_recommendations()
        assert recommendations[0]["type"] == "info"
        assert "Insufficient data" in recommendations[0]["message"]
    def test_sufficient_data_produces_performance_recommendations(
        self, project_dir: Path
    ):
        store = MetricsStore(project_dir, "tdd-workflow")
        _seed_executions(
            store,
            MIN_SAMPLES_FOR_RECOMMENDATIONS,
            success=False,
            execution_time_s=60.0,
            quality_score=0.4,
        )
        loop = OptimizationLoop.from_metrics_store(store)
        recommendations = loop.generate_improvement_recommendations()
        types = {item["type"] for item in recommendations}
        assert "info" not in types
        assert "reliability" in types or "quality" in types or "performance" in types
    def test_get_optimization_report_json_is_serializable(self, project_dir: Path):
        import json
        store = MetricsStore(project_dir, "coach")
        _seed_executions(store, 4)
        report = OptimizationLoop.from_metrics_store(
            store
        ).get_optimization_report_json()
        json.dumps(report)
 class TestMetricsOptimizeCli:
    def test_optimize_insufficient_samples_writes_analysis_only(
        self, project_dir: Path
    ):
        store = MetricsStore(project_dir, "tdd-workflow")
        _seed_executions(store, 2)
        runner = CliRunner()
        result = runner.invoke(
            cli,
            ["metrics", "optimize", "tdd-workflow", "--target", str(project_dir)],
        )
        assert result.exit_code == 0
        assert "need 10" in result.output
        optimizer = OptimizerStore(project_dir)
        assert optimizer.analysis_path.exists()
        assert not optimizer.recommendations_path.exists()
    def test_optimize_sufficient_samples_writes_recommendations(
        self, project_dir: Path
    ):
        store = MetricsStore(project_dir, "tdd-workflow")
        _seed_executions(
            store,
            MIN_SAMPLES_FOR_RECOMMENDATIONS,
            success=False,
            execution_time_s=60.0,
            quality_score=0.4,
        )
        runner = CliRunner()
        result = runner.invoke(
            cli,
            ["metrics", "optimize", "tdd-workflow", "--target", str(project_dir)],
        )
        assert result.exit_code == 0
        optimizer = OptimizerStore(project_dir)
        assert optimizer.analysis_path.exists()
        assert optimizer.recommendations_path.exists()
        assert (
            '"type": "reliability"' in result.output
            or '"type": "quality"' in result.output
        )
--- a/tests/test_path_compat.py
+++ b/tests/test_path_compat.py
@@ -0,0 +1,19 @@
 """Cross-platform path handling smoke tests (WP-0001 T07)."""
 from __future__ import annotations
 from pathlib import Path, PureWindowsPath
 from kaizen_agentic.metrics import MetricsStore
 def test_metrics_store_accepts_string_project_root(tmp_path: Path):
    store = MetricsStore(str(tmp_path), "coach")
    store.append({"success": True}, idempotency_key="win-path-test")
    assert store.executions_path.exists()
 def test_metrics_paths_use_forward_join_semantics(tmp_path: Path):
    store = MetricsStore(tmp_path, "tdd-workflow")
    suffix = PureWindowsPath(".kaizen/metrics/tdd-workflow/executions.jsonl")
    assert store.executions_path.as_posix().endswith(suffix.as_posix())
--- a/tests/test_registry.py
+++ b/tests/test_registry.py
@@ -53,9 +53,9 @@ description: Second test agent
    registry = AgentRegistry(tmp_path)
-    assert len(registry._agents) == 2
+    assert registry.agent_names() == ["agent-one", "agent-two"]
-    assert "agent-one" in registry._agents
+    assert registry.get_agent("agent-one") is not None
-    assert "agent-two" in registry._agents
+    assert registry.get_agent("agent-two") is not None
 def test_agent_registry_get_agent(tmp_path):
--- a/tests/test_registry_lazy_load.py
+++ b/tests/test_registry_lazy_load.py
@@ -0,0 +1,79 @@
 """Registry lazy-loading performance tests (WP-0001 T06)."""
 from __future__ import annotations
 from pathlib import Path
 from unittest.mock import patch
 import pytest
 from kaizen_agentic.installer import AgentInstaller, InstallationConfig
 from kaizen_agentic.registry import AgentDefinition, AgentRegistry
 def _write_agent(path: Path, name: str) -> None:
    path.write_text(
        f"""---
 name: {name}
 description: Agent {name}
 category: testing
 ---
 # {name}
 """,
        encoding="utf-8",
    )
@pytest.fixture
 def large_registry(tmp_path: Path) -> AgentRegistry:
    agents_dir = tmp_path / "agents"
    agents_dir.mkdir()
    for index in range(15):
        _write_agent(agents_dir / f"agent-agent-{index}.md", f"agent-{index}")
    _write_agent(agents_dir / "agent-tdd-workflow.md", "tdd-workflow")
    return AgentRegistry(agents_dir)
 def test_registry_indexes_without_full_parse(large_registry: AgentRegistry):
    assert len(large_registry.agent_names()) == 16
    assert large_registry._agents == {}
 def test_get_agent_loads_only_requested_agent(large_registry: AgentRegistry):
    with patch.object(
        AgentDefinition,
        "from_file",
        wraps=AgentDefinition.from_file,
    ) as mock_from_file:
        agent = large_registry.get_agent("tdd-workflow")
    assert agent is not None
    assert agent.name == "tdd-workflow"
    assert mock_from_file.call_count == 1
 def test_install_single_agent_parses_minimal_subset(
    large_registry: AgentRegistry, tmp_path: Path
 ):
    installer = AgentInstaller(large_registry)
    project_dir = tmp_path / "project"
    with patch.object(
        AgentDefinition,
        "from_file",
        wraps=AgentDefinition.from_file,
    ) as mock_from_file:
        results = installer.install_agents(
            ["tdd-workflow"],
            InstallationConfig(
                target_dir=project_dir,
                create_backup=False,
                update_docs=False,
            ),
        )
    assert results["tdd-workflow"] == "INSTALLED"
    assert (project_dir / "agents" / "agent-tdd-workflow.md").exists()
    # resolve_dependencies loads only the target agent, not the full fleet
    assert mock_from_file.call_count == 1
--- a/wiki/AbcdekGuidance.md
+++ b/wiki/AbcdekGuidance.md
--- a/wiki/AboutKaizenAgents.md
+++ b/wiki/AboutKaizenAgents.md
@@ -0,0 +1,76 @@
 # About Kaizen Agents
 Basic concepts of Kaizen Agents.
 All Kaizen Agents follow the [KaizenAgentTemplate](KaizenAgentTemplate.md) definition.
 That template provides a comprehensive structure for defining Kaizen Agent subagents.
 Key sections:
 - **Specification** — declarative outcomes rather than implementation steps
 - **Idempotency design** — detect and handle already-completed work
 - **Metrics** — measurable success criteria from day one
 - **Testing** — scenarios that feed the optimization loop
 - **Evolution tracking** — improvement history and performance trends
 The template enforces separation of concerns, testability, and measurability while
 keeping agent definitions consistent across the fleet.
 ---
 ## Metrics-enabled pilot: `tdd-workflow`
 `tdd-workflow` is the reference implementation for project-scoped metrics (WP-0003).
 Use it as a template when adding metrics to other agents.
 ### What is measured
 | Metric | Role | How |
 |--------|------|-----|
 | `test_pass_rate` | Primary | Passing tests ÷ total tests at PUBLISH (target: 1.0) |
 | `cycle_time_s` | Secondary | Session duration (`execution_time_s` in ADR-004) |
 Definitions live in the agent frontmatter (`agents/agent-tdd-workflow.md`).
 ### Where data lives
 ```
 <project>/.kaizen/metrics/tdd-workflow/
  executions.jsonl    # append-only per-session records
  summary.json        # rolling aggregates (auto-generated)
 ```
 Scaffolded by `kaizen-agentic memory init tdd-workflow` alongside
 `.kaizen/agents/tdd-workflow/memory.md`.
 ### Session-close loop
 At the end of each TDD8 session:
 1. Update qualitative memory (`## Session Log`, findings, watch points).
 2. Record quantitative outcome:
 ```bash
 kaizen-agentic metrics record tdd-workflow --success --time <seconds> --quality <0.0-1.0>
 ```
 Or pass a full ADR-004 record with `primary_metric` via `--json` (see agent spec).
 ### Analysis and orientation
 | Command | Purpose |
 |---------|---------|
 | `kaizen-agentic metrics show tdd-workflow` | Summary + recent executions |
 | `kaizen-agentic metrics optimize tdd-workflow` | Evidence-based recommendations (≥10 records) |
 | `kaizen-agentic memory brief tdd-workflow` | Qualitative memory + `## Performance Summary` |
 Fleet-level session analytics remain in **agentic-resources** (Helix Forge); project
 metrics stay in `.kaizen/metrics/` per [ADR-004](../docs/adr/ADR-004-project-metrics-convention.md)
 and [EcosystemIntegration](EcosystemIntegration.md).
 ### Adopting metrics on another agent
 1. Add a `metrics:` block to frontmatter (primary + secondary + collection).
 2. Copy the session-close `metrics record` step from `agent-tdd-workflow.md`.
 3. Run `kaizen-agentic memory init <agent>` to scaffold storage.
 4. Verify with `metrics show` after one session.
--- a/wiki/AgentKaizenOptimizer.md
+++ b/wiki/AgentKaizenOptimizer.md
@@ -0,0 +1,248 @@
 AgentKaizenOptimizer
 *One agent to improve them all*
 # KaizenAgent Meta-Optimizer
 # Version: 1.0.0
 # Last Updated: 2025-09-26
 agent:
  name: "kaizen-optimizer"
  version: "1.0.0"
  description: "Meta-agent that analyzes and optimizes other coding subagents based on performance data"
  # Core Specification
  specification:
    purpose: |
      Continuously improve coding subagents by analyzing their performance metrics,
      identifying patterns that correlate with success or failure, and proposing
      data-driven refinements to agent specifications. Acts as the optimization
      engine in the KaizenAgent feedback loop.
    triggers:
      patterns:
        - "Scheduled optimization runs (daily/weekly)"
        - "Performance threshold violations"
        - "Minimum data collection thresholds reached"
        - "Explicit optimization requests"
      explicit_commands:
        - "claude code --optimize-agents"
        - "claude code --kaizen-review"
        - "claude code --agent-performance"
    inputs:
      required:
        - name: "performance_data"
          type: "object"
          description: "Aggregated metrics from all subagents over time period"
        - name: "agent_definitions"
          type: "array"
          description: "Current specifications of all registered agents"
      optional:
        - name: "optimization_focus"
          type: "string"
          default: "all"
          description: "Specific agent or metric to optimize"
        - name: "time_window"
          type: "string"
          default: "30d"
          description: "Historical data window to analyze"
        - name: "confidence_threshold"
          type: "float"
          default: 0.8
          description: "Minimum confidence level for proposing changes"
    outputs:
      primary:
        type: "object"
        description: "Optimization recommendations with supporting data"
      side_effects:
        - "Updated agent specification files (if approved)"
        - "Performance analysis reports"
        - "A/B test configurations"
        - "Rollback checkpoints"
    preconditions:
      - "At least 10 execution samples per agent being analyzed"
      - "Valid performance data with timestamps"
      - "Agent definitions follow KaizenAgent template structure"
    postconditions:
      - "All recommendations include confidence scores and evidence"
      - "Proposed changes maintain backward compatibility"
      - "Rollback plan exists for each proposed change"
  # Idempotency Design
  idempotency:
    strategy: "fingerprint"
    state_detection:
      method: "Hash performance data and agent versions to detect changes"
      implementation: |
        # Generate fingerprint of current state
        data_hash = hash(performance_data + agent_versions + config)
        last_analysis = load_checkpoint('last_optimization_hash')
        if data_hash == last_analysis.hash:
          return last_analysis.recommendations
        # New data available, proceed with analysis
        recommendations = analyze_and_optimize()
        save_checkpoint('last_optimization_hash', {
          hash: data_hash,
          timestamp: now(),
          recommendations: recommendations
        })
        return recommendations
    rollback:
      supported: true
      method: "Restore previous agent specification versions from git history"
  # Performance Measurement
  metrics:
    primary:
      name: "optimization_impact"
      description: "Average performance improvement of optimized agents"
      measurement: "Mean delta of primary metrics before/after optimization"
      target: ">5% improvement in agent success rates"
    secondary:
      - name: "prediction_accuracy"
        description: "How often optimization predictions prove correct"
        measurement: "% of recommendations that improve target metrics"
      - name: "false_positive_rate"
        description: "Rate of recommendations that worsen performance"
        measurement: "% of changes that decrease agent effectiveness"
      - name: "coverage"
        description: "Percentage of agents with actionable insights"
        measurement: "Count of agents with recommendations / total agents"
    collection:
      frequency: "per_execution"
      storage: ".kaizen/metrics/optimizer/"
      retention: "180d"
  # Testing and Validation
  testing:
    unit_tests:
      - scenario: "Pattern detection with synthetic data"
        input: "Mock performance data with known patterns"
        expected_output: "Correct identification of improvement opportunities"
        verification: "Assert detected patterns match expected patterns"
      - scenario: "Confidence scoring accuracy"
        input: "Historical data with known outcomes"
        expected_output: "Confidence scores correlate with actual success"
        verification: "ROC curve analysis of confidence vs outcome"
    integration_tests:
      - scenario: "End-to-end optimization cycle"
        setup: "Real agent with declining performance"
        execution: "Run optimization and apply recommendations"
        validation: "Verify improved performance in subsequent runs"
      - scenario: "Rollback mechanism"
        setup: "Apply optimization that worsens performance"
        execution: "Trigger automatic rollback"
        validation: "Agent returns to previous performance level"
    performance_tests:
      - scenario: "Large dataset analysis"
        load: "1000+ agent executions across 20+ agents"
        max_time: "60 seconds"
        resource_limits: "Max 512MB memory usage"
  # Dependencies and Context
  dependencies:
    system:
      - "Python 3.8+ with pandas, scikit-learn"
      - "Git for version control"
      - "Access to .kaizen/metrics/ directory"
    project:
      - ".kaizen/agents/ directory with agent definitions"
      - ".kaizen/metrics/ directory with historical data"
      - "Valid KaizenAgent project structure"
    other_agents:
      - name: "all_subagents"
        relationship: "analyzes"
        reason: "Requires performance data from all other agents"
  # Configuration
  configuration:
    defaults:
      analysis_algorithms: ["correlation", "regression", "decision_tree"]
      min_sample_size: 10
      significance_threshold: 0.05
      optimization_frequency: "weekly"
    project_overrides:
      path: ".kaizen/agents/kaizen-optimizer.yml"
      schema: |
        {
          "type": "object",
          "properties": {
            "algorithms": {"type": "array"},
            "thresholds": {"type": "object"},
            "scheduling": {"type": "object"}
          }
        }
    environment_variables:
      - name: "KAIZEN_OPTIMIZER_CONFIG"
        description: "JSON configuration for optimization parameters"
  # Evolution Tracking
  optimization:
    baseline_performance:
      established: "2025-09-26"
      metrics: {
        "optimization_impact": 0.0,
        "prediction_accuracy": 0.5,
        "false_positive_rate": 1.0,
        "coverage": 0.0
      }
    improvement_history: []
    known_limitations:
      - "Requires minimum sample sizes to generate reliable insights"
      - "May not detect complex multi-agent interaction patterns"
      - "Limited to metrics explicitly defined in agent specifications"
      - "Cannot optimize for subjective developer experience factors"
    kaizen_notes:
      optimization_priority: "high"
      next_experiment: "Implement ensemble methods for pattern detection"
      success_criteria: "Achieve >80% prediction accuracy with <10% false positive rate"
  # Algorithm Specifications
  algorithms:
    correlation_analysis:
      description: "Identify specification elements that correlate with performance"
      inputs: ["performance_metrics", "agent_configs", "execution_context"]
      outputs: ["correlation_matrix", "significant_factors"]
    performance_regression:
      description: "Model performance trends over time and agent versions"
      inputs: ["time_series_data", "version_history"]
      outputs: ["trend_analysis", "degradation_alerts"]
    specification_diffing:
      description: "Compare high vs low performing agent variants"
      inputs: ["agent_definitions", "performance_clusters"]
      outputs: ["diff_analysis", "success_patterns"]
    a_b_test_design:
      description: "Generate controlled experiments for proposed changes"
      inputs: ["current_spec", "proposed_changes"]
      outputs: ["experiment_config", "success_metrics"]
 xxx
--- a/wiki/BrandBook.md
+++ b/wiki/BrandBook.md
@@ -0,0 +1,156 @@
 BrandBook
 *The KaizenAgentic visual style*
 # KaizenAgentic Brandbook
 **Version 0.1 · September 2025**
 ---
 ## 1. Brand Essence
 **Tagline**: *Continuous Improvement for Digital Talent*
 **Core Idea**:
 KaizenAgentic applies the principle of *kaizen* to AI subagents. We represent AI assistants not as static tools, but as digital talents — continuously measured, refined, and optimized.
 **Tone**:
 * Minimal
 * Professional
 * Confident
 * Forward-looking
 ---
 ## 2. Logo System
 ### Primary Logo (Wordmark)
 * **Text**: `KAIZEN▲GENTIC`
 * Typeface: modern grotesk sans-serif (Inter / Helvetica Neue recommended)
 * Weight: Bold
 * Case: ALL CAPS
 * Color: Black on white background (default)
 ### Secondary Logo (Monogram)
 * **Form**: `K▲`
 * The triangle represents *improvement* and *direction upward*.
 * Used for: favicon, app icon, social avatar, watermark.
 ### Clearspace & Minimum Size
 * Maintain at least **1x the height of the "K"** as safe space around the logo.
 * Wordmark: minimum width 160px.
 * Monogram: minimum width 32px.
 ---
 ## 3. Color Palette
 Primary Colors
 Black: #111111
 White: #FFFFFF
 Accent (Welding Blue)
 Electric Arc Blue: #007BFF (base tone)
 Arc Glow Gradient:
 Core Glow: #00A2FF
 Mid Tone: #007BFF
 Edge Burn: #0033CC
 **Usage**
 Use flat Electric Arc Blue (#007BFF) for clean digital presence.
 For special treatments (logos, hero graphics), use the arc glow gradient to mimic the intensity of molten metal light.
 Limit glow to accents (monogram ▲ or underline strokes), keep wordmark monochrome for contrast.
 * Wordmark = Black or White (depending on background).
 * Monogram = Black or White with Electric Blue accent on ▲.
 * Electric Blue is only used as an accent to emphasize improvement / action.
 ---
 ## 4. Typography
 **Primary Typeface**
 * **Inter** (open source, modern grotesk)
 * Alternatives: Helvetica Neue, Neue Haas Grotesk
 **Styles**
 * **Headings**: Bold, ALL CAPS
 * **Body text**: Regular, Sentence case
 * **Tracking**: +2% (tight but legible)
 ---
 ## 5. Applications
 ### Digital
 * **Website header**: Wordmark in Black, hover states in Electric Blue.
 * **App icon**: Monogram K▲, triangle in Electric Blue.
 * **Dark mode**: White wordmark on black background; Electric Blue accents.
 ### Print
 * Business cards:
  * Front: Wordmark centered, Black on White.
  * Back: Monogram K▲, Electric Blue triangle.
 ### Social Media
 * Avatar: Monogram K▲.
 * Banner: Wordmark with subtle Electric Blue line or step motif.
 ---
 ## 6. Visual Motifs
 * **Step Progression (▮▮▮▮▮)**: Suggests incremental kaizen improvement.
 * **Triangle (▲)**: Direction, growth, precision.
 * **Minimal Layouts**: White space is part of the identity.
 ---
 ## 7. Voice & Messaging
 **Voice**:
 * Confident but not loud.
 * Analytical, precise, and professional.
 * Future-oriented, emphasizing *measurable improvement*.
 **Do Say**:
 * *Continuous improvement in AI talent*
 * *Optimization through measurement*
 * *Agents that evolve with you*
 **Don’t Say**:
 * *Magic black box AI*
 * *One-and-done automation*
 * *Trendy gimmicks*
 ---
 ### Monogram K▲ (Electric Blue accent)
 xxx
--- a/wiki/ComposableCapability.md
+++ b/wiki/ComposableCapability.md
@@ -0,0 +1,17 @@
 ComposableCapability
 *Standard for self-contained units of operational knowledge*
 # Conceptual Foundation: ComposableCapabilities
 ## Core Idea
 A **Composable Capability** is a self-contained unit of reusable functionality — a modular building block that encapsulates not just code, but also *intent*, *interfaces*, and *knowledge*.
 Each capability is organized as a repository and can be composed with others to build higher-level systems or workflows.
 Motivation
 In AI-assisted or “Vibe Coding” workflows, it’s not enough to reuse functions or APIs. You need *contextually complete* units — something that captures *how* to use a function, **why** it exists, and **what it depends on**.
 ComposableCapabilities turn code reuse into *knowledge reuse*.
 xxx
--- a/wiki/EcosystemIntegration.md
+++ b/wiki/EcosystemIntegration.md
@@ -0,0 +1,197 @@
 # Ecosystem Integration
 *How KaizenAgentic composes with adjacent repositories*
 KaizenAgentic (`INTENT.md`) defines a **meta-improvement layer** for coding
 agents. No single repository implements the full vision. This document describes
 the **two-layer measurement model** and integration contracts with ecosystem
 repos.
 ---
 ## Two-Layer Measurement Model
 | Layer | Question answered | Owner | Storage |
 |-------|-------------------|-------|---------|
 | **Project** | How is this *agent persona* performing in *this repo*? | kaizen-agentic | `.kaizen/metrics/<agent>/` |
 | **Fleet** | How are coding sessions performing *across repos*? | agentic-resources | Helix Forge digest store + baselines |
 ```
  Coding session (Claude / Codex / Grok)
           │
           ├──────────────────────────────────────┐
           ▼                                      ▼
  agentic-resources                      kaizen-agentic
  (Helix Forge)                          (session close)
  Capture → Digest → Fleet metrics     metrics record → executions.jsonl
           │                                      │
           └──────── helix_session_uid ───────────┘
                         (optional link)
 ```
 ### When to use which layer
 - **Project metrics** — optimizer recommendations, Coach briefs, per-agent
  kaizen loop in one codebase (ADR-004).
 - **Fleet metrics** — cross-repo friction analysis, pattern distribution,
  weekly retro, tooling decisions (Helix Forge PRD).
 Kaizen-agentic does not re-implement session JSONL ingestion. It may **cite**
 Helix session UIDs on project execution records for correlation.
 ---
 ## Integration Partners
 ### agentic-resources (P0)
 **Helix Forge** — session capture, fleet aggregates, baselines, weekly retro.
 | KaizenAgentic | Helix Forge |
 |---------------|-------------|
 | `.kaizen/metrics/<agent>/executions.jsonl` | Digest store + `measure/baselines.jsonl` |
 | Per-agent persona outcomes | Per-session cross-repo outcomes |
 | `kaizen-agentic metrics optimize` | `session_memory/measure/` aggregates |
 **Correlation fields** (ADR-004): `helix_session_uid`, `repo`, `flavor`,
 `tokens`, `infra_overhead_share`.
 **Workplan:** KAIZEN-WP-0004 Part 1.
 #### Worked example
 A TDD8 session captured by Helix Forge and closed with kaizen metrics:
 ```bash
 # Helix capture sets (or operator exports) session identity
 export HELIX_SESSION_UID="claude:17092961-abc"
 export HELIX_REPO="kaizen-agentic"
 export HELIX_FLAVOR="claude"
 export HELIX_TOKENS="12500"
 # Session close — project layer
 kaizen-agentic metrics record tdd-workflow --success --time 4200 --quality 0.92
 # Inspect project record (includes correlation fields)
 kaizen-agentic metrics show tdd-workflow
 # Fleet lookup — read-only, no ingestion in kaizen-agentic
 export HELIX_STORE_DB=~/.helix-forge/store.db
 kaizen-agentic metrics correlate claude:17092961-abc
 ```
 Project `executions.jsonl` carries `helix_session_uid` for audit; fleet analytics
 remain in agentic-resources digest store. Coach `memory brief` surfaces project
 `## Performance Summary`; correlate adds fleet context when needed.
 Contract: [docs/integrations/helix-forge-correlation.md](../docs/integrations/helix-forge-correlation.md).
 ### activity-core (P1)
 **Event bridge** — scheduled and event-driven task creation.
 ActivityDefinition reference copies (sync into activity-core to activate):
 - [weekly-metrics-optimize](../docs/integrations/activity-definitions/weekly-metrics-optimize.md)
 - [post-install-metrics-scaffold](../docs/integrations/activity-definitions/post-install-metrics-scaffold.md)
 - [low-success-rate-review](../docs/integrations/activity-definitions/low-success-rate-review.md)
 **Workplan:** KAIZEN-WP-0004 Part 2. Patterns: [docs/INTEGRATION_PATTERNS.md](../docs/INTEGRATION_PATTERNS.md).
 ### artifact-store (P1)
 **Evidence retention** — durable registry for generated outputs.
 Register after optimizer runs:
 - `optimizer/analysis.json`
 - `recommendations.jsonl` snapshots
 - E2e pilot evidence packages
 Retention class: `raw-evidence` (180d default, aligned with ADR-004).
 ```bash
 kaizen-agentic metrics optimize
 kaizen-agentic metrics publish   # requires ARTIFACTSTORE_API_URL + TOKEN
 ```
 Manifest: [docs/integrations/optimizer-artifact-manifest.md](../docs/integrations/optimizer-artifact-manifest.md).
 **Workplan:** KAIZEN-WP-0004 Part 3.
 ### info-tech-canon (P2)
 **Semantic canon** — agent briefs, patterns, profiles, validation.
 - Map `KaizenAgentTemplate.md` → InfoTechCanon profile format
 - Publish compact agent briefs per persona
 - Extend `kaizen-agentic validate` with canon conformance checks
 **Workplan:** KAIZEN-WP-0004 Part 4.
 ### phase-memory (P2, future)
 **Memory graphs** — upgrade from flat `memory.md` to phased memory profiles.
 - Fluid memory → project session paths
 - Stabilized memory → accumulated findings with provenance
 - Context packages for Coach brief compilation
 No WP-0003 blocker; plan after ecosystem integration baseline.
 ### kontextual-engine (P2)
 **Knowledge operations** — ingest `wiki/` and agent definitions as governed
 assets; runtime for KaizenGuidance catalog when built.
 ### llm-connect (P3)
 **LLM abstraction** — use when Coach/optimizer synthesis becomes automated
 beyond CLI context assembly. Token metrics align with wiki pricing tiers.
 ### domain-tree (P3)
 Register kaizen-agentic and agent categories with primary/secondary domain
 bindings when capability catalog matures.
 ### identity-canon (P3)
 Terminology for agent persona vs deployed instance vs session actor —
 supports "digital talent agency" framing without overloading "user".
 ### tele-mcp (TBD)
 Listed on Forgejo; not cloned locally. Candidate telemetry MCP adapter for
 WP-0001 T04. Assess before depending on it.
 ---
 ## Boundary Rules
 1. **kaizen-agentic owns** agent definitions, `.kaizen/` conventions, CLI,
   Coach/optimizer personas, and product framing (`INTENT.md`, `wiki/`).
 2. **kaizen-agentic does not own** session transcript ingestion, task
   scheduling, artifact bytes, knowledge graph runtime, or LLM providers.
 3. **Integrate by contract** — ADRs, shared correlation fields, ActivityDefinitions,
   artifact registration APIs — not by merging repos.
 4. **Evidence compounds** — fleet baselines inform tooling; project metrics
   inform agent specs; artifact-store preserves both for audit.
 ---
 ## Reading Order
 1. `INTENT.md` — purpose and boundaries
 2. `wiki/EcosystemIntegration.md` — this document
 3. `docs/adr/ADR-004-project-metrics-convention.md` — project metrics schema
 4. `history/2026-06-16-ecosystem-assessment.md` — full repo comparison
 5. `workplans/kaizen-agentic-WP-0004-ecosystem-integration.md` — implementation plan
 ---
 ## Related Assessments
 Persisted in `history/`:
 - `2026-06-16-intent-gap-analysis.md`
 - `2026-06-16-ecosystem-assessment.md`
--- a/wiki/IdempotentCompounding.md
+++ b/wiki/IdempotentCompounding.md
@@ -0,0 +1,112 @@
 IdempotentCompounding
 *Kaizen Agentic Philosophy*
 # IdempotentCompounding — a primer
 Definition (one-liner): Build and evolve systems by idempotent automation (safe to run repeatedly) and compounding increments (small units that add durable value), governed by outcomes and quality gates.
 ## Core principles
 - Idempotence by default — Every operation (provision, deploy, migrate, refactor) is safe to re-run; desired state > imperative steps.
 - Compound value, not complexity — Ship small, composable capability units that stack cleanly and raise the baseline.
 - Evidence over intention — Each increment must declare its value metric and show before/after.
 - Reversibility — Fast rollback/roll-forward; changes are sliceable and isolated.
 - Sustainability as a constraint — Optimize for maintainability, cost, energy, and human time.
 - Quality is automated — Tests, checks, and drift detection run continuously, not occasionally.
 - Documentation is generated — Architecture, runbooks, and changelogs are derived from code & traces, then curated.
 ## The operating cycle (repeatable)
 Select → Specify → Safeguard → Apply → Verify → Record
 - Select a high-ROI increment (hotspot × business value).
 - Specify desired state (declarative spec, schema, or refactor objective).
 - Safeguard with idempotent checks: contract tests, drift monitors, health probes.
 - Apply via automation (IaC, pipelines, codemods) — re-runnable end-to-end.
 - Verify outcomes (SLOs, cost, complexity, security).
 - Record: update arc42 views, ADR, and the quality dashboard.
 Rule of thumb: if you’re afraid to re-run it, it’s not IdempotentCompounding yet.
 ## Units of change (the “compounders”)
 - Infra Module (e.g., Terraform/Kubernetes object)
 - Service Capability (feature flag, API slice)
 - Quality Guide Move (codemod + lint rule)
 - Data Contract (schema + migration + validation)
 - Ops Control (SLO, alert, autoscaler policy)
 Each unit carries:
 - Spec (YAML/DSL + schema)
 - Guards (tests/checks)
 - KPIs (value metric)
 - Rollbacks (delete/replace plan)
 - Docs hooks (arc42/ADR update hints)
 ## Minimal guardrails
 - Idempotence test: run the job twice; expect no diff.
 - Blast-radius cap: feature flags, canaries, or scoped namespaces.
 - Drift sentry: reconcile loop or plan delta must be ≈0 after apply.
 - Budget bound: change must not breach cost/latency/error-rate budgets.
 - Timebox: if verification can’t prove value in X hours, revert or park.
 ## Metrics that matter
 - Value: SLO attainment, cycle time, revenue/usage lift, defect escape rate ↓
 - Quality: Maintainability index ↑, complexity/duplication ↓, DoC compliance %
 - Sustainability: € per request ↓, watts per request ↓, toil hours ↓
 - Reliability: MTTR ↓, change failure rate ↓, successful re-runs % = 100
 ## Tooling patterns (typical stack)
 - Desired state: Terraform/Pulumi, Helm/Kustomize, GitOps (Argo CD/Flux)
 - Idempotent app changes: migration frameworks, codemods (libCST/jscodeshift), OpenRewrite
 - Verification: contract & golden tests; load tests in CI for hot paths
 - Observability: traces/metrics feeding fitness functions in pipelines
 - Docs: Structurizr/PlantUML generated from code + traces; ADRs as code
 How it fits the AbcdekGuidance Practice
 A — Architecture (ArcFortyTwo): auto-generate views; define fitness functions (what “value” means).
 B — Build (SafetyNetTests): make verification idempotent; contract tests become guards.
 C — Clean (CleanByDefinition): encode idempotence rules (no side-effect scripts, reversible migrations).
 D — Direction (GamePlan): prioritize compounders with best value/effort ratio.
 E — Evolve (RefactoringLoop): codemods + tests prove idempotent repeats and measurable deltas.
 K — KeepClean (Kaizen): weekly trend checks; drift/DoC gates keep compounded value from decaying.
 Templates (drop-in)
 1) Value Ticket (per increment)
 Title: <Increment name>
 Desired State: <declarative spec or target structure>
 Guards: <tests/checks to pass; idempotence proof: re-run yields no diff>
 Value Metric: <name, baseline, target, window>
 Rollback: <how to undo safely>
 Docs Hooks: <arc42 sections / ADR to update>
 Owner / ETA:
 2) Idempotence Checklist
 [ ] Declarative spec exists
 [ ] Dry-run/plan produces stable diff
 [ ] Double-apply yields zero change
 [ ] Safe to parallelize or properly serialized
 [ ] Idempotent cleanup (delete/apply symmetry)
 ## Example (brief)
 - Goal: Reduce p95 latency by 20% on /checkout.
 - Compounder: Add Redis read-through cache for product lookups (Helm values + code toggle).
 - Guards: Contract tests for /checkout; load test at 95th percentile; drift check on Helm release.
 - Apply: helm upgrade (safe to re-run), feature flag rollout 10%→100%.
 - Verify: p95 from 480 ms → 360 ms (7-day window), error rate unchanged.
 - Record: ADR-012, arc42 runtime view updated, dashboard shows value trend.
 ## Bottom line 
 IdempotentCompounding turns improvement into a safe, repeatable habit: every step is re-runnable, every change compounds value, and every gain is proven.
 xxx
--- a/wiki/KaiPersonal.md
+++ b/wiki/KaiPersonal.md
@@ -0,0 +1,7 @@
 KaiPersonal
 *Kaizen Personal Assistent Framework*
 A framework to set up, use and improve personal assistents based on agentic ai in your daily life that keeps you in charge of your data and organization without any vendor lock in.
 xxx
--- a/wiki/KaizenAgentTemplate.md
+++ b/wiki/KaizenAgentTemplate.md
@@ -0,0 +1,169 @@
 KaizenAgentTemplate
 *This is where we build from*
 # KaizenAgent Definition Template
 # Version: 1.0
 # Last Updated: {timestamp}
 agent:
  name: "{agent_name}"
  version: "1.0.0"
  description: "Brief description of agent's primary responsibility"
  # Core Specification
  specification:
    purpose: |
      One paragraph describing the agent's single responsibility.
      Focus on the desired outcome, not implementation details.
    triggers:
      # When should this agent be invoked?
      patterns:
        - "File patterns that indicate this agent should run"
        - "Keywords or context clues in requests"
        - "Project states that require this agent"
      explicit_commands:
        - "--agent={agent_name}"
        - "claude code --{shorthand}"
    inputs:
      required:
        - name: "input_name"
          type: "string|array|object"
          description: "What this input represents"
      optional:
        - name: "optional_input"
          type: "string"
          default: "default_value"
          description: "Optional configuration"
    outputs:
      primary:
        type: "file|stdout|metadata"
        description: "Main deliverable of the agent"
      side_effects:
        - "Any files created or modified"
        - "External systems touched"
        - "State changes made"
    preconditions:
      - "Conditions that must be true before agent runs"
      - "Dependencies that must exist"
    postconditions:
      - "Guaranteed state after successful execution"
      - "Invariants that will be maintained"
  # Idempotency Design
  idempotency:
    strategy: "convergent|checkpoint|fingerprint|state_detection"
    state_detection:
      method: "How to check if work is already done"
      implementation: |
        # Pseudo-code or description of how to detect current state
        check_current_state()
        if (desired_state_achieved()) return current_state
        proceed_with_transformation()
    rollback:
      supported: true
      method: "How to undo changes if needed"
  # Performance Measurement
  metrics:
    primary:
      name: "primary_success_metric"
      description: "Most important measure of agent success"
      measurement: "How to calculate this metric"
      target: "Desired value or improvement threshold"
    secondary:
      - name: "additional_metric_1"
        description: "Secondary success indicator"
        measurement: "Calculation method"
      - name: "additional_metric_2" 
        description: "Quality or safety metric"
        measurement: "How to measure"
    collection:
      frequency: "per_execution|daily|weekly"
      storage: "where_metrics_are_stored"
      retention: "how_long_to_keep_data"
  # Testing and Validation
  testing:
    unit_tests:
      - scenario: "Test scenario description"
        input: "Sample input data"
        expected_output: "Expected result"
        verification: "How to verify success"
    integration_tests:
      - scenario: "End-to-end test scenario"
        setup: "Required project state"
        execution: "Commands to run"
        validation: "Success criteria"
    performance_tests:
      - scenario: "Performance test case"
        load: "Input complexity/size"
        max_time: "Acceptable execution time"
        resource_limits: "Memory/CPU constraints"
  # Dependencies and Context
  dependencies:
    system:
      - "Required tools or binaries"
      - "Environment variables needed"
    project:
      - "Files that must exist"
      - "Project structure assumptions"
    other_agents:
      - name: "dependency_agent"
        relationship: "runs_before|runs_after|collaborates"
        reason: "Why this dependency exists"
  # Configuration
  configuration:
    defaults:
      key1: "default_value1"
      key2: "default_value2"
    project_overrides:
      path: ".kaizen/agents/{agent_name}.yml"
      schema: "JSON schema for configuration validation"
    environment_variables:
      - name: "KAIZEN_{AGENT_NAME}_CONFIG"
        description: "Runtime configuration override"
  # Evolution Tracking
  optimization:
    baseline_performance:
      established: "{date}"
      metrics: {}
    improvement_history:
      - version: "1.0.1"
        change: "Description of what was modified"
        reason: "Why the change was made"
        impact: "Measured improvement"
    known_limitations:
      - "Current limitation 1"
      - "Area for future improvement"
    kaizen_notes:
      optimization_priority: "high|medium|low"
      next_experiment: "Planned improvement to test"
      success_criteria: "How to measure if experiment succeeded"
 xxx
--- a/wiki/KaizenAgentic.md
+++ b/wiki/KaizenAgentic.md
@@ -0,0 +1,8 @@
 KaizenAgentic
 *The digital talent agency*
 KaizenAgentic is a digital talent agency for AI coding agents. We apply the principle of kaizen—continuous improvement—to agent design, transforming coding subagents into evolving digital talents that get measurably better over time.
 xxx
--- a/wiki/KaizenAgenticIdea.md
+++ b/wiki/KaizenAgenticIdea.md
@@ -0,0 +1,33 @@
 KaizenAgenticIdea
 *How it started*
 # Overview 
 KaizenAgentic provides a meta-optimization framework for continuously improving AI coding subagents through data-driven iteration. Rather than treating agent development as a one-time engineering task, KaizenAgent establishes a systematic approach to refine and evolve coding assistants based on their real-world performance. 
 # Core Philosophy 
 The project embraces the Japanese concept of "kaizen" (continuous improvement) applied to AI agent development. Every coding subagent becomes part of an optimization loop where performance is measured, patterns are analyzed, and specifications are refined over time. 
 # Key Components 
 Performance-Driven Subagents: Each coding subagent includes built-in metrics that capture meaningful performance indicators - test coverage improvements, code quality deltas, maintenance burden, and developer experience metrics. 
 ## Meta-Optimization Engine: 
 A specialized KaizenAgent that analyzes subagent performance history, identifies improvement opportunities, and proposes specification refinements through systematic pattern analysis. Evolutionary Architecture: Agent specifications are treated as versioned, testable code that can be A/B tested, rolled back, and iteratively improved based on empirical evidence rather than intuition. 
 ## Design Principles Measurable by Default: 
 Every subagent must define at least one quantitative success metric Idempotent Operations: All agent actions are designed to converge on desired states rather than perform blind transformations 
 ## Separation of Concerns: 
 Clear boundaries between task-specific subagents, performance measurement, and optimization logic 
 ##Test-First Development: 
 Agent improvements are validated through controlled experiments before deployment 
 # Target Use Case 
 Designing and optimizing agents for coding task, that can be used with claude, cursor and other coding agent systems to improve software development workflows, enabling coding assistants that genuinely improve over time through systematic observation and refinement of their real-world effectiveness.
 xxx
--- a/wiki/KaizenAgenticMission.md
+++ b/wiki/KaizenAgenticMission.md
@@ -0,0 +1,70 @@
 KaizenAgenticMission
 *Kaizen Agentic in a nutshell*
 # KaizenAgentic
 **Mission**
 KaizenAgentic is a digital talent agency for AI coding agents. We apply the principle of *kaizen*—continuous improvement—to agent design, transforming coding subagents into evolving digital talents that get measurably better over time.
 ---
 ## Overview
 Traditional AI agent development is often a one-off engineering project: design once, deploy, and hope it works. KaizenAgentic takes a different path. We provide a **meta-optimization framework** that treats agent development as an iterative lifecycle. Every coding subagent is continuously observed, evaluated, and refined based on real-world performance data.
 ---
 ## Core Philosophy
 * **Continuous Improvement**: Inspired by Japanese kaizen, every agent is part of an optimization loop.
 * **Data-Driven Evolution**: Decisions are grounded in metrics, not intuition.
 * **Systematic Refinement**: Performance history, usage patterns, and experimental results drive specification updates.
 ---
 ## Key Components
 ### 1. Performance-Driven Subagents
 Every coding subagent comes with **built-in metrics**:
 * Test coverage improvements
 * Code quality deltas
 * Maintenance burden
 * Developer experience signals
 ### 2. Meta-Optimization Engine
 A specialized **KaizenAgent** analyzes performance logs, spots recurring improvement opportunities, and proposes refined specifications.
 ### 3. Evolutionary Architecture
 Agent specifications are treated like code:
 * Versioned
 * Testable
 * A/B tested in real workflows
 * Rollback-ready
 ---
 ## Design Principles
 * **Measurable by Default**: Every subagent defines quantitative success criteria.
 * **Idempotent Operations**: Actions converge toward desired states instead of introducing uncontrolled drift.
 * **Separation of Concerns**: Subagents focus on tasks; optimization logic stays independent.
 * **Test-First Improvement**: New refinements are validated through controlled experiments before rollout.
 ---
 ## Target Use Case
 KaizenAgentic focuses on **coding task optimization**. Our refined subagents integrate with platforms like **Claude**, **Cursor**, and other coding assistant ecosystems. The result: coding assistants that don’t stagnate but **improve with use**, enabling better workflows, higher code quality, and reduced developer friction.
 ---
 👉 Think of **KaizenAgentic** as the **talent agency for digital coders**—a place where AI subagents aren’t static tools but *living talents*, continuously coached, measured, and refined for peak performance.
 xxx
--- a/wiki/KaizenBackground.md
+++ b/wiki/KaizenBackground.md
@@ -0,0 +1,27 @@
 KaizenBackground
 *Continuous Improvement Methods and Applications*
 Kaizen is a Japanese business philosophy and methodology focused on continuous improvement through small, incremental changes that involve everyone in an organization, from executives to frontline workers. The term comes from the Japanese words "kai" (change) and "zen" (good), together meaning "change for the better" or simply "improvement".
 ## Core Principles
 Kaizen aims to improve efficiency, increase productivity, reduce waste, and enhance quality by encouraging regular, everyday improvements. It relies on cooperation, employee empowerment, and commitment at all levels rather than imposing top-down or radical changes.
 ## Applications
 Kaizen began as an industrial practice in post-WWII Japan, notably within the Toyota Production System, and has since spread worldwide to industries far beyond manufacturing, including healthcare, software development, and service sectors.
 ## Methodology
 Key elements of the Kaizen approach include:
 - Encouraging all employees to identify and suggest improvements.
 - Using systematic cycles like the PDCA (Plan, Do, Check, Act) method to implement and review changes.
 - Focusing on standardized work processes that evolve based on new improvements.
 - Application of the “5S” system for workplace organization: Sort, Set in Order, Shine, Standardize, Sustain.
 Kaizen’s philosophy of continuous, collective improvement remains foundational in modern lean management, helping organizations enhance productivity, quality, and workplace culture.
 xxx
--- a/wiki/KaizenDefinitionOfClean.md
+++ b/wiki/KaizenDefinitionOfClean.md
@@ -0,0 +1,7 @@
 KaizenDefinitionOfClean
 *Keep your codebase in shape*
 The KaizenDefinitionOfClean specifies the target state for matching KaizenGuidance and can be used to diagnose and restore deviations from guidance regularly.
 xxx
--- a/wiki/KaizenGameplan.md
+++ b/wiki/KaizenGameplan.md
@@ -0,0 +1,7 @@
 KaizenGameplan
 *Optimize your codebase *
 The gameplan describes what to do in your specific codebase to implement a KaizenGuidance document.
 xxx
--- a/wiki/KaizenGuidance.md
+++ b/wiki/KaizenGuidance.md
@@ -0,0 +1,65 @@
 KaizenGuidance
 *Codebase improvement programs*
 A curated, language-agnostic library of Code Quality Guides where each guide is:
 - Readable for humans,
 - Checkable by linters/static analyzers,
 - Refactorable by codemods/agents,
 - Measurable with before/after quality metrics.
 Think “Clean Code + MISRA precision + Sonar/ESLint automation + AI codemods.”
 See also: https://chatgpt.com/share/68d6b45b-17f8-8009-8d15-c174f53d2591
 ## Guide anatomy (single source of truth)
 - Each guide lives as a versioned folder containing:
 - A manifest (machine-readable spec)
 - A narrative (rationale, trade-offs, examples)
 - Checks (lint/static analysis mappings)
 - Refactors (codemods, recipes, prompts)
 - Tests (fixtures + expected diffs)
 - Metrics (what ‘better’ means)
 ## Rule expression & execution pipeline
 a) Parse → Check → Plan → Refactor → Test → Measure → Report
 - Parse: build AST/index (libcst for Py, ts-morph/jscodeshift for TS/JS, OpenRewrite for Java, Clang-Tidy/LibTooling for C/C++).
 - Check: run native linters + Semgrep queries from guide.yaml (unified output schema).
 - Plan: produce a Change Plan (JSON) listing targets & suggested transforms.
 - Refactor: deterministic codemods first; ambiguous edits delegated to an Agent with a strict prompt & test harness.
 - Test: run unit tests + mutation tests (where available).
 - Measure: compute deltas for maintainability index (MI), cyclomatic complexity, duplication, lint issues, “hotspot*rule” intersections (code churn × smells).
 - Report: markdown/HTML summary + SARIF for code scanning.
 ## Example guides (initial catalog)
 1. API Design
 - Avoid boolean “success” returns (above)
 - Prefer narrow, explicit exceptions
 - Make side effects explicit (naming & module boundaries)
 2. Readability & Structure
 - Function length & parameter count thresholds (with exceptions mechanism)
 - Cohesion over convenience: one reason to change (SRP pragmatically)
 - Replace “god module/class” with feature modules
 3. Testing & Contracts
 - Fast tests default; slow/flaky quarantined
 - Golden tests for parsers/formatters
 - Pre/postconditions via lightweight asserts or type contracts
 4. Performance-safe Patterns
 - Avoid N+1 queries (framework-specific codemods)
 - Replace quadratic hot-loops with map/join or indexed lookups
 - Lazy vs eager boundaries (measurable)
 5. Security & Robustness
 - Input validation at boundaries (web/cli)
 - No raw SQL without parameterization
 - Secrets/config separation; env-based wiring
 Each guide ships checks + codemods + agent prompts + metrics.
 xxx
--- a/wiki/KaizenPrompting.md
+++ b/wiki/KaizenPrompting.md
@@ -0,0 +1,115 @@
 KaizenPrompting
 *Continuous Improvement for Agentic AI*
 Introduction to “Kaizen-style continuous improvement in agentic AI”, using the triad **PromptIdea → PromptExperiment → PromptMantra** to structure the evolution of prompting into agent design.
 ---
 # 🧭 Kaizen Prompting: Continuous Improvement for Agentic AI
 In the early days of AI interaction, prompts were static instructions — short messages carefully crafted to evoke a single response.
 With the rise of **agentic AI**, this changed: instead of merely producing text, models began to act — reasoning over time, calling tools, and improving through feedback.
 This shift calls for a *Kaizen approach* to prompting — one that treats prompt creation as an iterative, measurable learning process that leads from inspiration to embodied behavior.
 ---
 ## 🌱 Stage 1 — **Prompt Idea**
 *Seed of curiosity and exploration.*
 A **Prompt Idea** is a raw spark — a line, a phrasing, or a structure that captures a potentially useful interaction pattern.
 It may come from social feeds, community examples, or spontaneous inspiration.
 The Kaizen principle here is *observation*: collect without judgment, record quickly, and tag loosely for later exploration.
 **Purpose:**
 * Capture inspiration and context.
 * Encourage curiosity and variation.
 * Build a living repository of possibilities.
 **Typical contents:**
 * Source (where it was found)
 * Quick notes on intent or tone
 * Tags for domain or behavior
 ---
 ## ⚗️ Stage 2 — **Prompt Experiment**
 *Form and deliberate practice.*
 A **Prompt Experiment** is a tested and refined version of a Prompt Idea.
 Here, the practitioner engages in iterative cycles: run, observe, adjust.
 The focus is on learning what works — how the system reacts, how stable results are, how cost and success relate.
 Each experiment is documented with outcomes and metrics.
 **Purpose:**
 * Establish a reproducible form.
 * Identify influencing factors (phrasing, context, role).
 * Gather data for cost, stability, and satisfaction.
 **Kaizen loop:**
 > Plan → Try → Observe → Reflect → Adjust
 **Metrics to track:**
 * Consistency of output
 * Token or time cost
 * User satisfaction or success rate
 ---
 ## 🔮 Stage 3 — **Prompt Mantra**
 *Flow and embodiment.*
 When a Prompt Experiment consistently evokes a desirable pattern of behavior — accuracy, empathy, clarity, creativity — it graduates into a **Prompt Mantra**.
 A Prompt Mantra is no longer an instruction but a *behavioral seed*: a concise invocation that activates a known agentic quality.
 It can be embedded in agents as a reusable behavioral modulator — just as humans use mantras to evoke particular mindsets.
 **Purpose:**
 * Encode repeatable, high-value behavior.
 * Provide building blocks for agent personalities.
 * Allow measurable, reflective improvement loops.
 **Example:**
 > **Name:** Truth Herald
 > **Essence:** “Speak not to persuade, but to awaken.”
 > **Effect:** Encourages fact-based, empathic communication in analysis agents.
 > **Metrics:** Invocation count, success ratio, refinement index.
 ---
 ## 🌀 Integrating the Three Stages into Agentic Kaizen
 1. **Collect Prompt Ideas** → continuously harvest inspiration.
 2. **Run Prompt Experiments** → test, measure, and reflect.
 3. **Distill Prompt Mantras** → preserve what consistently works.
 4. **Deploy in Agents** → use Mantras as behavioral components.
 5. **Reflect and Adapt** → measure results, refine or retire Mantras.
 This process forms a **Kaizen Loop of behavioral prompting**, where every agent — and every human behind it — grows more capable, intentional, and aligned through continuous feedback.
 ---
 ### ✳️ Summary
 | Stage                 | Symbolic Role | Focus                  | Outcome                            |
 | --------------------- | ------------- | ---------------------- | ---------------------------------- |
 | **Prompt Idea**       | *Seed*        | Curiosity, observation | Inspiration for exploration        |
 | **Prompt Experiment** | *Form*        | Practice, feedback     | Reliable, measured prompt behavior |
 | **Prompt Mantra**     | *Flow*        | Embodiment, invocation | Reusable agentic quality           |
 ---
 By treating prompts not as fixed spells but as evolving practices, **Kaizen Prompting** turns AI interaction into a living craft — one of mindful iteration, measurement, and improvement.
 Through *PromptIdeas*, *PromptExperiments*, and *PromptMantras*, we cultivate agents that don’t just perform — they learn, refine, and flow.
 xxx
--- a/wiki/PricingModel.md
+++ b/wiki/PricingModel.md
@@ -0,0 +1,65 @@
 PricingModel
 *Pricing for Kaizen agents*
 # KaizenAgentic Pricing Model
 ### 1. Base Assumption
 * **Token Cost (C):** The unit price per token for the underlying foundation model (e.g., OpenAI GPT-4o, Anthropic Claude, etc.).
 * KaizenAgentic charges are always calculated as a multiple of this base token cost.
 ---
 ### 2. Capability Multipliers
 Each subagent is classified by its **capability tier**, which reflects complexity, optimization overhead, and real-world utility.
 | **Tier** | **Agent Capability**   | **Multiplier (x)** | **Example Use Case**                                           |
 | -------- | ---------------------- | ------------------ | -------------------------------------------------------------- |
 | 1x       | Baseline wrapper agent | 1×                 | Simple automation around base LLM calls                        |
 | 2x       | Enhanced agent         | 2×                 | Adds logging, minimal optimization, lightweight feedback loops |
 | 3x       | Professional agent     | 3×                 | Integrated metrics, test coverage deltas, developer UX signals |
 | 4x       | Expert agent           | 4×                 | Adaptive refinement, A/B testing, rollback mechanisms          |
 | 5x       | KaizenAgent premium    | 5×                 | Full meta-optimization loop, cross-subagent orchestration      |
 ---
 ### 3. Pricing Formula
 $$
 \text{KaizenAgentic Price per Token} = C \times M
 $$
 Where:
 * **C** = cost per token of the underlying LLM
 * **M** = capability multiplier (2x–5x)
 Example:
 * GPT-4o base token = $0.01 / 1K tokens
 * KaizenAgent Premium (5x) = $0.05 / 1K tokens
 ---
 ### 4. Service Tiers
 On top of token-based billing, KaizenAgentic can introduce **subscription layers** to cover operational support:
 * **Free Tier** → 1x baseline agents, capped usage, no optimization feedback.
 * **Pro Tier** → 2x–3x agents, includes monitoring dashboards.
 * **Enterprise Tier** → 4x–5x agents, includes dedicated KaizenAgent meta-optimization + SLAs.
 ---
 ### 5. Value Rationale
 * **Fair:** Always anchored in base token price (transparent to clients).
 * **Scalable:** Higher capability → higher multiplier → more value.
 * **Predictable:** Clients can forecast spend by capability tier, independent of vendor-specific LLM pricing changes.
 * **Flexible:** Basemodel transparent to avoid basemodel lockin supporting various providers (ChatGPT, Claude, Cursor, etc.).
 xxx
--- a/wiki/RecommendedRepositoryLayout.md
+++ b/wiki/RecommendedRepositoryLayout.md
@@ -0,0 +1,66 @@
 RecommendedRepositoryLayout
 *Pragmatic directory layout for dev projects*
 ## 📁 Recommended Repository Layout
 Adopting a consistent repository layout is essential for **collaboration**, **maintainability**, and **scalability**. A well-structured project allows developers to quickly understand the codebase, simplifies automation, and reduces time spent searching for files. This convention separates source code from other assets and organizes project files logically.
 ---
 ## 🌳 Core Directory Structure
 The following directories represent a standard, universal layout for most projects.
 * `**src/**`: Contains the **source code**—the core files of your application.
 * `**dist/**`: Holds the **compiled or minified code** ready for production deployment.
 * `**test/**`: A dedicated directory for all **unit, integration, and end-to-end tests**.
 * `**docs/**`: Stores all project **documentation**, including API guides and user manuals.
 * `**assets/**`: For **static assets** like images, fonts, and stylesheets.
 * `**vendor/**`: For **third-party libraries** not managed by a package manager.
 * `**lib/**`: For shared code and **libraries** created as part of the project.
 * `**bin/**`: Contains **executable scripts** for common tasks like setup, testing, or deployment.
 * `**.gitignore**` **and other dotfiles**: Essential configuration files that manage project-specific settings (e.g., Git ignores).
 ---
 ## 🗂️ A Deeper Dive: A Detailed Example
 For more complex projects, a **clean architecture** approach offers a robust and scalable structure. This example demonstrates how to organize a project within the `src/` directory to enforce separation of concerns. 
 * `**project_name/**`: The main package.
    * `**domain/**`: Houses the **core business logic** (models, entities) independent of any framework.
    * `**application/**`: Contains **services and use cases** that orchestrate the domain logic.
    * `**infrastructure/**`: Manages **external dependencies** like databases, third-party APIs, and logging.
    * `**interfaces/**`: Holds **user-facing interfaces**.
        * `**cli/**`: Logic for a command-line interface.
        * `**api/**`: **(Optional)** Logic for a web API.
    * `**shared/**`: Reusable utilities and types used across different layers.
 ---
 ## ⚙️ Root-Level Files and Directories
 The root of your repository should contain files and directories that provide high-level project information and setup instructions.
 * `**README.md**`: The primary documentation file for a project overview, installation, and usage.
 * `**LICENSE**`: Specifies the project's intellectual property license.
 * `**pyproject.toml**` **/** `**package.json**`: Defines project dependencies and configuration for package managers.
 * `**Makefile**` **/** `**justfile**`: A file for common development commands.
 * `**docs/**`: **(Recommended)** A top-level directory for all project documentation.
 * `**tests/**`: **(Recommended)** A top-level directory for all test files.
 ---
 ## 💡 Guiding Principles
 These rules explain the rationale behind this convention.
 * **Separation of Concerns**: The layout strictly separates source code (`src/`), documentation (`docs/`), and development tools (`tools/`) to improve clarity and maintainability.
 * **Encapsulation**: Moving logic to specific layers (`domain/`, `application/`) enforces a **clean architecture**, reducing dependencies and making the project easier to test.
 * **Idempotency**: This structure is predictable and repeatable, ensuring that creating a new project with this convention always yields a consistent result.
 * **Extensibility**: The layout is easily extensible. New interfaces or tools can be added without disrupting the core structure.
 xxx
--- a/wiki/RevenueModel.md
+++ b/wiki/RevenueModel.md
@@ -0,0 +1,79 @@
 RevenueModel
 *Monetization concept*
 # KaizenAgentic Revenue Model
 How KaizenAgentic captures value on top of the raw token costs for LLM providers.
 ### 1. Cost Basis
 * **C** = token price of underlying model (e.g. GPT-4o, Claude 3, etc.).
 * This is the *direct variable cost* passed through from the model vendor.
 ---
 ### 2. Markup via Capability Multipliers
 * KaizenAgentic defines capability tiers (2x–5x).
 * **Markup = Capability Multiplier – 1**
  * Example: 3x agent = 200% markup over base cost.
 ---
 ### 3. Gross Margin Structure
 | **Tier** | **Customer Price** | **Vendor Cost** | **KaizenAgentic Revenue (Gross Margin)** |
 | -------- | ------------------ | --------------- | ---------------------------------------- |
 | 2x Agent | 2C                 | C               | C (50% margin)                           |
 | 3x Agent | 3C                 | C               | 2C (66% margin)                          |
 | 4x Agent | 4C                 | C               | 3C (75% margin)                          |
 | 5x Agent | 5C                 | C               | 4C (80% margin)                          |
 Margins increase with capability tier → incentivizing customers to upgrade.
 ---
 ### 4. Additional Revenue Streams
 Beyond token usage markups:
 * **Subscription Access** (recurring):
  * Pro Tier (monthly): access to 2x–3x agents + monitoring dashboards.
  * Enterprise Tier (monthly/annual): 4x–5x agents + SLAs + private optimization loops.
 * **Professional Services**: Custom agent design, integration with developer workflows, consulting.
 * **Data Insights**: Aggregated anonymized performance benchmarks offered as an add-on (optional).
 ---
 ### 5. Example Economics
 Assume:
 * GPT-4o cost = $0.01 / 1K tokens
 * Customer runs 10M tokens / month with a 4x Agent
 * **Customer Price** = $0.04 × 10M = **$400**
 * **Vendor Cost** = $0.01 × 10M = **$100**
 * **Revenue (Gross Margin)** = **$300 (75%)**
 ---
 ### 6. Business Model Summary
 * **Transparent:** Customers always see pricing tied to base model cost.
 * **Scalable:** More usage → more revenue, with healthy margins.
 * **Tiered Value Capture:** Higher-capability agents capture proportionally more margin.
 * **Recurring Layer:** Subscriptions and enterprise add-ons stabilize revenue beyond token usage.
 ---
 👉 This makes KaizenAgentic operate like a **“talent agency margin model”**: you pay the “raw salary” (token cost to the model vendor), and KaizenAgentic earns its cut (markup × value of coaching/optimization).
 xxx
--- a/workplans/kaizen-agentic-WP-0001-community-engagement.md
+++ b/workplans/kaizen-agentic-WP-0001-community-engagement.md
@@ -4,17 +4,17 @@ type: workplan
 title: "Community Engagement and Advanced Automation (v1.1.0)"
 domain: custodian
 repo: kaizen-agentic
-status: active
+status: completed
 owner: kaizen-agentic
 topic_slug: custodian
 state_hub_workstream_id: a43e92af-1cb4-4c55-8b74-19588e0ded20
 created: "2026-03-18"
-updated: "2026-03-18"
+updated: "2026-06-18"
 ---
 # KAIZEN-WP-0001 — Community Engagement and Advanced Automation
-**Status:** active
+**Status:** completed
 **Owner:** kaizen-agentic
 **Repo:** kaizen-agentic
 **Target version:** 1.1.0
@@ -28,23 +28,23 @@ to make kaizen-agentic easier to adopt, contribute to, and operate reliably.
 ### To Add
- [ ] T01 — Developer feedback mechanisms for easy repo user feedback collection
+- [x] T01 — Developer feedback mechanisms for easy repo user feedback collection
- [ ] T02 — Pre-commit hooks for automated code quality checks
+- [x] T02 — Pre-commit hooks for automated code quality checks
- [ ] T03 — CI/CD pipeline configuration for automated testing and deployment
+- [x] T03 — CI/CD pipeline configuration for automated testing and deployment
- [ ] T04 — Usage analytics and telemetry for agent effectiveness tracking
+- [x] T04 — Usage analytics and telemetry for agent effectiveness tracking
 ### To Refactor
- [ ] T05 — Enhanced error handling in CLI with more informative messages
+- [x] T05 — Enhanced error handling in CLI with more informative messages
- [ ] T06 — Performance optimization for large project installations
+- [x] T06 — Performance optimization for large project installations (lazy registry index, path-based install copy, makefile tab fix)
 ### To Fix
- [ ] T07 — Cross-platform compatibility testing and fixes for Windows/macOS
+- [x] T07 — Cross-platform compatibility testing and fixes for Windows/macOS
 ### To Remove
- [ ] T08 — Remove remaining development scaffolding or temporary files
+- [x] T08 — Remove remaining development scaffolding or temporary files
 ## State Hub Task IDs
--- a/workplans/kaizen-agentic-WP-0003-measurement-loop.md
+++ b/workplans/kaizen-agentic-WP-0003-measurement-loop.md
@@ -0,0 +1,289 @@
 ---
 id: KAIZEN-WP-0003
 type: workplan
 title: "Measurement Loop: Metrics Convention, Collection, and Optimizer Integration"
 domain: custodian
 repo: kaizen-agentic
 status: completed
 owner: kaizen-agentic
 topic_slug: custodian
 state_hub_workstream_id: 36252a45-f360-4496-bf77-17b5dfb02767
 created: "2026-06-16"
 updated: "2026-06-18"
 ---
 # KAIZEN-WP-0003 — Measurement Loop: Metrics Convention, Collection, and Optimizer Integration
 **Status:** completed
 **Owner:** kaizen-agentic
 **Repo:** kaizen-agentic
 **Target version:** 1.1.0 (partial; remainder in WP-0001)
 ## Goal
 Close the kaizen feedback loop defined in `INTENT.md` and `wiki/AgentKaizenOptimizer.md`:
 agents produce **measurable, per-execution performance records** stored in project-scoped
 `.kaizen/metrics/`, the existing `OptimizationLoop` reads that data and generates
 evidence-based recommendations, and the Coach/optimizer meta-agents share a single
 improvement path.
 This workplan addresses the P0 gap from the INTENT gap analysis: strategic vision
 (memory + qualitative learning) exists; **quantitative measurement → refinement**
 does not.
 ---
 ## Background
 | Layer | State |
 |-------|-------|
 | `INTENT.md` | Requires measurable-by-default agents and evidence-based refinement |
 | `wiki/KaizenAgentTemplate.md` | Defines `metrics`, `idempotency`, `optimization` sections per agent |
 | `wiki/AgentKaizenOptimizer.md` | Specifies `.kaizen/metrics/` storage and optimizer behaviour |
 | `src/kaizen_agentic/optimization.py` | `OptimizationLoop` + `PerformanceMetrics` implemented, unit-tested, unwired |
 | Agency framework (WP-0002) | `.kaizen/agents/<name>/memory.md` + Coach brief — qualitative only |
 | WP-0001 T04 | Telemetry — overlaps; WP-0003 defines the convention; WP-0001 can adopt it |
 ---
 ## Part 1 — Metrics Convention and Storage
 Define the project-scoped metrics artifact alongside the existing memory convention
 (ADR-002).
 ### Location convention
 ```
 <project-root>/.kaizen/metrics/<agent-name>/
  executions.jsonl          # append-only per-execution records
  summary.json              # rolling aggregates (regenerated on write)
 ```
 Optimizer-specific aggregates (per `wiki/AgentKaizenOptimizer.md`):
 ```
 <project-root>/.kaizen/metrics/optimizer/
  analysis.json             # last run output + fingerprint
  recommendations.jsonl     # append-only recommendation history
 ```
 ### Execution record schema (minimum viable)
 ```json
 {
  "timestamp": "ISO-8601",
  "agent": "tdd-workflow",
  "session_id": "optional-uuid-or-hash",
  "execution_time_s": 0.0,
  "success": true,
  "quality_score": 0.0,
  "primary_metric": { "name": "...", "value": 0.0, "target": 0.0 },
  "metadata": {}
 }
 ```
 ### Tasks
 - [x] T01 — Write ADR-004: project metrics convention (location, schema, lifecycle, retention, Helix Forge correlation)
 - [x] T02 — Implement `MetricsStore` in `src/kaizen_agentic/metrics.py` (append, read, summarise, prune by retention)
 - [x] T03 — Add `memory init` hook to scaffold `.kaizen/metrics/<agent>/` alongside memory (optional flag `--no-metrics`)
 - [x] T04 — Unit tests for `MetricsStore` (append idempotency key, summary regeneration, retention prune)
 ### Definition of done
 - ADR-004 accepted and referenced from `docs/agency-framework.md`
 - `MetricsStore` passes unit tests
 - `kaizen-agentic memory init <agent>` creates metrics scaffold by default
 ---
 ## Part 2 — Metrics CLI
 Expose metrics collection and inspection without requiring Python imports in agent
 sessions.
 ### Commands
 ```
 kaizen-agentic metrics record <agent>   # Append one execution record (stdin JSON or flags)
 kaizen-agentic metrics show <agent>     # Print summary + recent executions
 kaizen-agentic metrics list             # List agents with metrics in current project
 kaizen-agentic metrics export <agent>   # Dump executions.jsonl to stdout
 ```
 ### Options (record)
 - `--target / -t` — project root (default: cwd)
 - `--success / --failure` — boolean outcome shorthand
 - `--time` — execution time in seconds
 - `--quality` — quality score 0.0–1.0
 - `--json` — full record on stdin
 ### Tasks
 - [x] T05 — Implement `metrics` CLI command group (record, show, list, export)
 - [x] T06 — Integrate `metrics record` into session-close protocol template for pilot agents
 - [x] T07 — CLI tests for metrics commands (click.testing, temp project dir)
 - [x] T08 — Update `docs/CLI_CHEAT_SHEET.md` and `docs/agency-framework.md` with metrics section
 ### Definition of done
 - All four metrics commands work against a test project with `.kaizen/metrics/`
 - Session-close template documents the `metrics record` one-liner for pilot agents
 - CLI cheat sheet updated
 ---
 ## Part 3 — Wire OptimizationLoop to Project Metrics
 Connect the existing Python optimization infrastructure to real project data.
 ### Tasks
 - [x] T09 — Add `OptimizationLoop.from_metrics_store(store)` factory that loads `PerformanceMetrics` from executions
 - [x] T10 — Implement `kaizen-agentic metrics optimize [agent]` — run analysis, print recommendations, write `optimizer/analysis.json`
 - [x] T11 — Consolidate `agent-optimization.md` and `agent-agent-optimization.md` into single canonical `optimization` agent; update registry
 - [x] T12 — Update `agent-optimization.md` session protocol to invoke `metrics optimize` and reference ADR-004
 - [x] T13 — Unit + integration tests: synthetic executions → recommendations → non-empty output
 ### Definition of done
 - `kaizen-agentic metrics optimize` produces recommendations when ≥10 execution records exist (per wiki minimum sample size)
 - Single canonical optimization meta-agent in registry
 - Tests cover insufficient-data and sufficient-data paths
 ---
 ## Part 4 — Bridge Coach, Memory, and Metrics
 Unify qualitative memory and quantitative metrics in the orientation path.
 ### Tasks
 - [x] T14 — Extend `memory brief` to include metrics summary for target agent (recent success rate, avg quality, trend arrow)
 - [x] T15 — Extend `agent-coach.md` to reference metrics context in synthesis instructions
 - [x] T16 — E2e test: populate memory + metrics for two agents → `memory brief` includes both qualitative and quantitative sections
 ### Definition of done
 - `memory brief tdd-workflow` output includes a `## Performance Summary` block when metrics exist
 - E2e test passes
 ---
 ## Part 5 — Pilot Agent and Template Conformance
 Prove the loop end-to-end on one agent before fleet-wide rollout.
 **Pilot agent:** `tdd-workflow` (high usage, clear success criteria in existing prompt)
 ### Tasks
 - [x] T17 — Add `metrics` section to `agent-tdd-workflow.md` frontmatter (primary: test-pass rate; secondary: cycle time)
 - [x] T18 — Add session-close step: invoke `kaizen-agentic metrics record tdd-workflow` with session outcome
 - [x] T19 — Document pilot in `wiki/AboutKaizenAgents.md` as reference implementation
 - [x] T20 — E2e test: two simulated tdd-workflow sessions → metrics accumulate → optimize produces recommendation
 ### Definition of done
 - tdd-workflow is the documented reference for metrics-enabled agents
 - Full loop demonstrated in e2e test: record → show → optimize → brief
 ---
 ## Part 6 — Packaging and Orientation
 Close distribution and documentation gaps surfaced in gap analysis.
 ### Tasks
 - [x] T21 — Sync missing 4 agents into `src/kaizen_agentic/data/agents/` (coach, sys-medic, scope-analyst, optimization)
 - [x] T22 — Update `README.md` Getting Oriented to link `INTENT.md` and `wiki/` (SCOPE.md already updated)
 - [x] T23 — Update `.claude/rules/architecture.md` agent table (20 agents, meta category, sys-medic, coach)
 - [x] T24 — CHANGELOG.md entry for metrics convention and CLI
 ### Definition of done
 - `pip install` / packaged data includes all 21 agents
 - README orientation path matches SCOPE.md
 - architecture.md agent count accurate
 ---
 ## Sequencing
 ```
 Part 1 (T01–T04)  ──→  Part 2 (T05–T08)  ──→  Part 3 (T09–T13)
                                                    │
                     Part 4 (T14–T16)  ←────────────┘
                            │
                     Part 5 (T17–T20)  ──→  Part 6 (T21–T24)
 ```
 Parts 1–2 are blocking. Part 3 depends on storage + CLI. Parts 4–5 can overlap
 once Part 3 factory exists. Part 6 can run in parallel except T21 (needs final
 agent consolidation from T11).
 Estimated effort: 4–6 sessions.
 ---
 ## Out of Scope (this workplan)
 - Full `wiki/KaizenAgentTemplate.md` conformance for all 21 agents (future workplan)
 - KaizenGuidance codemod pipeline (`wiki/KaizenGuidance.md`)
 - Scheduled/automated optimizer runs (cron, activity-core integration) — convention only
 - WP-0001 CI/CD, PyPI publication, cross-platform testing
 - ML-based pattern detection (pandas/sklearn in wiki spec) — simple statistics first
 ---
 ## Success Criteria
 A reader of `INTENT.md` can point to this repo and say:
 1. Agents **can** record measurable per-execution outcomes in a standard location.
 2. The optimization loop **does** read real project data and produce recommendations.
 3. Coach orientation **includes** performance context, not only qualitative memory.
 4. At least one agent (tdd-workflow) demonstrates the full measure → analyse → orient cycle.
 ---
 ## State Hub Task IDs
 | Code | UUID |
 |------|------|
 | T01 | 4e7b0fd2-38c0-46aa-84a7-bb18366b8c7c |
 | T02 | eeaa99c7-d7a7-403b-a013-364cba45a663 |
 | T03 | 247c097f-de89-4383-930c-35ee66de9b36 |
 | T04 | 3aa14026-6ee3-4384-b409-11300c1302f0 |
 | T05 | 6b505d29-7d2e-44a2-a4b7-1fe82884390c |
 | T06 | 84f2a357-f2dd-4fc7-96b6-a4e80d5467a7 |
 | T07 | 8e9ee64b-b7c4-4dff-ac6e-988fd47ef95d |
 | T08 | 4c41e0db-d5d8-4a1b-b346-06ad004edf4a |
 | T09 | 0b374439-6eca-4754-8e15-2a7eece0cd27 |
 | T10 | db87a09b-0252-495c-a771-a43b4b98f820 |
 | T11 | 73cb7d73-6fc6-42a9-97aa-d33cdf9ee363 |
 | T12 | c127eca7-7394-42db-ba5e-721aef0ccb76 |
 | T13 | f208dc9f-cdf7-47e3-9c03-09097e46eee9 |
 | T14 | d01f969c-bbb1-4eca-a4f1-d79d5c867b35 |
 | T15 | 67f791a4-fced-4986-a331-7eb4ea47fe6e |
 | T16 | 1fb89b54-8bd2-40bf-9a71-04693cb9f695 |
 | T17 | 1d471a7a-9a98-4805-903e-b4a2b8153717 |
 | T18 | abb387f1-86ce-4b9b-a516-2d4efb6aca4c |
 | T19 | 67fbc26e-a57d-4133-96e6-3d2cdbd10dc0 |
 | T20 | fbdd7c8b-e122-48d9-8c8f-de9f82d025e3 |
 | T21 | 9662bcec-34fe-451b-b61f-5d11b9574576 |
 | T22 | 422aae43-5697-4a00-86e9-1569baf09422 |
 | T23 | ba6b3411-d330-4a58-8cd0-62b4fbef8c5f |
 | T24 | 748be9f3-f6ac-4f26-a844-6330268935b6 |
 **Hub workstream:** `kaizen-wp-0003-measurement-loop` (`36252a45-f360-4496-bf77-17b5dfb02767`)
 ---
 ## Notes
 - Retention default: 180 days (per `wiki/AgentKaizenOptimizer.md`); override via project config in a later iteration
 - WP-0001 T04 (telemetry) should consume ADR-004 schema rather than inventing a parallel format
 - `OptimizationLoop` threshold constants (30s execution, 0.8 success rate) are starting points; expose in config later
--- a/workplans/kaizen-agentic-WP-0004-ecosystem-integration.md
+++ b/workplans/kaizen-agentic-WP-0004-ecosystem-integration.md
@@ -0,0 +1,190 @@
 ---
 id: KAIZEN-WP-0004
 type: workplan
 title: "Ecosystem Integration: Helix Forge, activity-core, and artifact-store"
 domain: custodian
 repo: kaizen-agentic
 status: completed
 owner: kaizen-agentic
 topic_slug: custodian
 state_hub_workstream_id: 76be7294-e201-4074-91c0-6421992470fe
 created: "2026-06-16"
 updated: "2026-06-18"
 ---
 # KAIZEN-WP-0004 — Ecosystem Integration: Helix Forge, activity-core, and artifact-store
 **Status:** completed
 **Owner:** kaizen-agentic
 **Repo:** kaizen-agentic
 **Depends on:** KAIZEN-WP-0003 Part 3 (metrics CLI + `metrics optimize` operational)
 ## Goal
 Compose KaizenAgentic with adjacent ecosystem repos so INTENT's measurement and
 improvement vision spans **project** and **fleet** layers without duplicating
 capabilities or violating repo boundaries.
 Primary integrations: **agentic-resources** (Helix Forge), **activity-core**
 (scheduled triggers), **artifact-store** (evidence retention). Secondary
 integrations (info-tech-canon, kontextual-engine) are Part 4 stretch goals.
 Reference: `wiki/EcosystemIntegration.md`, `history/2026-06-16-ecosystem-assessment.md`
 ---
 ## Part 1 — Helix Forge Correlation (agentic-resources)
 Wire project metrics (ADR-004) to fleet session metrics without re-implementing
 session ingestion.
 ### Tasks
 - [x] T01 — Document correlation contract in `agentic-resources` (cross-repo PR or shared doc link from both repos)
 - [x] T02 — Add optional `helix_session_uid` population to `metrics record` when env `HELIX_SESSION_UID` is set
 - [x] T03 — Add `kaizen-agentic metrics correlate` — lookup Helix digest summary by UID (read-only adapter stub if Helix API not ready)
 - [x] T04 — Integration test: synthetic project record with `helix_session_uid` round-trips through show/brief
 - [x] T05 — Update `wiki/EcosystemIntegration.md` with worked correlation example
 ### Definition of done
 - Project execution records can carry Helix correlation fields per ADR-004
 - Documentation is bidirectional (kaizen-agentic + agentic-resources reference each other)
 - No session JSONL ingestion code in kaizen-agentic
 ---
 ## Part 2 — activity-core Triggers
 Define ActivityDefinitions for recurring kaizen operations.
 ### Tasks
 - [x] T06 — Draft ActivityDefinition: weekly `metrics optimize` on repos with `.kaizen/metrics/`
 - [x] T07 — Draft ActivityDefinition: post-install metrics scaffold validation (`memory init` check)
 - [x] T08 — Draft ActivityDefinition: success_rate below 0.8 → issue-core review task
 - [x] T09 — Document ActivityDefinition paths and activation contract in `docs/INTEGRATION_PATTERNS.md`
 - [x] T10 — Smoke test: manual activation against a test repo with populated metrics
 ### Definition of done
 - Three ActivityDefinition markdown files committed (location per activity-core convention)
 - kaizen-agentic docs describe how activity-core triggers map to CLI commands
 - No scheduling code in kaizen-agentic
 ---
 ## Part 3 — artifact-store Evidence Retention
 Persist optimizer outputs as registered artifact packages.
 ### Tasks
 - [x] T11 — Define artifact package manifest for optimizer run (`analysis.json` + `recommendations.jsonl`)
 - [x] T12 — Add `kaizen-agentic metrics publish` — register optimizer output with artifact-store API (configurable endpoint)
 - [x] T13 — Map retention class `raw-evidence` (180d) in publish manifest metadata
 - [x] T14 — Integration test with artifact-store local backend (skip if service unavailable; mark `@pytest.mark.integration`)
 - [x] T15 — Document publish workflow in `docs/agency-framework.md` metrics section
 ### Definition of done
 - Optimizer outputs can be registered as artifact packages when artifact-store is reachable
 - Retention metadata matches ADR-004 default
 - Publish is optional — local-only workflows still work without artifact-store
 ---
 ## Part 4 — Canon and Knowledge (stretch)
 Secondary integrations for template conformance and knowledge asset lifecycle.
 ### Tasks
 - [x] T16 — Map `wiki/KaizenAgentTemplate.md` sections to info-tech-canon profile outline (design doc only)
 - [x] T17 — Draft one InfoTechCanon-style agent brief for `tdd-workflow` pilot
 - [x] T18 — Spike: kontextual-engine ingestion manifest for `wiki/` directory (design note, no runtime dependency)
 - [x] T19 — Update `history/2026-06-16-ecosystem-assessment.md` with Part 4 outcomes
 ### Definition of done
 - Design artifacts committed; no hard dependency on info-tech-canon or kontextual-engine services
 - tdd-workflow brief serves as reference for fleet-wide brief rollout (future WP)
 ---
 ## Sequencing
 ```
 WP-0003 Part 3 complete
        │
        ▼
 Part 1 (T01–T05)  ──→  Part 2 (T06–T10)
        │                      │
        └──────────┬───────────┘
                   ▼
            Part 3 (T11–T15)
                   │
                   ▼
            Part 4 (T16–T19)  [stretch]
 ```
 Part 1 can start once `metrics record` and `metrics optimize` exist.
 Parts 2–3 can overlap. Part 4 is non-blocking.
 Estimated effort: 3–5 sessions after WP-0003 Part 3.
 ---
 ## Out of Scope
 - Cloning or implementing tele-mcp (assess separately)
 - phase-memory graph migration (future WP)
 - Full KaizenGuidance codemod pipeline
 - Owning activity-core, artifact-store, or agentic-resources code
 ---
 ## Success Criteria
 1. Two-layer measurement model is documented, implemented at correlation layer,
   and operable without repo merges.
 2. Recurring kaizen checks can be triggered via activity-core without custom cron.
 3. Optimizer evidence can be preserved in artifact-store when configured.
 4. Canon/knowledge integration has a clear design path for later work.
 ---
 ## State Hub Task IDs
 | Code | UUID |
 |------|------|
 | T01 | f365d19e-9619-4453-bebf-f1fd596b1bd1 |
 | T02 | e7f47683-5957-49db-bcbd-3aa47f44a073 |
 | T03 | 6ef8ba99-7d0c-44f4-835d-7a66e9d55984 |
 | T04 | 9875422c-a54b-40f1-a444-6b485a9e57d6 |
 | T05 | 0dc33d13-0e0b-4336-a7ad-371fc533b823 |
 | T06 | dbaa5f46-f66a-4a74-b4a0-97978e47d1c3 |
 | T07 | 161a264a-8f70-4e37-a854-bd5a76a0e54b |
 | T08 | 3b58ad38-839c-436a-8d97-ef5a8f9beefe |
 | T09 | a004b60f-4e8f-4881-b088-229ac9ab242f |
 | T10 | 84866bf1-5830-470d-87a5-9786222332c2 |
 | T11 | 033a19db-fbd2-411f-9d2e-779d210400d4 |
 | T12 | 54517f2b-23e3-433b-a483-c59227625dbc |
 | T13 | 3b378789-a761-4472-b072-a346541be239 |
 | T14 | a3566713-db58-4519-b9c4-5003421c1f1e |
 | T15 | 5d8255aa-fd7a-4fe6-bce2-3a176f954c7f |
 | T16 | 852c9cbf-0b0c-4f23-8594-905ca280c268 |
 | T17 | 62e05097-9033-401d-bbe0-d5d773da50fe |
 | T18 | cd6962c7-aaed-4d7d-81de-37c0e3ed715e |
 | T19 | 2c1f66f5-e6ab-4e19-88ca-818acb15a706 |
 **Hub workstream:** `kaizen-wp-0004-ecosystem-integration` (`76be7294-e201-4074-91c0-6421992470fe`)
 ---
 ## Notes
 - ADR-004 Helix Forge correlation section is the authoritative field mapping
 - WP-0001 T04 (telemetry) should evaluate tele-mcp as adapter candidate
 - activity-core ActivityDefinitions live in activity-core repo per ACT-ADR-002/003;
  kaizen-agentic commits reference copies or links under `docs/integrations/`
Author	SHA1	Message	Date
tegwick	68555ec2f1	fix: release-check lint fixes for 1.1.0 publish Some checks failed ci / test (3.10) (push) Has been cancelled Details ci / test (3.12) (push) Has been cancelled Details Wrap long lines for flake8, rename extensions remove command handler to avoid Click shadowing, and drop unused migration imports.	2026-06-16 02:14:07 +02:00
tegwick	22ee93e125	WP-0001 complete: v1.1.0 lazy registry and install performance Some checks failed ci / test (3.12) (push) Has been cancelled Details ci / test (3.10) (push) Has been cancelled Details Lazy-load agent registry (frontmatter index, parse on demand), copy agents by path during install, fix Makefile template tab lint issue, add registry performance tests, bump to 1.1.0, document CLI reinstall after pull.	2026-06-16 02:06:43 +02:00
tegwick	80c60ebd7a	WP-0001: feedback channels, CI, pre-commit, telemetry docs Some checks failed ci / test (3.12) (push) Has been cancelled Details ci / test (3.10) (push) Has been cancelled Details Add kaizen-agentic feedback CLI, Gitea issue templates, CI workflow, pre-commit hooks, FEEDBACK/TELEMETRY docs, and cross-platform path tests. Improve CLI registry error messages; remove agents_backup scaffolding. Apply black formatting across src/tests for CI consistency. State Hub message sent to agentic-resources for Helix correlation doc link.	2026-06-16 01:58:07 +02:00
tegwick	79883aa25b	Add capability registry scaffold (REUSE-WP-0014-T05 B03)	2026-06-16 01:53:56 +02:00
tegwick	b48a2102d7	WP-0004: ecosystem integration complete Add Helix Forge correlation (HELIX_SESSION_UID env, metrics correlate), artifact-store publish (metrics publish), activity-core ActivityDefinition references, integration patterns docs, and canon/knowledge design artifacts.	2026-06-16 01:53:01 +02:00
tegwick	4a9c2d9bea	WP-0003 Part 6: packaging sync and docs close-out Sync coach, sys-medic, scope-analyst, optimization, and updated tdd-workflow to packaged data (20 agents). Update architecture.md, README orientation, and CHANGELOG for the metrics loop. Mark WP-0003 completed.	2026-06-16 01:49:27 +02:00
tegwick	fd2edfbe6c	WP-0003 Part 5: tdd-workflow metrics pilot Add metrics frontmatter and session-close recording to tdd-workflow, document the reference implementation in wiki/AboutKaizenAgents.md, and add an e2e test covering record → show → optimize → brief.	2026-06-16 01:48:43 +02:00
tegwick	04fdc249f5	Bridge Coach memory brief with project metrics summaries. Add Performance Summary block to memory brief, document metrics synthesis in agent-coach, and add e2e and CLI tests for qualitative plus quantitative briefs.	2026-06-16 01:46:51 +02:00
tegwick	2711a3ebcc	Wire OptimizationLoop to project metrics and add metrics optimize. Add from_metrics_store factory, OptimizerStore persistence, metrics optimize CLI, consolidate duplicate optimization agent, and add integration tests.	2026-06-16 01:41:26 +02:00
tegwick	97b7eb8cba	Add metrics CLI for project-scoped agent performance records. Implement record, show, list, and export commands; document session-close protocol template; extend cheat sheet and agency-framework docs; add CLI tests.	2026-06-16 01:38:42 +02:00
tegwick	5cd3da3166	Implement MetricsStore for project-scoped agent metrics. Add ADR-004 storage layer with append-only executions, summary regeneration, idempotency keys, and retention pruning. Wire memory init to scaffold .kaizen/metrics/ by default and add unit tests.	2026-06-16 01:35:27 +02:00
tegwick	bd74d7d122	Document measurement loop plan and ecosystem integration strategy. Persist INTENT and ecosystem assessments in history/, add ADR-004 for project metrics with Helix Forge correlation, and register WP-0003 and WP-0004 workplans with State Hub. Update SCOPE, README, and agency-framework docs to reflect the two-layer measurement model.	2026-06-16 01:34:13 +02:00
tegwick	71ef5f4734	Added project documentation in wiki and established INTENT.md	2026-06-16 00:58:43 +02:00
tegwick	95b729cc53	feat(agents): add Provided Capabilities section to scope-analyst template Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-25 00:06:56 +01:00
Bernd Worsch	0a228826fb	feat(agents): add optimization meta-agent and ignore backup dirs Add agents/agent-optimization.md — the Kaizen Optimizer meta-agent for analyzing and improving agent performance. Also update .gitignore to suppress agents_backup_*/ directories produced by optimization scripts. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-19 00:31:45 +00:00