Document measurement loop plan and ecosystem integration strategy.
Persist INTENT and ecosystem assessments in history/, add ADR-004 for project metrics with Helix Forge correlation, and register WP-0003 and WP-0004 workplans with State Hub. Update SCOPE, README, and agency-framework docs to reflect the two-layer measurement model.
This commit is contained in:
12
README.md
12
README.md
@@ -89,6 +89,18 @@ kaizen-agentic memory show project-management
|
||||
|
||||
See [docs/agency-framework.md](docs/agency-framework.md) for the full model.
|
||||
|
||||
## Orientation
|
||||
|
||||
Read in this order for strategic context:
|
||||
|
||||
1. [INTENT.md](INTENT.md) — purpose, boundaries, design principles
|
||||
2. [wiki/KaizenAgenticMission.md](wiki/KaizenAgenticMission.md) — product narrative
|
||||
3. [wiki/EcosystemIntegration.md](wiki/EcosystemIntegration.md) — ecosystem composition
|
||||
4. [SCOPE.md](SCOPE.md) — repository boundaries and current state
|
||||
5. [history/](history/) — persisted assessments and gap analyses
|
||||
|
||||
Active workplans: [WP-0003](workplans/kaizen-agentic-WP-0003-measurement-loop.md) (measurement loop), [WP-0004](workplans/kaizen-agentic-WP-0004-ecosystem-integration.md) (ecosystem integration).
|
||||
|
||||
## Features
|
||||
|
||||
- **18 Specialized Agents**: Project management, testing, code quality, infrastructure, meta
|
||||
|
||||
82
SCOPE.md
82
SCOPE.md
@@ -3,33 +3,35 @@
|
||||
> This file helps you quickly understand what this repository is about,
|
||||
> when it is relevant, and when it is not.
|
||||
> It is intentionally lightweight and may be incomplete.
|
||||
> For strategic purpose and boundaries, see `INTENT.md`.
|
||||
|
||||
---
|
||||
|
||||
## One-liner
|
||||
|
||||
AI agency framework: specialized agent personas (markdown instruction sets), project-scoped memory, and CLI tooling for deploying informed agents into Claude Code sessions.
|
||||
KaizenAgentic: a digital talent agency framework — agent personas, project memory, measurable improvement loops, and CLI tooling for deploying continuously refining AI coding agents into Claude Code sessions.
|
||||
|
||||
---
|
||||
|
||||
## Core Idea
|
||||
|
||||
Kaizen-agentic packages recurring development workflows (TDD, refactoring, project management, infrastructure health) as named agent personas you invoke in Claude Code. The agency layer adds **project-scoped memory** (`.kaizen/agents/<name>/memory.md`) so agents accumulate knowledge across sessions, plus a **Coach** meta-agent that synthesises cross-agent context for new deployments. The kaizen loop — measure, analyse, refine — is embodied in agent definitions and an `OptimizationLoop` Python pattern, even though runtime execution remains Claude's responsibility.
|
||||
This repo is the canonical home for the **KaizenAgentic** operating model (`INTENT.md`, `wiki/`). It packages recurring development workflows as named agent personas invoked in Claude Code. The **agency layer** adds project-scoped memory (`.kaizen/agents/<name>/memory.md`) and a **Coach** meta-agent for cross-agent orientation. The **kaizen loop** — measure, analyse, refine — is defined in `wiki/` and partially implemented: `OptimizationLoop` exists in Python, but per-execution metrics collection and optimizer integration are in progress (WP-0003). Runtime execution remains Claude Code's responsibility.
|
||||
|
||||
---
|
||||
|
||||
## In Scope
|
||||
|
||||
- **21 agent definitions** (`agents/agent-*.md`) — markdown persona instruction sets with YAML frontmatter
|
||||
- **Strategic framing**: `INTENT.md` (purpose, boundaries, design principles) and `wiki/` (mission, agent template, guidance model, brand/pricing)
|
||||
- **21 agent definitions** (`agents/agent-*.md`) — markdown persona instruction sets with YAML frontmatter (reference fleet; see `INTENT.md` boundaries)
|
||||
- **Agent categories**: project-management, development-process, code-quality, infrastructure, testing, documentation, meta
|
||||
- **Agency framework**: project memory convention, session-start/close protocols, Coach meta-agent (`agent-coach.md`)
|
||||
- **Protocol runbooks** (`agents/protocols/<agent>/<slug>.md`) — procedural checklists distinct from agent prompts (sys-medic k3s assessment is the first example)
|
||||
- **CLI tooling** (`kaizen-agentic`): `init`, `install`, `update`, `remove`, `list`, `status`, `validate`, `templates`, `detect`, `migrate`, `extensions`, `memory` (show/init/brief/clear), `protocols` (list/show)
|
||||
- **Project templates** (python-basic, python-web, python-cli, python-data, comprehensive) — agent bundles defined in registry code, not separate template directories
|
||||
- **Agency framework**: project memory convention (ADR-002), session-start/close protocols, Coach meta-agent (`agent-coach.md`)
|
||||
- **Protocol runbooks** (`agents/protocols/<agent>/<slug>.md`) — procedural checklists distinct from agent prompts
|
||||
- **CLI tooling** (`kaizen-agentic`): `init`, `install`, `update`, `remove`, `list`, `status`, `validate`, `templates`, `detect`, `migrate`, `extensions`, `memory` (show/init/brief/clear), `protocols` (list/show); `metrics` commands planned in WP-0003
|
||||
- **Project templates** (python-basic, python-web, python-cli, python-data, comprehensive) — agent bundles in registry code
|
||||
- **Python framework** (`src/kaizen_agentic/`): `Agent`/`AgentConfig`, `AgentRegistry`, `AgentInstaller`, `OptimizationLoop`/`PerformanceMetrics`, detection/migration/extensions
|
||||
- **Packaged agent data** (`src/kaizen_agentic/data/agents/`) — 17 agents bundled for pip installs (lags `agents/` by 4 agents; see Notes)
|
||||
- **Custodian MCP integration** (owned by `the-custodian`): `list_kaizen_agents()` and `get_kaizen_agent()` resolve this repo via `host_paths`
|
||||
- **ADRs and workplans** documenting memory, protocols, and workplan conventions
|
||||
- **Packaged agent data** (`src/kaizen_agentic/data/agents/`) — 17 agents bundled for pip installs (lags `agents/` by 4; see Notes)
|
||||
- **Custodian MCP integration** (owned by `the-custodian`): `list_kaizen_agents()` and `get_kaizen_agent()`
|
||||
- **ADRs and workplans** for memory, protocols, workplan, and metrics conventions
|
||||
|
||||
---
|
||||
|
||||
@@ -39,25 +41,26 @@ Kaizen-agentic packages recurring development workflows (TDD, refactoring, proje
|
||||
- LLM orchestration, scheduling, or multi-agent debate systems
|
||||
- Project-specific implementation (agents guide work; they do not build the target software)
|
||||
- Custodian State Hub, MCP server code, or cross-domain governance (consumed, not owned)
|
||||
- Full KaizenGuidance codemod pipeline (vision in `wiki/KaizenGuidance.md`; not yet implemented)
|
||||
- PyPI publication pipeline (v1.0.2 released locally; public PyPI distribution still pending)
|
||||
|
||||
---
|
||||
|
||||
## Relevant When
|
||||
|
||||
- Understanding **why** KaizenAgentic exists and what it must not become (`INTENT.md`)
|
||||
- Exploring the conceptual model: agent template, optimizer, guidance, composable capabilities (`wiki/`)
|
||||
- Starting a guided development workflow (TDD, refactoring, testing, requirements, scope analysis)
|
||||
- Deploying agents into a project with persistent cross-session memory
|
||||
- Briefing a newly deployed agent using accumulated project knowledge (Coach / `memory brief`)
|
||||
- Scaffolding a new project with consistent structure and agent bundles
|
||||
- Looking up available agent personas (CLI, MCP, or `agents/` directory)
|
||||
- Contributing or refining an agent persona or protocol runbook
|
||||
- Deploying agents with persistent cross-session memory or Coach-mediated orientation
|
||||
- Scaffolding projects with agent bundles; looking up personas via CLI or Custodian MCP
|
||||
- Contributing agent personas, protocol runbooks, or improvement-loop conventions
|
||||
|
||||
---
|
||||
|
||||
## Not Relevant When
|
||||
|
||||
- Ad-hoc scripting with no need for structured agent guidance
|
||||
- Non-Claude-Code development environments
|
||||
- Non-Claude-Code development environments (primary target; patterns may transfer)
|
||||
- Need for runtime orchestration, task scheduling, or autonomous agent execution
|
||||
- Repository capability profiling or SCOPE.md generation at scale (see `repo-scoping`)
|
||||
|
||||
@@ -66,43 +69,55 @@ Kaizen-agentic packages recurring development workflows (TDD, refactoring, proje
|
||||
## Current State
|
||||
|
||||
- Status: experimental → stabilizing (v1.0.2; agency framework shipped in WP-0002)
|
||||
- Implementation: substantial — 21 agents, full CLI, agency memory + protocols tested e2e; optimization loop exists but is not exercised in production workflows
|
||||
- Strategic layer: `INTENT.md` and `wiki/` established; orientation docs not yet fully linked
|
||||
- Implementation: substantial — 21 agents, full CLI, agency memory + protocols tested e2e; **measurement loop not closed** (no `.kaizen/metrics/`, optimizer unwired)
|
||||
- Stability: CLI stable (Click workaround in place); agency framework validated by e2e tests
|
||||
- Usage: internal dev projects and Custodian MCP hub-wide; packaged wheel missing 4 newest agents
|
||||
- Active work: WP-0001 (community engagement / v1.1.0) — CI, telemetry, cross-platform fixes not started
|
||||
- Active work: **WP-0003** (measurement loop); **WP-0004** (ecosystem integration); WP-0001 (community engagement / v1.1.0) pending
|
||||
|
||||
---
|
||||
|
||||
## How It Fits
|
||||
|
||||
- Upstream dependencies: Claude Code (agent invocation), kaizen continuous-improvement philosophy
|
||||
- Downstream consumers: Custodian State Hub (MCP agent discovery); domain repos that install agents and maintain `.kaizen/` memory
|
||||
- Often used with: `the-custodian` (MCP integration), `markitect_project` (project-management patterns), `activity-core` (scaffolding references)
|
||||
- Downstream consumers: Custodian State Hub (MCP agent discovery); domain repos that install agents and maintain `.kaizen/` state
|
||||
- Often used with: `the-custodian` (MCP integration), `markitect_project` (project-management patterns), `activity-core` (scaffolding references), `repo-scoping` (SCOPE.md generation)
|
||||
|
||||
---
|
||||
|
||||
## Terminology
|
||||
|
||||
- Preferred terms: agent, agent persona, agency, project memory, protocol runbook, Coach
|
||||
- Also known as: "kaizen agents", "the agent library"
|
||||
- Potentially confusing terms: "Agent" here is a persona/instruction set, not a running process; "agency" means memory + coaching, not autonomous orchestration
|
||||
- Preferred terms: KaizenAgentic (product), agent, agent persona, agency, project memory, protocol runbook, Coach, kaizen loop
|
||||
- Also known as: "kaizen agents", "kaizen-agentic" (repo/package slug), "the agent library"
|
||||
- Potentially confusing terms: "Agent" is a persona/instruction set, not a running process; "agency" means memory + coaching, not autonomous orchestration; repo slug `kaizen-agentic` vs product name `KaizenAgentic`
|
||||
|
||||
---
|
||||
|
||||
## Related / Overlapping Repositories
|
||||
|
||||
- `the-custodian` — hosts MCP tools that load agents; integration code lives there, not here
|
||||
- `repo-scoping` — generates/refreshes SCOPE.md from approved characteristics; owns scope analysis at scale
|
||||
- `repo-scoping` — generates/refreshes SCOPE.md from approved characteristics
|
||||
- `markitect_project` — references kaizen-agentic as a capability submodule
|
||||
- `sys-medic` (source repo) — origin of sys-medic agent; canonical copy now lives in `agents/agent-sys-medic.md`
|
||||
- `sys-medic` (source repo) — origin of sys-medic agent; canonical copy in `agents/agent-sys-medic.md`
|
||||
|
||||
---
|
||||
|
||||
## Getting Oriented
|
||||
|
||||
- Start with: `README.md` (quick start, agency overview), `docs/agency-framework.md` (memory + coach + protocols)
|
||||
- Key files / directories: `agents/` (persona definitions), `agents/protocols/` (runbooks), `src/kaizen_agentic/` (Python framework), `workplans/` (active roadmap)
|
||||
- Entry points: `kaizen-agentic --help`; via MCP: `get_kaizen_agent("scope-analyst")`; docs: `docs/GETTING_STARTED.md`, `docs/AGENT_DISTRIBUTION.md`
|
||||
Read in this order for full context:
|
||||
|
||||
1. `INTENT.md` — stable purpose, boundaries, design principles
|
||||
2. `wiki/KaizenAgenticMission.md` — product narrative and key components
|
||||
3. `wiki/EcosystemIntegration.md` — how KaizenAgentic composes with adjacent repos
|
||||
4. `wiki/KaizenAgentTemplate.md` — intended agent specification format
|
||||
5. `README.md` — quick start and agency overview
|
||||
6. `docs/agency-framework.md` — memory, coach, protocols, metrics (ADR-004)
|
||||
7. `history/` — persisted assessments and gap analyses
|
||||
8. `workplans/` — active implementation roadmap
|
||||
|
||||
Key directories: `wiki/` (conceptual model), `agents/` (personas), `agents/protocols/` (runbooks), `src/kaizen_agentic/` (Python framework), `docs/adr/` (conventions)
|
||||
|
||||
Entry points: `kaizen-agentic --help`; MCP: `get_kaizen_agent("scope-analyst")`; docs: `docs/GETTING_STARTED.md`, `docs/AGENT_DISTRIBUTION.md`
|
||||
|
||||
---
|
||||
|
||||
@@ -136,10 +151,17 @@ description: Single source of truth for agent definitions consumed by the Custod
|
||||
keywords: [mcp, custodian, discovery, agent-library]
|
||||
```
|
||||
|
||||
```capability
|
||||
type: process
|
||||
title: KaizenAgentic conceptual model and agent specification standards
|
||||
description: Strategic framing, design principles, agent template, optimizer spec, and improvement philosophy via INTENT.md and wiki/.
|
||||
keywords: [kaizen, intent, template, optimization, digital-talent-agency]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
|
||||
- `agents/` (21 files) is the development source of truth; `src/kaizen_agentic/data/agents/` (17 files) is what pip installs ship — coach, sys-medic, scope-analyst, and optimization are not yet bundled
|
||||
- `INTENT.md` is not present in this repo (see gap analysis against derived intent)
|
||||
- `agent-optimization.md` and `agent-agent-optimization.md` both exist; naming overlap may confuse discovery
|
||||
- `agent-optimization.md` and `agent-agent-optimization.md` both exist; consolidation planned in WP-0003
|
||||
- Agent definitions use minimal frontmatter today; full `wiki/KaizenAgentTemplate.md` conformance is a maturity target, not current reality
|
||||
190
docs/adr/ADR-004-project-metrics-convention.md
Normal file
190
docs/adr/ADR-004-project-metrics-convention.md
Normal file
@@ -0,0 +1,190 @@
|
||||
---
|
||||
id: ADR-004
|
||||
title: Project Metrics Convention
|
||||
status: accepted
|
||||
date: "2026-06-16"
|
||||
---
|
||||
|
||||
# ADR-004 — Project Metrics Convention
|
||||
|
||||
## Status
|
||||
|
||||
Accepted
|
||||
|
||||
## Context
|
||||
|
||||
`INTENT.md` requires agents to be measurable, versioned, and optimizable. The
|
||||
agency framework (ADR-002) provides **qualitative** project memory; the kaizen
|
||||
loop needs **quantitative** per-execution records.
|
||||
|
||||
`wiki/AgentKaizenOptimizer.md` specifies `.kaizen/metrics/` storage.
|
||||
`OptimizationLoop` in `src/kaizen_agentic/optimization.py` exists but has no
|
||||
data source.
|
||||
|
||||
Separately, `agentic-resources` (Helix Forge) captures **fleet-level** session
|
||||
metrics from coding agent transcripts. Project metrics and fleet metrics serve
|
||||
different scopes and must correlate without duplicating ingestion logic.
|
||||
|
||||
## Decision
|
||||
|
||||
Each agent deployed into a project may accumulate **project-scoped execution
|
||||
metrics**. Records are append-only JSONL with rolling summaries. The optimizer
|
||||
reads these files to produce evidence-based recommendations.
|
||||
|
||||
### File locations
|
||||
|
||||
Per-agent executions:
|
||||
|
||||
```
|
||||
<project-root>/.kaizen/metrics/<agent-name>/
|
||||
executions.jsonl # append-only per-execution records
|
||||
summary.json # rolling aggregates (regenerated on write)
|
||||
```
|
||||
|
||||
Optimizer outputs:
|
||||
|
||||
```
|
||||
<project-root>/.kaizen/metrics/optimizer/
|
||||
analysis.json # last analysis run + input fingerprint
|
||||
recommendations.jsonl # append-only recommendation history
|
||||
```
|
||||
|
||||
The `.kaizen/metrics/` tree lives alongside `.kaizen/agents/` under the same
|
||||
project-level state directory (ADR-002).
|
||||
|
||||
### Execution record schema (minimum viable)
|
||||
|
||||
```json
|
||||
{
|
||||
"timestamp": "2026-06-16T12:00:00Z",
|
||||
"agent": "tdd-workflow",
|
||||
"session_id": "optional-uuid-or-hash",
|
||||
"execution_time_s": 0.0,
|
||||
"success": true,
|
||||
"quality_score": 0.0,
|
||||
"primary_metric": {
|
||||
"name": "test_pass_rate",
|
||||
"value": 1.0,
|
||||
"target": 1.0
|
||||
},
|
||||
"metadata": {}
|
||||
}
|
||||
```
|
||||
|
||||
Required fields: `timestamp`, `agent`, `success`.
|
||||
Recommended fields: `execution_time_s`, `quality_score`, `primary_metric`.
|
||||
|
||||
### Summary schema
|
||||
|
||||
`summary.json` is derived — never hand-edited. Regenerated on each append:
|
||||
|
||||
```json
|
||||
{
|
||||
"agent": "tdd-workflow",
|
||||
"execution_count": 12,
|
||||
"success_rate": 0.917,
|
||||
"avg_quality_score": 0.82,
|
||||
"avg_execution_time_s": 45.3,
|
||||
"last_execution": "2026-06-16T12:00:00Z",
|
||||
"trend": {
|
||||
"success_rate": "stable",
|
||||
"quality_score": "up"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Retention
|
||||
|
||||
Default retention: **180 days** (per `wiki/AgentKaizenOptimizer.md`).
|
||||
Pruning removes aged lines from `executions.jsonl` and regenerates `summary.json`.
|
||||
Project-level override via `.kaizen/metrics/config.json` is reserved for a
|
||||
future iteration.
|
||||
|
||||
### Session-close protocol
|
||||
|
||||
Memory-enabled agents with declared metrics should append one execution record
|
||||
at session close:
|
||||
|
||||
```bash
|
||||
kaizen-agentic metrics record <agent> --success --time <seconds> --quality <0-1>
|
||||
```
|
||||
|
||||
Or pipe a full JSON record via `--json` / stdin.
|
||||
|
||||
### CLI interface
|
||||
|
||||
```
|
||||
kaizen-agentic metrics record <agent> # Append execution record
|
||||
kaizen-agentic metrics show <agent> # Summary + recent executions
|
||||
kaizen-agentic metrics list # Agents with metrics in project
|
||||
kaizen-agentic metrics export <agent> # Dump executions.jsonl
|
||||
kaizen-agentic metrics optimize [agent] # Run OptimizationLoop (WP-0003 Part 3)
|
||||
```
|
||||
|
||||
`kaizen-agentic memory init <agent>` scaffolds metrics directories by default
|
||||
(`--no-metrics` to opt out).
|
||||
|
||||
### Helix Forge correlation
|
||||
|
||||
Kaizen-agentic **project metrics** and agentic-resources **fleet metrics**
|
||||
operate at different layers:
|
||||
|
||||
| Layer | Scope | Owner | Typical storage |
|
||||
|-------|-------|-------|-----------------|
|
||||
| Project | Per-agent persona in one repo | kaizen-agentic | `.kaizen/metrics/` |
|
||||
| Fleet | Cross-repo coding sessions | agentic-resources | Helix Forge digest store + `measure/baselines.jsonl` |
|
||||
|
||||
**Correlation fields** — optional on project execution records, populated when
|
||||
the session is also captured by Helix Forge:
|
||||
|
||||
```json
|
||||
{
|
||||
"helix_session_uid": "claude:<native-session-uuid>",
|
||||
"repo": "kaizen-agentic",
|
||||
"flavor": "claude",
|
||||
"tokens": 12500,
|
||||
"infra_overhead_share": 0.12
|
||||
}
|
||||
```
|
||||
|
||||
Mapping from Helix Forge `session_metrics()` (agentic-resources):
|
||||
|
||||
| Helix field | ADR-004 field |
|
||||
|-------------|---------------|
|
||||
| `digest.outcome == "success"` | `success` |
|
||||
| `digest.cost.wall_clock_s` | `execution_time_s` |
|
||||
| `tokens` (input + output) | `tokens` in metadata / top-level |
|
||||
| `infra_overhead_share` | `metadata.infra_overhead_share` |
|
||||
| `Session.session_uid` | `helix_session_uid` |
|
||||
| `Session.repo` | `repo` |
|
||||
| `Session.flavor` | `flavor` |
|
||||
|
||||
Kaizen-agentic does **not** ingest Claude/Codex/Grok JSONL transcripts.
|
||||
Correlation is **link-by-reference**: project metrics may cite a Helix session
|
||||
UID; fleet analytics remain owned by agentic-resources.
|
||||
|
||||
WP-0004 defines the integration contract and optional sync tooling.
|
||||
|
||||
### Coach and memory integration
|
||||
|
||||
`kaizen-agentic memory brief <agent>` includes a `## Performance Summary`
|
||||
section when `summary.json` exists (WP-0003 Part 4). Qualitative memory
|
||||
(ADR-002) and quantitative metrics (this ADR) are complementary views of the
|
||||
same agent's project history.
|
||||
|
||||
## Consequences
|
||||
|
||||
- Agents can be measured per project without a central telemetry platform.
|
||||
- `OptimizationLoop` has a defined data source for recommendations.
|
||||
- Fleet session analytics stay in agentic-resources; no duplicate ingestion.
|
||||
- `.kaizen/metrics/` should default to `.gitignore` (same policy as memory).
|
||||
- WP-0003 implements `MetricsStore` and CLI against this convention.
|
||||
- WP-0004 wires ecosystem services (activity-core, artifact-store, Helix Forge).
|
||||
|
||||
## Related Documents
|
||||
|
||||
- [ADR-002: Project Memory Convention](ADR-002-project-memory-convention.md)
|
||||
- [wiki/EcosystemIntegration.md](../../wiki/EcosystemIntegration.md)
|
||||
- [agentic-resources session schema](https://github.com/coulomb/agentic-resources) — `session_memory/core/schema.py`
|
||||
- [KAIZEN-WP-0003](../../workplans/kaizen-agentic-WP-0003-measurement-loop.md)
|
||||
- [KAIZEN-WP-0004](../../workplans/kaizen-agentic-WP-0004-ecosystem-integration.md)
|
||||
@@ -234,8 +234,56 @@ All agents that do session-bound project work have `memory: enabled` in their fr
|
||||
|
||||
---
|
||||
|
||||
## Project Metrics
|
||||
|
||||
Project-scoped **quantitative** metrics complement qualitative memory (ADR-002).
|
||||
Per-execution records live under `.kaizen/metrics/<agent>/` and feed the
|
||||
kaizen optimizer loop.
|
||||
|
||||
### Location
|
||||
|
||||
```
|
||||
<project-root>/.kaizen/metrics/<agent-name>/
|
||||
executions.jsonl
|
||||
summary.json
|
||||
|
||||
<project-root>/.kaizen/metrics/optimizer/
|
||||
analysis.json
|
||||
recommendations.jsonl
|
||||
```
|
||||
|
||||
### CLI (WP-0003)
|
||||
|
||||
```
|
||||
kaizen-agentic metrics record <agent> # Append execution record at session close
|
||||
kaizen-agentic metrics show <agent> # Summary + recent executions
|
||||
kaizen-agentic metrics list # Agents with metrics in project
|
||||
kaizen-agentic metrics export <agent> # Dump executions.jsonl
|
||||
kaizen-agentic metrics optimize [agent] # Run optimizer on project metrics
|
||||
```
|
||||
|
||||
`memory brief` includes a `## Performance Summary` when metrics exist (WP-0003
|
||||
Part 4).
|
||||
|
||||
### Fleet correlation
|
||||
|
||||
Project metrics correlate with **Helix Forge** fleet session metrics in
|
||||
`agentic-resources` via optional `helix_session_uid` (ADR-004). See
|
||||
[wiki/EcosystemIntegration.md](../wiki/EcosystemIntegration.md).
|
||||
|
||||
### Evidence retention
|
||||
|
||||
Optimizer outputs may be published to `artifact-store` (WP-0004 Part 3).
|
||||
|
||||
---
|
||||
|
||||
## Related Documents
|
||||
|
||||
- [ADR-001: Workplan Convention](../workplans/kaizen-agentic-WP-0001-community-engagement.md) — how work items are structured
|
||||
- [ADR-002: Project Memory Convention](../workplans/kaizen-agentic-WP-0002-agency-framework.md) — memory file location, structure, and lifecycle
|
||||
- [WP-0002: Agency Framework](../workplans/kaizen-agentic-WP-0002-agency-framework.md) — full implementation workplan
|
||||
- [ADR-001: Workplan Convention](adr/ADR-001-workplan-convention.md)
|
||||
- [ADR-002: Project Memory Convention](adr/ADR-002-project-memory-convention.md)
|
||||
- [ADR-003: Protocols Artifact Convention](adr/ADR-003-protocols-artifact-convention.md)
|
||||
- [ADR-004: Project Metrics Convention](adr/ADR-004-project-metrics-convention.md)
|
||||
- [wiki/EcosystemIntegration.md](../wiki/EcosystemIntegration.md) — two-layer measurement model
|
||||
- [WP-0002: Agency Framework](../workplans/kaizen-agentic-WP-0002-agency-framework.md)
|
||||
- [WP-0003: Measurement Loop](../workplans/kaizen-agentic-WP-0003-measurement-loop.md)
|
||||
- [WP-0004: Ecosystem Integration](../workplans/kaizen-agentic-WP-0004-ecosystem-integration.md)
|
||||
|
||||
142
history/2026-06-16-ecosystem-assessment.md
Normal file
142
history/2026-06-16-ecosystem-assessment.md
Normal file
@@ -0,0 +1,142 @@
|
||||
# KaizenAgentic Ecosystem Assessment
|
||||
|
||||
**Date:** 2026-06-16
|
||||
**Compared repos:** info-tech-canon, agentic-resources, activity-core, llm-connect, identity-canon, phase-memory, artifact-store, domain-tree, kontextual-engine, tele-mcp
|
||||
**Against:** `INTENT.md`, `wiki/`, WP-0003 measurement loop plan
|
||||
|
||||
---
|
||||
|
||||
## Strategic Insight
|
||||
|
||||
INTENT's vision is **distributed across the ecosystem**, not missing from a single repo:
|
||||
|
||||
| INTENT promise | Primary owner |
|
||||
|----------------|---------------|
|
||||
| Agent definitions + deployment | kaizen-agentic |
|
||||
| Project memory + Coach | kaizen-agentic |
|
||||
| Per-agent metrics + optimizer | kaizen-agentic (WP-0003) |
|
||||
| Session capture + fleet metrics | agentic-resources (Helix Forge) |
|
||||
| Scheduled improvement triggers | activity-core |
|
||||
| Evidence retention | artifact-store |
|
||||
| Rich memory graphs | phase-memory (future) |
|
||||
| Guidance as knowledge | kontextual-engine + info-tech-canon |
|
||||
| Semantic vocabulary | info-tech-canon, identity-canon |
|
||||
| Org placement | domain-tree |
|
||||
| Runtime telemetry MCP | tele-mcp (unassessed — not cloned) |
|
||||
|
||||
KaizenAgentic matures by **stabilizing conventions and composing adjacent services**, consistent with INTENT boundaries.
|
||||
|
||||
---
|
||||
|
||||
## Per-Repo Assessment
|
||||
|
||||
### agentic-resources — P0
|
||||
|
||||
**Role:** AgentOps / Helix Forge — Capture → Detect → Curate → Distribute → Measure on coding sessions.
|
||||
|
||||
**Use:** Fleet-level session metrics (`session_memory/measure/`), JSONL baselines, cross-agent adapters (Claude/Codex/Grok). Complements project-scoped `.kaizen/metrics/`.
|
||||
|
||||
**Action:** ADR-004 correlation fields; WP-0004 integration; do not re-implement session ingestion here.
|
||||
|
||||
### activity-core — P1
|
||||
|
||||
**Role:** Event bridge — cron/NATS → task emission.
|
||||
|
||||
**Use:** Scheduled `metrics optimize`, retention hygiene, metrics scaffold validation after agent install.
|
||||
|
||||
**Action:** WP-0004 ActivityDefinitions after WP-0003 Part 2.
|
||||
|
||||
### artifact-store — P1
|
||||
|
||||
**Role:** Artifact registry + retention gateway.
|
||||
|
||||
**Use:** Persist optimizer `analysis.json`, recommendations, e2e evidence packages.
|
||||
|
||||
**Action:** WP-0004 pilot registration with `raw-evidence` retention class.
|
||||
|
||||
### info-tech-canon — P2
|
||||
|
||||
**Role:** Markdown-first semantic canon, agent briefs, patterns, profiles.
|
||||
|
||||
**Use:** Map KaizenAgentTemplate → canon profiles; publish per-agent briefs; validation rules for `kaizen-agentic validate`.
|
||||
|
||||
**Action:** WP-0004 Part 4 (later phase).
|
||||
|
||||
### phase-memory — P2
|
||||
|
||||
**Role:** Profile-driven memory graphs (ephemeral → rigid).
|
||||
|
||||
**Use:** Upgrade path from flat `.kaizen/agents/*/memory.md`.
|
||||
|
||||
**Action:** Future WP after WP-0004; no WP-0003 blocker.
|
||||
|
||||
### kontextual-engine — P2
|
||||
|
||||
**Role:** Knowledge operations engine.
|
||||
|
||||
**Use:** Ingest `wiki/` and `agents/` as knowledge assets; KaizenGuidance catalog runtime.
|
||||
|
||||
**Action:** WP-0004 Part 4 (guidance pilot).
|
||||
|
||||
### llm-connect — P3
|
||||
|
||||
**Role:** Provider-neutral LLM adapter.
|
||||
|
||||
**Use:** Automated Coach/optimizer narration when LLM synthesis moves beyond CLI context assembly.
|
||||
|
||||
**Action:** Reference pattern; adopt when WP-0003+ adds LLM-powered recommendations.
|
||||
|
||||
### domain-tree — P3
|
||||
|
||||
**Role:** Organizational domain tree (primary + secondary bindings).
|
||||
|
||||
**Use:** Register kaizen-agentic and agent categories in org structure.
|
||||
|
||||
**Action:** When capability catalog matures.
|
||||
|
||||
### identity-canon — P3
|
||||
|
||||
**Role:** Identity/agent terminology research.
|
||||
|
||||
**Use:** Distinguish agent persona vs instance vs session actor for "digital talent agency" framing.
|
||||
|
||||
**Action:** Glossary alignment in wiki.
|
||||
|
||||
### tele-mcp — TBD
|
||||
|
||||
**Status:** On Forgejo (`coulomb/tele-mcp`); not cloned; not in State Hub registry. Name suggests telemetry MCP.
|
||||
|
||||
**Action:** Clone and assess before integration; candidate for WP-0001 T04 telemetry adapter.
|
||||
|
||||
---
|
||||
|
||||
## Two-Layer Measurement Model
|
||||
|
||||
| Layer | Scope | Owner | Storage |
|
||||
|-------|-------|-------|---------|
|
||||
| **Fleet** | Cross-repo session outcomes | agentic-resources | Helix Forge store + `measure/baselines.jsonl` |
|
||||
| **Project** | Per-agent persona performance in one repo | kaizen-agentic | `.kaizen/metrics/<agent>/executions.jsonl` |
|
||||
|
||||
Correlation via shared fields defined in ADR-004 (`helix_session_uid`, `repo`, `success`, `tokens`, `execution_time_s`).
|
||||
|
||||
See `wiki/EcosystemIntegration.md` for integration contracts.
|
||||
|
||||
---
|
||||
|
||||
## Priority Matrix
|
||||
|
||||
| Priority | Repo | WP |
|
||||
|----------|------|-----|
|
||||
| P0 | agentic-resources | WP-0004 Part 1 |
|
||||
| P1 | activity-core | WP-0004 Part 2 |
|
||||
| P1 | artifact-store | WP-0004 Part 3 |
|
||||
| P2 | info-tech-canon, kontextual-engine, phase-memory | WP-0004 Part 4 / future |
|
||||
| P3 | llm-connect, domain-tree, identity-canon | Adopt as needed |
|
||||
| TBD | tele-mcp | Assess when cloned |
|
||||
|
||||
---
|
||||
|
||||
## Follow-Up Workplans
|
||||
|
||||
- **KAIZEN-WP-0003** — measurement loop (active, State Hub registered)
|
||||
- **KAIZEN-WP-0004** — ecosystem integration (active, depends on WP-0003 Part 3)
|
||||
87
history/2026-06-16-intent-gap-analysis.md
Normal file
87
history/2026-06-16-intent-gap-analysis.md
Normal file
@@ -0,0 +1,87 @@
|
||||
# KaizenAgentic Intent Gap Analysis
|
||||
|
||||
**Date:** 2026-06-16
|
||||
**Scope:** `INTENT.md`, `wiki/`, codebase (`agents/`, `src/kaizen_agentic/`, `docs/`, workplans)
|
||||
**Author:** kaizen-agentic session assessment
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
Kaizen-agentic is in a **two-layer state**: the strategic/conceptual layer (`INTENT.md`, `wiki/`) is well-developed; the operational layer (agents, CLI, agency framework) is substantial but implements a **deployment and memory** product more than a **measurable continuous-improvement engine**.
|
||||
|
||||
The largest gap: the **measurement → optimization → specification refinement loop** described in INTENT is largely unbuilt. Addressed by **KAIZEN-WP-0003** (registered 2026-06-16).
|
||||
|
||||
---
|
||||
|
||||
## Alignment
|
||||
|
||||
| INTENT asset | Status |
|
||||
|--------------|--------|
|
||||
| Mission and conceptual model | `wiki/` established |
|
||||
| KaizenAgent definition template | `wiki/KaizenAgentTemplate.md` — not enforced in agents |
|
||||
| Meta-optimizer concept | `wiki/AgentKaizenOptimizer.md` + `agent-optimization.md` — no data pipeline |
|
||||
| Idempotent/measurable principles | Documented; not in agent implementations |
|
||||
| Codebase improvement guidance | `wiki/KaizenGuidance.md` — vision only |
|
||||
| Prompts/experiments/mantras | `wiki/KaizenPrompting.md` — not operationalized |
|
||||
| Product/pricing/brand | `wiki/` complete |
|
||||
| Agency memory + Coach | WP-0002 shipped |
|
||||
| CLI deployment | Functional (21 agents) |
|
||||
|
||||
---
|
||||
|
||||
## Critical Gaps
|
||||
|
||||
### 1. Kaizen loop not closed
|
||||
|
||||
INTENT requires evidence-based refinement with before/after deltas. Reality: `OptimizationLoop` exists but is unwired; no `.kaizen/metrics/`; WP-0001 telemetry unstarted.
|
||||
|
||||
### 2. Agent template not enforced
|
||||
|
||||
Agents use minimal YAML frontmatter; `wiki/KaizenAgentTemplate.md` (metrics, idempotency, testing, evolution) is reference only.
|
||||
|
||||
### 3. KaizenGuidance unbuilt
|
||||
|
||||
No guide catalog, manifests, codemods, or Parse→Measure pipeline.
|
||||
|
||||
### 4. Coach vs optimizer not integrated
|
||||
|
||||
Qualitative memory (Coach) and quantitative optimization (optimizer) are separate paths.
|
||||
|
||||
### 5. Agent implementation boundary undeclared
|
||||
|
||||
INTENT says repo should not own all concrete agent implementations; 21 agents live here as reference fleet — interim state needs explicit policy.
|
||||
|
||||
---
|
||||
|
||||
## Design Principles Scorecard
|
||||
|
||||
| Principle | Status |
|
||||
|-----------|--------|
|
||||
| Continuous Improvement | Partial (memory; no automated refinement) |
|
||||
| Measurable by Default | Gap |
|
||||
| Idempotent Operations | Gap |
|
||||
| Evidence over Intuition | Gap |
|
||||
| Separation of Concerns | Partial |
|
||||
| Composable Capabilities | Gap |
|
||||
| Human-Readable + Machine-Executable | Gap (guidance) |
|
||||
| Rollback-Ready Evolution | Partial |
|
||||
| Compounding Value | Partial (memory only) |
|
||||
|
||||
---
|
||||
|
||||
## Remediation Sequence
|
||||
|
||||
1. **WP-0003** — metrics convention, CLI, optimizer wiring, Coach bridge (active)
|
||||
2. **WP-0004** — ecosystem integration (agentic-resources, activity-core, artifact-store)
|
||||
3. Future — KaizenGuidance catalog, phase-memory upgrade, full template conformance
|
||||
|
||||
---
|
||||
|
||||
## Related Artifacts
|
||||
|
||||
- `SCOPE.md` — updated 2026-06-16
|
||||
- `workplans/kaizen-agentic-WP-0003-measurement-loop.md`
|
||||
- `history/2026-06-16-ecosystem-assessment.md`
|
||||
- `wiki/EcosystemIntegration.md`
|
||||
- `docs/adr/ADR-004-project-metrics-convention.md`
|
||||
11
history/README.md
Normal file
11
history/README.md
Normal file
@@ -0,0 +1,11 @@
|
||||
# History
|
||||
|
||||
Persisted assessments, gap analyses, and ecosystem reviews for KaizenAgentic.
|
||||
|
||||
| Date | Document | Summary |
|
||||
|------|----------|---------|
|
||||
| 2026-06-16 | [2026-06-16-intent-gap-analysis.md](2026-06-16-intent-gap-analysis.md) | INTENT.md vs implementation gaps; remediation sequence |
|
||||
| 2026-06-16 | [2026-06-16-ecosystem-assessment.md](2026-06-16-ecosystem-assessment.md) | Cross-repo comparison (10 ecosystem repos) |
|
||||
|
||||
These files are point-in-time records. Living conventions live in `INTENT.md`,
|
||||
`SCOPE.md`, `wiki/`, and `docs/adr/`.
|
||||
162
wiki/EcosystemIntegration.md
Normal file
162
wiki/EcosystemIntegration.md
Normal file
@@ -0,0 +1,162 @@
|
||||
# Ecosystem Integration
|
||||
|
||||
*How KaizenAgentic composes with adjacent repositories*
|
||||
|
||||
KaizenAgentic (`INTENT.md`) defines a **meta-improvement layer** for coding
|
||||
agents. No single repository implements the full vision. This document describes
|
||||
the **two-layer measurement model** and integration contracts with ecosystem
|
||||
repos.
|
||||
|
||||
---
|
||||
|
||||
## Two-Layer Measurement Model
|
||||
|
||||
| Layer | Question answered | Owner | Storage |
|
||||
|-------|-------------------|-------|---------|
|
||||
| **Project** | How is this *agent persona* performing in *this repo*? | kaizen-agentic | `.kaizen/metrics/<agent>/` |
|
||||
| **Fleet** | How are coding sessions performing *across repos*? | agentic-resources | Helix Forge digest store + baselines |
|
||||
|
||||
```
|
||||
Coding session (Claude / Codex / Grok)
|
||||
│
|
||||
├──────────────────────────────────────┐
|
||||
▼ ▼
|
||||
agentic-resources kaizen-agentic
|
||||
(Helix Forge) (session close)
|
||||
Capture → Digest → Fleet metrics metrics record → executions.jsonl
|
||||
│ │
|
||||
└──────── helix_session_uid ───────────┘
|
||||
(optional link)
|
||||
```
|
||||
|
||||
### When to use which layer
|
||||
|
||||
- **Project metrics** — optimizer recommendations, Coach briefs, per-agent
|
||||
kaizen loop in one codebase (ADR-004).
|
||||
- **Fleet metrics** — cross-repo friction analysis, pattern distribution,
|
||||
weekly retro, tooling decisions (Helix Forge PRD).
|
||||
|
||||
Kaizen-agentic does not re-implement session JSONL ingestion. It may **cite**
|
||||
Helix session UIDs on project execution records for correlation.
|
||||
|
||||
---
|
||||
|
||||
## Integration Partners
|
||||
|
||||
### agentic-resources (P0)
|
||||
|
||||
**Helix Forge** — session capture, fleet aggregates, baselines, weekly retro.
|
||||
|
||||
| KaizenAgentic | Helix Forge |
|
||||
|---------------|-------------|
|
||||
| `.kaizen/metrics/<agent>/executions.jsonl` | Digest store + `measure/baselines.jsonl` |
|
||||
| Per-agent persona outcomes | Per-session cross-repo outcomes |
|
||||
| `kaizen-agentic metrics optimize` | `session_memory/measure/` aggregates |
|
||||
|
||||
**Correlation fields** (ADR-004): `helix_session_uid`, `repo`, `flavor`,
|
||||
`tokens`, `infra_overhead_share`.
|
||||
|
||||
**Workplan:** KAIZEN-WP-0004 Part 1.
|
||||
|
||||
### activity-core (P1)
|
||||
|
||||
**Event bridge** — scheduled and event-driven task creation.
|
||||
|
||||
Example ActivityDefinitions (after metrics CLI ships):
|
||||
|
||||
- Weekly: run `kaizen-agentic metrics optimize` on repos with `.kaizen/`
|
||||
- On low success_rate threshold: create review task in issue-core
|
||||
- Post agent install: verify metrics scaffold exists
|
||||
|
||||
**Workplan:** KAIZEN-WP-0004 Part 2.
|
||||
|
||||
### artifact-store (P1)
|
||||
|
||||
**Evidence retention** — durable registry for generated outputs.
|
||||
|
||||
Register after optimizer runs:
|
||||
|
||||
- `optimizer/analysis.json`
|
||||
- `recommendations.jsonl` snapshots
|
||||
- E2e pilot evidence packages
|
||||
|
||||
Retention class: `raw-evidence` (180d default, aligned with ADR-004).
|
||||
|
||||
**Workplan:** KAIZEN-WP-0004 Part 3.
|
||||
|
||||
### info-tech-canon (P2)
|
||||
|
||||
**Semantic canon** — agent briefs, patterns, profiles, validation.
|
||||
|
||||
- Map `KaizenAgentTemplate.md` → InfoTechCanon profile format
|
||||
- Publish compact agent briefs per persona
|
||||
- Extend `kaizen-agentic validate` with canon conformance checks
|
||||
|
||||
**Workplan:** KAIZEN-WP-0004 Part 4.
|
||||
|
||||
### phase-memory (P2, future)
|
||||
|
||||
**Memory graphs** — upgrade from flat `memory.md` to phased memory profiles.
|
||||
|
||||
- Fluid memory → project session paths
|
||||
- Stabilized memory → accumulated findings with provenance
|
||||
- Context packages for Coach brief compilation
|
||||
|
||||
No WP-0003 blocker; plan after ecosystem integration baseline.
|
||||
|
||||
### kontextual-engine (P2)
|
||||
|
||||
**Knowledge operations** — ingest `wiki/` and agent definitions as governed
|
||||
assets; runtime for KaizenGuidance catalog when built.
|
||||
|
||||
### llm-connect (P3)
|
||||
|
||||
**LLM abstraction** — use when Coach/optimizer synthesis becomes automated
|
||||
beyond CLI context assembly. Token metrics align with wiki pricing tiers.
|
||||
|
||||
### domain-tree (P3)
|
||||
|
||||
Register kaizen-agentic and agent categories with primary/secondary domain
|
||||
bindings when capability catalog matures.
|
||||
|
||||
### identity-canon (P3)
|
||||
|
||||
Terminology for agent persona vs deployed instance vs session actor —
|
||||
supports "digital talent agency" framing without overloading "user".
|
||||
|
||||
### tele-mcp (TBD)
|
||||
|
||||
Listed on Forgejo; not cloned locally. Candidate telemetry MCP adapter for
|
||||
WP-0001 T04. Assess before depending on it.
|
||||
|
||||
---
|
||||
|
||||
## Boundary Rules
|
||||
|
||||
1. **kaizen-agentic owns** agent definitions, `.kaizen/` conventions, CLI,
|
||||
Coach/optimizer personas, and product framing (`INTENT.md`, `wiki/`).
|
||||
2. **kaizen-agentic does not own** session transcript ingestion, task
|
||||
scheduling, artifact bytes, knowledge graph runtime, or LLM providers.
|
||||
3. **Integrate by contract** — ADRs, shared correlation fields, ActivityDefinitions,
|
||||
artifact registration APIs — not by merging repos.
|
||||
4. **Evidence compounds** — fleet baselines inform tooling; project metrics
|
||||
inform agent specs; artifact-store preserves both for audit.
|
||||
|
||||
---
|
||||
|
||||
## Reading Order
|
||||
|
||||
1. `INTENT.md` — purpose and boundaries
|
||||
2. `wiki/EcosystemIntegration.md` — this document
|
||||
3. `docs/adr/ADR-004-project-metrics-convention.md` — project metrics schema
|
||||
4. `history/2026-06-16-ecosystem-assessment.md` — full repo comparison
|
||||
5. `workplans/kaizen-agentic-WP-0004-ecosystem-integration.md` — implementation plan
|
||||
|
||||
---
|
||||
|
||||
## Related Assessments
|
||||
|
||||
Persisted in `history/`:
|
||||
|
||||
- `2026-06-16-intent-gap-analysis.md`
|
||||
- `2026-06-16-ecosystem-assessment.md`
|
||||
289
workplans/kaizen-agentic-WP-0003-measurement-loop.md
Normal file
289
workplans/kaizen-agentic-WP-0003-measurement-loop.md
Normal file
@@ -0,0 +1,289 @@
|
||||
---
|
||||
id: KAIZEN-WP-0003
|
||||
type: workplan
|
||||
title: "Measurement Loop: Metrics Convention, Collection, and Optimizer Integration"
|
||||
domain: custodian
|
||||
repo: kaizen-agentic
|
||||
status: active
|
||||
owner: kaizen-agentic
|
||||
topic_slug: custodian
|
||||
state_hub_workstream_id: 36252a45-f360-4496-bf77-17b5dfb02767
|
||||
created: "2026-06-16"
|
||||
updated: "2026-06-17"
|
||||
---
|
||||
|
||||
# KAIZEN-WP-0003 — Measurement Loop: Metrics Convention, Collection, and Optimizer Integration
|
||||
|
||||
**Status:** active
|
||||
**Owner:** kaizen-agentic
|
||||
**Repo:** kaizen-agentic
|
||||
**Target version:** 1.1.0 (partial; remainder in WP-0001)
|
||||
|
||||
## Goal
|
||||
|
||||
Close the kaizen feedback loop defined in `INTENT.md` and `wiki/AgentKaizenOptimizer.md`:
|
||||
agents produce **measurable, per-execution performance records** stored in project-scoped
|
||||
`.kaizen/metrics/`, the existing `OptimizationLoop` reads that data and generates
|
||||
evidence-based recommendations, and the Coach/optimizer meta-agents share a single
|
||||
improvement path.
|
||||
|
||||
This workplan addresses the P0 gap from the INTENT gap analysis: strategic vision
|
||||
(memory + qualitative learning) exists; **quantitative measurement → refinement**
|
||||
does not.
|
||||
|
||||
---
|
||||
|
||||
## Background
|
||||
|
||||
| Layer | State |
|
||||
|-------|-------|
|
||||
| `INTENT.md` | Requires measurable-by-default agents and evidence-based refinement |
|
||||
| `wiki/KaizenAgentTemplate.md` | Defines `metrics`, `idempotency`, `optimization` sections per agent |
|
||||
| `wiki/AgentKaizenOptimizer.md` | Specifies `.kaizen/metrics/` storage and optimizer behaviour |
|
||||
| `src/kaizen_agentic/optimization.py` | `OptimizationLoop` + `PerformanceMetrics` implemented, unit-tested, unwired |
|
||||
| Agency framework (WP-0002) | `.kaizen/agents/<name>/memory.md` + Coach brief — qualitative only |
|
||||
| WP-0001 T04 | Telemetry — overlaps; WP-0003 defines the convention; WP-0001 can adopt it |
|
||||
|
||||
---
|
||||
|
||||
## Part 1 — Metrics Convention and Storage
|
||||
|
||||
Define the project-scoped metrics artifact alongside the existing memory convention
|
||||
(ADR-002).
|
||||
|
||||
### Location convention
|
||||
|
||||
```
|
||||
<project-root>/.kaizen/metrics/<agent-name>/
|
||||
executions.jsonl # append-only per-execution records
|
||||
summary.json # rolling aggregates (regenerated on write)
|
||||
```
|
||||
|
||||
Optimizer-specific aggregates (per `wiki/AgentKaizenOptimizer.md`):
|
||||
|
||||
```
|
||||
<project-root>/.kaizen/metrics/optimizer/
|
||||
analysis.json # last run output + fingerprint
|
||||
recommendations.jsonl # append-only recommendation history
|
||||
```
|
||||
|
||||
### Execution record schema (minimum viable)
|
||||
|
||||
```json
|
||||
{
|
||||
"timestamp": "ISO-8601",
|
||||
"agent": "tdd-workflow",
|
||||
"session_id": "optional-uuid-or-hash",
|
||||
"execution_time_s": 0.0,
|
||||
"success": true,
|
||||
"quality_score": 0.0,
|
||||
"primary_metric": { "name": "...", "value": 0.0, "target": 0.0 },
|
||||
"metadata": {}
|
||||
}
|
||||
```
|
||||
|
||||
### Tasks
|
||||
|
||||
- [x] T01 — Write ADR-004: project metrics convention (location, schema, lifecycle, retention, Helix Forge correlation)
|
||||
- [ ] T02 — Implement `MetricsStore` in `src/kaizen_agentic/metrics.py` (append, read, summarise, prune by retention)
|
||||
- [ ] T03 — Add `memory init` hook to scaffold `.kaizen/metrics/<agent>/` alongside memory (optional flag `--no-metrics`)
|
||||
- [ ] T04 — Unit tests for `MetricsStore` (append idempotency key, summary regeneration, retention prune)
|
||||
|
||||
### Definition of done
|
||||
|
||||
- ADR-004 accepted and referenced from `docs/agency-framework.md`
|
||||
- `MetricsStore` passes unit tests
|
||||
- `kaizen-agentic memory init <agent>` creates metrics scaffold by default
|
||||
|
||||
---
|
||||
|
||||
## Part 2 — Metrics CLI
|
||||
|
||||
Expose metrics collection and inspection without requiring Python imports in agent
|
||||
sessions.
|
||||
|
||||
### Commands
|
||||
|
||||
```
|
||||
kaizen-agentic metrics record <agent> # Append one execution record (stdin JSON or flags)
|
||||
kaizen-agentic metrics show <agent> # Print summary + recent executions
|
||||
kaizen-agentic metrics list # List agents with metrics in current project
|
||||
kaizen-agentic metrics export <agent> # Dump executions.jsonl to stdout
|
||||
```
|
||||
|
||||
### Options (record)
|
||||
|
||||
- `--target / -t` — project root (default: cwd)
|
||||
- `--success / --failure` — boolean outcome shorthand
|
||||
- `--time` — execution time in seconds
|
||||
- `--quality` — quality score 0.0–1.0
|
||||
- `--json` — full record on stdin
|
||||
|
||||
### Tasks
|
||||
|
||||
- [ ] T05 — Implement `metrics` CLI command group (record, show, list, export)
|
||||
- [ ] T06 — Integrate `metrics record` into session-close protocol template for pilot agents
|
||||
- [ ] T07 — CLI tests for metrics commands (click.testing, temp project dir)
|
||||
- [ ] T08 — Update `docs/CLI_CHEAT_SHEET.md` and `docs/agency-framework.md` with metrics section
|
||||
|
||||
### Definition of done
|
||||
|
||||
- All four metrics commands work against a test project with `.kaizen/metrics/`
|
||||
- Session-close template documents the `metrics record` one-liner for pilot agents
|
||||
- CLI cheat sheet updated
|
||||
|
||||
---
|
||||
|
||||
## Part 3 — Wire OptimizationLoop to Project Metrics
|
||||
|
||||
Connect the existing Python optimization infrastructure to real project data.
|
||||
|
||||
### Tasks
|
||||
|
||||
- [ ] T09 — Add `OptimizationLoop.from_metrics_store(store)` factory that loads `PerformanceMetrics` from executions
|
||||
- [ ] T10 — Implement `kaizen-agentic metrics optimize [agent]` — run analysis, print recommendations, write `optimizer/analysis.json`
|
||||
- [ ] T11 — Consolidate `agent-optimization.md` and `agent-agent-optimization.md` into single canonical `optimization` agent; update registry
|
||||
- [ ] T12 — Update `agent-optimization.md` session protocol to invoke `metrics optimize` and reference ADR-004
|
||||
- [ ] T13 — Unit + integration tests: synthetic executions → recommendations → non-empty output
|
||||
|
||||
### Definition of done
|
||||
|
||||
- `kaizen-agentic metrics optimize` produces recommendations when ≥10 execution records exist (per wiki minimum sample size)
|
||||
- Single canonical optimization meta-agent in registry
|
||||
- Tests cover insufficient-data and sufficient-data paths
|
||||
|
||||
---
|
||||
|
||||
## Part 4 — Bridge Coach, Memory, and Metrics
|
||||
|
||||
Unify qualitative memory and quantitative metrics in the orientation path.
|
||||
|
||||
### Tasks
|
||||
|
||||
- [ ] T14 — Extend `memory brief` to include metrics summary for target agent (recent success rate, avg quality, trend arrow)
|
||||
- [ ] T15 — Extend `agent-coach.md` to reference metrics context in synthesis instructions
|
||||
- [ ] T16 — E2e test: populate memory + metrics for two agents → `memory brief` includes both qualitative and quantitative sections
|
||||
|
||||
### Definition of done
|
||||
|
||||
- `memory brief tdd-workflow` output includes a `## Performance Summary` block when metrics exist
|
||||
- E2e test passes
|
||||
|
||||
---
|
||||
|
||||
## Part 5 — Pilot Agent and Template Conformance
|
||||
|
||||
Prove the loop end-to-end on one agent before fleet-wide rollout.
|
||||
|
||||
**Pilot agent:** `tdd-workflow` (high usage, clear success criteria in existing prompt)
|
||||
|
||||
### Tasks
|
||||
|
||||
- [ ] T17 — Add `metrics` section to `agent-tdd-workflow.md` frontmatter (primary: test-pass rate; secondary: cycle time)
|
||||
- [ ] T18 — Add session-close step: invoke `kaizen-agentic metrics record tdd-workflow` with session outcome
|
||||
- [ ] T19 — Document pilot in `wiki/AboutKaizenAgents.md` as reference implementation
|
||||
- [ ] T20 — E2e test: two simulated tdd-workflow sessions → metrics accumulate → optimize produces recommendation
|
||||
|
||||
### Definition of done
|
||||
|
||||
- tdd-workflow is the documented reference for metrics-enabled agents
|
||||
- Full loop demonstrated in e2e test: record → show → optimize → brief
|
||||
|
||||
---
|
||||
|
||||
## Part 6 — Packaging and Orientation
|
||||
|
||||
Close distribution and documentation gaps surfaced in gap analysis.
|
||||
|
||||
### Tasks
|
||||
|
||||
- [ ] T21 — Sync missing 4 agents into `src/kaizen_agentic/data/agents/` (coach, sys-medic, scope-analyst, optimization)
|
||||
- [ ] T22 — Update `README.md` Getting Oriented to link `INTENT.md` and `wiki/` (SCOPE.md already updated)
|
||||
- [ ] T23 — Update `.claude/rules/architecture.md` agent table (21 agents, meta category, sys-medic, coach)
|
||||
- [ ] T24 — CHANGELOG.md entry for metrics convention and CLI
|
||||
|
||||
### Definition of done
|
||||
|
||||
- `pip install` / packaged data includes all 21 agents
|
||||
- README orientation path matches SCOPE.md
|
||||
- architecture.md agent count accurate
|
||||
|
||||
---
|
||||
|
||||
## Sequencing
|
||||
|
||||
```
|
||||
Part 1 (T01–T04) ──→ Part 2 (T05–T08) ──→ Part 3 (T09–T13)
|
||||
│
|
||||
Part 4 (T14–T16) ←────────────┘
|
||||
│
|
||||
Part 5 (T17–T20) ──→ Part 6 (T21–T24)
|
||||
```
|
||||
|
||||
Parts 1–2 are blocking. Part 3 depends on storage + CLI. Parts 4–5 can overlap
|
||||
once Part 3 factory exists. Part 6 can run in parallel except T21 (needs final
|
||||
agent consolidation from T11).
|
||||
|
||||
Estimated effort: 4–6 sessions.
|
||||
|
||||
---
|
||||
|
||||
## Out of Scope (this workplan)
|
||||
|
||||
- Full `wiki/KaizenAgentTemplate.md` conformance for all 21 agents (future workplan)
|
||||
- KaizenGuidance codemod pipeline (`wiki/KaizenGuidance.md`)
|
||||
- Scheduled/automated optimizer runs (cron, activity-core integration) — convention only
|
||||
- WP-0001 CI/CD, PyPI publication, cross-platform testing
|
||||
- ML-based pattern detection (pandas/sklearn in wiki spec) — simple statistics first
|
||||
|
||||
---
|
||||
|
||||
## Success Criteria
|
||||
|
||||
A reader of `INTENT.md` can point to this repo and say:
|
||||
|
||||
1. Agents **can** record measurable per-execution outcomes in a standard location.
|
||||
2. The optimization loop **does** read real project data and produce recommendations.
|
||||
3. Coach orientation **includes** performance context, not only qualitative memory.
|
||||
4. At least one agent (tdd-workflow) demonstrates the full measure → analyse → orient cycle.
|
||||
|
||||
---
|
||||
|
||||
## State Hub Task IDs
|
||||
|
||||
| Code | UUID |
|
||||
|------|------|
|
||||
| T01 | 4e7b0fd2-38c0-46aa-84a7-bb18366b8c7c |
|
||||
| T02 | eeaa99c7-d7a7-403b-a013-364cba45a663 |
|
||||
| T03 | 247c097f-de89-4383-930c-35ee66de9b36 |
|
||||
| T04 | 3aa14026-6ee3-4384-b409-11300c1302f0 |
|
||||
| T05 | 6b505d29-7d2e-44a2-a4b7-1fe82884390c |
|
||||
| T06 | 84f2a357-f2dd-4fc7-96b6-a4e80d5467a7 |
|
||||
| T07 | 8e9ee64b-b7c4-4dff-ac6e-988fd47ef95d |
|
||||
| T08 | 4c41e0db-d5d8-4a1b-b346-06ad004edf4a |
|
||||
| T09 | 0b374439-6eca-4754-8e15-2a7eece0cd27 |
|
||||
| T10 | db87a09b-0252-495c-a771-a43b4b98f820 |
|
||||
| T11 | 73cb7d73-6fc6-42a9-97aa-d33cdf9ee363 |
|
||||
| T12 | c127eca7-7394-42db-ba5e-721aef0ccb76 |
|
||||
| T13 | f208dc9f-cdf7-47e3-9c03-09097e46eee9 |
|
||||
| T14 | d01f969c-bbb1-4eca-a4f1-d79d5c867b35 |
|
||||
| T15 | 67f791a4-fced-4986-a331-7eb4ea47fe6e |
|
||||
| T16 | 1fb89b54-8bd2-40bf-9a71-04693cb9f695 |
|
||||
| T17 | 1d471a7a-9a98-4805-903e-b4a2b8153717 |
|
||||
| T18 | abb387f1-86ce-4b9b-a516-2d4efb6aca4c |
|
||||
| T19 | 67fbc26e-a57d-4133-96e6-3d2cdbd10dc0 |
|
||||
| T20 | fbdd7c8b-e122-48d9-8c8f-de9f82d025e3 |
|
||||
| T21 | 9662bcec-34fe-451b-b61f-5d11b9574576 |
|
||||
| T22 | 422aae43-5697-4a00-86e9-1569baf09422 |
|
||||
| T23 | ba6b3411-d330-4a58-8cd0-62b4fbef8c5f |
|
||||
| T24 | 748be9f3-f6ac-4f26-a844-6330268935b6 |
|
||||
|
||||
**Hub workstream:** `kaizen-wp-0003-measurement-loop` (`36252a45-f360-4496-bf77-17b5dfb02767`)
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
|
||||
- Retention default: 180 days (per `wiki/AgentKaizenOptimizer.md`); override via project config in a later iteration
|
||||
- WP-0001 T04 (telemetry) should consume ADR-004 schema rather than inventing a parallel format
|
||||
- `OptimizationLoop` threshold constants (30s execution, 0.8 success rate) are starting points; expose in config later
|
||||
190
workplans/kaizen-agentic-WP-0004-ecosystem-integration.md
Normal file
190
workplans/kaizen-agentic-WP-0004-ecosystem-integration.md
Normal file
@@ -0,0 +1,190 @@
|
||||
---
|
||||
id: KAIZEN-WP-0004
|
||||
type: workplan
|
||||
title: "Ecosystem Integration: Helix Forge, activity-core, and artifact-store"
|
||||
domain: custodian
|
||||
repo: kaizen-agentic
|
||||
status: active
|
||||
owner: kaizen-agentic
|
||||
topic_slug: custodian
|
||||
state_hub_workstream_id: 76be7294-e201-4074-91c0-6421992470fe
|
||||
created: "2026-06-16"
|
||||
updated: "2026-06-17"
|
||||
---
|
||||
|
||||
# KAIZEN-WP-0004 — Ecosystem Integration: Helix Forge, activity-core, and artifact-store
|
||||
|
||||
**Status:** active
|
||||
**Owner:** kaizen-agentic
|
||||
**Repo:** kaizen-agentic
|
||||
**Depends on:** KAIZEN-WP-0003 Part 3 (metrics CLI + `metrics optimize` operational)
|
||||
|
||||
## Goal
|
||||
|
||||
Compose KaizenAgentic with adjacent ecosystem repos so INTENT's measurement and
|
||||
improvement vision spans **project** and **fleet** layers without duplicating
|
||||
capabilities or violating repo boundaries.
|
||||
|
||||
Primary integrations: **agentic-resources** (Helix Forge), **activity-core**
|
||||
(scheduled triggers), **artifact-store** (evidence retention). Secondary
|
||||
integrations (info-tech-canon, kontextual-engine) are Part 4 stretch goals.
|
||||
|
||||
Reference: `wiki/EcosystemIntegration.md`, `history/2026-06-16-ecosystem-assessment.md`
|
||||
|
||||
---
|
||||
|
||||
## Part 1 — Helix Forge Correlation (agentic-resources)
|
||||
|
||||
Wire project metrics (ADR-004) to fleet session metrics without re-implementing
|
||||
session ingestion.
|
||||
|
||||
### Tasks
|
||||
|
||||
- [ ] T01 — Document correlation contract in `agentic-resources` (cross-repo PR or shared doc link from both repos)
|
||||
- [ ] T02 — Add optional `helix_session_uid` population to `metrics record` when env `HELIX_SESSION_UID` is set
|
||||
- [ ] T03 — Add `kaizen-agentic metrics correlate` — lookup Helix digest summary by UID (read-only adapter stub if Helix API not ready)
|
||||
- [ ] T04 — Integration test: synthetic project record with `helix_session_uid` round-trips through show/brief
|
||||
- [ ] T05 — Update `wiki/EcosystemIntegration.md` with worked correlation example
|
||||
|
||||
### Definition of done
|
||||
|
||||
- Project execution records can carry Helix correlation fields per ADR-004
|
||||
- Documentation is bidirectional (kaizen-agentic + agentic-resources reference each other)
|
||||
- No session JSONL ingestion code in kaizen-agentic
|
||||
|
||||
---
|
||||
|
||||
## Part 2 — activity-core Triggers
|
||||
|
||||
Define ActivityDefinitions for recurring kaizen operations.
|
||||
|
||||
### Tasks
|
||||
|
||||
- [ ] T06 — Draft ActivityDefinition: weekly `metrics optimize` on repos with `.kaizen/metrics/`
|
||||
- [ ] T07 — Draft ActivityDefinition: post-install metrics scaffold validation (`memory init` check)
|
||||
- [ ] T08 — Draft ActivityDefinition: success_rate below 0.8 → issue-core review task
|
||||
- [ ] T09 — Document ActivityDefinition paths and activation contract in `docs/INTEGRATION_PATTERNS.md`
|
||||
- [ ] T10 — Smoke test: manual activation against a test repo with populated metrics
|
||||
|
||||
### Definition of done
|
||||
|
||||
- Three ActivityDefinition markdown files committed (location per activity-core convention)
|
||||
- kaizen-agentic docs describe how activity-core triggers map to CLI commands
|
||||
- No scheduling code in kaizen-agentic
|
||||
|
||||
---
|
||||
|
||||
## Part 3 — artifact-store Evidence Retention
|
||||
|
||||
Persist optimizer outputs as registered artifact packages.
|
||||
|
||||
### Tasks
|
||||
|
||||
- [ ] T11 — Define artifact package manifest for optimizer run (`analysis.json` + `recommendations.jsonl`)
|
||||
- [ ] T12 — Add `kaizen-agentic metrics publish` — register optimizer output with artifact-store API (configurable endpoint)
|
||||
- [ ] T13 — Map retention class `raw-evidence` (180d) in publish manifest metadata
|
||||
- [ ] T14 — Integration test with artifact-store local backend (skip if service unavailable; mark `@pytest.mark.integration`)
|
||||
- [ ] T15 — Document publish workflow in `docs/agency-framework.md` metrics section
|
||||
|
||||
### Definition of done
|
||||
|
||||
- Optimizer outputs can be registered as artifact packages when artifact-store is reachable
|
||||
- Retention metadata matches ADR-004 default
|
||||
- Publish is optional — local-only workflows still work without artifact-store
|
||||
|
||||
---
|
||||
|
||||
## Part 4 — Canon and Knowledge (stretch)
|
||||
|
||||
Secondary integrations for template conformance and knowledge asset lifecycle.
|
||||
|
||||
### Tasks
|
||||
|
||||
- [ ] T16 — Map `wiki/KaizenAgentTemplate.md` sections to info-tech-canon profile outline (design doc only)
|
||||
- [ ] T17 — Draft one InfoTechCanon-style agent brief for `tdd-workflow` pilot
|
||||
- [ ] T18 — Spike: kontextual-engine ingestion manifest for `wiki/` directory (design note, no runtime dependency)
|
||||
- [ ] T19 — Update `history/2026-06-16-ecosystem-assessment.md` with Part 4 outcomes
|
||||
|
||||
### Definition of done
|
||||
|
||||
- Design artifacts committed; no hard dependency on info-tech-canon or kontextual-engine services
|
||||
- tdd-workflow brief serves as reference for fleet-wide brief rollout (future WP)
|
||||
|
||||
---
|
||||
|
||||
## Sequencing
|
||||
|
||||
```
|
||||
WP-0003 Part 3 complete
|
||||
│
|
||||
▼
|
||||
Part 1 (T01–T05) ──→ Part 2 (T06–T10)
|
||||
│ │
|
||||
└──────────┬───────────┘
|
||||
▼
|
||||
Part 3 (T11–T15)
|
||||
│
|
||||
▼
|
||||
Part 4 (T16–T19) [stretch]
|
||||
```
|
||||
|
||||
Part 1 can start once `metrics record` and `metrics optimize` exist.
|
||||
Parts 2–3 can overlap. Part 4 is non-blocking.
|
||||
|
||||
Estimated effort: 3–5 sessions after WP-0003 Part 3.
|
||||
|
||||
---
|
||||
|
||||
## Out of Scope
|
||||
|
||||
- Cloning or implementing tele-mcp (assess separately)
|
||||
- phase-memory graph migration (future WP)
|
||||
- Full KaizenGuidance codemod pipeline
|
||||
- Owning activity-core, artifact-store, or agentic-resources code
|
||||
|
||||
---
|
||||
|
||||
## Success Criteria
|
||||
|
||||
1. Two-layer measurement model is documented, implemented at correlation layer,
|
||||
and operable without repo merges.
|
||||
2. Recurring kaizen checks can be triggered via activity-core without custom cron.
|
||||
3. Optimizer evidence can be preserved in artifact-store when configured.
|
||||
4. Canon/knowledge integration has a clear design path for later work.
|
||||
|
||||
---
|
||||
|
||||
## State Hub Task IDs
|
||||
|
||||
| Code | UUID |
|
||||
|------|------|
|
||||
| T01 | f365d19e-9619-4453-bebf-f1fd596b1bd1 |
|
||||
| T02 | e7f47683-5957-49db-bcbd-3aa47f44a073 |
|
||||
| T03 | 6ef8ba99-7d0c-44f4-835d-7a66e9d55984 |
|
||||
| T04 | 9875422c-a54b-40f1-a444-6b485a9e57d6 |
|
||||
| T05 | 0dc33d13-0e0b-4336-a7ad-371fc533b823 |
|
||||
| T06 | dbaa5f46-f66a-4a74-b4a0-97978e47d1c3 |
|
||||
| T07 | 161a264a-8f70-4e37-a854-bd5a76a0e54b |
|
||||
| T08 | 3b58ad38-839c-436a-8d97-ef5a8f9beefe |
|
||||
| T09 | a004b60f-4e8f-4881-b088-229ac9ab242f |
|
||||
| T10 | 84866bf1-5830-470d-87a5-9786222332c2 |
|
||||
| T11 | 033a19db-fbd2-411f-9d2e-779d210400d4 |
|
||||
| T12 | 54517f2b-23e3-433b-a483-c59227625dbc |
|
||||
| T13 | 3b378789-a761-4472-b072-a346541be239 |
|
||||
| T14 | a3566713-db58-4519-b9c4-5003421c1f1e |
|
||||
| T15 | 5d8255aa-fd7a-4fe6-bce2-3a176f954c7f |
|
||||
| T16 | 852c9cbf-0b0c-4f23-8594-905ca280c268 |
|
||||
| T17 | 62e05097-9033-401d-bbe0-d5d773da50fe |
|
||||
| T18 | cd6962c7-aaed-4d7d-81de-37c0e3ed715e |
|
||||
| T19 | 2c1f66f5-e6ab-4e19-88ca-818acb15a706 |
|
||||
|
||||
**Hub workstream:** `kaizen-wp-0004-ecosystem-integration` (`76be7294-e201-4074-91c0-6421992470fe`)
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
|
||||
- ADR-004 Helix Forge correlation section is the authoritative field mapping
|
||||
- WP-0001 T04 (telemetry) should evaluate tele-mcp as adapter candidate
|
||||
- activity-core ActivityDefinitions live in activity-core repo per ACT-ADR-002/003;
|
||||
kaizen-agentic commits reference copies or links under `docs/integrations/`
|
||||
Reference in New Issue
Block a user