WP-0003 Part 5: tdd-workflow metrics pilot
Add metrics frontmatter and session-close recording to tdd-workflow, document the reference implementation in wiki/AboutKaizenAgents.md, and add an e2e test covering record → show → optimize → brief.
This commit is contained in:
@@ -1,24 +1,76 @@
|
||||
AboutKaizenAgents
|
||||
# About Kaizen Agents
|
||||
|
||||
*Basic concepts of Kaizen Agents*
|
||||
Basic concepts of Kaizen Agents.
|
||||
|
||||
All Kaizen Agents follow the KaizenAgentTemplateDefinition
|
||||
All Kaizen Agents follow the [KaizenAgentTemplate](KaizenAgentTemplate.md) definition.
|
||||
That template provides a comprehensive structure for defining Kaizen Agent subagents.
|
||||
|
||||
This template provides a comprehensive structure for defining KaizenAgent subagents.
|
||||
Key sections:
|
||||
|
||||
The key sections are:
|
||||
- **Specification** — declarative outcomes rather than implementation steps
|
||||
- **Idempotency design** — detect and handle already-completed work
|
||||
- **Metrics** — measurable success criteria from day one
|
||||
- **Testing** — scenarios that feed the optimization loop
|
||||
- **Evolution tracking** — improvement history and performance trends
|
||||
|
||||
Specification: Focuses on declarative outcomes rather than implementation steps, making agents more maintainable and testable.
|
||||
The template enforces separation of concerns, testability, and measurability while
|
||||
keeping agent definitions consistent across the fleet.
|
||||
|
||||
Idempotency Design: Forces you to think upfront about how the agent will detect and handle already-completed work.
|
||||
---
|
||||
|
||||
Metrics: Ensures every agent has measurable success criteria from day one.
|
||||
## Metrics-enabled pilot: `tdd-workflow`
|
||||
|
||||
Testing: Built-in test scenarios that can be automated as part of the optimization loop.
|
||||
`tdd-workflow` is the reference implementation for project-scoped metrics (WP-0003).
|
||||
Use it as a template when adding metrics to other agents.
|
||||
|
||||
Evolution Tracking: Maintains a history of improvements and provides hooks for the KaizenAgent to analyze performance trends.
|
||||
### What is measured
|
||||
|
||||
The template enforces our design principles - separation of concerns, testability, and measurability - while providing enough structure to ensure consistency across different coding subagents.
|
||||
| Metric | Role | How |
|
||||
|--------|------|-----|
|
||||
| `test_pass_rate` | Primary | Passing tests ÷ total tests at PUBLISH (target: 1.0) |
|
||||
| `cycle_time_s` | Secondary | Session duration (`execution_time_s` in ADR-004) |
|
||||
|
||||
Definitions live in the agent frontmatter (`agents/agent-tdd-workflow.md`).
|
||||
|
||||
xxx
|
||||
### Where data lives
|
||||
|
||||
```
|
||||
<project>/.kaizen/metrics/tdd-workflow/
|
||||
executions.jsonl # append-only per-session records
|
||||
summary.json # rolling aggregates (auto-generated)
|
||||
```
|
||||
|
||||
Scaffolded by `kaizen-agentic memory init tdd-workflow` alongside
|
||||
`.kaizen/agents/tdd-workflow/memory.md`.
|
||||
|
||||
### Session-close loop
|
||||
|
||||
At the end of each TDD8 session:
|
||||
|
||||
1. Update qualitative memory (`## Session Log`, findings, watch points).
|
||||
2. Record quantitative outcome:
|
||||
|
||||
```bash
|
||||
kaizen-agentic metrics record tdd-workflow --success --time <seconds> --quality <0.0-1.0>
|
||||
```
|
||||
|
||||
Or pass a full ADR-004 record with `primary_metric` via `--json` (see agent spec).
|
||||
|
||||
### Analysis and orientation
|
||||
|
||||
| Command | Purpose |
|
||||
|---------|---------|
|
||||
| `kaizen-agentic metrics show tdd-workflow` | Summary + recent executions |
|
||||
| `kaizen-agentic metrics optimize tdd-workflow` | Evidence-based recommendations (≥10 records) |
|
||||
| `kaizen-agentic memory brief tdd-workflow` | Qualitative memory + `## Performance Summary` |
|
||||
|
||||
Fleet-level session analytics remain in **agentic-resources** (Helix Forge); project
|
||||
metrics stay in `.kaizen/metrics/` per [ADR-004](../docs/adr/ADR-004-project-metrics-convention.md)
|
||||
and [EcosystemIntegration](EcosystemIntegration.md).
|
||||
|
||||
### Adopting metrics on another agent
|
||||
|
||||
1. Add a `metrics:` block to frontmatter (primary + secondary + collection).
|
||||
2. Copy the session-close `metrics record` step from `agent-tdd-workflow.md`.
|
||||
3. Run `kaizen-agentic memory init <agent>` to scaffold storage.
|
||||
4. Verify with `metrics show` after one session.
|
||||
Reference in New Issue
Block a user