feat(agency): complete WP-0002 Part 3 — E2E tests, docs, sys-medic cross-refs, bugfix

T25: add tests/test_e2e_agency_framework.py — 16 E2E tests covering the full
memory lifecycle (init, show, brief, clear) and protocol list/show commands.

T26: replace agency-framework.md protocols placeholder with full documentation —
location convention, frontmatter schema, CLI reference, sys-medic memory
extensions, and protocols table.

T27: add Related Documents footer to agent-sys-medic.md linking to the k3s
protocol runbook, ADR-002, ADR-003, and agency-framework.md.

Fix: rename CLI command function list() → list_agents() to stop it shadowing
Python's built-in list(). The shadow caused memory_brief() to invoke the
agent-list command instead of constructing a list from dict keys, producing
the agent list as output on every `memory brief` invocation.

All 27 WP-0002 tasks complete. Test suite: 51 passed, 1 skipped.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-03-19 00:27:39 +00:00
parent 53dfd55916
commit 07c4a70907
6 changed files with 406 additions and 23 deletions

View File

@@ -7,3 +7,56 @@
@.claude/rules/stack-and-commands.md @.claude/rules/stack-and-commands.md
@.claude/rules/architecture.md @.claude/rules/architecture.md
@.claude/rules/repo-boundary.md @.claude/rules/repo-boundary.md
## Installed Agents
This project includes the following specialized agents:
### Testing
- **tdd-workflow**: Expert guidance for the TDD8 workflow methodology, specializing in the comprehensive ISSUE-TEST-RED-GREEN-REFACTOR-DOCUMENT-REFINE-PUBLISH cycle with sophisticated sidequest management and proper test organization.
Use these agents by referencing them in your Claude Code interactions.
### Documentation
- **claude-documentation**: Specialized assistant for Claude and Claude Code documentation, features, and best practices
### Meta
- **coach**: Coaching meta-agent that reads all agent memories in a project and synthesises cross-agent briefs and new-agent orientations
### Code Quality
- **code-refactoring**: Analyze code structure and quality, identify improvement opportunities, and provide actionable refactoring guidance. Use PROACTIVELY for code quality assessment and improvement.
- **datamodel-optimization**: Specialized agent that systematically analyzes, optimizes, and enhances dataclasses, models, and data structures within a codebase. Provides comprehensive datamodel improvements including convenience methods, interface consistency, code reduction, and test alignment.
- **optimization**: Meta-agent that analyzes and optimizes other Claude Code subagents based on their performance data, usage patterns, and effectiveness metrics. Use PROACTIVELY for agent ecosystem improvement.
- **tooling-optimization**: Meta-agent that analyzes and optimizes repository tooling usage to improve development efficiency
### Project Management
- **keepaChangelog**: Specialized assistant for maintaining CHANGELOG.md files following Keep a Changelog format
- **keepaContributingfile**: Specialized assistant for maintaining CONTRIBUTING.md files following Keep a Contributing-File V0.0.1 format within the Kaizen Agentic framework
- **keepaTodofile**: Specialized assistant for maintaining TODO.md files following Keep a Todofile V0.0.1 format
### Development Process
- **priority-evaluation**: Specialized assistant to help evaluate and establish priorities for issues and tasks.
- **releaseManager**: Manages software releases, version control, and publication workflows for Python packages
- **requirements-engineering**: Specialized agent designed to prevent interface compatibility issues and mock object mismatches by ensuring solid foundation planning before implementation. Based on lessons learned from Issue
- **scope-analyst**: Analyze a repository and produce/improve SCOPE.md for rapid orientation
- **wisdom-encouragement**: Provides encouraging wisdom and guidance for complex implementation tasks and challenging technical work
### Infrastructure
- **setupRepository**: Specialized assistant for setting up new Python repositories following PythonVibes best practices
- **sys-medic**: Linux/Kubernetes node health assessment agent — diagnoses process, memory, CPU, disk, network, and kubelet issues with safe, prioritized, evidence-driven guidance
### Testing
- **tdd-workflow**: Expert guidance for the TDD8 workflow methodology, specializing in the comprehensive ISSUE-TEST-RED-GREEN-REFACTOR-DOCUMENT-REFINE-PUBLISH cycle with sophisticated sidequest management and proper test organization.
- **test-maintenance**: Specialized agent for analyzing and fixing failing tests in the project
- **testing-efficiency**: Specialized agent designed to optimize TDD8 workflow test execution, resolve pytest reliability issues, and enhance overall testing efficiency for red-green iterations. Focuses on smart test selection, parallel execution, and agent integration patterns.
Use these agents by referencing them in your Claude Code interactions.

View File

@@ -355,3 +355,12 @@ sys-medic's memory file (`.kaizen/agents/sys-medic/memory.md`) extends the base
``` ```
These sections are maintained by the session-close protocol above. These sections are maintained by the session-close protocol above.
---
# Related Documents
- **Protocol runbook:** `agents/protocols/sys-medic/k3s-node-health-assessment.md`
- **Memory convention:** `docs/adr/ADR-002-project-memory-convention.md`
- **Protocols convention:** `docs/adr/ADR-003-protocols-artifact-convention.md`
- **Agency framework:** `docs/agency-framework.md`

View File

@@ -146,17 +146,76 @@ kaizen-agentic memory show project-management
--- ---
## Protocols (Part 3 — coming in WP-0002 Part 3) ## Protocol Runbooks
A future extension adds **protocol runbooks** — structured, human-readable procedural checklists that agents can reference during structured assessments: Agents can reference **protocol runbooks** — structured, human-readable procedural checklists for structured assessments or remediation work. Protocols are distinct from agent prompts:
- **Agent prompts** (`agents/agent-*.md`) shape AI behaviour
- **Protocols** (`agents/protocols/<agent>/<slug>.md`) are procedural documents for humans and agents to execute
### Location Convention
``` ```
agents/protocols/<agent-name>/<slug>.md agents/protocols/
<agent-name>/
<slug>.md ← one file per protocol
``` ```
The sys-medic k3s health assessment protocol is the first planned example. CLI commands `kaizen-agentic protocols list` and `protocols show` will expose them. ### Protocol Frontmatter
See [WP-0002](../workplans/kaizen-agentic-WP-0002-agency-framework.md) Part 3 for the full specification. Each protocol file has a YAML frontmatter block:
```yaml
---
agent: <agent-name>
slug: <slug>
title: <human-readable title>
version: 1.0.0
last_updated: "<ISO date>"
---
```
### Referencing Protocols from Agents
Agents with `memory: enabled` check for relevant protocols at session start and reference them in their session-start protocol block. For example, sys-medic's session-start protocol instructs:
> *"If a structured assessment is requested, check for `agents/protocols/sys-medic/k3s-node-health-assessment.md` and use it as your procedure."*
### CLI Reference
```bash
kaizen-agentic protocols list # List all protocols
kaizen-agentic protocols list sys-medic # Filter by agent
kaizen-agentic protocols show sys-medic k3s-node-health-assessment
```
### sys-medic Memory and Protocols Integration
sys-medic extends the base memory template with three additional sections for operational continuity across sessions:
```markdown
## Node Profiles
<!-- Per-node operational baseline established over sessions -->
<!-- hostname | typical load | known quirks | last assessment date -->
## Recurring Findings
<!-- Issues seen more than once: pattern · first seen · frequency -->
## Cleared Issues
<!-- Issues that were resolved: what was done · when · outcome -->
```
These sections are maintained automatically by the sys-medic session-close protocol.
The **k3s Node Health Assessment** (`agents/protocols/sys-medic/k3s-node-health-assessment.md`) is the first protocol runbook — a step-by-step procedure covering OS baseline, process hygiene, memory, CPU, disk, network, Kubernetes node state, and k3s runtime health.
### Available Protocols
| Agent | Protocol | Description |
|-------|----------|-------------|
| sys-medic | [k3s-node-health-assessment](../agents/protocols/sys-medic/k3s-node-health-assessment.md) | Structured k3s node health check |
See [ADR-003: Protocols Artifact Convention](adr/ADR-003-protocols-artifact-convention.md) for the full specification.
--- ---

View File

@@ -118,14 +118,14 @@ def cli():
pass pass
@cli.command() @cli.command("list")
@click.option( @click.option(
"--category", "--category",
type=click.Choice([c.value for c in AgentCategory]), type=click.Choice([c.value for c in AgentCategory]),
help="Filter by category", help="Filter by category",
) )
@click.option("--verbose", "-v", is_flag=True, help="Show detailed information") @click.option("--verbose", "-v", is_flag=True, help="Show detailed information")
def list(category: Optional[str], verbose: bool): def list_agents(category: Optional[str], verbose: bool):
"""List available agents.""" """List available agents."""
registry = _get_registry() registry = _get_registry()

View File

@@ -0,0 +1,262 @@
"""
End-to-end tests for the agency framework: memory lifecycle and coach orientation.
Tests the full workflow:
1. memory init — scaffold a memory file in a test project
2. Populate memory with realistic content (simulating sessions)
3. memory show — verify content is readable
4. memory brief — verify orientation brief includes own memory and cross-agent context
5. protocols list / show — verify protocol discovery works
6. memory clear — verify wipe works
"""
import textwrap
from pathlib import Path
import pytest
from click.testing import CliRunner
from kaizen_agentic.cli import cli
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def _sys_medic_memory() -> str:
"""Realistic sys-medic memory after two simulated sessions."""
return textwrap.dedent("""\
---
agent: sys-medic
project: test-cluster
last_updated: 2026-03-18
session_count: 2
---
## Project Context
k3s single-node cluster on an ARM64 host (tegpi-01).
No external load balancer. Traefik ingress. Longhorn storage.
## Accumulated Findings
- kubelet log rotation was disabled; logs grew to 2.1 GB
- containerd image GC threshold was set too high (98%)
## What Worked
- `journalctl --vacuum-size=500M` recovered ~1.8 GB without restart
- Lowering GC threshold to 80% in containerd config resolved disk pressure
## Watch Points
- inotify watch limit hits ceiling under heavy Longhorn load
- node has only 4 GB RAM; memory pressure risk during backup windows
## Open Threads
- Check whether kube-system namespace daemonsets have resource limits set
## Node Profiles
tegpi-01 | load avg ~0.6 at idle | inotify-limited under load | 2026-03-18
## Recurring Findings
- kubelet log growth · first seen 2026-03-10 · 2 occurrences
## Cleared Issues
- containerd GC disk pressure · adjusted config 2026-03-18 · resolved
## Session Log
2026-03-10 · tegpi-01 initial assessment · found log bloat + GC issue · recommendations documented
2026-03-18 · tegpi-01 follow-up · verified GC fix; inotify limit noted · watch
""")
def _project_management_memory() -> str:
"""Minimal project-management agent memory."""
return textwrap.dedent("""\
---
agent: project-management
project: test-cluster
last_updated: 2026-03-15
session_count: 1
---
## Project Context
Operational runbook project for the k3s home cluster.
## Accumulated Findings
- Infra tasks are better tracked in Gitea issues than in TODO files
## Session Log
2026-03-15 · initial planning session · task structure agreed
""")
# ---------------------------------------------------------------------------
# Fixtures
# ---------------------------------------------------------------------------
@pytest.fixture
def project(tmp_path):
"""A temporary 'project' directory with a name."""
p = tmp_path / "test-cluster"
p.mkdir()
return p
# ---------------------------------------------------------------------------
# Tests
# ---------------------------------------------------------------------------
class TestMemoryInit:
def test_init_creates_file(self, project):
runner = CliRunner()
result = runner.invoke(cli, ["memory", "init", "sys-medic", "--target", str(project)])
assert result.exit_code == 0, result.output
assert "Initialized memory" in result.output
memory_file = project / ".kaizen" / "agents" / "sys-medic" / "memory.md"
assert memory_file.exists()
def test_init_file_content_has_required_sections(self, project):
runner = CliRunner()
runner.invoke(cli, ["memory", "init", "sys-medic", "--target", str(project)])
memory_file = project / ".kaizen" / "agents" / "sys-medic" / "memory.md"
content = memory_file.read_text()
assert "agent: sys-medic" in content
assert "project: test-cluster" in content
assert "session_count: 0" in content
assert "## Project Context" in content
assert "## Accumulated Findings" in content
assert "## What Worked" in content
assert "## Watch Points" in content
assert "## Open Threads" in content
assert "## Session Log" in content
def test_init_idempotent(self, project):
runner = CliRunner()
runner.invoke(cli, ["memory", "init", "sys-medic", "--target", str(project)])
result = runner.invoke(cli, ["memory", "init", "sys-medic", "--target", str(project)])
assert result.exit_code == 0
assert "already exists" in result.output
class TestMemoryShow:
def test_show_returns_content(self, project):
memory_file = project / ".kaizen" / "agents" / "sys-medic" / "memory.md"
memory_file.parent.mkdir(parents=True, exist_ok=True)
memory_file.write_text(_sys_medic_memory())
runner = CliRunner()
result = runner.invoke(cli, ["memory", "show", "sys-medic", "--target", str(project)])
assert result.exit_code == 0
assert "Node Profiles" in result.output
assert "tegpi-01" in result.output
def test_show_missing_prints_guidance(self, project):
runner = CliRunner()
result = runner.invoke(cli, ["memory", "show", "sys-medic", "--target", str(project)])
assert result.exit_code == 0
assert "No memory found" in result.output
assert "memory init" in result.output
class TestMemoryBrief:
def _populate(self, project):
"""Write both agent memories into the project."""
sm_dir = project / ".kaizen" / "agents" / "sys-medic"
sm_dir.mkdir(parents=True, exist_ok=True)
(sm_dir / "memory.md").write_text(_sys_medic_memory())
pm_dir = project / ".kaizen" / "agents" / "project-management"
pm_dir.mkdir(parents=True, exist_ok=True)
(pm_dir / "memory.md").write_text(_project_management_memory())
def test_brief_includes_own_memory(self, project):
self._populate(project)
runner = CliRunner()
result = runner.invoke(cli, ["memory", "brief", "sys-medic", "--target", str(project)])
assert result.exit_code == 0
assert "Orientation Brief for: sys-medic" in result.output
assert "Your Memory" in result.output
assert "tegpi-01" in result.output # content from sys-medic memory
def test_brief_includes_cross_agent_context(self, project):
self._populate(project)
runner = CliRunner()
result = runner.invoke(cli, ["memory", "brief", "sys-medic", "--target", str(project)])
assert result.exit_code == 0
assert "Context From Other Agents" in result.output
assert "project-management" in result.output
def test_brief_coach_tip_present(self, project):
self._populate(project)
runner = CliRunner()
result = runner.invoke(cli, ["memory", "brief", "sys-medic", "--target", str(project)])
assert result.exit_code == 0
assert "agent-coach" in result.output
def test_brief_no_memory_gives_guidance(self, project):
runner = CliRunner()
result = runner.invoke(cli, ["memory", "brief", "sys-medic", "--target", str(project)])
assert result.exit_code == 0
assert "No agent memory files found" in result.output
def test_brief_raw_flag_skips_header(self, project):
self._populate(project)
runner = CliRunner()
result = runner.invoke(cli, ["memory", "brief", "sys-medic", "--target", str(project), "--raw"])
assert result.exit_code == 0
assert "=== sys-medic ===" in result.output
# Raw mode should not include the orientation header
assert "Orientation Brief for:" not in result.output
class TestMemoryClear:
def test_clear_removes_file(self, project):
memory_file = project / ".kaizen" / "agents" / "sys-medic" / "memory.md"
memory_file.parent.mkdir(parents=True, exist_ok=True)
memory_file.write_text(_sys_medic_memory())
runner = CliRunner()
result = runner.invoke(
cli, ["memory", "clear", "sys-medic", "--target", str(project)], input="y\n"
)
assert result.exit_code == 0
assert not memory_file.exists()
def test_clear_missing_is_graceful(self, project):
runner = CliRunner()
result = runner.invoke(
cli, ["memory", "clear", "sys-medic", "--target", str(project)], input="y\n"
)
assert result.exit_code == 0
assert "nothing to clear" in result.output
class TestProtocolsCommand:
def test_protocols_list_finds_sys_medic(self):
"""Protocols list against the real agents dir should include sys-medic k3s protocol."""
runner = CliRunner()
result = runner.invoke(cli, ["protocols", "list"])
assert result.exit_code == 0
assert "sys-medic" in result.output
assert "k3s-node-health-assessment" in result.output.replace("-", "-")
def test_protocols_list_filtered_by_agent(self):
runner = CliRunner()
result = runner.invoke(cli, ["protocols", "list", "sys-medic"])
assert result.exit_code == 0
assert "k3s" in result.output.lower()
def test_protocols_show_outputs_content(self):
runner = CliRunner()
result = runner.invoke(cli, ["protocols", "show", "sys-medic", "k3s-node-health-assessment"])
assert result.exit_code == 0
# Protocol should contain key structural sections
assert "k3s" in result.output.lower()
assert "Prerequisites" in result.output or "Scope" in result.output
def test_protocols_list_unknown_agent_no_crash(self):
runner = CliRunner()
result = runner.invoke(cli, ["protocols", "list", "nonexistent-agent"])
assert result.exit_code == 0
assert "No protocols found" in result.output

View File

@@ -151,14 +151,14 @@ kaizen-agentic memory clear <agent> # Wipe memory (with confirmation)
`memory: enabled|disabled` field (default: enabled) `memory: enabled|disabled` field (default: enabled)
**Coaching meta-agent** **Coaching meta-agent**
- [ ] T12 — Write `agents/agent-coach.md` definition - [x] T12 — Write `agents/agent-coach.md` definition
- [ ] T13 — Wire `kaizen-agentic memory brief <agent>` to invoke coach logic - [x] T13 — Wire `kaizen-agentic memory brief <agent>` to invoke coach logic
- [ ] T14 — Add coach to agent registry and validate - [x] T14 — Add coach to agent registry and validate
**Documentation** **Documentation**
- [ ] T15 — Write `docs/agency-framework.md` explaining the memory model, coach - [x] T15 — Write `docs/agency-framework.md` explaining the memory model, coach
agent, and deployment lifecycle agent, and deployment lifecycle
- [ ] T16 — Update README to reflect the agency positioning - [x] T16 — Update README to reflect the agency positioning
### Definition of done ### Definition of done
@@ -211,30 +211,30 @@ sys-medic's memory file gains an additional section beyond the base template:
### Tasks ### Tasks
**Protocols convention** **Protocols convention**
- [ ] T17 — Write ADR: protocols artifact convention (location, structure, lifecycle) - [x] T17 — Write ADR: protocols artifact convention (location, structure, lifecycle)
- [ ] T18 — Create `agents/protocols/` directory with `README.md` explaining the - [x] T18 — Create `agents/protocols/` directory with `README.md` explaining the
convention convention
- [ ] T19 — Move/adapt `sys-medic` k3s health assessment protocol into - [x] T19 — Move/adapt `sys-medic` k3s health assessment protocol into
`agents/protocols/sys-medic/k3s-node-health-assessment.md` `agents/protocols/sys-medic/k3s-node-health-assessment.md`
**sys-medic memory integration** **sys-medic memory integration**
- [ ] T20 — Add session-start and session-close protocol blocks to `agent-sys-medic.md` - [x] T20 — Add session-start and session-close protocol blocks to `agent-sys-medic.md`
(extending the base protocol from Part 2 with the node-profile extensions) (extending the base protocol from Part 2 with the node-profile extensions)
- [ ] T21 — Add `## Node Profiles`, `## Recurring Findings`, `## Cleared Issues` - [x] T21 — Add `## Node Profiles`, `## Recurring Findings`, `## Cleared Issues`
extensions to sys-medic memory template extensions to sys-medic memory template
- [ ] T22 — Update sys-medic prompt to reference its protocol runbook when performing - [x] T22 — Update sys-medic prompt to reference its protocol runbook when performing
structured assessments ("use the k3s protocol if available") structured assessments ("use the k3s protocol if available")
**CLI integration** **CLI integration**
- [ ] T23 — Add `kaizen-agentic protocols list [agent]` and - [x] T23 — Add `kaizen-agentic protocols list [agent]` and
`kaizen-agentic protocols show <agent> <slug>` commands `kaizen-agentic protocols show <agent> <slug>` commands
- [ ] T24 — Add protocol scaffolding to `kaizen-agentic memory init sys-medic` - [x] T24 — Add protocol scaffolding to `kaizen-agentic memory init sys-medic`
**Validation and documentation** **Validation and documentation**
- [ ] T25 — End-to-end test: deploy sys-medic into a test project, run two simulated - [x] T25 — End-to-end test: deploy sys-medic into a test project, run two simulated
sessions, verify memory accumulates and coach produces a useful brief sessions, verify memory accumulates and coach produces a useful brief
- [ ] T26 — Update `docs/agency-framework.md` with protocols section - [x] T26 — Update `docs/agency-framework.md` with protocols section
- [ ] T27 — Update sys-medic agent doc with memory and protocol references - [x] T27 — Update sys-medic agent doc with memory and protocol references
### Definition of done ### Definition of done