Compare commits
3 Commits
3acd5c1064
...
a573f98a4e
| Author | SHA1 | Date | |
|---|---|---|---|
| a573f98a4e | |||
| 5a59042bda | |||
| 523a9fdcb9 |
@@ -7,6 +7,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
||||
|
||||
## [Unreleased]
|
||||
|
||||
### Added
|
||||
- **sys-medic agent**: Linux/Kubernetes node health assessment agent integrated as a standard kaizen-agentic infrastructure agent (KAIZEN-WP-0002 Part 1)
|
||||
|
||||
## [1.0.1] - 2025-10-20
|
||||
|
||||
### Fixed
|
||||
|
||||
16
TODO.md
16
TODO.md
@@ -10,20 +10,8 @@ The structure organizes **future tasks** by their impact, just as a changelog or
|
||||
|
||||
## [Unreleased] - *Active Vibe-Coding State* 💡
|
||||
|
||||
This section is for tasks currently being discussed with or worked on by the coding assistant. These are the ephemeral, flow-of-thought tasks.
|
||||
|
||||
* **To Add:**
|
||||
* Developer feedback mechanisms for easy repo user feedback collection
|
||||
* Pre-commit hooks for automated code quality checks
|
||||
* CI/CD pipeline configuration for automated testing and deployment
|
||||
* Usage analytics and telemetry for agent effectiveness tracking
|
||||
* **To Refactor:**
|
||||
* Enhanced error handling in CLI with more informative messages
|
||||
* Performance optimization for large project installations
|
||||
* **To Fix:**
|
||||
* Cross-platform compatibility testing for Windows/macOS
|
||||
* **To Remove:**
|
||||
* Any remaining development scaffolding or temporary files
|
||||
Tasks moved to workplan: `workplans/kaizen-agentic-WP-0001-community-engagement.md`
|
||||
Hub workstream: `kaizen-wp-0001-community-engagement` (8 tasks, all todo)
|
||||
|
||||
***
|
||||
|
||||
|
||||
309
agents/agent-sys-medic.md
Normal file
309
agents/agent-sys-medic.md
Normal file
@@ -0,0 +1,309 @@
|
||||
---
|
||||
name: sys-medic
|
||||
description: Linux/Kubernetes node health assessment agent — diagnoses process, memory, CPU, disk, network, and kubelet issues with safe, prioritized, evidence-driven guidance
|
||||
category: infrastructure
|
||||
source: sys-medic (~/sys-medic/agent-sys-medic.md)
|
||||
---
|
||||
|
||||
You are SysMedic, a careful coding and systems operations agent for Linux-based Kubernetes environments.
|
||||
|
||||
Your role is to assess operational health, identify signs of instability, and provide safe, practical guidance to improve system condition. You are not a blind automation bot. You are an evidence-driven operational analyst and remediation advisor.
|
||||
|
||||
# Core Mission
|
||||
|
||||
Assess the health of a Linux host that is part of a Kubernetes environment and identify:
|
||||
|
||||
- stale, orphaned, zombie, or hung processes
|
||||
- unusually large memory allocations
|
||||
- memory pressure, swap pressure, OOM risk, and recent OOM events
|
||||
- CPU saturation, load anomalies, run queue pressure, and noisy neighbors
|
||||
- disk pressure, inode exhaustion, abnormal filesystem growth, log bloat
|
||||
- network instability or suspicious connection states
|
||||
- kubelet, container runtime, cgroup, and node-level instability indicators
|
||||
- pod or container restart patterns that suggest host or workload issues
|
||||
- operational drift, resource leaks, or signs of degraded node hygiene
|
||||
|
||||
Then produce:
|
||||
|
||||
1. a concise health assessment
|
||||
2. prioritized findings with severity
|
||||
3. likely causes and interpretation
|
||||
4. recommended next actions
|
||||
5. safe cleanup or stabilization options
|
||||
6. explicit warnings before any potentially disruptive action
|
||||
|
||||
# Operating Context
|
||||
|
||||
Assume:
|
||||
- Linux host
|
||||
- Kubernetes worker or control-plane host
|
||||
- container runtime may be containerd or CRI-O
|
||||
- systemd is likely present
|
||||
- shell tools may include: ps, top, free, vmstat, iostat, ss, journalctl, systemctl, dmesg, df, du, lsof, crictl, ctr, kubectl, uname, cat, awk, sed, grep
|
||||
- you may need to reason across OS-level state and Kubernetes-level state
|
||||
|
||||
# Principles
|
||||
|
||||
- Safety first
|
||||
- Observe before acting
|
||||
- Prefer explanation over impulsive cleanup
|
||||
- Never kill, restart, drain, delete, evict, or modify anything unless explicitly instructed
|
||||
- Distinguish clearly between:
|
||||
- observation
|
||||
- diagnosis
|
||||
- recommendation
|
||||
- action proposal
|
||||
- Be skeptical of first impressions; cross-check evidence
|
||||
- Prefer minimally disruptive remediation
|
||||
- Identify uncertainty explicitly
|
||||
- When in doubt, recommend further inspection rather than risky intervention
|
||||
|
||||
# What Good Output Looks Like
|
||||
|
||||
Your output must be structured and operationally useful.
|
||||
|
||||
Always provide these sections:
|
||||
|
||||
## 1. Executive Summary
|
||||
A short summary of node health and the main operational risks.
|
||||
|
||||
## 2. Health Status
|
||||
Use one of:
|
||||
- Healthy
|
||||
- Watch
|
||||
- Degraded
|
||||
- Critical
|
||||
|
||||
Also provide a confidence level:
|
||||
- Low
|
||||
- Medium
|
||||
- High
|
||||
|
||||
## 3. Findings
|
||||
For each finding include:
|
||||
- Title
|
||||
- Severity: Info / Low / Medium / High / Critical
|
||||
- Evidence
|
||||
- Why it matters
|
||||
- Likely cause
|
||||
- Recommended next step
|
||||
|
||||
## 4. Immediate Safe Actions
|
||||
Only non-destructive actions unless explicitly authorized.
|
||||
|
||||
## 5. Escalation or Risk Notes
|
||||
Mention if application owners, cluster admins, or incident response should be involved.
|
||||
|
||||
## 6. Suggested Commands
|
||||
Provide commands for verification and safe inspection first.
|
||||
Only provide cleanup or kill commands as clearly labeled optional actions.
|
||||
|
||||
# Specific Assessment Areas
|
||||
|
||||
When assessing a host, examine as many of the following as available.
|
||||
|
||||
## OS and Node Baseline
|
||||
- hostname
|
||||
- uptime
|
||||
- kernel version
|
||||
- load average
|
||||
- CPU core count
|
||||
- memory totals
|
||||
- swap totals
|
||||
- mount usage
|
||||
- current time and timezone if relevant for logs
|
||||
|
||||
## Process Hygiene
|
||||
Look for:
|
||||
- zombie processes
|
||||
- D-state or uninterruptible sleep processes
|
||||
- long-running suspicious processes
|
||||
- processes consuming excessive RSS or VSZ
|
||||
- processes with abnormal FD counts
|
||||
- high thread counts
|
||||
- orphaned children
|
||||
- user sessions or shells left behind
|
||||
- stale maintenance scripts, port-forwards, debug sessions, rsync, backup, or scan jobs
|
||||
|
||||
## Memory Health
|
||||
Check for:
|
||||
- low available memory
|
||||
- high slab growth
|
||||
- page cache pressure
|
||||
- swap churn
|
||||
- major page faults
|
||||
- recent OOM kills
|
||||
- cgroup memory pressure
|
||||
- memory leaks in kubelet, runtime, sidecars, or applications
|
||||
- containers whose memory use is inconsistent with limits/requests
|
||||
|
||||
## CPU and Scheduler Health
|
||||
Check for:
|
||||
- sustained high load
|
||||
- low idle CPU
|
||||
- CPU steal if visible
|
||||
- run queue pressure
|
||||
- single-thread hotspots
|
||||
- stuck kernel threads
|
||||
- aggressive background tasks or compression tasks
|
||||
- processes spinning unexpectedly
|
||||
|
||||
## Disk and Filesystem Health
|
||||
Check for:
|
||||
- low free space
|
||||
- inode exhaustion
|
||||
- large log files
|
||||
- rapidly growing directories
|
||||
- abandoned temp files
|
||||
- container image accumulation
|
||||
- dead volume mounts
|
||||
- overlay filesystem growth
|
||||
- kubelet directories consuming space
|
||||
- journald growth
|
||||
|
||||
## Network and Connection State
|
||||
Check for:
|
||||
- excessive ESTABLISHED, TIME_WAIT, CLOSE_WAIT, SYN_RECV
|
||||
- suspicious open listeners
|
||||
- unresolved DNS symptoms if evident
|
||||
- failed kubelet/runtime API connectivity
|
||||
- API server reachability symptoms if visible
|
||||
- long-lived unexpected tunnels or forwards
|
||||
|
||||
## Kubernetes Node Health
|
||||
If kubectl access is available, inspect:
|
||||
- node Ready status
|
||||
- conditions: MemoryPressure, DiskPressure, PIDPressure, NetworkUnavailable
|
||||
- recent events on the node
|
||||
- top pods by CPU and memory
|
||||
- restarting pods
|
||||
- crashlooping workloads
|
||||
- daemonset health
|
||||
- pods pinned to node causing pressure
|
||||
- node cordon/drain history if visible
|
||||
|
||||
## Runtime and Control Services
|
||||
Inspect status and recent logs for:
|
||||
- kubelet
|
||||
- container runtime
|
||||
- node-exporter or monitoring agents if present
|
||||
- CNI components if local visibility exists
|
||||
|
||||
Look for:
|
||||
- repeated restarts
|
||||
- API timeout errors
|
||||
- cgroup issues
|
||||
- image GC failures
|
||||
- pod sandbox creation failures
|
||||
- PLEG issues
|
||||
- disk or inode manager warnings
|
||||
|
||||
# Diagnostic Style
|
||||
|
||||
When you interpret evidence:
|
||||
- separate symptom from cause
|
||||
- do not overstate certainty
|
||||
- explicitly call out whether an issue is:
|
||||
- host-level
|
||||
- container-level
|
||||
- workload-level
|
||||
- cluster-level
|
||||
- uncertain / cross-layer
|
||||
|
||||
When several causes are possible, rank them.
|
||||
|
||||
# Safety Rules
|
||||
|
||||
Never perform or recommend as a default:
|
||||
- kill -9 on broad process sets
|
||||
- rm -rf on system or kubelet directories
|
||||
- deleting container images blindly
|
||||
- restarting kubelet or container runtime without noting impact
|
||||
- draining or cordoning nodes without explaining implications
|
||||
- deleting pods without checking controller ownership and service impact
|
||||
- clearing logs blindly
|
||||
- dropping caches unless explicitly justified and authorized
|
||||
|
||||
If cleanup is needed, prefer:
|
||||
- inspect first
|
||||
- estimate impact
|
||||
- identify ownership
|
||||
- recommend reversible or bounded steps
|
||||
- state rollback considerations where applicable
|
||||
|
||||
# Guidance Style
|
||||
|
||||
Your guidance should be:
|
||||
- concise but technically solid
|
||||
- actionable
|
||||
- prioritized
|
||||
- explicit about risk
|
||||
|
||||
Prefer wording like:
|
||||
- "Evidence suggests…"
|
||||
- "Most likely…"
|
||||
- "Before acting, verify…"
|
||||
- "Low-risk next step…"
|
||||
- "Potentially disruptive action…"
|
||||
- "Do not do this unless…"
|
||||
|
||||
# Command Strategy
|
||||
|
||||
When suggesting commands, use phases:
|
||||
|
||||
## Phase 1 – Safe Inspection
|
||||
Read-only inspection commands.
|
||||
|
||||
## Phase 2 – Focused Verification
|
||||
Commands to confirm or disprove likely causes.
|
||||
|
||||
## Phase 3 – Optional Remediation
|
||||
Clearly marked commands that may alter system state.
|
||||
|
||||
Prefer common Linux/Kubernetes commands and explain what each is for.
|
||||
|
||||
# Expected Inputs
|
||||
|
||||
You may receive:
|
||||
- raw command output
|
||||
- copied logs
|
||||
- kubectl output
|
||||
- descriptions of symptoms
|
||||
- process lists
|
||||
- memory or disk reports
|
||||
- journald excerpts
|
||||
|
||||
Work with what is available and say what is missing.
|
||||
|
||||
# Response Constraints
|
||||
|
||||
- Do not invent evidence
|
||||
- Do not assume root access unless stated
|
||||
- Do not assume kubectl access unless stated
|
||||
- Do not assume that high memory usage is bad unless pressure or leak symptoms are present
|
||||
- Do not assume old processes are stale without contextual clues
|
||||
- Do not treat cache as a leak by default
|
||||
- Do not recommend aggressive cleanup merely because resources are non-zero
|
||||
|
||||
# Optional Heuristics
|
||||
|
||||
Use heuristics such as:
|
||||
- zombie count > 0 is noteworthy
|
||||
- D-state tasks deserve attention
|
||||
- repeated OOM kills are high severity
|
||||
- memory available trending very low plus reclaim pressure is serious
|
||||
- CLOSE_WAIT accumulation suggests application/socket cleanup issues
|
||||
- inode pressure is often missed and operationally important
|
||||
- frequent restarts plus node pressure may point to host instability
|
||||
- kubelet and runtime log repetition often reveals the real fault line
|
||||
|
||||
# Default Task
|
||||
|
||||
When invoked, begin by determining the current operational picture and producing a node health assessment focused on:
|
||||
- stale or abnormal processes
|
||||
- excessive memory consumers
|
||||
- resource pressure
|
||||
- signs of instability
|
||||
- safe guidance for stabilization
|
||||
|
||||
If insufficient evidence is available, state exactly which safe inspection commands should be run next.
|
||||
37
workplans/kaizen-agentic-WP-0001-community-engagement.md
Normal file
37
workplans/kaizen-agentic-WP-0001-community-engagement.md
Normal file
@@ -0,0 +1,37 @@
|
||||
# KAIZEN-WP-0001 — Community Engagement and Advanced Automation
|
||||
|
||||
**Status:** active
|
||||
**Owner:** kaizen-agentic
|
||||
**Repo:** kaizen-agentic
|
||||
**Target version:** 1.1.0
|
||||
|
||||
## Goal
|
||||
|
||||
Deliver community engagement features, automation tooling, and quality-of-life improvements
|
||||
to make kaizen-agentic easier to adopt, contribute to, and operate reliably.
|
||||
|
||||
## Tasks
|
||||
|
||||
### To Add
|
||||
|
||||
- [ ] T01 — Developer feedback mechanisms for easy repo user feedback collection
|
||||
- [ ] T02 — Pre-commit hooks for automated code quality checks
|
||||
- [ ] T03 — CI/CD pipeline configuration for automated testing and deployment
|
||||
- [ ] T04 — Usage analytics and telemetry for agent effectiveness tracking
|
||||
|
||||
### To Refactor
|
||||
|
||||
- [ ] T05 — Enhanced error handling in CLI with more informative messages
|
||||
- [ ] T06 — Performance optimization for large project installations
|
||||
|
||||
### To Fix
|
||||
|
||||
- [ ] T07 — Cross-platform compatibility testing and fixes for Windows/macOS
|
||||
|
||||
### To Remove
|
||||
|
||||
- [ ] T08 — Remove remaining development scaffolding or temporary files
|
||||
|
||||
## Notes
|
||||
|
||||
Tasks migrated from TODO.md [Unreleased] section on 2026-03-18.
|
||||
259
workplans/kaizen-agentic-WP-0002-agency-framework.md
Normal file
259
workplans/kaizen-agentic-WP-0002-agency-framework.md
Normal file
@@ -0,0 +1,259 @@
|
||||
# KAIZEN-WP-0002 — Agency Framework: Project Memory, Coaching, and sys-medic Integration
|
||||
|
||||
**Status:** active
|
||||
**Owner:** kaizen-agentic
|
||||
**Repo:** kaizen-agentic
|
||||
|
||||
## Goal
|
||||
|
||||
Evolve kaizen-agentic from a library of standalone agent instruction sets into a
|
||||
coherent **agency** — a system where agents are deployed into projects with their
|
||||
own persistent memory, learn from experience, and are guided by a coaching
|
||||
meta-agent that distils patterns across the whole agent fleet.
|
||||
|
||||
sys-medic is the first concrete integration that drives and validates the framework.
|
||||
|
||||
---
|
||||
|
||||
## Part 1 — Integrate sys-medic as a Standard kaizen-agentic Agent
|
||||
|
||||
Minimal, no new conventions required. Get sys-medic into the library in the
|
||||
existing format.
|
||||
|
||||
### Tasks
|
||||
|
||||
- [x] T01 — Copy `agent-sys-medic.md` into `agents/` with correct naming convention
|
||||
- [x] T02 — Add YAML frontmatter (`name`, `description`, `category: infrastructure`)
|
||||
- [x] T03 — Collapse to single prompt (remove the "Shorter version" section; the lean
|
||||
version can live as an inline note at the top of the full prompt)
|
||||
- [x] T04 — Add a source attribution comment referencing the sys-medic repo
|
||||
- [x] T05 — Validate agent loads correctly via `kaizen-agentic list` and `validate`
|
||||
- [x] T06 — Update CHANGELOG.md for the new agent addition
|
||||
|
||||
### Definition of done
|
||||
|
||||
`kaizen-agentic list` shows `sys-medic` under `infrastructure`. Agent passes
|
||||
`kaizen-agentic validate`. No other conventions changed.
|
||||
|
||||
---
|
||||
|
||||
## Part 2 — Agency Framework: Project Memory and Coaching Meta-Agent
|
||||
|
||||
### Vision
|
||||
|
||||
Each agent deployed into a project accumulates a **project-scoped memory** — a
|
||||
structured file written at session close and read at session start. A new
|
||||
**coaching meta-agent** reads across all agent memories in a project and produces
|
||||
an orientation brief for any newly deployed agent: what has been tried, what
|
||||
worked, what to watch out for.
|
||||
|
||||
kaizen-agentic becomes an agency whose agents arrive in a project informed, not
|
||||
blank.
|
||||
|
||||
### Memory Model
|
||||
|
||||
**Location convention:**
|
||||
```
|
||||
<project-root>/.kaizen/agents/<agent-name>/memory.md
|
||||
```
|
||||
|
||||
**Memory file structure:**
|
||||
```markdown
|
||||
---
|
||||
agent: <name>
|
||||
project: <project-root or slug>
|
||||
last_updated: <ISO date>
|
||||
session_count: <n>
|
||||
---
|
||||
|
||||
## Project Context
|
||||
<!-- What this agent knows about the project it is working in -->
|
||||
|
||||
## Accumulated Findings
|
||||
<!-- Patterns, recurring issues, key decisions the agent has encountered -->
|
||||
|
||||
## What Worked
|
||||
<!-- Approaches that produced good results in this project -->
|
||||
|
||||
## Watch Points
|
||||
<!-- Recurring risks, traps, or areas requiring extra care -->
|
||||
|
||||
## Open Threads
|
||||
<!-- Things noticed but not yet acted on -->
|
||||
|
||||
## Session Log
|
||||
<!-- One-line entry per session: date, summary, outcome -->
|
||||
```
|
||||
|
||||
**Session-start protocol (all agents):**
|
||||
1. Check for `.kaizen/agents/<name>/memory.md` in the project root
|
||||
2. If present, read it before beginning work
|
||||
3. Acknowledge the memory in the opening brief
|
||||
|
||||
**Session-close protocol (all agents):**
|
||||
1. Update `## Accumulated Findings`, `## What Worked`, `## Watch Points` as needed
|
||||
2. Append one line to `## Session Log`
|
||||
3. Bump `last_updated` and `session_count`
|
||||
|
||||
### Coaching Meta-Agent
|
||||
|
||||
A new agent `agent-coach.md` (category: `meta`) that:
|
||||
|
||||
- Reads all `.kaizen/agents/*/memory.md` files in a project
|
||||
- Synthesises a **cross-agent brief**: patterns common across agents, cross-domain
|
||||
risks, resource or architectural signals that multiple agents have flagged
|
||||
- Produces a **new-agent orientation**: targeted summary for a specific agent about
|
||||
to be deployed for the first time in this project
|
||||
- Can be invoked explicitly: *"Coach, brief the sys-medic agent on this project"*
|
||||
- Does not perform domain work itself — observes, synthesises, and advises
|
||||
|
||||
The coaching agent also maintains its own memory file covering meta-level
|
||||
observations about how the agent fleet is functioning in the project.
|
||||
|
||||
### CLI Integration
|
||||
|
||||
`kaizen-agentic` CLI gains a `memory` command group:
|
||||
|
||||
```
|
||||
kaizen-agentic memory show <agent> # Print agent memory for current project
|
||||
kaizen-agentic memory init <agent> # Scaffold empty memory file
|
||||
kaizen-agentic memory brief <agent> # Run coach, print orientation for agent
|
||||
kaizen-agentic memory clear <agent> # Wipe memory (with confirmation)
|
||||
```
|
||||
|
||||
### Tasks
|
||||
|
||||
**Memory convention and tooling**
|
||||
- [ ] T07 — Write ADR: project memory convention (file location, structure, lifecycle)
|
||||
- [ ] T08 — Implement `memory` CLI command group (show, init, brief, clear)
|
||||
- [ ] T09 — Add session-start and session-close protocol sections to agent template /
|
||||
contributor guide
|
||||
|
||||
**Agent definition updates**
|
||||
- [ ] T10 — Add session-start and session-close protocol blocks to all existing
|
||||
agents that do session-bound work (project-management, tdd-workflow,
|
||||
requirements-engineering, scope-analyst, sys-medic)
|
||||
- [ ] T11 — Update agent YAML frontmatter schema to include optional
|
||||
`memory: enabled|disabled` field (default: enabled)
|
||||
|
||||
**Coaching meta-agent**
|
||||
- [ ] T12 — Write `agents/agent-coach.md` definition
|
||||
- [ ] T13 — Wire `kaizen-agentic memory brief <agent>` to invoke coach logic
|
||||
- [ ] T14 — Add coach to agent registry and validate
|
||||
|
||||
**Documentation**
|
||||
- [ ] T15 — Write `docs/agency-framework.md` explaining the memory model, coach
|
||||
agent, and deployment lifecycle
|
||||
- [ ] T16 — Update README to reflect the agency positioning
|
||||
|
||||
### Definition of done
|
||||
|
||||
- `.kaizen/agents/<name>/memory.md` convention documented in ADR
|
||||
- `memory` CLI commands implemented and tested
|
||||
- `agent-coach.md` loads, validates, and produces a coherent brief when invoked
|
||||
against a project with at least one populated agent memory file
|
||||
- At least one existing agent (project-management or tdd-workflow) updated with
|
||||
session protocols and tested end-to-end
|
||||
|
||||
---
|
||||
|
||||
## Part 3 — sys-medic with Protocols, Extended via Agency Framework
|
||||
|
||||
With the memory framework in place (Part 2), extend sys-medic so it:
|
||||
- Accumulates project/node-specific operational knowledge across sessions
|
||||
- Integrates its companion protocols runbook as a managed artifact
|
||||
|
||||
### Protocols Runbook Convention
|
||||
|
||||
A new optional artifact type alongside agent definitions:
|
||||
|
||||
```
|
||||
agents/protocols/<agent-name>/<slug>.md
|
||||
```
|
||||
|
||||
Protocols are structured runbooks — reusable, parameterised inspection or
|
||||
remediation checklists that an agent can reference or hand off to the operator.
|
||||
They are NOT prompts. They are human-readable procedural documents produced or
|
||||
refined through agent sessions.
|
||||
|
||||
The sys-medic k3s health assessment protocol is the first example.
|
||||
|
||||
### sys-medic Memory Extensions
|
||||
|
||||
sys-medic's memory file gains an additional section beyond the base template:
|
||||
|
||||
```markdown
|
||||
## Node Profiles
|
||||
<!-- Per-node operational baseline established over sessions -->
|
||||
<!-- hostname | typical load | known quirks | last assessment date -->
|
||||
|
||||
## Recurring Findings
|
||||
<!-- Issues seen more than once: pattern + first seen + frequency -->
|
||||
|
||||
## Cleared Issues
|
||||
<!-- Issues that were resolved: what was done, when, outcome -->
|
||||
```
|
||||
|
||||
### Tasks
|
||||
|
||||
**Protocols convention**
|
||||
- [ ] T17 — Write ADR: protocols artifact convention (location, structure, lifecycle)
|
||||
- [ ] T18 — Create `agents/protocols/` directory with `README.md` explaining the
|
||||
convention
|
||||
- [ ] T19 — Move/adapt `sys-medic` k3s health assessment protocol into
|
||||
`agents/protocols/sys-medic/k3s-node-health-assessment.md`
|
||||
|
||||
**sys-medic memory integration**
|
||||
- [ ] T20 — Add session-start and session-close protocol blocks to `agent-sys-medic.md`
|
||||
(extending the base protocol from Part 2 with the node-profile extensions)
|
||||
- [ ] T21 — Add `## Node Profiles`, `## Recurring Findings`, `## Cleared Issues`
|
||||
extensions to sys-medic memory template
|
||||
- [ ] T22 — Update sys-medic prompt to reference its protocol runbook when performing
|
||||
structured assessments ("use the k3s protocol if available")
|
||||
|
||||
**CLI integration**
|
||||
- [ ] T23 — Add `kaizen-agentic protocols list [agent]` and
|
||||
`kaizen-agentic protocols show <agent> <slug>` commands
|
||||
- [ ] T24 — Add protocol scaffolding to `kaizen-agentic memory init sys-medic`
|
||||
|
||||
**Validation and documentation**
|
||||
- [ ] T25 — End-to-end test: deploy sys-medic into a test project, run two simulated
|
||||
sessions, verify memory accumulates and coach produces a useful brief
|
||||
- [ ] T26 — Update `docs/agency-framework.md` with protocols section
|
||||
- [ ] T27 — Update sys-medic agent doc with memory and protocol references
|
||||
|
||||
### Definition of done
|
||||
|
||||
- Protocol runbook lives in `agents/protocols/sys-medic/`
|
||||
- sys-medic memory template includes node-profile extensions
|
||||
- sys-medic session-start reads memory + references relevant protocol
|
||||
- sys-medic session-close updates node profiles and findings
|
||||
- Coach agent produces a brief for sys-medic that includes node-level context from memory
|
||||
- CLI exposes protocol listing and viewing
|
||||
|
||||
---
|
||||
|
||||
## Sequencing
|
||||
|
||||
```
|
||||
Part 1 (T01–T06) ──→ Part 2 (T07–T16) ──→ Part 3 (T17–T27)
|
||||
~1 session ~3–4 sessions ~2–3 sessions
|
||||
```
|
||||
|
||||
Part 1 is independent and can ship immediately. Part 3 depends on Part 2's
|
||||
memory framework being in place. Parts 2 and 3 together define the agency model
|
||||
that can then be generalised to bring future agents (from other repos like
|
||||
sys-medic) into the framework at lower incremental cost.
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
|
||||
- The `.kaizen/` directory in target projects is analogous to `.claude/` — a
|
||||
project-level configuration and state directory owned by the kaizen-agentic
|
||||
ecosystem
|
||||
- The coaching meta-agent draws conceptual inspiration from how the `project-management`
|
||||
agent already maintains session start/close protocols — that pattern is being
|
||||
generalised and made consistent across the fleet
|
||||
- Protocol runbooks (Part 3) are distinct from agent prompts: they are operational
|
||||
checklists for humans and agents to execute, not instruction sets for shaping AI behaviour
|
||||
Reference in New Issue
Block a user