Files

tegwick d65bc701da feat(token-tracking): record AI token consumption per task (CUST-WP-0029)

Introduces end-to-end token consumption tracking so agent work is
visible as a cost/effort metric alongside tasks and workplans.

- Migration o2j3k4l5m6n7: token_events table with FK indexes on
  task_id, workstream_id, repo_id, created_at
- ORM model, Pydantic schemas (TokenEventCreate, TokenEventRead with
  computed tokens_total, TokenSummary)
- Router: POST /token-events/, GET /token-events/ (7 filters),
  GET /token-events/summary/ (task|workstream|repo|commit|release scope)
- MCP tools: record_token_event, get_token_summary (formatted table)
- update_task_status enriched with optional tokens_in/tokens_out
  passthrough — one call creates status update + token event
- Dashboard token-cost.md page: by-repo bar, by-workplan table,
  by-model bar, top-10 tasks by tokens
- ralph-workplan skill updated with token reporting guidance and
  per-task heuristics for estimating counts
- Tests: test_token_events.py + test_token_passthrough.py (182 pass)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-03-29 17:46:46 +02:00

12 KiB

Raw Blame History

State Hub MCP — Tool Reference Card

Quick reference for all tools and resources.

Design Boundary

The State Hub is a read model. It observes and visualises cross-domain state that originates in the projects themselves.

Two write operations are permanently sanctioned:

Use Case	Tools
Resolving Decisions	`resolve_decision()` — decisions are cross-cutting; resolution must propagate across all domains
Suggesting Next Steps	`get_next_steps()` (v0.2) — surface what is unblocked; the domain does the work

All other mutate tools are bootstrap-only: use them during First Session Protocol to give a freshly-registered project its initial workstream structure. Do not use them as a substitute for formal work definition inside the domain repo.

Query Tools (read-only, use freely)

Tool	Key Args	When to use
`get_domain_summary(domain_slug)`	`domain_slug`: e.g. `"railiance"`	Domain session start. Scoped snapshot: active workstreams, blocking decisions, last 5 events, repo SBOM status — ~10% of get_state_summary() token cost.
`get_state_summary()`	—	Cross-domain work / custodian sessions. Full snapshot: totals, all blocking decisions, all blocked tasks, all open workstreams, last 20 events. Large (~10k tokens).
`get_topic(slug)`	`slug`: e.g. `"markitect"`	Deep-dive on one topic + its workstreams + recent events.
`list_tasks(workstream_id, status?)`	`workstream_id`: UUID (required); `status?`: todo/in_progress/blocked/done/cancelled	List all tasks in a workstream. Use this to look up task UUIDs before calling `update_task_status`, or to verify which workplan tasks are already synced to the DB.
`list_blocked_tasks(workstream_id?)`	optional filter	Surface all impediments, optionally scoped to one workstream.
`list_pending_decisions(topic_id?)`	optional filter	Decisions holding up work, sorted by deadline.
`get_recent_progress(limit, since?)`	`limit` default 20; `since` ISO datetime	Reconstruct recent session history.

Sanctioned Write Tools

Tool	Key Args	Notes
`record_decision(title, ...)`	`decision_type`: made/pending; `topic_id?`; `workstream_id?`; `deadline?`	Financial/legal + pending → auto-escalated per constitution §4. At least one of topic_id/workstream_id required.
`resolve_decision(decision_id, rationale, decided_by)`	all required	Marks decision resolved, emits progress event, writes DECISIONS.md to project directory.
`add_progress_event(summary, ...)`	`event_type`: note/milestone/blocker/insight; `topic_id?`; `workstream_id?`; `task_id?`; `detail?`	Append-only log entry. Use at session end.

Bootstrap-Only Tools

Use during First Session Protocol to give a freshly-registered project its initial workstream structure. Do not use for ongoing project management — formal work structure belongs in the domain repo (workplans, requirements, milestones).

Tool	Key Args	Notes
`create_workstream(topic_id, title, ...)`	`slug?`; `owner?`; `description?`; `due_date?`	Creates workstream under a topic. Use `get_state_summary()` to find topic IDs.
`create_task(workstream_id, title, ...)`	`priority`: low/medium/high/critical; `assignee?`; `due_date?`	Creates task under a workstream.
`update_task_status(task_id, status, ...)`	`status`: todo/in_progress/blocked/done/cancelled; `blocking_reason` required when blocked
`update_workstream_status(workstream_id, status)`	`status`: active/blocked/completed/archived	Thin shortcut — use `update_workstream` for full field control.
`update_workstream(workstream_id, ...)`	`title?`; `description?`; `owner?`; `due_date?`; `repo_goal_id?`; `status?`	Patch any subset of workstream fields. Pass empty string for `repo_goal_id` to clear the link.

Human Interventions

Tasks that agents cannot complete themselves are flagged with needs_human=True. Use list_human_interventions() at session start to see Bernd's action items.

Tool	Key Args	Notes
`flag_for_human(task_id, note)`	`task_id`: UUID; `note`: action description (required)	Sets needs_human=True + intervention_note. Emits progress event.
`clear_human_flag(task_id)`	`task_id`: UUID	Clears flag after human completes the action. Emits progress event.
`list_human_interventions(workstream_id?)`	optional workstream filter	Returns all tasks with needs_human=True.

Token Consumption Tools

Record and query AI token usage at task/workstream/repo/commit/release granularity. Agents should call record_token_event (or pass tokens_in/tokens_out via update_task_status) at task completion.

Tool	Key Args	Notes
`record_token_event(tokens_in, tokens_out, ...)`	`task_id`?, `workstream_id`?, `repo_id`?, `model`?, `agent`?, `ref_type`?, `ref_id`?, `note`?, `session_id`?	POSTs to `/token-events/`. `workstream_id` auto-filled from task. Returns event id + running total.
`get_token_summary(scope, id)`	`scope`: task\|workstream\|repo\|commit\|release\|session; `id`: UUID or ref string	Returns formatted table of tokens_in/out/total, event_count, by_model, by_agent.

Governance Tools

Tool	Key Args	When to use
`validate_repo_adr(repo_slug, domain_slug?)`	`repo_slug`: registered repo slug (e.g. `"the-custodian"`); `domain_slug?`: for orphan detection	Check a repo against ADR-001. Resolves the local path from the DB (uses this host's registered path). Detects missing workplans/ dir, invalid frontmatter, stale workstream ID references, and DB-only orphan workstreams. Always runs against the MCP server's copy — see Multi-Host section below.

Resources (URI-addressable, read-only)

URI	Returns
`state://summary`	Full StateSummary JSON
`state://topics`	Active topics list
`state://workstreams/{topic_slug}`	Workstreams for a topic (by slug)
`state://decisions/blocking`	All pending decisions
`state://tasks/blocked`	All blocked tasks

Domain Management Tools (v0.5)

Domains are now first-class DB entities. Use list_domains() to discover available slugs.

Tool	Key Args	Notes
`list_domains(status?)`	`status`: active/archived/all (default: active)	Discover all registered domains.
`create_domain(slug, name, description?)`	`slug`: lowercase_underscored; `name`: display name	Register a new project domain.
`rename_domain(slug, new_slug, new_name)`	all required	Renames domain and cascades to EP/TD string columns.
`archive_domain(slug)`	`slug`	Soft-delete; fails if active topics exist.
`list_domain_repos(domain_slug)`	`domain_slug`	List repos registered under a domain.
`register_repo(domain_slug, name, ...)`	`slug?`; `local_path?`; `remote_url?`	Register a git repo under a domain.
`update_repo_path(repo_slug, path, host?)`	`repo_slug`: e.g. `"marki-docx"`; `path`: absolute local path; `host`: defaults to current hostname	Register this machine's local path for a repo. Use when the same repo lives at different paths on different machines (e.g. `/home/worsch/…` vs `/home/tegwick/…`). The consistency checker prefers this over `local_path`.

Agent Inbox Tools

Inter-agent coordination via shared message board. Check inbox at session start; send messages to coordinate across Claude instances.

Agent names: use the repo slug (e.g. "marki-docx", "railiance") or "hub" for the custodian agent. Use "broadcast" as to_agent to send to all agents.

Tool	Key Args	When to use
`get_messages(to_agent?, from_agent?, unread_only?, limit?)`	`to_agent`: your agent name; `unread_only`: True recommended at session start	Check for pending coordination messages.
`send_message(from_agent, to_agent, subject, body, thread_id?)`	all except `thread_id` required	Send a coordination message to another agent (or broadcast).
`mark_message_read(message_id)`	`message_id`: UUID	Mark a message as read after acting on it.
`reply_to_message(message_id, from_agent, body)`	all required	Reply in-thread; marks original as read.

Dashboard: http://localhost:3000/inbox

Kaizen Agents

Specialized agent personas from kaizen-agentic/agents/. Each agent is a markdown instruction set — load it and follow the instructions it contains.

Tool	Key Args	When to use
`list_kaizen_agents(category?)`	`category`: optional filter (testing/quality/process/infrastructure)	Discover all 17 available agent personas with name, description, category.
`get_kaizen_agent(name)`	`name`: e.g. `"tdd-workflow"`, `"code-refactoring"`	Load full agent instructions. Read and follow them.

Common agents:

Agent	Category	When to use
`tdd-workflow`	testing	Step-by-step TDD8 workflow for any feature
`code-refactoring`	quality	Code quality analysis and safe refactoring
`test-maintenance`	testing	Diagnose and fix failing tests
`requirements-engineering`	process	Prevent interface/mock mismatches upfront
`keepaTodofile`	process	Maintain TODO.md during work
`project-management`	process	Track status, determine next steps
`scope-analyst`	project-management	Analyze a repo and produce/improve SCOPE.md
`datamodel-optimization`	quality	Optimize dataclasses and data structures

Multi-Host & Remote Agent Usage

Three tools access the local filesystem on the MCP server machine:

Tool	File-sys operation
`validate_repo_adr`	Runs `validate_repo_adr.py` against the server's repo checkout
`check_repo_consistency`	Runs `consistency_check.py` against the server's repo checkout
`ingest_sbom_tool`	Runs `ingest_sbom.py` against the server's lockfiles

Design boundary: these tools always execute on the machine where the MCP server runs (bnt-lap001), against the path registered for that host. A remote agent calling them gets results from the server's checkout — not from its own working copy.

Implications for remote agents (e.g. workers on COULOMBCORE)

Ahead of server on a branch? Results will be based on the server's (older) copy. Sync first: push your branch and pull it on the server, or accept the gap.
Pure-API tools (get_state_summary, create_task, add_progress_event, etc.) work correctly from any host — they query the DB, not the filesystem.

Running file-sys scripts locally from a remote host

# From COULOMBCORE (tunnel maps remote :18000 → bnt-lap001 :8000):
python scripts/consistency_check.py --repo the-custodian --api-base http://127.0.0.1:18000
python scripts/validate_repo_adr.py /home/tegwick/the-custodian --api-base http://127.0.0.1:18000

Registering a new host path

# Via MCP tool:
update_repo_path("marki-docx", "/home/tegwick/marki-docx")   # defaults to current hostname

# Via Makefile (on the machine where the path lives):
make register-path REPO=marki-docx PATH=/home/tegwick/marki-docx

# Via API directly:
curl -X POST http://127.0.0.1:8000/repos/marki-docx/paths/ \
  -H "Content-Type: application/json" \
  -d '{"host": "your-hostname", "path": "/home/you/marki-docx"}'

Domain Slugs

Run list_domains() to get the live list. Default 6: custodian · railiance · markitect · coulomb_social · personhood · foerster_capabilities

Common Patterns

# Session start:
get_state_summary()

# Decision resolved in the hub UI or via tool:
resolve_decision(decision_id="<uuid>", rationale="...", decided_by="Bernd")

# Session end:
add_progress_event(
    summary="...",
    event_type="note",          # or milestone / insight / blocker
    topic_id="<uuid>",
    workstream_id="<uuid>",     # optional
    detail={"key": "value"},    # optional
)

# First Session Protocol only — bootstrap a new project:
create_workstream(topic_id="<uuid>", title="My Workstream", owner="me")
create_task(workstream_id="<uuid>", title="Do the thing", priority="high")

12 KiB Raw Blame History