Compare commits
28 Commits
36c20f37d0
...
main
| Author | SHA1 | Date | |
|---|---|---|---|
| cd8339ecef | |||
| f8ab58edbe | |||
| 2b5e9743fe | |||
| 753c3d4fc6 | |||
| 94e84f0db9 | |||
| a765ccda21 | |||
| 4472fa6c7f | |||
| 526fa1e3bc | |||
| 86de18c247 | |||
| ca9d0d7030 | |||
| bc527ec09a | |||
| ce984482e2 | |||
| 9266f124e6 | |||
| 8740a66611 | |||
| b7e9edbb4b | |||
| 479fa95fdf | |||
| eb9b622499 | |||
| e3e5b8ecc1 | |||
| 9e8d73fa7d | |||
| d44a4cd3df | |||
| c0615c2d50 | |||
| 965508ec06 | |||
| f325f89dc9 | |||
| 36a5136bdf | |||
| b7e11461f4 | |||
| 3966814868 | |||
| f4610a46e3 | |||
| 0d95e6dbcf |
20
.claude/rules/agents.md
Normal file
20
.claude/rules/agents.md
Normal file
@@ -0,0 +1,20 @@
|
||||
## Kaizen Agents
|
||||
|
||||
Specialized agent personas available on demand via the state-hub MCP.
|
||||
|
||||
**Discover:** `list_kaizen_agents()` — returns all agents with name, description, category
|
||||
**Load:** `get_kaizen_agent("tdd-workflow")` — returns full instructions; read and follow them
|
||||
|
||||
Common agents:
|
||||
|
||||
| Agent | Category | When to use |
|
||||
|-------|----------|-------------|
|
||||
| `tdd-workflow` | testing | Step-by-step TDD8 workflow for any feature |
|
||||
| `code-refactoring` | quality | Code quality analysis and safe refactoring |
|
||||
| `test-maintenance` | testing | Diagnose and fix failing tests |
|
||||
| `requirements-engineering` | process | Prevent interface/mock mismatches upfront |
|
||||
| `keepaTodofile` | process | Maintain TODO.md during work |
|
||||
| `project-management` | process | Track status, determine next steps |
|
||||
| `datamodel-optimization` | quality | Optimize dataclasses and data structures |
|
||||
|
||||
All 17 agents: call `list_kaizen_agents()` for the full list.
|
||||
8
.claude/rules/architecture.md
Normal file
8
.claude/rules/architecture.md
Normal file
@@ -0,0 +1,8 @@
|
||||
## Architecture
|
||||
|
||||
<!-- TODO: Describe the key design decisions and component structure.
|
||||
Key modules, data flows, external integrations, state machines, etc. -->
|
||||
|
||||
## Quick Reference
|
||||
|
||||
`~/state-hub/mcp_server/TOOLS.md` — MCP tool reference
|
||||
50
.claude/rules/credential-routing.md
Normal file
50
.claude/rules/credential-routing.md
Normal file
@@ -0,0 +1,50 @@
|
||||
# Credential and access routing
|
||||
|
||||
**Audience:** Codex, Claude Code, Grok, and custodian agents that call **llm-connect**
|
||||
for inference. Run this check **before** requesting secrets, API keys, SSH access,
|
||||
login tokens, or database passwords — in any repo, not only `ops-warden`.
|
||||
|
||||
ops-warden **issues SSH certificates only** (`warden sign`, `cert_command`). Every
|
||||
other credential need belongs to another subsystem. **Do not** message
|
||||
`ops-warden` on State Hub expecting a secret value; the reply is a pointer, not a key.
|
||||
|
||||
### Lookup (do this first)
|
||||
|
||||
```bash
|
||||
warden route find "<describe your need>" --json
|
||||
warden route show <catalog-id> --json
|
||||
```
|
||||
|
||||
Requires the `warden` CLI from `~/ops-warden` (`uv tool install .` or `uv run warden`).
|
||||
|
||||
| Agent runtime | How to orient |
|
||||
| --- | --- |
|
||||
| **Codex / Grok** (shell, HTTP State Hub) | `warden route` commands above; inbox `to_agent=markitect-main` is for coordination, not secret vending |
|
||||
| **Claude Code** (MCP when available) | `get_domain_summary("custodian")` for workstreams; **still** use `warden route` for credential ownership |
|
||||
| **llm-connect** (inference service) | Never put secret retrieval in prompts; route custody to OpenBao/operator paths surfaced by `warden route` |
|
||||
|
||||
### Quick routing table
|
||||
|
||||
| I need… | Owner | ops-warden executes? |
|
||||
| --- | --- | --- |
|
||||
| SSH cert (`adm`/`agt`/`atm`) | ops-warden | **Yes** — `warden sign` |
|
||||
| API key, DB password, provider token | OpenBao (`railiance-platform`) | No — route only |
|
||||
| Login / OIDC / MFA | key-cape / Keycloak | No — route only |
|
||||
| Authorization decision | flex-auth | No — route only |
|
||||
| activity-core → issue-core emission | activity-core + issue-core | No — `warden route show activity-core-issue-sink` |
|
||||
| SSH tunnel | ops-bridge (+ `cert_command` from warden) | No — route only |
|
||||
|
||||
### Anti-patterns (do not do these)
|
||||
|
||||
- `POST /messages/` to `ops-warden` asking for `ISSUE_CORE_API_KEY`, `OPENROUTER_API_KEY`, etc.
|
||||
- Inventing `warden secret`, `warden login`, `warden bao`, `warden tunnel` — they do not exist
|
||||
- Pasting secrets into Git, State Hub, workplans, logs, or chat
|
||||
|
||||
### Other capabilities (reuse-surface)
|
||||
|
||||
Non-credential capabilities are usually discovered through **reuse-surface** federation
|
||||
(`reuse-surface` registry / `capability.*` indexes). Credential routing is inlined in
|
||||
every repo's agent instructions because it is high-frequency, high-risk, and easy to
|
||||
get wrong.
|
||||
|
||||
**Canon:** `~/ops-warden/wiki/CredentialRouting.md` · catalog `~/ops-warden/registry/routing/catalog.yaml`
|
||||
38
.claude/rules/first-session.md
Normal file
38
.claude/rules/first-session.md
Normal file
@@ -0,0 +1,38 @@
|
||||
## First Session Protocol
|
||||
|
||||
Triggered when `get_domain_summary("communication")` shows **no workstreams**.
|
||||
The project is registered but work has not yet been structured.
|
||||
|
||||
**Step 1 — Read, don't write**
|
||||
- `~/the-custodian/canon/projects/communication/project_charter_v0.1.md` — purpose, scope
|
||||
- `~/the-custodian/canon/projects/communication/roadmap_v0.1.md` — planned phases
|
||||
- Scan repo root: README, directory structure, existing code or docs
|
||||
|
||||
**Step 2 — Survey in-progress work**
|
||||
Look for TODOs, open branches, half-finished files. Note done vs. started but incomplete.
|
||||
|
||||
**Step 3 — Propose workstreams to Bernd**
|
||||
Propose 1–3 workstreams — each a coherent strand, weeks to months, anchored to a
|
||||
roadmap phase. **Wait for approval before creating.**
|
||||
|
||||
**Step 4 — Create workplan file first, then DB record (ADR-001)**
|
||||
```
|
||||
workplans/MARKITECT-WP-NNNN-<slug>.md ← write this first
|
||||
```
|
||||
Then register in the hub:
|
||||
```
|
||||
create_workstream(topic_id="36c7421b-c537-4723-bf75-42a3ebc6a1dc", title="...", owner="...", description="...")
|
||||
create_task(workstream_id="<id>", title="...", priority="high|medium|low")
|
||||
```
|
||||
|
||||
**Step 5 — Record the setup**
|
||||
```
|
||||
add_progress_event(
|
||||
summary="First session: structured communication into N workstreams, M tasks",
|
||||
event_type="milestone",
|
||||
topic_id="36c7421b-c537-4723-bf75-42a3ebc6a1dc",
|
||||
detail={"workstreams": [...], "tasks_created": M}
|
||||
)
|
||||
```
|
||||
|
||||
<!-- Delete or archive this file once past first session -->
|
||||
8
.claude/rules/repo-boundary.md
Normal file
8
.claude/rules/repo-boundary.md
Normal file
@@ -0,0 +1,8 @@
|
||||
## Repo boundary
|
||||
|
||||
This repo owns **Markitect Main** only. It does not own:
|
||||
|
||||
<!-- TODO: List what belongs in adjacent repos, e.g.:
|
||||
- SSH key management → railiance-infra/
|
||||
- State hub code → state-hub/
|
||||
-->
|
||||
5
.claude/rules/repo-identity.md
Normal file
5
.claude/rules/repo-identity.md
Normal file
@@ -0,0 +1,5 @@
|
||||
**Purpose:** Markitect Main - (fill in purpose)
|
||||
|
||||
**Domain:** communication
|
||||
**Repo slug:** markitect-main
|
||||
**Topic ID:** 36c7421b-c537-4723-bf75-42a3ebc6a1dc
|
||||
85
.claude/rules/session-protocol.md
Normal file
85
.claude/rules/session-protocol.md
Normal file
@@ -0,0 +1,85 @@
|
||||
## Session Protocol
|
||||
|
||||
Dev Hub (State Hub API): http://127.0.0.1:8000
|
||||
MCP server name in `~/.claude.json`: `dev-hub`
|
||||
|
||||
**Step 1 — Orient**
|
||||
|
||||
Read the offline-safe brief first — it works without a live hub connection:
|
||||
```bash
|
||||
cat .custodian-brief.md
|
||||
```
|
||||
Then call the MCP tool for richer cross-domain context when MCP tools are exposed:
|
||||
```
|
||||
get_domain_summary("communication")
|
||||
```
|
||||
If MCP tools are unavailable in the current agent session, use the REST API:
|
||||
```bash
|
||||
curl -s "http://127.0.0.1:8000/state/summary" | python3 -m json.tool
|
||||
```
|
||||
If the hub is offline: `cd ~/state-hub && make api`
|
||||
|
||||
**Step 2 — Check inbox**
|
||||
With MCP tools:
|
||||
```
|
||||
get_messages(to_agent="markitect-main", unread_only=True)
|
||||
```
|
||||
Mark read with `mark_message_read(message_id)`. Reply or act on coordination
|
||||
requests before proceeding.
|
||||
|
||||
Without MCP tools:
|
||||
```bash
|
||||
curl -s "http://127.0.0.1:8000/messages/?to_agent=markitect-main&unread_only=true" \
|
||||
| python3 -m json.tool
|
||||
curl -s -X PATCH "http://127.0.0.1:8000/messages/<id>/read" \
|
||||
-H "Content-Type: application/json" -d '{}'
|
||||
```
|
||||
|
||||
**Step 3 — Scan workplans**
|
||||
```bash
|
||||
ls workplans/
|
||||
```
|
||||
For each file with `status: ready`, `active`, or `blocked`, note pending
|
||||
`wait`/`todo`/`progress` tasks.
|
||||
|
||||
**Step 4 — Present brief**
|
||||
|
||||
1. **Active workstreams** for `communication` — title, task counts, blocking decisions
|
||||
2. **Pending tasks** from `workplans/` + any `[repo:markitect-main]` hub tasks
|
||||
3. **Goal guidance** — if `goal_guidance` in summary:
|
||||
- `needs_workplan`: surface as top action — *"Repo goal '{title}' has no workplan yet"*
|
||||
- `alignment_warnings`: flag if active work is not aligned with current goal
|
||||
4. **Suggested next action** — highest-priority open item
|
||||
5. **SBOM status** — flag if `last_sbom_at` is unset for this repo
|
||||
|
||||
If no workstreams: follow First Session Protocol (`first-session.md`).
|
||||
|
||||
**During work:** `record_decision()` · `add_progress_event()` · `resolve_decision()`
|
||||
|
||||
> State Hub is a *read model*. Bootstrap tools (`create_workstream`, `create_task`)
|
||||
> are First Session Protocol only. Work structure belongs in repo files (ADR-001).
|
||||
|
||||
**Session close:**
|
||||
With MCP tools:
|
||||
```
|
||||
add_progress_event(summary="...", topic_id="36c7421b-c537-4723-bf75-42a3ebc6a1dc", workstream_id="<uuid>")
|
||||
```
|
||||
Without MCP tools:
|
||||
```bash
|
||||
curl -s -X POST http://127.0.0.1:8000/progress/ \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"topic_id":"36c7421b-c537-4723-bf75-42a3ebc6a1dc","workstream_id":"<uuid>","event_type":"note","summary":"what changed","author":"codex"}'
|
||||
```
|
||||
If workplan files were modified, ensure the local copy is up to date first:
|
||||
```bash
|
||||
git -C <repo_path> pull --ff-only
|
||||
cd ~/state-hub && make fix-consistency REPO=markitect-main
|
||||
```
|
||||
For repos where implementation runs on a remote machine (e.g. CoulombCore),
|
||||
use the combined target which pulls before fixing:
|
||||
```bash
|
||||
cd ~/state-hub && make fix-consistency-remote REPO=markitect-main
|
||||
```
|
||||
**C-15** (DB task ahead of file) is normal in multi-machine workflows — writeback
|
||||
will sync the file to match DB. **C-16** (repo behind remote) blocks all writes
|
||||
until you pull — intentional to prevent clobbering remote progress.
|
||||
16
.claude/rules/stack-and-commands.md
Normal file
16
.claude/rules/stack-and-commands.md
Normal file
@@ -0,0 +1,16 @@
|
||||
## Stack
|
||||
|
||||
- **Language:** Python 3.12+ (monorepo) + JavaScript UI (testdrive-jsui)
|
||||
- **Key deps:** uv/pip, pytest, npm; see `pyproject.toml`, `package.json`, `Makefile`
|
||||
|
||||
## Dev Commands
|
||||
|
||||
```bash
|
||||
make setup
|
||||
make test
|
||||
make test-js
|
||||
make test-all
|
||||
make lint
|
||||
make build
|
||||
make help
|
||||
```
|
||||
40
.claude/rules/workplan-convention.md
Normal file
40
.claude/rules/workplan-convention.md
Normal file
@@ -0,0 +1,40 @@
|
||||
## Workplan Convention (ADR-001)
|
||||
|
||||
File location: `workplans/MARKITECT-WP-NNNN-<slug>.md`
|
||||
ID prefix: `MARKITECT-WP-`
|
||||
|
||||
Work items originate as files in this repo **before** being registered in the hub.
|
||||
|
||||
Canonical workplan/workstream frontmatter statuses are:
|
||||
`proposed`, `ready`, `active`, `blocked`, `backlog`, `finished`, `archived`.
|
||||
Use `proposed` for a newly drafted plan, `ready` after review against current
|
||||
repo state, and `finished` when implementation is complete. `stalled` and
|
||||
`needs_review` are derived health labels, not stored statuses.
|
||||
|
||||
Closed workplans may be moved to `workplans/archived/` with a completion-date
|
||||
prefix: `YYMMDD-MARKITECT-WP-NNNN-<slug>.md`. The frontmatter id remains
|
||||
unchanged; the prefix is only for quick visual reference.
|
||||
|
||||
Small opportunistic tasks discovered during another session use **Ad Hoc Tasks**:
|
||||
`workplans/ADHOC-YYYY-MM-DD.md`, workstream slug `adhoc-YYYY-MM-DD`, and task ids
|
||||
`ADHOC-YYYY-MM-DD-T01`, `T02`, etc. Use adhocs only for low-risk work completed
|
||||
directly. Promote anything requiring analysis, design, approval, dependencies, or
|
||||
multiple planned phases into a normal workplan.
|
||||
|
||||
Ecosystem todos from other agents arrive as `[repo:markitect-main]` hub tasks —
|
||||
visible at session start. Pick one up by creating the workplan file, then registering
|
||||
the workstream.
|
||||
|
||||
Task blocks use this shape:
|
||||
|
||||
```task
|
||||
id: MARKITECT-WP-NNNN-T01
|
||||
status: wait | todo | progress | done | cancel
|
||||
priority: high | medium | low
|
||||
state_hub_task_id: "<uuid>" # written by fix-consistency — do not edit
|
||||
```
|
||||
|
||||
Status progression is `todo` → `progress` → `done`; use `wait` for waiting or
|
||||
blocked work and `cancel` for stopped work.
|
||||
|
||||
<!-- Ralph Loop rules and HEUREKA sequence: ~/.claude/CLAUDE.md — do not duplicate here -->
|
||||
@@ -10,7 +10,7 @@ principles with strict separation of concerns.
|
||||
|
||||
## Directory Structure & Clean Architecture
|
||||
```
|
||||
markitect_project/
|
||||
markitect-main/
|
||||
├── domain/ # Business logic (innermost layer)
|
||||
├── application/ # Use cases and workflows
|
||||
├── infrastructure/ # External interfaces (database, file system)
|
||||
|
||||
18
.custodian-brief.md
Normal file
18
.custodian-brief.md
Normal file
@@ -0,0 +1,18 @@
|
||||
<!-- custodian-brief: generated by fix-consistency — do not edit manually -->
|
||||
# Custodian Brief — markitect-main
|
||||
|
||||
**Domain:** communication
|
||||
**Last synced:** 2026-06-22 21:32 UTC
|
||||
**State Hub:** http://127.0.0.1:8000 *(adjust if running on a remote machine)*
|
||||
|
||||
## Active Workstreams
|
||||
|
||||
*(none — repo may need first-session setup)*
|
||||
|
||||
---
|
||||
## MCP Orientation (when available)
|
||||
|
||||
If the state-hub MCP server is reachable, call:
|
||||
`get_domain_summary("communication")`
|
||||
This provides richer cross-domain context.
|
||||
If the MCP call fails, use this file as your orientation source.
|
||||
2
.gitignore
vendored
2
.gitignore
vendored
@@ -91,6 +91,8 @@ debug_*.py
|
||||
|
||||
# Claude Code local settings (user-specific permissions)
|
||||
.claude/settings.local.json
|
||||
# Claude Code runtime session locks (per-session, not content)
|
||||
.claude/*.lock
|
||||
|
||||
.aider*
|
||||
|
||||
|
||||
2
.gitmodules
vendored
2
.gitmodules
vendored
@@ -1,6 +1,6 @@
|
||||
[submodule "wiki"]
|
||||
path = wiki
|
||||
url = http://92.205.130.254:32166/coulomb/markitect_project.wiki.git
|
||||
url = http://92.205.130.254:32166/coulomb/markitect-main.wiki.git
|
||||
branch = main
|
||||
[submodule "capabilities/kaizen-agentic"]
|
||||
path = capabilities/kaizen-agentic
|
||||
|
||||
25
.repo-classification.yaml
Normal file
25
.repo-classification.yaml
Normal file
@@ -0,0 +1,25 @@
|
||||
repo_classification:
|
||||
standard: Repo Classification Standard
|
||||
version: '1.0'
|
||||
classified_at: '2026-06-22'
|
||||
classified_by: human
|
||||
category: product
|
||||
domain: communication
|
||||
secondary_domains:
|
||||
- infotech
|
||||
- agents
|
||||
capability_tags:
|
||||
- knowledge
|
||||
- documentation
|
||||
- product-development
|
||||
- platform
|
||||
business_stake:
|
||||
- product
|
||||
- technology
|
||||
- execution
|
||||
business_mechanics:
|
||||
- intention
|
||||
- coordination
|
||||
- operation
|
||||
- adaptation
|
||||
notes: Markitect successor to archived markitect-project; human confirmed.
|
||||
219
AGENTS.md
Normal file
219
AGENTS.md
Normal file
@@ -0,0 +1,219 @@
|
||||
# Markitect Main — Agent Instructions
|
||||
|
||||
## Repo Identity
|
||||
|
||||
**Purpose:** Markitect Main - (fill in purpose)
|
||||
|
||||
**Domain:** communication
|
||||
**Repo slug:** markitect-main
|
||||
**Topic ID:** `36c7421b-c537-4723-bf75-42a3ebc6a1dc`
|
||||
**Workplan prefix:** `MARKITECT-WP-`
|
||||
|
||||
---
|
||||
|
||||
## State Hub Integration
|
||||
|
||||
The Custodian State Hub tracks work across all domains. Interact via HTTP REST —
|
||||
there is no MCP server for Codex agents.
|
||||
|
||||
| Context | URL |
|
||||
|---------|-----|
|
||||
| Local workstation | `http://127.0.0.1:8000` |
|
||||
| Remote via tunnel | `http://127.0.0.1:18000` |
|
||||
|
||||
### Orient at session start
|
||||
|
||||
```bash
|
||||
# Offline brief — works without hub connection
|
||||
cat .custodian-brief.md
|
||||
|
||||
# Active workstreams for this domain
|
||||
curl -s "http://127.0.0.1:8000/workstreams/?topic_id=36c7421b-c537-4723-bf75-42a3ebc6a1dc&status=active" \
|
||||
| python3 -m json.tool
|
||||
|
||||
# Check inbox
|
||||
curl -s "http://127.0.0.1:8000/messages/?to_agent=markitect-main&unread_only=true" \
|
||||
| python3 -m json.tool
|
||||
```
|
||||
|
||||
Mark a message read:
|
||||
```bash
|
||||
curl -s -X PATCH "http://127.0.0.1:8000/messages/<id>/read" \
|
||||
-H "Content-Type: application/json" -d '{}'
|
||||
```
|
||||
|
||||
### Log progress (required at session close)
|
||||
|
||||
```bash
|
||||
curl -s -X POST http://127.0.0.1:8000/progress/ \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"summary": "what was done",
|
||||
"event_type": "note",
|
||||
"author": "codex",
|
||||
"workstream_id": "<uuid>",
|
||||
"task_id": "<uuid>"
|
||||
}'
|
||||
```
|
||||
|
||||
Omit `workstream_id` / `task_id` when not applicable.
|
||||
|
||||
### Update task status
|
||||
|
||||
```bash
|
||||
curl -s -X PATCH "http://127.0.0.1:8000/tasks/<task_id>" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"status": "progress"}'
|
||||
# values: wait | todo | progress | done | cancel
|
||||
```
|
||||
|
||||
### Flag a task for human review
|
||||
|
||||
```bash
|
||||
curl -s -X PATCH "http://127.0.0.1:8000/tasks/<task_id>" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"needs_human": true, "intervention_note": "reason"}'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Session Protocol
|
||||
|
||||
**Start:**
|
||||
1. `cat .custodian-brief.md` — domain goal and open workstreams (offline-safe)
|
||||
2. Check inbox: `GET /messages/?to_agent=markitect-main&unread_only=true`; mark read
|
||||
3. Scan workplans: `ls workplans/` — note `status: ready`, `active`, or `blocked` files and open tasks
|
||||
4. Check human-needed tasks: `GET /tasks/?needs_human=true`
|
||||
|
||||
**During work:**
|
||||
- Update task statuses in workplan files as tasks progress
|
||||
- Record significant decisions via `POST /decisions/`
|
||||
|
||||
**Close:**
|
||||
1. Update workplan file task statuses to reflect progress
|
||||
2. Log: `POST /progress/` with a summary of what changed
|
||||
3. Note for the custodian operator: after workplan file changes, run from
|
||||
`~/state-hub`:
|
||||
```bash
|
||||
make fix-consistency REPO=markitect-main
|
||||
```
|
||||
This syncs task status from files into the hub DB.
|
||||
|
||||
---
|
||||
|
||||
## Credential and access routing
|
||||
|
||||
**Audience:** Codex, Claude Code, Grok, and custodian agents that call **llm-connect**
|
||||
for inference. Run this check **before** requesting secrets, API keys, SSH access,
|
||||
login tokens, or database passwords — in any repo, not only `ops-warden`.
|
||||
|
||||
ops-warden **issues SSH certificates only** (`warden sign`, `cert_command`). Every
|
||||
other credential need belongs to another subsystem. **Do not** message
|
||||
`ops-warden` on State Hub expecting a secret value; the reply is a pointer, not a key.
|
||||
|
||||
### Lookup (do this first)
|
||||
|
||||
```bash
|
||||
warden route find "<describe your need>" --json
|
||||
warden route show <catalog-id> --json
|
||||
```
|
||||
|
||||
Requires the `warden` CLI from `~/ops-warden` (`uv tool install .` or `uv run warden`).
|
||||
|
||||
| Agent runtime | How to orient |
|
||||
| --- | --- |
|
||||
| **Codex / Grok** (shell, HTTP State Hub) | `warden route` commands above; inbox `to_agent=markitect-main` is for coordination, not secret vending |
|
||||
| **Claude Code** (MCP when available) | `get_domain_summary("custodian")` for workstreams; **still** use `warden route` for credential ownership |
|
||||
| **llm-connect** (inference service) | Never put secret retrieval in prompts; route custody to OpenBao/operator paths surfaced by `warden route` |
|
||||
|
||||
### Quick routing table
|
||||
|
||||
| I need… | Owner | ops-warden executes? |
|
||||
| --- | --- | --- |
|
||||
| SSH cert (`adm`/`agt`/`atm`) | ops-warden | **Yes** — `warden sign` |
|
||||
| API key, DB password, provider token | OpenBao (`railiance-platform`) | No — route only |
|
||||
| Login / OIDC / MFA | key-cape / Keycloak | No — route only |
|
||||
| Authorization decision | flex-auth | No — route only |
|
||||
| activity-core → issue-core emission | activity-core + issue-core | No — `warden route show activity-core-issue-sink` |
|
||||
| SSH tunnel | ops-bridge (+ `cert_command` from warden) | No — route only |
|
||||
|
||||
### Anti-patterns (do not do these)
|
||||
|
||||
- `POST /messages/` to `ops-warden` asking for `ISSUE_CORE_API_KEY`, `OPENROUTER_API_KEY`, etc.
|
||||
- Inventing `warden secret`, `warden login`, `warden bao`, `warden tunnel` — they do not exist
|
||||
- Pasting secrets into Git, State Hub, workplans, logs, or chat
|
||||
|
||||
### Other capabilities (reuse-surface)
|
||||
|
||||
Non-credential capabilities are usually discovered through **reuse-surface** federation
|
||||
(`reuse-surface` registry / `capability.*` indexes). Credential routing is inlined in
|
||||
every repo's agent instructions because it is high-frequency, high-risk, and easy to
|
||||
get wrong.
|
||||
|
||||
**Canon:** `~/ops-warden/wiki/CredentialRouting.md` · catalog `~/ops-warden/registry/routing/catalog.yaml`
|
||||
|
||||
<!-- REPO-AGENTS-EXTENSIONS -->
|
||||
<!-- Append repo-specific agent instructions below this marker.
|
||||
The state-hub template sync preserves content after this line. -->
|
||||
|
||||
---
|
||||
|
||||
## Workplan Convention (ADR-001)
|
||||
|
||||
Work items originate as files in this repo — not in the hub. The hub is a
|
||||
read/cache/index layer that rebuilds from files.
|
||||
|
||||
**File location:** `workplans/MARKITECT-WP-NNNN-<slug>.md`
|
||||
|
||||
**Archived location:** finished workplans may move to
|
||||
`workplans/archived/YYMMDD-MARKITECT-WP-NNNN-<slug>.md`. The `YYMMDD` prefix is
|
||||
the completion/archive date; the frontmatter `id` does not change.
|
||||
|
||||
**Ad Hoc Tasks:** small opportunistic fixes discovered during a session use
|
||||
`workplans/ADHOC-YYYY-MM-DD.md` with task ids `ADHOC-YYYY-MM-DD-T01`, etc. Use
|
||||
this only for low-risk work completed directly; create a normal workplan for
|
||||
anything needing analysis, design, approval, dependencies, or multiple phases.
|
||||
|
||||
**Frontmatter:**
|
||||
|
||||
```yaml
|
||||
---
|
||||
id: MARKITECT-WP-NNNN
|
||||
type: workplan
|
||||
title: "..."
|
||||
domain: communication
|
||||
repo: markitect-main
|
||||
status: proposed | ready | active | blocked | backlog | finished | archived
|
||||
owner: codex
|
||||
topic_slug: ...
|
||||
created: "YYYY-MM-DD"
|
||||
updated: "YYYY-MM-DD"
|
||||
state_hub_workstream_id: "<uuid>" # written by fix-consistency — do not edit
|
||||
---
|
||||
```
|
||||
|
||||
Use `proposed` for a new draft, `ready` after review against current repo
|
||||
state, and `finished` after implementation. `stalled` and `needs_review` are
|
||||
derived health labels, not frontmatter statuses.
|
||||
|
||||
**Task block format** (one per `##` section):
|
||||
|
||||
```
|
||||
## Task Title
|
||||
|
||||
` ` `task
|
||||
id: MARKITECT-WP-NNNN-T01
|
||||
status: wait | todo | progress | done | cancel
|
||||
priority: high | medium | low
|
||||
state_hub_task_id: "<uuid>" # written by fix-consistency — do not edit
|
||||
` ` `
|
||||
|
||||
Task description text.
|
||||
```
|
||||
|
||||
Status progression: `todo` → `progress` → `done`; use `wait` for waiting/blocked work and `cancel` for stopped work.
|
||||
|
||||
To create a new workplan:
|
||||
1. Write the file following the format above
|
||||
2. Notify the custodian operator to run `make fix-consistency REPO=markitect-main`
|
||||
(or send a message to the hub agent via `POST /messages/`)
|
||||
46
CLAUDE.md
46
CLAUDE.md
@@ -1,34 +1,12 @@
|
||||
# Markitect — Claude Code Instructions
|
||||
|
||||
## Custodian State Hub Integration
|
||||
|
||||
This project is tracked as the **markitect** domain in the Custodian State Hub.
|
||||
Hub topic ID: `5571d954-0d30-4950-980d-7bcaaad8e3e2`
|
||||
|
||||
### Session Protocol
|
||||
|
||||
**At the start of every session:**
|
||||
Call `get_state_summary()` via the `state-hub` MCP tool to orient yourself.
|
||||
If the hub is not reachable, start it: `cd ~/the-custodian/state-hub && make api`
|
||||
|
||||
**At the end of every session:**
|
||||
Call `add_progress_event()` with at minimum:
|
||||
- `topic_id`: `5571d954-0d30-4950-980d-7bcaaad8e3e2`
|
||||
- `summary`: what was accomplished or left in-flight
|
||||
- `event_type`: `note` for routine updates, `milestone` for completions, `blocker` for blockers
|
||||
|
||||
### Available State-Hub MCP Tools
|
||||
|
||||
- `get_state_summary()` — full cross-domain overview
|
||||
- `add_progress_event(summary, topic_id, event_type, detail)` — log progress
|
||||
- `create_workstream(topic_id, title, ...)` — create a new workstream
|
||||
- `create_task(workstream_id, title, ...)` — create a task under a workstream
|
||||
- `update_task_status(task_id, status)` — move task through lifecycle
|
||||
- `record_decision(title, decision_type, topic_id, ...)` — log decisions
|
||||
- `resolve_decision(decision_id, rationale, decided_by)` — close a decision
|
||||
|
||||
### If the MCP Server is Not Available
|
||||
|
||||
The state-hub MCP server (`state-hub`) is registered at user scope in `~/.claude.json`.
|
||||
It requires the API to be running at `http://127.0.0.1:8000`.
|
||||
Fallback: use `curl` directly against the REST API — see `/docs` at the hub URL.
|
||||
# Markitect Main — Claude Code Instructions
|
||||
|
||||
@SCOPE.md
|
||||
@.claude/rules/repo-identity.md
|
||||
@.claude/rules/session-protocol.md
|
||||
@.claude/rules/first-session.md
|
||||
@.claude/rules/workplan-convention.md
|
||||
@.claude/rules/stack-and-commands.md
|
||||
@.claude/rules/architecture.md
|
||||
@.claude/rules/repo-boundary.md
|
||||
@.claude/rules/credential-routing.md
|
||||
@.claude/rules/agents.md
|
||||
|
||||
@@ -457,7 +457,7 @@ Sister projects can reuse these capabilities directly:
|
||||
Install capabilities via local file references:
|
||||
```toml
|
||||
[project.dependencies]
|
||||
release-management = {path = "../markitect_project/capabilities/release-management"}
|
||||
release-management = {path = "../markitect-main/capabilities/release-management"}
|
||||
```
|
||||
|
||||
### Shared Infrastructure
|
||||
|
||||
129
SCOPE.md
Normal file
129
SCOPE.md
Normal file
@@ -0,0 +1,129 @@
|
||||
# SCOPE
|
||||
|
||||
> This file helps you quickly understand what this repository is about,
|
||||
> when it is relevant, and when it is not.
|
||||
> It is intentionally lightweight and may be incomplete.
|
||||
|
||||
---
|
||||
|
||||
## One-liner
|
||||
|
||||
Intelligent markdown engine and information management platform — treats documents as structured, queryable information spaces with schema validation, transclusion, LLM-driven evaluation, and infospace lifecycle management.
|
||||
|
||||
---
|
||||
|
||||
## Core Idea
|
||||
|
||||
MarkiTect turns fragmented knowledge (scattered docs, chats, notes) into structured, versioned, reusable artifacts. The core abstraction is an **infospace**: a curated collection of typed entities (concepts, mechanisms, observations) governed by a YAML config, validated against schemas, and evaluated for quality across five dimensions. The platform automates generation, validation, and transformation at scale, delegating domain-level judgment to LLMs while Python handles structure and evaluation.
|
||||
|
||||
---
|
||||
|
||||
## In Scope
|
||||
|
||||
- Parse, validate, and analyze markdown documents against schemas
|
||||
- Generate schemas from example documents; enforce naming convention `{domain}-schema-v{major}.{minor}.md`
|
||||
- Infospace lifecycle: create, populate, evaluate (per-entity + collection quality scores), compose, export
|
||||
- Transclusion: embed content from one document into another, maintaining single source of truth
|
||||
- LLM-driven prompt execution with dependency resolution and quality gates
|
||||
- Relationship graph export (Mermaid, DOT) and analysis (networkx, FCA)
|
||||
- Batch document processing; CLI (`markitect <command>`) and programmatic API
|
||||
- Rendering: markdown → interactive HTML via plugin system (testdrive-jsui)
|
||||
- Asset management (image embedding, resource handling)
|
||||
|
||||
---
|
||||
|
||||
## Out of Scope
|
||||
|
||||
- Visual/WYSIWYG editing (markdown-first, text-based workflows only)
|
||||
- Real-time collaborative editing (git-based versioning instead)
|
||||
- Financial transactions or external payment integration
|
||||
- Making domain-level judgments in Python code (delegated to LLM via prompt templates)
|
||||
- Storing secrets or credentials in plaintext
|
||||
- Full GraphQL API (structure exists but not fully implemented)
|
||||
- Vendor-specific integrations or lock-in
|
||||
|
||||
---
|
||||
|
||||
## Relevant When
|
||||
|
||||
- Managing large document sets (hundreds to thousands) needing consistent structure and validation
|
||||
- Building or maintaining institutional knowledge bases, technical documentation, or canon releases
|
||||
- Automating document generation from schemas or templates
|
||||
- Tracking relationships and dependencies between knowledge artifacts
|
||||
- Needing programmatic access to document structure (beyond file reading)
|
||||
- Applying quality evaluation to a structured concept collection
|
||||
|
||||
---
|
||||
|
||||
## Not Relevant When
|
||||
|
||||
- Working with a handful of simple, unrelated documents
|
||||
- Visual editor required
|
||||
- Exclusively non-markdown source formats (PDF/Word need conversion first)
|
||||
- No consistency, validation, or automation needed
|
||||
|
||||
---
|
||||
|
||||
## Current State
|
||||
|
||||
- Status: active (v0.13.0-dev, ~90 commits ahead of release)
|
||||
- Implementation: substantial — core modules mature (CLI, parsing, schema management, prompt execution, infospace); infospace S3 close-out in progress; LLM adapter extracted to standalone `llm-connect` package
|
||||
- Stability: stable core; plugin system and infospace tooling evolving; 200+ CHANGELOG entries since v0.6.0
|
||||
- Usage: active personal development; examples with 988 entities and full evaluation pipeline
|
||||
|
||||
---
|
||||
|
||||
## How It Fits
|
||||
|
||||
- Upstream dependencies: `llm-connect` (LLM adapter library, extracted), `testdrive-jsui` (rendering plugin submodule), `markitect-utils` (utility library)
|
||||
- Downstream consumers: Custodian — MarkiTect is the knowledge artifact platform in the canonical dependency order (Railiance → **Markitect** → Coulomb.social → Personhood/Foerster → Custodian)
|
||||
- Often used with: the-custodian (state hub tracks markitect domain workstreams), kaizen-agentic (project-management agent for session workflow)
|
||||
|
||||
---
|
||||
|
||||
## Terminology
|
||||
|
||||
- Preferred terms: infospace, topic, discipline, entity, evaluation, viability, transclusion, schema, quality gates
|
||||
- Also known as: "markitect", "the markdown engine"
|
||||
- Potentially confusing terms: "topic" = the subject matter an infospace explains (not a chat thread); "discipline" = a reusable framework of concepts (itself a viable infospace); "infospace" ≠ filesystem directory (it's a curated conceptual collection with explicit quality thresholds)
|
||||
|
||||
---
|
||||
|
||||
## Related / Overlapping
|
||||
|
||||
- `llm-connect` — standalone LLM adapter extracted from MarkiTect (dependency)
|
||||
- `the-custodian` — tracks markitect workstreams; custodian canon includes a markitect domain charter
|
||||
- `marki-docx` — separate repo (on tegwick machine); relationship: docx export capability for MarkiTect artifacts
|
||||
|
||||
---
|
||||
|
||||
## Provided Capabilities
|
||||
|
||||
```capability
|
||||
type: documentation
|
||||
title: Structured document validation and schema management
|
||||
description: Parse, validate, and enforce schemas on markdown documents — generate schemas from examples, validate entity collections, report naming convention compliance.
|
||||
keywords: [markdown, schema, validation, document, structure, linting]
|
||||
```
|
||||
|
||||
```capability
|
||||
type: documentation
|
||||
title: Infospace lifecycle management
|
||||
description: Create, populate, evaluate (quality scores), compose, and export curated knowledge collections (infospaces) with transclusion and relationship graph analysis.
|
||||
keywords: [infospace, knowledge, curation, evaluation, transclusion, quality, graph]
|
||||
```
|
||||
|
||||
```capability
|
||||
type: data
|
||||
title: LLM-driven knowledge artifact generation
|
||||
description: Execute prompts with dependency resolution and quality gates to generate typed entities — concepts, mechanisms, observations — at scale from schemas and templates.
|
||||
keywords: [llm, generation, prompt, entity, artifact, knowledge, automation]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Getting Oriented
|
||||
|
||||
- Start with: `CLAUDE.md` (dev commands, LLM config, infospace lifecycle), `INTRODUCTION.md` (use cases, philosophy)
|
||||
- Key files / directories: `markitect/cli.py` (CLI entry point), `markitect/infospace/` (primary active area), `markitect/prompts/` (LLM execution), `roadmap/` (6 active planning tracks), `examples/infospace-with-history/` (988-entity reference implementation)
|
||||
- Entry points: `markitect --help`; `markitect infospace --help`; `pytest tests/unit/` (inner TDD loop)
|
||||
@@ -15,7 +15,7 @@ You are responsible for:
|
||||
|
||||
### Directory Structure
|
||||
```
|
||||
markitect_project/
|
||||
markitect-main/
|
||||
├── Makefile # Main project Makefile
|
||||
├── scripts/
|
||||
│ └── capability_discovery.mk # Auto-discovery and delegation system
|
||||
|
||||
@@ -7,7 +7,7 @@ detachment:
|
||||
capability_name: issue-facade
|
||||
capability_family: issue-tracking
|
||||
integration_pattern: capabilities-directory
|
||||
original_location: /home/worsch/markitect_project/capabilities/issue-facade
|
||||
original_location: /home/worsch/markitect-main/capabilities/issue-facade
|
||||
|
||||
capability_metadata:
|
||||
spec_file: CAPABILITY-issue-tracking.yaml
|
||||
@@ -17,23 +17,23 @@ capability_metadata:
|
||||
|
||||
integration_details:
|
||||
parent_project: capabilities
|
||||
parent_path: /home/worsch/markitect_project/capabilities
|
||||
parent_path: /home/worsch/markitect-main/capabilities
|
||||
|
||||
re_integration_guide: |
|
||||
To re-integrate this capability using the new architecture:
|
||||
|
||||
# Option 1: Git submodule (recommended)
|
||||
cd /home/worsch/markitect_project/capabilities
|
||||
cd /home/worsch/markitect-main/capabilities
|
||||
git submodule add <repo-url> _issue-facade
|
||||
pip install -e _issue-facade/
|
||||
|
||||
# Option 2: Clone directly
|
||||
cd /home/worsch/markitect_project/capabilities
|
||||
cd /home/worsch/markitect-main/capabilities
|
||||
git clone <repo-url> _issue-facade
|
||||
pip install -e _issue-facade/
|
||||
|
||||
# Option 3: Copy into project
|
||||
cd /home/worsch/markitect_project/capabilities
|
||||
cd /home/worsch/markitect-main/capabilities
|
||||
cp -r /path/to/issue-facade _issue-facade
|
||||
pip install -e _issue-facade/
|
||||
|
||||
|
||||
@@ -8,7 +8,7 @@ This test module validates outline mode schema generation improvements including
|
||||
- Content instruction integration
|
||||
- End-to-end workflow from example document to generated drafts
|
||||
|
||||
Created for Issue #46: https://gitea.coulomb.social/coulomb/markitect_project/issues/46
|
||||
Created for Issue #46: https://gitea.coulomb.social/coulomb/markitect-main/issues/46
|
||||
"""
|
||||
|
||||
import pytest
|
||||
|
||||
@@ -209,7 +209,7 @@ tests/
|
||||
## 🎯 Detailed File Structure After Migration
|
||||
|
||||
```
|
||||
markitect_project/
|
||||
markitect-main/
|
||||
├── capabilities/
|
||||
│ └── release-management/
|
||||
│ ├── README.md ✅ CREATED
|
||||
|
||||
@@ -162,7 +162,7 @@ clean_before_build = true
|
||||
[tool.release-management.registries.gitea]
|
||||
url = "http://92.205.130.254:32166"
|
||||
owner = "coulomb"
|
||||
repo = "markitect_project"
|
||||
repo = "markitect-main"
|
||||
auth_token_env = "GITEA_API_TOKEN"
|
||||
|
||||
[tool.release-management.registries.pypi]
|
||||
|
||||
@@ -141,7 +141,7 @@ make release-publish VERSION=0.8.0
|
||||
## Registry Information
|
||||
|
||||
- **Gitea URL**: http://92.205.130.254:32166
|
||||
- **Repository**: coulomb/markitect_project
|
||||
- **Repository**: coulomb/markitect-main
|
||||
- **PyPI Registry URL**: http://92.205.130.254:32166/api/packages/coulomb/pypi
|
||||
- **Package List URL**: http://92.205.130.254:32166/api/v1/packages/coulomb
|
||||
|
||||
|
||||
@@ -8,7 +8,7 @@
|
||||
|
||||
```bash
|
||||
# ❌ WRONG - Don't edit capability files from main repo
|
||||
cd /home/worsch/markitect_project/capabilities/testdrive-jsui
|
||||
cd /home/worsch/markitect-main/capabilities/testdrive-jsui
|
||||
vim src/testdrive_jsui/core.py # DON'T DO THIS!
|
||||
|
||||
# ✅ CORRECT - Use separate Claude instance/session
|
||||
@@ -29,7 +29,7 @@ cd /path/to/work/testdrive-jsui
|
||||
|
||||
| Session | Purpose | Location |
|
||||
|---------|---------|----------|
|
||||
| **Main Repo** | Integration, configuration | `/home/worsch/markitect_project` |
|
||||
| **Main Repo** | Integration, configuration | `/home/worsch/markitect-main` |
|
||||
| **Capability** | Feature development, bugs | Separate clone or `capabilities/capability-name` |
|
||||
|
||||
**Why?** Prevents accidental cross-contamination and respects repository boundaries.
|
||||
@@ -40,7 +40,7 @@ cd /path/to/work/testdrive-jsui
|
||||
|
||||
```bash
|
||||
# After pushing changes to capability repo
|
||||
cd /home/worsch/markitect_project
|
||||
cd /home/worsch/markitect-main
|
||||
git submodule update --remote capabilities/testdrive-jsui
|
||||
git add capabilities/testdrive-jsui
|
||||
git commit -m "chore: update testdrive-jsui to latest"
|
||||
@@ -50,7 +50,7 @@ git push
|
||||
### Add New Capability
|
||||
|
||||
```bash
|
||||
cd /home/worsch/markitect_project
|
||||
cd /home/worsch/markitect-main
|
||||
|
||||
# Add as submodule
|
||||
git submodule add http://gitea/coulomb/new-capability.git capabilities/new-capability
|
||||
@@ -67,7 +67,7 @@ git commit -m "feat: add new-capability submodule"
|
||||
|
||||
```bash
|
||||
# Option 1: In submodule directory (careful!)
|
||||
cd /home/worsch/markitect_project/capabilities/testdrive-jsui
|
||||
cd /home/worsch/markitect-main/capabilities/testdrive-jsui
|
||||
git checkout -b feature-branch
|
||||
# make changes
|
||||
git commit -m "feat: new feature"
|
||||
@@ -86,7 +86,7 @@ git push origin feature-branch
|
||||
### Check Capability Status
|
||||
|
||||
```bash
|
||||
cd /home/worsch/markitect_project
|
||||
cd /home/worsch/markitect-main
|
||||
|
||||
# List all capabilities
|
||||
make capabilities-list
|
||||
|
||||
@@ -9,7 +9,7 @@ MarkiTect is a markdown processing toolkit with transclusion, schema validation,
|
||||
## Current Directory Structure
|
||||
|
||||
```
|
||||
markitect_project/
|
||||
markitect-main/
|
||||
├── markitect/ # Main package
|
||||
│ ├── [34 root-level .py files] # Core functionality (see below)
|
||||
│ ├── assets/ # Asset discovery, management, caching (21 files)
|
||||
|
||||
@@ -8,7 +8,7 @@ MarkiTect uses a **capabilities-based architecture** where functionality is orga
|
||||
|
||||
### 1. **Separation of Concerns**
|
||||
|
||||
**Critical Rule:** The main repository (`markitect_project`) **MUST NOT** directly modify capability code.
|
||||
**Critical Rule:** The main repository (`markitect-main`) **MUST NOT** directly modify capability code.
|
||||
|
||||
- ✅ **DO**: Use capabilities as dependencies
|
||||
- ✅ **DO**: Configure capabilities through documented interfaces
|
||||
@@ -28,7 +28,7 @@ MarkiTect uses a **capabilities-based architecture** where functionality is orga
|
||||
Capabilities are integrated as **git submodules**, not regular directories:
|
||||
|
||||
```
|
||||
markitect_project/
|
||||
markitect-main/
|
||||
├── .gitmodules # Submodule configuration
|
||||
├── capabilities/
|
||||
│ ├── testdrive-jsui/ # Git submodule → separate repo
|
||||
@@ -80,8 +80,8 @@ engine.render_document(content, mode='edit', config=config)
|
||||
|
||||
#### Main Repository Session
|
||||
```bash
|
||||
# In markitect_project/
|
||||
cd /home/worsch/markitect_project
|
||||
# In markitect-main/
|
||||
cd /home/worsch/markitect-main
|
||||
|
||||
# Main repo tasks:
|
||||
# - Integrate capabilities
|
||||
@@ -93,7 +93,7 @@ cd /home/worsch/markitect_project
|
||||
#### Capability Session
|
||||
```bash
|
||||
# In capability repository
|
||||
cd /home/worsch/markitect_project/capabilities/testdrive-jsui
|
||||
cd /home/worsch/markitect-main/capabilities/testdrive-jsui
|
||||
|
||||
# OR clone separately
|
||||
git clone http://gitea/coulomb/testdrive-jsui.git
|
||||
@@ -122,7 +122,7 @@ cd testdrive-jsui
|
||||
|
||||
2. **Update main project** (different Claude instance)
|
||||
```bash
|
||||
cd /home/worsch/markitect_project
|
||||
cd /home/worsch/markitect-main
|
||||
git submodule update --remote capabilities/testdrive-jsui
|
||||
git commit -m "chore: update testdrive-jsui submodule"
|
||||
```
|
||||
@@ -139,7 +139,7 @@ When a capability releases a new version:
|
||||
|
||||
```bash
|
||||
# In main repo
|
||||
cd /home/worsch/markitect_project
|
||||
cd /home/worsch/markitect-main
|
||||
|
||||
# Update specific capability
|
||||
cd capabilities/testdrive-jsui
|
||||
@@ -160,7 +160,7 @@ git commit -am "chore: update all capabilities"
|
||||
# http://gitea/coulomb/new-capability
|
||||
|
||||
# 2. Add as submodule to main repo
|
||||
cd /home/worsch/markitect_project
|
||||
cd /home/worsch/markitect-main
|
||||
git submodule add http://gitea/coulomb/new-capability.git capabilities/new-capability
|
||||
|
||||
# 3. Add dependency to pyproject.toml
|
||||
@@ -324,7 +324,7 @@ def test_testdrive_jsui_integration():
|
||||
1. **Create separate git repo**
|
||||
```bash
|
||||
cd /tmp
|
||||
cp -r markitect_project/capabilities/capability-name capability-name
|
||||
cp -r markitect-main/capabilities/capability-name capability-name
|
||||
cd capability-name
|
||||
git init
|
||||
git add .
|
||||
@@ -335,7 +335,7 @@ def test_testdrive_jsui_integration():
|
||||
|
||||
2. **Remove from main repo**
|
||||
```bash
|
||||
cd markitect_project
|
||||
cd markitect-main
|
||||
git rm -rf capabilities/capability-name
|
||||
git commit -m "chore: remove capability-name for submodule conversion"
|
||||
```
|
||||
|
||||
203
docs/composition-guide.md
Normal file
203
docs/composition-guide.md
Normal file
@@ -0,0 +1,203 @@
|
||||
# Infospace Composition Guide
|
||||
|
||||
One completed, viable infospace can be reused as a **discipline** for
|
||||
another infospace — a lens applied to a different topic. This guide
|
||||
explains how composition works and walks through the live
|
||||
`examples/supply-chain-vsm/` reference.
|
||||
|
||||
---
|
||||
|
||||
## What composition means
|
||||
|
||||
An **infospace** is a directory of typed entities governed by
|
||||
`infospace.yaml`. Its entities and relations describe a specific topic
|
||||
(for example, Adam Smith's *Wealth of Nations*).
|
||||
|
||||
A **discipline** is an infospace declared as a reusable analytical
|
||||
framework by another infospace. When infospace B binds infospace A as a
|
||||
discipline:
|
||||
|
||||
1. B's entities can reference A's entities in `## WoN Concept` (or
|
||||
equivalent) sections.
|
||||
2. Properties A has already computed on its entities — such as VSM system
|
||||
placement — become available to B by transitivity through the mapping.
|
||||
3. B can impose its own viability thresholds independently of A's. The two
|
||||
infospaces each pass or fail viability on their own terms.
|
||||
|
||||
The binding is declarative: a relative path in `infospace.yaml` plus a
|
||||
display name. No code. No import. The discipline is looked up on disk at
|
||||
the declared path when B's commands run.
|
||||
|
||||
---
|
||||
|
||||
## The viability pre-condition
|
||||
|
||||
Binding a non-viable infospace as a discipline is a mistake: a framework
|
||||
that fails its own thresholds is not a stable reference frame. Before
|
||||
binding, confirm the candidate discipline is viable:
|
||||
|
||||
```bash
|
||||
cd examples/infospace-with-history
|
||||
markitect infospace viability
|
||||
```
|
||||
|
||||
```
|
||||
Metric Value Threshold Status
|
||||
---------------------------------------------------------------
|
||||
redundancy_ratio 0.0061 max=0.1 PASS
|
||||
coverage_ratio 0.6190 min=0.4 PASS
|
||||
coherence_components 0.0000 max=3 PASS
|
||||
consistency_cycles 0.0000 max=0 PASS
|
||||
granularity_entropy 2.6748 min=1.0 PASS
|
||||
per_entity_mean 3.9556 min=3.5 PASS
|
||||
|
||||
Viable: YES (6/6 thresholds met)
|
||||
```
|
||||
|
||||
If the discipline is not viable, fix it first (see
|
||||
`examples/infospace-with-history/docs/advanced-usage.md` §4 for triaging
|
||||
low scorers).
|
||||
|
||||
---
|
||||
|
||||
## Example — how `supply-chain-vsm` binds WoN
|
||||
|
||||
The supply-chain infospace declares WoN as a discipline in its
|
||||
`infospace.yaml`:
|
||||
|
||||
```yaml
|
||||
topic:
|
||||
name: "Modern Supply Chain Management"
|
||||
domain: "Operations Management"
|
||||
sources: artifacts/sources/
|
||||
|
||||
disciplines:
|
||||
- name: "Wealth of Nations"
|
||||
path: ../infospace-with-history
|
||||
```
|
||||
|
||||
The binding is a **relative path**, so the two infospaces travel together
|
||||
(they can be moved as a pair without breaking the link).
|
||||
|
||||
Verify the binding resolves and the discipline is viable:
|
||||
|
||||
```bash
|
||||
cd examples/supply-chain-vsm
|
||||
markitect infospace disciplines
|
||||
```
|
||||
|
||||
```
|
||||
Name Entities Viable Path
|
||||
----------------------------------------------------------------------
|
||||
Wealth of Nations 988 YES ../infospace-with-history
|
||||
```
|
||||
|
||||
Each supply-chain entity then carries a `## WoN Concept` section
|
||||
mapping it to exactly one WoN entity. The consolidated mapping files
|
||||
(`output/mappings/*-mappings.md`) record the pairing, rationale, and a
|
||||
conceptual-continuity rating (Strong / Moderate / Weak):
|
||||
|
||||
| Supply Chain Entity | WoN Concept | Strength | VSM |
|
||||
|------------------------------|----------------------------------|----------|-------|
|
||||
| Demand Signal | Effectual Demand | Strong | S2 |
|
||||
| Vendor-Managed Inventory | Division of Labour | Strong | S1/S2 |
|
||||
| Just-in-Time Inventory | Circulating Capital | Strong | S1/S3 |
|
||||
| Bullwhip Effect | Natural Price as Central Price | Moderate | S2 |
|
||||
| Safety Stock | Accumulation of Stock | Moderate | S3 |
|
||||
|
||||
Because each WoN entity already has a VSM system placement (S1–S5), the
|
||||
supply-chain entities inherit a VSM position by transitivity through
|
||||
their mapping — without supply-chain-vsm needing its own VSM reference.
|
||||
|
||||
---
|
||||
|
||||
## Creating a new infospace that binds an existing one
|
||||
|
||||
Step-by-step, using WoN as the discipline for a hypothetical "Modern
|
||||
Monetary Policy" infospace:
|
||||
|
||||
### 1. Start from the target topic
|
||||
|
||||
```bash
|
||||
mkdir -p examples/monetary-policy/artifacts/sources
|
||||
cd examples/monetary-policy
|
||||
markitect infospace init
|
||||
```
|
||||
|
||||
### 2. Declare the discipline in `infospace.yaml`
|
||||
|
||||
```yaml
|
||||
topic:
|
||||
name: "Modern Monetary Policy"
|
||||
domain: "Macroeconomics"
|
||||
sources: artifacts/sources/
|
||||
|
||||
disciplines:
|
||||
- name: "Wealth of Nations"
|
||||
path: ../infospace-with-history
|
||||
```
|
||||
|
||||
Alternatively, bind imperatively after `init`:
|
||||
|
||||
```bash
|
||||
markitect infospace bind-discipline ../infospace-with-history --name "Wealth of Nations"
|
||||
```
|
||||
|
||||
### 3. Set your own viability thresholds
|
||||
|
||||
Copy the `viability:` block from a reference infospace and tune the
|
||||
numbers to the scale and maturity of your topic. A smaller infospace
|
||||
(50 entities, not 988) may need laxer `coverage_ratio` and stricter
|
||||
`redundancy_ratio`.
|
||||
|
||||
### 4. Verify the binding
|
||||
|
||||
```bash
|
||||
markitect infospace disciplines
|
||||
```
|
||||
|
||||
If `Viable` is `NO`, stop and fix the discipline before continuing.
|
||||
|
||||
### 5. Reference discipline entities in your own entities
|
||||
|
||||
For each entity in the new infospace, add a `## <Discipline> Concept`
|
||||
section that names the WoN entity the concept maps to, plus a rationale.
|
||||
The exact section heading is configured per schema — see
|
||||
`schemas/won-mapping-schema-v1.0.md` in `supply-chain-vsm` for the
|
||||
template used there.
|
||||
|
||||
### 6. Run checks and evaluate
|
||||
|
||||
```bash
|
||||
markitect infospace check
|
||||
markitect infospace evaluate --provider openrouter
|
||||
markitect infospace eval-summary --update-metrics
|
||||
markitect infospace viability
|
||||
```
|
||||
|
||||
The new infospace passes or fails viability independently of WoN.
|
||||
|
||||
---
|
||||
|
||||
## Why composition, not inclusion?
|
||||
|
||||
An alternative would be to copy WoN entities directly into the target
|
||||
infospace. Composition avoids that by design:
|
||||
|
||||
- **One source of truth** — if WoN is refined, every infospace that binds
|
||||
it picks up the improvement on the next run without a sync step.
|
||||
- **Separation of concerns** — each infospace owns its own schema,
|
||||
thresholds, and entity set. Changing the target topic cannot pollute
|
||||
the discipline.
|
||||
- **Bounded dependency** — the binding is a path, so the coupling is
|
||||
visible in one place (`infospace.yaml`) and easy to remove.
|
||||
|
||||
---
|
||||
|
||||
## See also
|
||||
|
||||
- `examples/supply-chain-vsm/README.md` — the full reference composition.
|
||||
- `examples/supply-chain-vsm/output/mappings/` — consolidated mapping
|
||||
files showing the rationale and strength rating for each pairing.
|
||||
- `examples/infospace-with-history/docs/advanced-usage.md` — patterns for
|
||||
maintaining the discipline once it is in use.
|
||||
141
docs/successor-gap-assessment.md
Normal file
141
docs/successor-gap-assessment.md
Normal file
@@ -0,0 +1,141 @@
|
||||
# markitect-main → Successor Repos: Gap Assessment
|
||||
|
||||
**Date:** 2026-05-23
|
||||
**Author:** Claude (custodian session)
|
||||
**Status:** Draft — awaiting Bernd's decisions on items A/B/C below
|
||||
|
||||
## Purpose
|
||||
|
||||
Bernd is retiring `markitect-main` and has transferred most functionality to
|
||||
sibling repos. This document identifies what was provided by `markitect-main`
|
||||
that is **not addressed** in those successors, and flags candidates that may
|
||||
not fit any successor's intent.
|
||||
|
||||
## Successor Ecosystem (5 repos, not 3)
|
||||
|
||||
| Repo | Role |
|
||||
|---|---|
|
||||
| `markitect-tool` | Markdown syntax layer + structured-document primitives; defines source-adapter and render-adapter contracts. CLI: `mkt`. |
|
||||
| `kontextual-engine` | Headless knowledge operations engine: artifacts, collections, persistence, relationships, workflow runs/manifests, query, quality/assessment, API. |
|
||||
| `infospace-bench` | Application layer — concrete infospaces, evaluation methodology, reference pilots. |
|
||||
| `markitect-filter` | Source-format ingestion adapters (`source.epub3`, `source.pdf`) implementing the markitect-tool source-adapter contract. |
|
||||
| `markitect-quarkdown` | Render/export adapter — implements the markitect-tool render-adapter contract via Quarkdown. |
|
||||
|
||||
## Method
|
||||
|
||||
Analysis is grounded in each successor's own assessment docs (recent, May 2026):
|
||||
|
||||
- `markitect-tool/docs/markitect-main-scope-assessment.md`
|
||||
- `kontextual-engine/docs/markitect-main-scope-assessment.md`
|
||||
- `kontextual-engine/docs/system-layer-extraction-inventory.md`
|
||||
- `kontextual-engine/docs/system-layer-migration-backlog.md`
|
||||
- `infospace-bench/docs/markitect-main-scope-assessment.md`
|
||||
- `infospace-bench/docs/legacy-infospace-feature-inventory.md`
|
||||
- `infospace-bench/docs/replacement-acceptance-matrix.md`
|
||||
|
||||
Cross-checked against actual `markitect-main` module sizing (Python LOC) and
|
||||
`__init__.py` docstrings.
|
||||
|
||||
**Confidence:** These successor docs are authoritative on *intent*. They have
|
||||
**not** been line-verified to confirm every "reimplement"-classified item
|
||||
actually landed in the successor. Where verification matters, it's flagged.
|
||||
|
||||
---
|
||||
|
||||
## A. Doesn't fit any successor's intent — needs a new home or explicit retirement
|
||||
|
||||
These are explicitly pushed away by tool/engine/bench and are unrelated to
|
||||
filter/quarkdown.
|
||||
|
||||
| markitect-main area | LOC | What it is | Status |
|
||||
|---|---|---|---|
|
||||
| `markitect/finance/` | ~8,100 | Cost-tracking system: cost items, period allocation to issues, financial reports, audit trails | **Orphan.** markitect-main's own SCOPE.md lists "financial transactions" as out-of-scope. Belongs with issue/project-ops, not knowledge tooling. |
|
||||
| `issue_tracker/` + `_issue-tracking/` + `.issues/` | ~1,200 | Issue tracking (finance allocates costs to these issues) | **Orphan to the five** — but likely already superseded by the `issue-facade` capability / `use-issues` skill. **Verify before retiring.** |
|
||||
| `markitect/profile/` | ~1,600 | User-profile CRUD, multi-profile, DB-backed | **Orphan.** Unrelated to all five. (Distinct from quarkdown's *render* "profile".) |
|
||||
| `markitect/production/` | ~3,800 | Deployment-readiness validation, cross-platform checks, perf benchmarking | Engine keeps only "structured error/audit *ideas*". Deployment-validation bulk is orphan. |
|
||||
| `tools/`, `services/`, gitea/tddai glue | ~5,500 | Project-ops tooling | Out-of-scope everywhere. |
|
||||
| `markitect/legacy/` + `legacy_compat.py` | ~2,700 | Backward-compat shims | Retire by definition. |
|
||||
|
||||
## B. Rendering / asset / plugin layer — only *partially* covered, real residual gap
|
||||
|
||||
**This is the most consequential gap.** `SCOPE.md` lists "Rendering: markdown
|
||||
→ interactive HTML via plugin system (testdrive-jsui)" as an in-scope
|
||||
capability of markitect-main.
|
||||
|
||||
| Area | LOC | Covered? |
|
||||
|---|---|---|
|
||||
| `markitect/plugins/` (generic processor/formatter/validator/exporter plugin system) | ~8,000 | **No.** tool defines a render-adapter *contract* and an *extension* point, but the general plugin runtime isn't carried. |
|
||||
| `markitect/assets/` (content-addressable asset store, dedup, `.mdpkg` ZIP packaging, symlink handling) + `asset_registry.json` (277 KB) | ~6,000 | **No.** Bench says "leave behind unless a concrete export needs assets." |
|
||||
| Interactive-HTML / testdrive-jsui rendering, `static/`, `themes/`, `templates/document.html`, JS UI | — | **Partial only.** quarkdown covers a *Quarkdown* export path; the interactive-HTML / JS-UI path has no home. |
|
||||
|
||||
**Decision needed:** spin these into a dedicated render/asset repo (sibling to
|
||||
quarkdown), fold the asset store into one of the existing repos, or retire the
|
||||
interactive-HTML path.
|
||||
|
||||
## C. The other "Information Space" lineage — `markitect/spaces/` (~11,000 LOC)
|
||||
|
||||
**Distinct from `markitect/infospace/`** (which infospace-bench inherited).
|
||||
`spaces/` is an older/parallel abstraction with features bench did *not* take:
|
||||
|
||||
- event-driven change tracking & notifications
|
||||
- persistent transclusion context with cross-space references
|
||||
- bidirectional directory synchronization
|
||||
- HTML rendering of spaces with caching/themes
|
||||
|
||||
Engine takes generic persistence concepts and bench takes infospace semantics,
|
||||
but **these specific `spaces/` behaviors (bidirectional sync, event
|
||||
notifications, cross-space transclusion context) aren't mapped anywhere.**
|
||||
|
||||
Likely intended as dead/superseded — but 11k LOC warrants an explicit "retire
|
||||
vs salvage" call.
|
||||
|
||||
## D. Declined-by-design (confirm retirement, don't re-extract)
|
||||
|
||||
| Area | LOC | Disposition |
|
||||
|---|---|---|
|
||||
| `markitect/graphql/` | ~4,000 | All three explicitly declined GraphQL ("evidence of API need, not a commitment"). |
|
||||
| `markitect/query_paradigms/` | ~3,500 | Engine/tool keep the *QueryResult envelope* concept but say "do not port the registry wholesale." |
|
||||
| `markitect/proxy/` | ~870 | Non-markdown→md proxy with checksum/freshness tracking. **Overlaps markitect-filter.** Freshness/staleness-tracking mechanism may be worth checking against bench's deferred "stale-mappings." |
|
||||
| `capabilities/` (top-level) | ~8,300 | Capability-packaging architecture; partially maps to tool (schema generation) but the packaging approach itself isn't carried. |
|
||||
|
||||
---
|
||||
|
||||
## What this means
|
||||
|
||||
The successors are, by their own assessments, **near complete for the
|
||||
in-scope core** (parsing/schema → tool; persistence/workflow → engine;
|
||||
infospace lifecycle → bench; ingestion → filter; one render path →
|
||||
quarkdown). The truly unaddressed functionality is almost entirely the stuff
|
||||
markitect-main accreted **beyond** its stated scope: finance, issue tracking,
|
||||
user profiles, production/deployment validation, the asset/plugin/interactive-HTML
|
||||
rendering stack, and the older `spaces/` abstraction.
|
||||
|
||||
## Decisions for Bernd
|
||||
|
||||
Three live decisions, not a long extraction backlog:
|
||||
|
||||
### Decision 1 — Render/asset stack (Section B)
|
||||
The one with genuine product value left.
|
||||
- **Option 1a:** new repo (sibling to quarkdown) for plugin runtime + asset store + interactive-HTML
|
||||
- **Option 1b:** fold the asset store into an existing repo (most likely markitect-tool, behind a flag); retire interactive-HTML
|
||||
- **Option 1c:** retire the interactive-HTML path entirely; trust quarkdown export as the single render story
|
||||
|
||||
### Decision 2 — `markitect/spaces/` (Section C)
|
||||
- **Option 2a:** salvage bidirectional-sync / event-tracking / cross-space transclusion into engine (engine has the persistence story to support it)
|
||||
- **Option 2b:** retire wholesale as superseded by infospace
|
||||
|
||||
### Decision 3 — Project-ops cluster (Section A: finance + issues + profile)
|
||||
- **Option 3a:** confirm `issue-facade` already replaces `issue_tracker/` + `finance/`; retire both
|
||||
- **Option 3b:** identify a home for any pieces worth keeping
|
||||
|
||||
---
|
||||
|
||||
## Suggested verification before deciding
|
||||
|
||||
If verification matters before committing:
|
||||
|
||||
- **For Decision 1:** grep the five repos for any render/asset adapter that already covers the HTML path beyond Quarkdown.
|
||||
- **For Decision 2:** check whether engine's `OperationRun` + collection model can express bidirectional-sync semantics, or whether new primitives would be needed.
|
||||
- **For Decision 3:** confirm whether `issue-facade` truly replaces `issue_tracker/` + `finance/` end-to-end.
|
||||
|
||||
Happy to do any of these focused passes when you're ready to decide.
|
||||
@@ -117,7 +117,7 @@ This graph enables:
|
||||
|
||||
```bash
|
||||
# Ensure MarkiTect is installed
|
||||
cd /path/to/markitect_project
|
||||
cd /path/to/markitect-main
|
||||
pip install -e .
|
||||
```
|
||||
|
||||
|
||||
230
examples/infospace-with-history/docs/advanced-usage.md
Normal file
230
examples/infospace-with-history/docs/advanced-usage.md
Normal file
@@ -0,0 +1,230 @@
|
||||
# Advanced Usage — Wealth of Nations Infospace
|
||||
|
||||
Patterns for working with the WoN infospace (988 entities) after the initial
|
||||
pipeline run. Every command in this file has been run against the actual
|
||||
infospace at the time of writing (2026-04-21); output shapes are excerpted
|
||||
verbatim.
|
||||
|
||||
All commands assume `cwd = examples/infospace-with-history` and the
|
||||
`markitect-venv` Python environment.
|
||||
|
||||
---
|
||||
|
||||
## 1. Incremental evaluation — add entities after the initial run
|
||||
|
||||
`markitect infospace evaluate` writes one file per entity under
|
||||
`output/evaluations/<slug>.md`. It skips any entity whose evaluation file
|
||||
already exists, so re-running after adding a new entity processes only the
|
||||
new one.
|
||||
|
||||
```bash
|
||||
# Add a new entity file
|
||||
vim output/entities/new-concept.md
|
||||
|
||||
# Evaluate only the new entity (explicit)
|
||||
markitect infospace evaluate --entity new-concept --provider openrouter
|
||||
|
||||
# Or re-run the whole pass — existing 988 are skipped, only the new file hits the LLM
|
||||
markitect infospace evaluate --provider openrouter
|
||||
```
|
||||
|
||||
**How skip detection works.** Evaluation slugs are normalised to underscores
|
||||
with `_s_` preserving apostrophes (`farmers-capital` entity →
|
||||
`farmer_s_capital.md` evaluation). If a new entity slug collides with an
|
||||
existing evaluation under this normalisation, the eval will be skipped.
|
||||
To be sure an entity was picked up, check:
|
||||
|
||||
```bash
|
||||
# Count entities vs evaluations
|
||||
ls output/entities/*.md | grep -Ev 'book-[0-9]+-(chapter-[0-9]+|introduction)-' | wc -l
|
||||
ls output/evaluations/*.md | wc -l
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. Re-evaluating after guideline changes
|
||||
|
||||
`evaluate` has no `--force` flag; re-evaluation requires deleting the
|
||||
existing file first.
|
||||
|
||||
```bash
|
||||
# Re-evaluate a single entity after updating the evaluation rubric
|
||||
rm output/evaluations/accumulation_of_stock.md
|
||||
markitect infospace evaluate --entity accumulation-of-stock --provider openrouter
|
||||
|
||||
# Re-evaluate a whole chapter
|
||||
ls output/entities/book-1-chapter-06-entities.md # see which entities the chapter produced
|
||||
# Map chapter entities to eval filenames (apostrophe/underscore normalisation) and rm them
|
||||
```
|
||||
|
||||
After re-evaluating, refresh the aggregate:
|
||||
|
||||
```bash
|
||||
markitect infospace eval-summary --update-metrics
|
||||
```
|
||||
|
||||
This merges `per_entity_mean` into `output/metrics/metrics.yaml` so the next
|
||||
`markitect infospace viability` check reflects the new scores.
|
||||
|
||||
---
|
||||
|
||||
## 3. Interpreting per-entity score distributions
|
||||
|
||||
`eval-summary` shows the mean for each of the five evaluation dimensions
|
||||
plus the overall range:
|
||||
|
||||
```
|
||||
$ markitect infospace eval-summary
|
||||
Evaluation summary — 985 entities evaluated
|
||||
|
||||
Dimension Mean
|
||||
--------------------------------------
|
||||
overall 3.956
|
||||
definition_precision 3.620
|
||||
domain_placement 4.559
|
||||
explanatory_value 3.936
|
||||
source_grounding 4.358
|
||||
vsm_relevance 3.305
|
||||
|
||||
Range: 1.00 – 4.80
|
||||
```
|
||||
|
||||
Interpretation:
|
||||
- `overall` above the 3.5 viability threshold → the collection passes
|
||||
`per_entity_mean`.
|
||||
- The lowest dimension (`vsm_relevance` = 3.305) is the weakest signal. If
|
||||
the collection is meant to be VSM-grounded, this is the dimension most
|
||||
worth improving (via sharper entity definitions or schema changes).
|
||||
- A wide range (1.00 – 4.80) tells you there are outliers at both ends —
|
||||
worth triaging (see pattern 4).
|
||||
|
||||
---
|
||||
|
||||
## 4. Triaging low scorers
|
||||
|
||||
`markitect infospace entities --by-type` prints each entity's star score
|
||||
in-line:
|
||||
|
||||
```
|
||||
$ markitect infospace entities --by-type | head
|
||||
=== Element (315 entities) ===
|
||||
active_and_productive_stock Accumulation S1 ★4.6
|
||||
advanced_state_of_society General Theory S5
|
||||
agio_of_bank_money Exchange S2 ★4.8
|
||||
```
|
||||
|
||||
Entities with no `★` have no evaluation yet. To list the lowest-scoring
|
||||
entities across the whole collection:
|
||||
|
||||
```bash
|
||||
# Extract overall_score from every evaluation file and sort ascending
|
||||
for f in output/evaluations/*.md; do
|
||||
score=$(awk '/^overall_score:/ {print $2; exit}' "$f")
|
||||
printf "%s\t%s\n" "$score" "$(basename "$f" .md)"
|
||||
done | sort -n | head -20
|
||||
```
|
||||
|
||||
The 20 lowest scorers are the natural triage list — inspect their
|
||||
`output/entities/<slug>.md` and evaluation rationales to decide whether to
|
||||
refine the entity, merge it with a better-formed neighbour, or drop it.
|
||||
|
||||
---
|
||||
|
||||
## 5. Reading and acting on collection-check output
|
||||
|
||||
`markitect infospace check` runs five concerns (C1–C5). Use `--concern` to
|
||||
focus on one and `--json` for machine-readable output:
|
||||
|
||||
```bash
|
||||
# Redundancy — which pairs of entities are suspiciously similar?
|
||||
markitect infospace check --concern redundancy --json
|
||||
```
|
||||
|
||||
```json
|
||||
{
|
||||
"redundancy": {
|
||||
"concern": "C1",
|
||||
"redundancy_ratio": 0.0061,
|
||||
"similar_pairs": [
|
||||
{"entity_a": "bank_economic_contribution_metrics",
|
||||
"entity_b": "bank_economic_development_metrics",
|
||||
"similarity": 1.0, "method": "word_overlap"},
|
||||
{"entity_a": "economic_system_objectives",
|
||||
"entity_b": "economic_system_purpose",
|
||||
"similarity": 0.9394, "method": "word_overlap"}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Acting on this:
|
||||
- **Similarity = 1.0** is almost certainly a duplicate — pick one slug and
|
||||
merge or delete the other.
|
||||
- **0.85–0.99** usually means two entities genuinely cover the same idea
|
||||
with slight phrasing differences. Merging is the cleanest fix.
|
||||
- **< 0.85** usually represents legitimate adjacent concepts — leave as-is
|
||||
unless the definition rubric says otherwise.
|
||||
|
||||
For coverage and coherence, the pattern is the same: the `--json` output
|
||||
surfaces the specific entities / missing links / disconnected components
|
||||
you need to look at, rather than a bare ratio.
|
||||
|
||||
---
|
||||
|
||||
## 5. Systematic processing of long texts
|
||||
|
||||
For long source material (books, multi-chapter specifications, corpora), the
|
||||
pipeline can produce a clean chapter-by-chapter git history on its own if
|
||||
you let it. The pattern:
|
||||
|
||||
```bash
|
||||
# Process all sources in canonical order, eval and classify per chapter,
|
||||
# snapshot metrics after each chapter.
|
||||
markitect infospace process --all \
|
||||
--provider openrouter \
|
||||
--eval-after-source \
|
||||
--classify-after-source \
|
||||
--check-after-each
|
||||
```
|
||||
|
||||
What you get:
|
||||
|
||||
- **One commit per source file**, not per batch run. The commit message body
|
||||
lists counts by bucket (`entities: +23`, `evaluations: +23`,
|
||||
`classifications: +23`) derived from the actual staged diff, so `git log`
|
||||
reads like the story of the infospace growing.
|
||||
- **Chapter-atomic commits.** `--eval-after-source` and
|
||||
`--classify-after-source` evaluate and classify *only the new entities*
|
||||
from the just-processed source before the commit lands, so each commit is
|
||||
a self-contained chapter snapshot.
|
||||
- **Metrics-per-chapter trail.** `--check-after-each` appends a snapshot to
|
||||
`output/metrics/history.yaml` after every chapter, so `markitect infospace
|
||||
history` later shows the metric trajectory rather than just start/end.
|
||||
|
||||
**Cost tradeoff.** `--eval-after-source` pays LLM latency per chapter rather
|
||||
than amortising it across one bulk batch. It's worth it when you care about
|
||||
the git history or want early quality signal, not when you're bulk-backfilling
|
||||
a known-good corpus.
|
||||
|
||||
**Triage during the run.** While processing, use `markitect infospace
|
||||
chapters` in another shell to see per-source entity/eval/classify counts and
|
||||
mean scores — handy for spotting chapters that under-extracted or evaluated
|
||||
poorly.
|
||||
|
||||
```
|
||||
$ markitect infospace chapters
|
||||
source entities evaluated classified mean_score
|
||||
------------------- -------- --------- ---------- ----------
|
||||
book-1-chapter-01 96 96 79 4.22
|
||||
book-1-chapter-02 16 16 10 4.06
|
||||
…
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## See also
|
||||
|
||||
- `METRICS-METHODOLOGY.md` — how each metric is computed.
|
||||
- `docs/composition-guide.md` — using this infospace as a discipline for a
|
||||
different domain.
|
||||
- `docs/performance-notes.md` — observed timings and provider choices.
|
||||
106
examples/infospace-with-history/docs/performance-notes.md
Normal file
106
examples/infospace-with-history/docs/performance-notes.md
Normal file
@@ -0,0 +1,106 @@
|
||||
# Performance Notes — Wealth of Nations Infospace
|
||||
|
||||
Observed timings, file sizes, and provider choices from the 988-entity WoN
|
||||
example. These are **operational notes**, not a benchmark — numbers come
|
||||
from the actual S3.3 evaluation run (2026-02-23) rather than a controlled
|
||||
experiment.
|
||||
|
||||
---
|
||||
|
||||
## Evaluation batch duration
|
||||
|
||||
The initial evaluation pass produced 985 `output/evaluations/*.md` files:
|
||||
|
||||
- First `evaluated_at`: `2026-02-23T00:11:52`
|
||||
- Last `evaluated_at`: `2026-02-23T06:39:45`
|
||||
- **Total wall time: ~6h 28m**
|
||||
- **Effective throughput: ~2.5 entities/min** (~152 entities/hour)
|
||||
|
||||
Extracted from evaluation frontmatter:
|
||||
```bash
|
||||
grep -h '^evaluated_at:' output/evaluations/*.md | sort | sed -n '1p;$p'
|
||||
```
|
||||
|
||||
Caveats:
|
||||
- This was against OpenRouter's free tier, which applies implicit
|
||||
rate-limiting and occasional retries.
|
||||
- Throughput is not constant — gaps between bursts show up as plateaus
|
||||
when you plot the timestamps.
|
||||
- The batch was not fully parallelised; a tuned concurrent client could
|
||||
likely 2–4× this throughput on a paid OpenRouter tier.
|
||||
|
||||
---
|
||||
|
||||
## Tokens per entity (estimate)
|
||||
|
||||
Direct token counts are not logged in the evaluation files, but the
|
||||
inputs and outputs are on disk:
|
||||
|
||||
- **Input per request**: evaluation schema (~3.7 KB) + entity file
|
||||
(~0.7 KB median) + fixed system prompt ≈ **~1500–2500 tokens in**
|
||||
- **Output per request**: structured evaluation with 5 dimensions and
|
||||
rationales, median eval file 3.6 KB ≈ **~600–800 tokens out**
|
||||
- **Round-trip total**: **~2000–3000 tokens per entity**
|
||||
- **Batch total estimate**: 985 entities × ~2500 tokens ≈ **~2.5M tokens**
|
||||
for the full pass
|
||||
|
||||
The constant per-entity input means the cheapest way to reduce spend on a
|
||||
re-run is to narrow the targeted entities (`--entity <slug>` or
|
||||
`--chapter <n>`), not to shorten the schema.
|
||||
|
||||
---
|
||||
|
||||
## Embedding cache and collection checks
|
||||
|
||||
`markitect infospace check --concern redundancy` supports two similarity
|
||||
backends (see `markitect/infospace/checks/redundancy.py`):
|
||||
|
||||
- **`word_overlap`** — the default, used when no embeddings are provided.
|
||||
Pure-Python set intersection over tokenised entity text. **No LLM calls,
|
||||
no cache needed.** This is what the current WoN check runs.
|
||||
- **`embedding`** — active when a pre-computed `{slug: vector}` mapping is
|
||||
passed in. No persistent on-disk embedding cache exists today; the
|
||||
caller is responsible for computing and supplying the vectors.
|
||||
|
||||
Implication: the 988-entity `check` runs in seconds because it's all
|
||||
word-overlap. Switching to embedding similarity would add an embedding
|
||||
API pass (another ~988 requests) which is currently a manual step
|
||||
outside the CLI.
|
||||
|
||||
---
|
||||
|
||||
## Provider choice — recommendation
|
||||
|
||||
For the WoN dataset specifically (text-heavy entities, 5-dimension
|
||||
rubric):
|
||||
|
||||
| Scale | Recommended provider | Rationale |
|
||||
|-----------------------|----------------------------------|-----------|
|
||||
| < 50 entities | `gemini/gemini-2.5-flash` | Fast default; free tier is generous enough; consistent with `markitect llm-check` out of the box. |
|
||||
| 50 – 1000 entities | `openrouter` with a `:free` model (e.g. `arcee-ai/trinity-large-preview:free`) | What the S3.3 batch used; gets through 988 entities in one overnight run without cost. |
|
||||
| > 1000 entities | `openrouter` with a paid small-context model, or `openai` | Free-tier rate limits start to dominate wall time; paying for higher concurrency is cheaper than calendar time. |
|
||||
|
||||
All providers are accepted by `markitect infospace evaluate --provider`.
|
||||
The evaluation schema doesn't assume any provider-specific features.
|
||||
|
||||
Note on provider mixing: if part of a collection is evaluated under one
|
||||
provider/model and the rest under another, `per_entity_mean` can drift
|
||||
slightly (different models calibrate scores differently). For the
|
||||
viability threshold of 3.5 the drift is usually negligible, but for
|
||||
fine-grained outlier analysis prefer a single provider per batch.
|
||||
|
||||
---
|
||||
|
||||
## What is *not* measured here
|
||||
|
||||
- **End-to-end pipeline time** (entity extraction from raw chapters,
|
||||
classification, relation graph) — only the evaluation phase is timed.
|
||||
- **Memory footprint** — the full in-memory state for 988 entities is
|
||||
small (< 200 MB observed), but not systematically measured.
|
||||
- **Failure/retry rates** — the 985 vs 988 gap is three entities the
|
||||
original run missed (plus one added later); no structured retry log
|
||||
was kept.
|
||||
|
||||
Expanding any of these into a proper benchmark is **out of scope** for
|
||||
the WoN example and should live alongside a synthetic corpus that can be
|
||||
regenerated deterministically.
|
||||
@@ -0,0 +1,28 @@
|
||||
---
|
||||
entity_slug: advanced_state_of_society
|
||||
evaluator: gemini-2.5-flash
|
||||
evaluated_at: '2026-04-21T21:32:17.135192'
|
||||
overall_score: 4.5
|
||||
scores:
|
||||
- name: definition_precision
|
||||
value: 4.0
|
||||
max_value: 5.0
|
||||
rationale: The definition is precise, listing key characteristics like accumulated
|
||||
stock and private property. It clearly distinguishes the concept by contrasting
|
||||
it with earlier economic conditions.
|
||||
- name: source_grounding
|
||||
value: 5.0
|
||||
max_value: 5.0
|
||||
rationale: This entity is deeply grounded in Smith's work, particularly in Book
|
||||
I
|
||||
---
|
||||
|
||||
# Evaluation: Advanced State Of Society
|
||||
|
||||
## definition_precision — 4.0 / 5.0
|
||||
|
||||
The definition is precise, listing key characteristics like accumulated stock and private property. It clearly distinguishes the concept by contrasting it with earlier economic conditions.
|
||||
|
||||
## source_grounding — 5.0 / 5.0
|
||||
|
||||
This entity is deeply grounded in Smith's work, particularly in Book I
|
||||
@@ -0,0 +1,61 @@
|
||||
---
|
||||
entity_slug: bank_notes
|
||||
evaluator: null
|
||||
evaluated_at: '2026-04-21T21:33:16.736926'
|
||||
overall_score: 4.4
|
||||
scores:
|
||||
- name: definition_precision
|
||||
value: 5.0
|
||||
max_value: 5.0
|
||||
rationale: The definition is precise, clearly distinguishing bank notes by their
|
||||
issuer, form, and key characteristics (payable on demand, confidence-based). It
|
||||
avoids circularity and captures a distinct concept.
|
||||
- name: source_grounding
|
||||
value: 5.0
|
||||
max_value: 5.0
|
||||
rationale: The entity is excellently grounded in "The Wealth of Nations," specifically
|
||||
Book II, Chapter 2, where Smith extensively discusses bank notes' role in economizing
|
||||
precious metals and their reliance on public confidence.
|
||||
- name: domain_placement
|
||||
value: 4.0
|
||||
max_value: 5.0
|
||||
rationale: '"Exchange" is an appropriate domain as bank notes primarily function
|
||||
as a medium for facilitating transactions. While "Money" or "Finance" could also
|
||||
fit, "Exchange" accurately reflects their operational role in the economy.'
|
||||
- name: vsm_relevance
|
||||
value: 3.0
|
||||
max_value: 5.0
|
||||
rationale: Bank notes are a critical *medium* or *tool* that enables the primary
|
||||
operations (S1) of an economy (i.e., exchange of goods and services). However,
|
||||
they are not a VSM system or management function themselves, making their direct
|
||||
mapping somewhat abstract.
|
||||
- name: explanatory_value
|
||||
value: 5.0
|
||||
max_value: 5.0
|
||||
rationale: This entity offers significant explanatory power by detailing how paper
|
||||
money functions, its reliance on confidence, and its role in reducing the need
|
||||
for precious metals, thereby illuminating a key mechanism in Smith's economic
|
||||
theory.
|
||||
---
|
||||
|
||||
# Evaluation: Bank Notes
|
||||
|
||||
## definition_precision — 5.0 / 5.0
|
||||
|
||||
The definition is precise, clearly distinguishing bank notes by their issuer, form, and key characteristics (payable on demand, confidence-based). It avoids circularity and captures a distinct concept.
|
||||
|
||||
## source_grounding — 5.0 / 5.0
|
||||
|
||||
The entity is excellently grounded in "The Wealth of Nations," specifically Book II, Chapter 2, where Smith extensively discusses bank notes' role in economizing precious metals and their reliance on public confidence.
|
||||
|
||||
## domain_placement — 4.0 / 5.0
|
||||
|
||||
"Exchange" is an appropriate domain as bank notes primarily function as a medium for facilitating transactions. While "Money" or "Finance" could also fit, "Exchange" accurately reflects their operational role in the economy.
|
||||
|
||||
## vsm_relevance — 3.0 / 5.0
|
||||
|
||||
Bank notes are a critical *medium* or *tool* that enables the primary operations (S1) of an economy (i.e., exchange of goods and services). However, they are not a VSM system or management function themselves, making their direct mapping somewhat abstract.
|
||||
|
||||
## explanatory_value — 5.0 / 5.0
|
||||
|
||||
This entity offers significant explanatory power by detailing how paper money functions, its reliance on confidence, and its role in reducing the need for precious metals, thereby illuminating a key mechanism in Smith's economic theory.
|
||||
@@ -0,0 +1,60 @@
|
||||
---
|
||||
entity_slug: bank_systemic_risk_management
|
||||
evaluator: gemini-2.5-flash-lite
|
||||
evaluated_at: '2026-04-21T21:49:35.222637'
|
||||
overall_score: 4.0
|
||||
scores:
|
||||
- name: definition_precision
|
||||
value: 4.0
|
||||
max_value: 5.0
|
||||
rationale: The definition is precise and clearly outlines the purpose of bank systemic
|
||||
risk management. It avoids being an overly broad umbrella term.
|
||||
- name: source_grounding
|
||||
value: 3.0
|
||||
max_value: 5.0
|
||||
rationale: While the concept of managing risks to the banking system is present
|
||||
in Book II, Chapter 2, the explicit framing of "systemic risk management" as a
|
||||
distinct entity with specific practices might be a slight abstraction beyond Smith's
|
||||
direct terminology.
|
||||
- name: domain_placement
|
||||
value: 5.0
|
||||
max_value: 5.0
|
||||
rationale: The "Regulation" domain is highly appropriate. Managing systemic risk
|
||||
is fundamentally a regulatory concern aimed at ensuring the stability of the financial
|
||||
system.
|
||||
- name: vsm_relevance
|
||||
value: 4.0
|
||||
max_value: 5.0
|
||||
rationale: This entity strongly maps to VSM System 3 (Internal Regulation/Audit)
|
||||
as it involves monitoring and controlling internal operations to prevent systemic
|
||||
failures. It also has elements of System 5 (Policy) in setting overall stability
|
||||
goals.
|
||||
- name: explanatory_value
|
||||
value: 4.0
|
||||
max_value: 5.0
|
||||
rationale: The entity provides good explanatory value by highlighting a crucial
|
||||
mechanism for maintaining financial stability. It explains *how* the banking system
|
||||
can be protected from cascading failures.
|
||||
---
|
||||
|
||||
# Evaluation: Bank Systemic Risk Management
|
||||
|
||||
## definition_precision — 4.0 / 5.0
|
||||
|
||||
The definition is precise and clearly outlines the purpose of bank systemic risk management. It avoids being an overly broad umbrella term.
|
||||
|
||||
## source_grounding — 3.0 / 5.0
|
||||
|
||||
While the concept of managing risks to the banking system is present in Book II, Chapter 2, the explicit framing of "systemic risk management" as a distinct entity with specific practices might be a slight abstraction beyond Smith's direct terminology.
|
||||
|
||||
## domain_placement — 5.0 / 5.0
|
||||
|
||||
The "Regulation" domain is highly appropriate. Managing systemic risk is fundamentally a regulatory concern aimed at ensuring the stability of the financial system.
|
||||
|
||||
## vsm_relevance — 4.0 / 5.0
|
||||
|
||||
This entity strongly maps to VSM System 3 (Internal Regulation/Audit) as it involves monitoring and controlling internal operations to prevent systemic failures. It also has elements of System 5 (Policy) in setting overall stability goals.
|
||||
|
||||
## explanatory_value — 4.0 / 5.0
|
||||
|
||||
The entity provides good explanatory value by highlighting a crucial mechanism for maintaining financial stability. It explains *how* the banking system can be protected from cascading failures.
|
||||
@@ -3,7 +3,7 @@ consistency_cycles: 0.0
|
||||
coverage_ratio: 0.619048
|
||||
granularity_entropy: 2.674752
|
||||
modularity: 0.0
|
||||
per_entity_mean: 3.955635
|
||||
per_entity_mean: 3.95668
|
||||
redundancy_ratio: 0.006073
|
||||
type_distribution:
|
||||
Element: 315
|
||||
|
||||
@@ -240,8 +240,14 @@ def llm_catalog(output_format):
|
||||
)
|
||||
def llm_check(provider, model):
|
||||
"""Send a minimal prompt to verify a provider is reachable and responding."""
|
||||
import os
|
||||
|
||||
from markitect.llm import create_adapter
|
||||
from markitect.llm.exceptions import LLMConfigurationError, LLMError
|
||||
from markitect.llm.exceptions import (
|
||||
LLMAPIError,
|
||||
LLMConfigurationError,
|
||||
LLMError,
|
||||
)
|
||||
from markitect.prompts.execution.models import RunConfig
|
||||
|
||||
resolved = resolve_llm(cli_provider=provider, cli_model=model)
|
||||
@@ -252,6 +258,17 @@ def llm_check(provider, model):
|
||||
f" model from: {resolved.model_source}"
|
||||
)
|
||||
|
||||
# Advisory: OPENROUTER_API_KEY is set but this call won't use it. Common
|
||||
# source of "works for me, fails for agents" when the env var holds a
|
||||
# stale key that overrides a clean config entry.
|
||||
if resolved.provider != "openrouter" and os.environ.get("OPENROUTER_API_KEY"):
|
||||
click.echo(
|
||||
" note: OPENROUTER_API_KEY is set but won't be used for this "
|
||||
"provider. If OpenRouter calls fail elsewhere with 401, the env "
|
||||
"var may be stale — unset or update it.",
|
||||
err=True,
|
||||
)
|
||||
|
||||
try:
|
||||
adapter = create_adapter(
|
||||
provider=resolved.provider,
|
||||
@@ -273,6 +290,19 @@ def llm_check(provider, model):
|
||||
except LLMError as exc:
|
||||
elapsed = time.monotonic() - start
|
||||
click.echo(f"ERROR \u2014 LLM error after {elapsed:.1f}s: {exc}", err=True)
|
||||
# Targeted hint: 401 on openrouter almost always means a stale key.
|
||||
if (
|
||||
resolved.provider == "openrouter"
|
||||
and isinstance(exc, LLMAPIError)
|
||||
and exc.status_code == 401
|
||||
):
|
||||
click.echo(
|
||||
" hint: OpenRouter returned 401 (unauthorized). Check whether "
|
||||
"OPENROUTER_API_KEY is stale (`unset OPENROUTER_API_KEY` to "
|
||||
"fall back to the key in ~/.config/markitect/config.toml, or "
|
||||
"update the env var).",
|
||||
err=True,
|
||||
)
|
||||
sys.exit(1)
|
||||
except Exception as exc:
|
||||
elapsed = time.monotonic() - start
|
||||
|
||||
@@ -7,8 +7,9 @@ inspecting, and evaluating infospaces.
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import re
|
||||
from pathlib import Path
|
||||
from typing import Optional
|
||||
from typing import Dict, Optional
|
||||
|
||||
import click
|
||||
|
||||
@@ -228,6 +229,227 @@ def _entities_by_type(cfg, root: "Path", entity_list: list) -> None:
|
||||
click.echo(f"\nTotal: {total} entities")
|
||||
|
||||
|
||||
# ── chapters (per-source triage view) ────────────────────────────────
|
||||
|
||||
|
||||
@infospace_commands.command()
|
||||
@click.option("--config", "config_path", default=None, help="Path to infospace.yaml.")
|
||||
@click.option(
|
||||
"--format", "output_format",
|
||||
type=click.Choice(["text", "json"]),
|
||||
default="text",
|
||||
help="Output format.",
|
||||
)
|
||||
def chapters(config_path: Optional[str], output_format: str):
|
||||
"""List source files in canonical order with per-source stats.
|
||||
|
||||
For each source file in the sources directory, reports entity count,
|
||||
mean per-entity score (if evaluated), classification coverage, and
|
||||
processing status. Useful for triaging long-text infospaces.
|
||||
"""
|
||||
cfg, cfg_path = _load_config_or_exit(config_path)
|
||||
root = cfg_path.parent
|
||||
|
||||
sources_dir = root / cfg.topic.sources if cfg.topic.sources else root
|
||||
if not sources_dir.is_dir():
|
||||
click.echo(f"No sources directory at {sources_dir}.", err=True)
|
||||
raise SystemExit(1)
|
||||
|
||||
source_files = sorted(sources_dir.glob("*.md"))
|
||||
if not source_files:
|
||||
click.echo(f"No source files in {sources_dir}.", err=True)
|
||||
raise SystemExit(1)
|
||||
|
||||
entities_dir = root / cfg.entities_dir
|
||||
entity_list = (
|
||||
parse_entity_directory(entities_dir) if entities_dir.is_dir() else []
|
||||
)
|
||||
|
||||
# Build a source_id → [entities] map using the source_chapter field.
|
||||
# Matching is lenient: entities with a source_chapter substring-equal
|
||||
# to a normalized form of the source stem count as belonging to it.
|
||||
def _chapter_keys(source_id: str) -> list:
|
||||
"""Return strings an entity's source_chapter might contain."""
|
||||
keys = [source_id, source_id.replace("-", " ")]
|
||||
m = re.match(r"book-(\d+)-chapter-(\d+)", source_id)
|
||||
if m:
|
||||
book, chap = m.group(1), m.group(2)
|
||||
roman = {"1": "I", "2": "II", "3": "III", "4": "IV", "5": "V"}
|
||||
if book in roman:
|
||||
keys.append(f"Book {roman[book]}, Chapter {int(chap)}")
|
||||
keys.append(f"Book {roman[book]} Chapter {int(chap)}")
|
||||
return keys
|
||||
|
||||
# Precompute evaluation scores and classification slugs once.
|
||||
evals_dir = root / cfg.evaluations_dir
|
||||
cls_dir = root / cfg.classifications_dir
|
||||
eval_scores: Dict[str, float] = {}
|
||||
if evals_dir.is_dir():
|
||||
from markitect.infospace.evaluation_io import read_entity_evaluation
|
||||
for ev_path in evals_dir.glob("*.md"):
|
||||
try:
|
||||
ev = read_entity_evaluation(ev_path)
|
||||
if ev.overall_score is not None:
|
||||
eval_scores[ev_path.stem] = ev.overall_score
|
||||
except Exception:
|
||||
continue
|
||||
classified_slugs = (
|
||||
{p.stem for p in cls_dir.glob("*.md")} if cls_dir.is_dir() else set()
|
||||
)
|
||||
|
||||
rows = []
|
||||
for source_file in source_files:
|
||||
source_id = source_file.stem
|
||||
keys = _chapter_keys(source_id)
|
||||
matched = [
|
||||
e for e in entity_list
|
||||
if any(k.lower() in (e.source_chapter or "").lower() for k in keys)
|
||||
]
|
||||
slugs = {e.slug for e in matched}
|
||||
evaluated = slugs & set(eval_scores)
|
||||
classified = slugs & classified_slugs
|
||||
mean = (
|
||||
sum(eval_scores[s] for s in evaluated) / len(evaluated)
|
||||
if evaluated else None
|
||||
)
|
||||
rows.append({
|
||||
"source_id": source_id,
|
||||
"entities": len(matched),
|
||||
"evaluated": len(evaluated),
|
||||
"classified": len(classified),
|
||||
"mean_score": round(mean, 2) if mean is not None else None,
|
||||
})
|
||||
|
||||
if output_format == "json":
|
||||
import json
|
||||
click.echo(json.dumps(rows, indent=2))
|
||||
return
|
||||
|
||||
# Text: aligned table.
|
||||
headers = ("source", "entities", "evaluated", "classified", "mean_score")
|
||||
widths = [
|
||||
max(len(h), max((len(str(r[h.replace(' ', '_')])) if h != "source"
|
||||
else len(r["source_id"]))
|
||||
for r in rows)) if rows else len(h)
|
||||
for h in headers
|
||||
]
|
||||
fmt = " ".join(f"{{:<{w}}}" for w in widths)
|
||||
click.echo(fmt.format(*headers))
|
||||
click.echo(fmt.format(*("-" * w for w in widths)))
|
||||
for r in rows:
|
||||
click.echo(fmt.format(
|
||||
r["source_id"],
|
||||
r["entities"],
|
||||
r["evaluated"],
|
||||
r["classified"],
|
||||
"-" if r["mean_score"] is None else f"{r['mean_score']:.2f}",
|
||||
))
|
||||
totals = {
|
||||
"entities": sum(r["entities"] for r in rows),
|
||||
"evaluated": sum(r["evaluated"] for r in rows),
|
||||
"classified": sum(r["classified"] for r in rows),
|
||||
}
|
||||
click.echo(
|
||||
f"\n{len(rows)} source file(s); "
|
||||
f"{totals['entities']} entities, "
|
||||
f"{totals['evaluated']} evaluated, "
|
||||
f"{totals['classified']} classified."
|
||||
)
|
||||
|
||||
|
||||
# ── entity (single lookup) ───────────────────────────────────────────
|
||||
|
||||
|
||||
@infospace_commands.command()
|
||||
@click.argument("name")
|
||||
@click.option("--config", "config_path", default=None, help="Path to infospace.yaml.")
|
||||
def entity(name: str, config_path: Optional[str]):
|
||||
"""Look up one entity by name, tolerating case / hyphens / underscores.
|
||||
|
||||
Prints slug, source path, domain, chapter, word count, overall score,
|
||||
VSM system (if classified), and evaluation-file path.
|
||||
"""
|
||||
cfg, cfg_path = _load_config_or_exit(config_path)
|
||||
root = cfg_path.parent
|
||||
entities_dir = root / cfg.entities_dir
|
||||
|
||||
if not entities_dir.is_dir():
|
||||
click.echo("No entities directory found.", err=True)
|
||||
raise SystemExit(1)
|
||||
|
||||
entity_list = parse_entity_directory(entities_dir)
|
||||
if not entity_list:
|
||||
click.echo("No entities found.", err=True)
|
||||
raise SystemExit(1)
|
||||
|
||||
# Normalize: lowercase, underscores.
|
||||
def norm(s: str) -> str:
|
||||
return s.lower().replace("-", "_").replace(" ", "_")
|
||||
|
||||
target = norm(name)
|
||||
by_slug = {e.slug: e for e in entity_list}
|
||||
|
||||
match = by_slug.get(target)
|
||||
if match is None:
|
||||
# Substring fallback for partial input.
|
||||
candidates = [e for e in entity_list if target in norm(e.slug)]
|
||||
if len(candidates) == 1:
|
||||
match = candidates[0]
|
||||
elif len(candidates) > 1:
|
||||
click.echo(f"Ambiguous — '{name}' matches multiple entities:", err=True)
|
||||
for c in sorted(candidates, key=lambda e: e.slug)[:10]:
|
||||
click.echo(f" {c.slug}", err=True)
|
||||
if len(candidates) > 10:
|
||||
click.echo(f" … and {len(candidates) - 10} more", err=True)
|
||||
raise SystemExit(1)
|
||||
else:
|
||||
click.echo(f"No entity matching '{name}'.", err=True)
|
||||
near = sorted(
|
||||
e.slug for e in entity_list
|
||||
if target.split("_", 1)[0] in e.slug
|
||||
)[:5]
|
||||
if near:
|
||||
click.echo(f" Near matches: {', '.join(near)}", err=True)
|
||||
raise SystemExit(1)
|
||||
|
||||
# Load score + classification (best-effort).
|
||||
score: Optional[float] = None
|
||||
evaluator: Optional[str] = None
|
||||
eval_file = root / cfg.evaluations_dir / f"{match.slug}.md"
|
||||
if eval_file.is_file():
|
||||
try:
|
||||
from markitect.infospace.evaluation_io import read_entity_evaluation
|
||||
ev = read_entity_evaluation(eval_file)
|
||||
score = ev.overall_score
|
||||
evaluator = ev.evaluator
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
vsm: Optional[str] = None
|
||||
cls_file = root / cfg.classifications_dir / f"{match.slug}.md"
|
||||
if cls_file.is_file():
|
||||
try:
|
||||
from markitect.infospace.classification_io import read_entity_classification
|
||||
cls = read_entity_classification(cls_file)
|
||||
vsm = cls.vsm_system
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Output — one field per line so it's easy to grep or pipe.
|
||||
click.echo(f"slug: {match.slug}")
|
||||
click.echo(f"source_path: {match.source_path}")
|
||||
click.echo(f"domain: {match.domain or '-'}")
|
||||
click.echo(f"chapter: {match.source_chapter or '-'}")
|
||||
click.echo(f"word_count: {match.total_word_count}")
|
||||
click.echo(f"vsm_system: {vsm or '-'}")
|
||||
if score is not None:
|
||||
click.echo(f"overall_score: {score:.2f}")
|
||||
click.echo(f"evaluator: {evaluator or '-'}")
|
||||
click.echo(f"evaluation: {eval_file}")
|
||||
else:
|
||||
click.echo("evaluation: (not yet evaluated)")
|
||||
|
||||
|
||||
# ── evaluate ─────────────────────────────────────────────────────────
|
||||
|
||||
|
||||
@@ -237,7 +459,14 @@ def _entities_by_type(cfg, root: "Path", entity_list: list) -> None:
|
||||
@click.option("--model", default=None, help="LLM model name.")
|
||||
@click.option("--entity", "entity_slug", default=None, help="Evaluate a single entity by slug.")
|
||||
@click.option("--chapter", default=None, help="Evaluate entities from a specific chapter.")
|
||||
def evaluate(config_path, provider, model, entity_slug, chapter):
|
||||
@click.option("--force", is_flag=True, default=False,
|
||||
help="Re-evaluate entities whose evaluation file already exists.")
|
||||
@click.option("--model-fallback", "model_fallback", default=None,
|
||||
help="If the primary model hits a rate limit (429), retry the "
|
||||
"failed entities once with this model. Useful on free tiers "
|
||||
"where models have separate quota buckets (e.g. "
|
||||
"gemini-2.5-flash → gemini-2.5-flash-lite).")
|
||||
def evaluate(config_path, provider, model, entity_slug, chapter, force, model_fallback):
|
||||
"""Evaluate entities using LLM-based quality assessment."""
|
||||
cfg, cfg_path = _load_config_or_exit(config_path)
|
||||
root = cfg_path.parent
|
||||
@@ -252,32 +481,44 @@ def evaluate(config_path, provider, model, entity_slug, chapter):
|
||||
click.echo("No entities to evaluate.")
|
||||
return
|
||||
|
||||
# Filter
|
||||
# Filter. Accept hyphenated input for --entity by normalizing to the
|
||||
# underscore slug format produced by parse_entity_directory.
|
||||
if entity_slug:
|
||||
entity_list = [e for e in entity_list if e.slug == entity_slug]
|
||||
if not entity_list:
|
||||
click.echo(f"Error: Entity '{entity_slug}' not found.", err=True)
|
||||
normalized = entity_slug.replace("-", "_")
|
||||
matches = [e for e in entity_list if e.slug == normalized]
|
||||
if not matches:
|
||||
# Build a short "did you mean…" list from entities sharing a stem.
|
||||
stem = normalized.split("_", 1)[0]
|
||||
near = sorted(e.slug for e in entity_list if e.slug.startswith(stem))[:5]
|
||||
msg = f"Error: Entity '{entity_slug}' not found."
|
||||
if near:
|
||||
msg += f" Did you mean: {', '.join(near)} ?"
|
||||
click.echo(msg, err=True)
|
||||
raise SystemExit(1)
|
||||
entity_list = matches
|
||||
elif chapter:
|
||||
entity_list = [e for e in entity_list if chapter in e.source_chapter]
|
||||
if not entity_list:
|
||||
click.echo(f"No entities found for chapter '{chapter}'.")
|
||||
return
|
||||
|
||||
# Skip entities that already have evaluation files (incremental resume)
|
||||
# Skip entities that already have evaluation files (incremental resume).
|
||||
# Applies uniformly to full-pass, --entity, and --chapter runs unless
|
||||
# --force is set.
|
||||
from markitect.infospace.evaluate import run_entity_evaluation
|
||||
output_dir = root / cfg.evaluations_dir
|
||||
if not entity_slug and not chapter and output_dir.is_dir():
|
||||
previous_digests = {
|
||||
p.stem: "" # non-empty sentinel → triggers skip in BatchEvaluator
|
||||
for p in output_dir.glob("*.md")
|
||||
}
|
||||
entity_list = [e for e in entity_list if e.slug not in previous_digests]
|
||||
if not force and output_dir.is_dir():
|
||||
existing = {p.stem for p in output_dir.glob("*.md")}
|
||||
before = len(entity_list)
|
||||
entity_list = [e for e in entity_list if e.slug not in existing]
|
||||
skipped = before - len(entity_list)
|
||||
if not entity_list:
|
||||
click.echo("All entities already evaluated. Nothing to do.")
|
||||
click.echo("All selected entities already evaluated. "
|
||||
"Re-run with --force to overwrite.")
|
||||
return
|
||||
if previous_digests:
|
||||
click.echo(f"Skipping {len(previous_digests)} already-evaluated entities.")
|
||||
if skipped:
|
||||
click.echo(f"Skipping {skipped} already-evaluated entities. "
|
||||
"Use --force to re-evaluate.")
|
||||
|
||||
# Create adapter
|
||||
from markitect.llm import create_adapter
|
||||
@@ -285,10 +526,14 @@ def evaluate(config_path, provider, model, entity_slug, chapter):
|
||||
adapter = create_adapter(provider, model=model)
|
||||
run_config = RunConfig(model_name=model, temperature=0.3, max_tokens=2000)
|
||||
|
||||
# Progress callback
|
||||
# Progress callback — surface error detail so agents don't have to
|
||||
# drop into Python to see whether an ERROR was 429, 503, or auth.
|
||||
def on_progress(done, total, result):
|
||||
status = result.status.upper()
|
||||
click.echo(f" [{done}/{total}] {result.key}: {status}")
|
||||
if status == "ERROR" and result.error:
|
||||
click.echo(f" [{done}/{total}] {result.key}: ERROR — {result.error}")
|
||||
else:
|
||||
click.echo(f" [{done}/{total}] {result.key}: {status}")
|
||||
|
||||
click.echo(f"Evaluating {len(entity_list)} entities via {provider}...")
|
||||
|
||||
@@ -301,6 +546,42 @@ def evaluate(config_path, provider, model, entity_slug, chapter):
|
||||
progress_callback=on_progress,
|
||||
)
|
||||
|
||||
# Model fallback: if any entities failed with a rate-limit-looking
|
||||
# error and the user opted in with --model-fallback, retry them once
|
||||
# with a fresh adapter on the fallback model. Different free-tier
|
||||
# models have separate quota buckets, so this often succeeds when
|
||||
# the primary is exhausted.
|
||||
if model_fallback and summary.failed > 0:
|
||||
rate_limited = [
|
||||
r for r in summary.results
|
||||
if r.status == "error"
|
||||
and r.error
|
||||
and ("429" in r.error or "rate" in r.error.lower())
|
||||
]
|
||||
if rate_limited:
|
||||
retry_slugs = {r.key for r in rate_limited}
|
||||
retry_entities = [e for e in entity_list if e.slug in retry_slugs]
|
||||
click.echo(
|
||||
f"\n{len(retry_entities)} rate-limited entities — "
|
||||
f"retrying with --model-fallback {model_fallback}..."
|
||||
)
|
||||
fb_adapter = create_adapter(provider, model=model_fallback)
|
||||
fb_run_config = RunConfig(
|
||||
model_name=model_fallback, temperature=0.3, max_tokens=2000
|
||||
)
|
||||
fb_summary = run_entity_evaluation(
|
||||
config=cfg,
|
||||
entities=retry_entities,
|
||||
adapter=fb_adapter,
|
||||
run_config=fb_run_config,
|
||||
output_dir=output_dir,
|
||||
progress_callback=on_progress,
|
||||
)
|
||||
summary.succeeded += fb_summary.succeeded
|
||||
summary.failed = (summary.failed - len(retry_entities)) + fb_summary.failed
|
||||
summary.total_prompt_tokens += fb_summary.total_prompt_tokens
|
||||
summary.total_completion_tokens += fb_summary.total_completion_tokens
|
||||
|
||||
click.echo(f"\nDone: {summary.succeeded} succeeded, {summary.failed} failed, {summary.skipped} skipped")
|
||||
if summary.total_tokens > 0:
|
||||
click.echo(f"Tokens used: {summary.total_tokens}")
|
||||
@@ -1015,6 +1296,18 @@ def disciplines(config_path: Optional[str]):
|
||||
help="Run collection checks (C1–C5) after each source file.",
|
||||
)
|
||||
@click.option("--no-commit", is_flag=True, help="Skip git commits.")
|
||||
@click.option(
|
||||
"--eval-after-source",
|
||||
is_flag=True,
|
||||
help="After each source's stages succeed, evaluate just the newly-"
|
||||
"added entities so the per-source commit is self-contained.",
|
||||
)
|
||||
@click.option(
|
||||
"--classify-after-source",
|
||||
is_flag=True,
|
||||
help="After each source's stages succeed, classify just the newly-"
|
||||
"added entities so the per-source commit is self-contained.",
|
||||
)
|
||||
def process(
|
||||
glob_pattern: Optional[str],
|
||||
process_all: bool,
|
||||
@@ -1023,6 +1316,8 @@ def process(
|
||||
model: Optional[str],
|
||||
check_after_each: bool,
|
||||
no_commit: bool,
|
||||
eval_after_source: bool,
|
||||
classify_after_source: bool,
|
||||
):
|
||||
"""Process source files through the pipeline defined in infospace.yaml.
|
||||
|
||||
@@ -1096,12 +1391,22 @@ def process(
|
||||
# Run pipeline
|
||||
from markitect.infospace.pipeline import SourcePipeline
|
||||
|
||||
if (eval_after_source or classify_after_source) and adapter is None:
|
||||
click.echo(
|
||||
"Error: --eval-after-source / --classify-after-source require "
|
||||
"--provider (they call the LLM).",
|
||||
err=True,
|
||||
)
|
||||
raise SystemExit(1)
|
||||
|
||||
pipeline = SourcePipeline(
|
||||
cfg, root,
|
||||
adapter=adapter,
|
||||
provider=provider or "",
|
||||
model=(model or _PROVIDER_DEFAULTS.get(provider or "", "")) if provider else "",
|
||||
no_commit=no_commit,
|
||||
eval_after_source=eval_after_source,
|
||||
classify_after_source=classify_after_source,
|
||||
)
|
||||
|
||||
total = len(source_files)
|
||||
|
||||
@@ -195,12 +195,23 @@ def run_entity_evaluation(
|
||||
"""
|
||||
topic = config.topic.name
|
||||
evaluations_path = output_dir or Path(config.evaluations_dir)
|
||||
evaluator_name = (run_config.model_name if run_config else "unknown")
|
||||
# Fall back from run_config.model_name (may be None if the CLI user did
|
||||
# not pass --model) to the adapter's resolved model, and only then to
|
||||
# "unknown". Keeps the evaluator field in the written frontmatter
|
||||
# informative for later audits.
|
||||
default_evaluator = (
|
||||
(run_config.model_name if run_config else None)
|
||||
or getattr(adapter, "_model", None)
|
||||
or "unknown"
|
||||
)
|
||||
|
||||
def _write_and_notify(done: int, total: int, result) -> None:
|
||||
# Write file immediately on success (incremental — run is resumable)
|
||||
if result.status == "success" and result.response is not None:
|
||||
scores = parse_evaluation_response(result.response.content, dimensions)
|
||||
# Prefer the model name the adapter actually echoed back — it
|
||||
# reflects post-resolution fallbacks (e.g. flash → flash-lite).
|
||||
evaluator_name = result.response.model or default_evaluator
|
||||
evaluation = EntityEvaluation(
|
||||
entity_slug=result.key,
|
||||
evaluator=evaluator_name,
|
||||
|
||||
@@ -81,17 +81,26 @@ def snapshot_from_checks(
|
||||
# ── Metrics file I/O ────────────────────────────────────────────────
|
||||
|
||||
|
||||
def write_metrics_file(metrics: Dict[str, float], path: Path) -> None:
|
||||
def write_metrics_file(metrics: Dict[str, Any], path: Path) -> None:
|
||||
"""Write the latest metrics to a simple YAML file.
|
||||
|
||||
This file is used by ``markitect infospace viability`` for quick
|
||||
threshold checking.
|
||||
threshold checking. Non-numeric values (e.g. ``type_distribution``)
|
||||
are passed through unchanged; floats are rounded to 6 dp; ints are
|
||||
preserved as ints so external consumers don't see ``29`` silently
|
||||
become ``29.0`` on every round-trip.
|
||||
"""
|
||||
def _normalize(v: Any) -> Any:
|
||||
if isinstance(v, bool):
|
||||
return v
|
||||
if isinstance(v, float):
|
||||
return round(v, 6)
|
||||
return v
|
||||
|
||||
path.parent.mkdir(parents=True, exist_ok=True)
|
||||
path.write_text(
|
||||
yaml.safe_dump(
|
||||
{k: round(v, 6) if isinstance(v, float) else v
|
||||
for k, v in sorted(metrics.items())},
|
||||
{k: _normalize(v) for k, v in sorted(metrics.items())},
|
||||
default_flow_style=False,
|
||||
sort_keys=True,
|
||||
),
|
||||
@@ -99,14 +108,20 @@ def write_metrics_file(metrics: Dict[str, float], path: Path) -> None:
|
||||
)
|
||||
|
||||
|
||||
def read_metrics_file(path: Path) -> Dict[str, float]:
|
||||
"""Read the latest metrics from a YAML file."""
|
||||
def read_metrics_file(path: Path) -> Dict[str, Any]:
|
||||
"""Read the latest metrics from a YAML file.
|
||||
|
||||
Returns all keys as written on disk, preserving types verbatim so a
|
||||
round-trip via :func:`write_metrics_file` does not silently drop
|
||||
structured values (e.g. ``type_distribution``) or flatten ints to
|
||||
floats.
|
||||
"""
|
||||
if not path.is_file():
|
||||
return {}
|
||||
raw = yaml.safe_load(path.read_text(encoding="utf-8"))
|
||||
if not isinstance(raw, dict):
|
||||
return {}
|
||||
return {k: float(v) for k, v in raw.items() if isinstance(v, (int, float))}
|
||||
return raw
|
||||
|
||||
|
||||
# ── History operations ───────────────────────────────────────────────
|
||||
|
||||
@@ -62,6 +62,8 @@ class SourcePipeline:
|
||||
provider: str = "",
|
||||
model: str = "",
|
||||
no_commit: bool = False,
|
||||
eval_after_source: bool = False,
|
||||
classify_after_source: bool = False,
|
||||
) -> None:
|
||||
self.config = config
|
||||
self.root = root
|
||||
@@ -69,6 +71,8 @@ class SourcePipeline:
|
||||
self.provider = provider
|
||||
self.model = model
|
||||
self.no_commit = no_commit
|
||||
self.eval_after_source = eval_after_source
|
||||
self.classify_after_source = classify_after_source
|
||||
|
||||
# ── Public API ────────────────────────────────────────────────────
|
||||
|
||||
@@ -110,6 +114,12 @@ class SourcePipeline:
|
||||
stage_outputs: Dict[str, str] = {}
|
||||
stage_logs: List[Dict[str, Any]] = []
|
||||
|
||||
# Snapshot entity slugs before any stage runs so we can identify
|
||||
# which entities were newly produced by this source. Used to scope
|
||||
# --eval-after-source / --classify-after-source to only the new
|
||||
# entities.
|
||||
pre_entity_slugs = self._current_entity_slugs()
|
||||
|
||||
print(f"\nProcessing: {source_id}")
|
||||
print("=" * 60)
|
||||
|
||||
@@ -133,6 +143,14 @@ class SourcePipeline:
|
||||
|
||||
print(f"\n {source_id}: all stages complete.")
|
||||
self._write_processing_log(source_id, stage_logs, success=True)
|
||||
|
||||
# Per-source follow-ups: evaluate and/or classify just the new
|
||||
# entities this source produced, so the next commit contains a
|
||||
# fully-processed chapter.
|
||||
new_slugs = self._current_entity_slugs() - pre_entity_slugs
|
||||
if new_slugs and (self.eval_after_source or self.classify_after_source):
|
||||
self._run_per_source_followups(new_slugs)
|
||||
|
||||
if not self.no_commit:
|
||||
self._git_commit(source_id)
|
||||
|
||||
@@ -636,7 +654,13 @@ class SourcePipeline:
|
||||
# ── Git Integration ───────────────────────────────────────────────
|
||||
|
||||
def _git_commit(self, source_id: str) -> None:
|
||||
"""Stage all output changes and commit them for *source_id*."""
|
||||
"""Stage all output changes and commit them for *source_id*.
|
||||
|
||||
The commit message body summarises what actually changed — counts
|
||||
of entities / evaluations / classifications / analyses added — so
|
||||
``git log`` reads like the chapter-by-chapter story of the
|
||||
infospace growing, not a wall of identical messages.
|
||||
"""
|
||||
output_dir = self.root / "output"
|
||||
try:
|
||||
subprocess.run(
|
||||
@@ -645,11 +669,11 @@ class SourcePipeline:
|
||||
check=True,
|
||||
capture_output=True,
|
||||
)
|
||||
body = self._compose_commit_body(source_id)
|
||||
result = subprocess.run(
|
||||
[
|
||||
"git", "commit", "-m",
|
||||
f"infospace: process {source_id}\n\n"
|
||||
f"Extract entities, map to VSM, and synthesize analysis.",
|
||||
f"infospace: process {source_id}\n\n{body}",
|
||||
],
|
||||
cwd=str(self.root),
|
||||
capture_output=True,
|
||||
@@ -666,3 +690,146 @@ class SourcePipeline:
|
||||
except subprocess.CalledProcessError as e:
|
||||
stderr = e.stderr.decode() if isinstance(e.stderr, bytes) else (e.stderr or "")
|
||||
print(f" Warning: Git error: {stderr.strip()}")
|
||||
|
||||
# ── Per-source helpers ────────────────────────────────────────────
|
||||
|
||||
def _current_entity_slugs(self) -> set:
|
||||
"""Return the set of entity file stems currently on disk."""
|
||||
entities_dir = self.root / self.config.entities_dir
|
||||
if not entities_dir.is_dir():
|
||||
return set()
|
||||
return {p.stem for p in entities_dir.glob("*.md")}
|
||||
|
||||
def _run_per_source_followups(self, new_slugs: set) -> None:
|
||||
"""Run per-source evaluation and/or classification on *new_slugs*.
|
||||
|
||||
Called after a source's pipeline stages succeed, before the git
|
||||
commit, so each chapter's commit contains the full set of
|
||||
artefacts derived from it.
|
||||
"""
|
||||
from markitect.infospace.entity_parser import parse_entity_directory
|
||||
|
||||
entities_dir = self.root / self.config.entities_dir
|
||||
all_entities = parse_entity_directory(entities_dir)
|
||||
new_entities = [e for e in all_entities if e.slug in new_slugs]
|
||||
if not new_entities:
|
||||
return
|
||||
|
||||
if self.adapter is None:
|
||||
print(
|
||||
" Skipping per-source eval/classify: no LLM adapter "
|
||||
"configured (run with --provider)."
|
||||
)
|
||||
return
|
||||
|
||||
from markitect.prompts.execution.models import RunConfig
|
||||
|
||||
run_config = RunConfig(
|
||||
model_name=self.model or None, temperature=0.3, max_tokens=2000
|
||||
)
|
||||
|
||||
if self.eval_after_source:
|
||||
from markitect.infospace.evaluate import run_entity_evaluation
|
||||
|
||||
print(f" Evaluating {len(new_entities)} new entity/entities…")
|
||||
try:
|
||||
run_entity_evaluation(
|
||||
config=self.config,
|
||||
entities=new_entities,
|
||||
adapter=self.adapter,
|
||||
run_config=run_config,
|
||||
output_dir=self.root / self.config.evaluations_dir,
|
||||
)
|
||||
except Exception as exc:
|
||||
print(f" Warning: per-source evaluation failed: {exc}")
|
||||
|
||||
if self.classify_after_source:
|
||||
from markitect.infospace.classifier import run_entity_classification
|
||||
|
||||
print(f" Classifying {len(new_entities)} new entity/entities…")
|
||||
try:
|
||||
run_entity_classification(
|
||||
config=self.config,
|
||||
entities=new_entities,
|
||||
adapter=self.adapter,
|
||||
run_config=run_config,
|
||||
output_dir=self.root / self.config.classifications_dir,
|
||||
)
|
||||
except Exception as exc:
|
||||
print(f" Warning: per-source classification failed: {exc}")
|
||||
|
||||
def _compose_commit_body(self, source_id: str) -> str:
|
||||
"""Summarise staged output changes into a commit-message body.
|
||||
|
||||
Counts added files per output subdirectory (entities, evaluations,
|
||||
classifications, analyses, mappings…) and produces one line per
|
||||
bucket that actually saw additions. Modified/deleted files are
|
||||
noted separately for auditability.
|
||||
"""
|
||||
default = "Extract entities, map to VSM, and synthesize analysis."
|
||||
try:
|
||||
result = subprocess.run(
|
||||
["git", "diff", "--cached", "--name-status", "--", "output"],
|
||||
cwd=str(self.root),
|
||||
check=True,
|
||||
capture_output=True,
|
||||
text=True,
|
||||
)
|
||||
except subprocess.CalledProcessError:
|
||||
return default
|
||||
|
||||
added_by_bucket: Dict[str, int] = {}
|
||||
modified = 0
|
||||
deleted = 0
|
||||
for line in result.stdout.splitlines():
|
||||
parts = line.split("\t")
|
||||
if len(parts) < 2:
|
||||
continue
|
||||
status = parts[0]
|
||||
path = parts[-1]
|
||||
if status.startswith("A"):
|
||||
bucket = self._bucket_for(path)
|
||||
if bucket:
|
||||
added_by_bucket[bucket] = added_by_bucket.get(bucket, 0) + 1
|
||||
elif status.startswith("M"):
|
||||
modified += 1
|
||||
elif status.startswith("D"):
|
||||
deleted += 1
|
||||
|
||||
if not added_by_bucket and not modified and not deleted:
|
||||
return default
|
||||
|
||||
# Emit buckets in a deterministic, reader-friendly order.
|
||||
order = ["entities", "mappings", "analyses", "evaluations",
|
||||
"classifications", "metrics", "logs", "other"]
|
||||
lines: List[str] = []
|
||||
for bucket in order:
|
||||
n = added_by_bucket.get(bucket, 0)
|
||||
if n:
|
||||
lines.append(f"- {bucket}: +{n}")
|
||||
if modified:
|
||||
lines.append(f"- modified: {modified}")
|
||||
if deleted:
|
||||
lines.append(f"- deleted: {deleted}")
|
||||
return "\n".join(lines) if lines else default
|
||||
|
||||
def _bucket_for(self, path: str) -> Optional[str]:
|
||||
"""Map an ``output/...`` path to a commit-summary bucket name."""
|
||||
# Use configured directory basenames where possible so non-default
|
||||
# layouts still bucket correctly.
|
||||
buckets = {
|
||||
Path(self.config.entities_dir).name: "entities",
|
||||
Path(self.config.evaluations_dir).name: "evaluations",
|
||||
Path(self.config.classifications_dir).name: "classifications",
|
||||
}
|
||||
parts = Path(path).parts
|
||||
if len(parts) < 2 or parts[0] != "output":
|
||||
return None
|
||||
sub = parts[1]
|
||||
if sub in buckets:
|
||||
return buckets[sub]
|
||||
# Heuristic fallback for common additional output subdirectories.
|
||||
known = {"mappings", "analyses", "metrics", "logs"}
|
||||
if sub in known:
|
||||
return sub
|
||||
return "other"
|
||||
|
||||
@@ -131,6 +131,12 @@ def build_state(
|
||||
This is a convenience function that assembles the state object
|
||||
and optionally runs viability checks if *metrics* are provided.
|
||||
"""
|
||||
if not isinstance(config, InfospaceConfig):
|
||||
raise TypeError(
|
||||
f"build_state(config=...) expects an InfospaceConfig instance, "
|
||||
f"got {type(config).__name__}. If you have a path, load the "
|
||||
f"config first with load_infospace_config(path)."
|
||||
)
|
||||
state = InfospaceState(
|
||||
config=config,
|
||||
entities=entities or [],
|
||||
|
||||
@@ -9,7 +9,11 @@ from markitect.llm.adapter import LLMAdapter
|
||||
from markitect.llm.models import RunConfig, LLMResponse
|
||||
from markitect.llm.config import resolve_api_key, find_project_root
|
||||
from markitect.llm._http import post_json
|
||||
from markitect.llm.exceptions import LLMConfigurationError
|
||||
from markitect.llm.exceptions import (
|
||||
LLMConfigurationError,
|
||||
LLMAPIError,
|
||||
LLMRateLimitError,
|
||||
)
|
||||
|
||||
_DEFAULT_MODEL = "gemini-2.5-flash"
|
||||
_API_BASE = "https://generativelanguage.googleapis.com/v1beta"
|
||||
@@ -26,10 +30,12 @@ class GeminiAdapter(LLMAdapter):
|
||||
model: Optional[str] = None,
|
||||
api_key: Optional[str] = None,
|
||||
system_prompt: Optional[str] = None,
|
||||
max_retries: int = 3,
|
||||
**_kwargs: Any,
|
||||
):
|
||||
self._model = model or _DEFAULT_MODEL
|
||||
self._system_prompt = system_prompt
|
||||
self._max_retries = max_retries
|
||||
|
||||
root = find_project_root()
|
||||
key_file_paths = [root / "apikey-geminifree.txt"] if root else []
|
||||
@@ -77,7 +83,7 @@ class GeminiAdapter(LLMAdapter):
|
||||
url = f"{_API_BASE}/models/{model}:generateContent?key={self._api_key}"
|
||||
|
||||
start = time.time()
|
||||
data = post_json(url, payload, timeout=config.timeout_seconds)
|
||||
data = self._post_with_retries(url, payload, timeout=config.timeout_seconds)
|
||||
latency = time.time() - start
|
||||
|
||||
# Parse Gemini response
|
||||
@@ -113,3 +119,27 @@ class GeminiAdapter(LLMAdapter):
|
||||
if not (0.0 <= config.temperature <= 2.0):
|
||||
return False
|
||||
return True
|
||||
|
||||
# ── Internals ───────────────────────────────────────────────────
|
||||
|
||||
def _post_with_retries(
|
||||
self,
|
||||
url: str,
|
||||
payload: Dict[str, Any],
|
||||
timeout: int,
|
||||
) -> Dict[str, Any]:
|
||||
last_exc: Optional[Exception] = None
|
||||
for attempt in range(self._max_retries + 1):
|
||||
try:
|
||||
return post_json(url, payload, timeout=timeout)
|
||||
except LLMRateLimitError as exc:
|
||||
last_exc = exc
|
||||
if attempt < self._max_retries:
|
||||
time.sleep(2 ** attempt)
|
||||
except LLMAPIError as exc:
|
||||
if exc.status_code in (502, 503, 504) and attempt < self._max_retries:
|
||||
last_exc = exc
|
||||
time.sleep(2 ** attempt)
|
||||
else:
|
||||
raise
|
||||
raise last_exc # type: ignore[misc]
|
||||
|
||||
4
package-lock.json
generated
4
package-lock.json
generated
@@ -1,11 +1,11 @@
|
||||
{
|
||||
"name": "markitect_project",
|
||||
"name": "markitect-main",
|
||||
"version": "1.0.0",
|
||||
"lockfileVersion": 3,
|
||||
"requires": true,
|
||||
"packages": {
|
||||
"": {
|
||||
"name": "markitect_project",
|
||||
"name": "markitect-main",
|
||||
"version": "1.0.0",
|
||||
"license": "ISC",
|
||||
"dependencies": {
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
{
|
||||
"name": "markitect_project",
|
||||
"name": "markitect-main",
|
||||
"version": "1.0.0",
|
||||
"description": "",
|
||||
"main": "index.js",
|
||||
@@ -14,7 +14,7 @@
|
||||
},
|
||||
"repository": {
|
||||
"type": "git",
|
||||
"url": "http://92.205.130.254:32166/coulomb/markitect_project"
|
||||
"url": "http://92.205.130.254:32166/coulomb/markitect-main"
|
||||
},
|
||||
"keywords": [],
|
||||
"author": "",
|
||||
|
||||
12
registry/README.md
Normal file
12
registry/README.md
Normal file
@@ -0,0 +1,12 @@
|
||||
# Capability Registry
|
||||
|
||||
Markdown-first capability index for federation and reuse planning.
|
||||
|
||||
## Authoring
|
||||
|
||||
1. Copy a capability entry template (see reuse-surface `templates/capability-entry.template.md`).
|
||||
2. Add the row to `indexes/capabilities.yaml`.
|
||||
3. Run `reuse-surface validate` from a checkout with the CLI installed.
|
||||
4. Merge to `main` and verify publish with `reuse-surface establish --publish-check`.
|
||||
|
||||
Federation contract: reuse-surface `docs/RegistryFederation.md`.
|
||||
0
registry/capabilities/.gitkeep
Normal file
0
registry/capabilities/.gitkeep
Normal file
4
registry/indexes/capabilities.yaml
Normal file
4
registry/indexes/capabilities.yaml
Normal file
@@ -0,0 +1,4 @@
|
||||
version: 1
|
||||
updated: '2026-06-16'
|
||||
domain: helix_forge
|
||||
capabilities: []
|
||||
@@ -10,6 +10,14 @@ and formally closes the roadmap.
|
||||
**Parent roadmap:** `roadmap/infospace-tooling/PLAN.md`
|
||||
**Example location:** `examples/infospace-with-history/`
|
||||
|
||||
**Status: CLOSED (2026-04-22).** All acceptance criteria except the cosmetic
|
||||
per-chapter history (C.7) are met. Final metrics: 988 entities, 988 evaluations,
|
||||
6/6 viability thresholds PASS (`per_entity_mean = 3.957`). Tooling work that
|
||||
came out of this close-out landed as commits `c0615c2d` (gemini retry,
|
||||
unified skip-existing, non-destructive metrics I/O) and `d44a4cd3`
|
||||
(`infospace entity` lookup, `evaluate --model-fallback`, `llm-check`
|
||||
stale-key advisory, `build_state` type guard).
|
||||
|
||||
### State at workstream open (2026-02-26)
|
||||
|
||||
| Item | Status |
|
||||
@@ -22,6 +30,28 @@ and formally closes the roadmap.
|
||||
| 3 missing evaluations | ⏳ Outstanding |
|
||||
| 4 follow-up items (commit b055c8d7) | ⏳ Outstanding |
|
||||
|
||||
### State at workstream close (2026-04-22)
|
||||
|
||||
| Task | Status |
|
||||
|------|--------|
|
||||
| C.1 Complete 3 missing entity evaluations | ✅ Done (commit f325f89d) |
|
||||
| C.2 Run eval-summary and verify viability | ✅ Done — 6/6 PASS |
|
||||
| C.3 Refresh metrics report (988 entities) | ✅ Done — snapshot `090bb961` |
|
||||
| C.4 Document advanced usage patterns | ✅ Done — `examples/infospace-with-history/docs/advanced-usage.md` |
|
||||
| C.5 Composition-examples documentation | ✅ Done — `docs/composition-guide.md` |
|
||||
| C.6 Performance benchmarking note | ✅ Done — `examples/infospace-with-history/docs/performance-notes.md` |
|
||||
| C.7 Clean per-chapter git history | ⏭️ Deferred indefinitely — see note below |
|
||||
| C.8 Formally close S3 roadmap | ✅ This commit |
|
||||
|
||||
**C.7 disposition.** The task assumed a pre-existing `clean-example-history`
|
||||
branch with chapters 1–8 already committed; that branch no longer exists in
|
||||
the repo. The task is explicitly cosmetic ("does not change output files"),
|
||||
and the output files themselves are canonical. Reconstructing a 35-commit
|
||||
per-chapter history from scratch would be archaeological rather than useful.
|
||||
Closing as "won't do" unless a specific archival need surfaces. If revisited,
|
||||
entities can be grouped by their `## Source Chapter` markdown section to
|
||||
reconstruct chapter membership.
|
||||
|
||||
---
|
||||
|
||||
## Tasks
|
||||
|
||||
@@ -1,5 +1,31 @@
|
||||
# Viable Infospace Tooling — Roadmap
|
||||
|
||||
## Status: CLOSED (2026-04-22)
|
||||
|
||||
All three stages complete.
|
||||
|
||||
| Stage | Status | Notes |
|
||||
|-------|--------|-------|
|
||||
| Stage 1 — Platform additions (S1.1–S1.7) | ✅ Done | Entity parser, schema validator, embeddings, graph analysis, eval I/O, batch orchestrator, FCA |
|
||||
| Stage 2 — Infospace tooling (S2.1–S2.7) | ✅ Done | Config model, lifecycle CLI, per-entity eval, collection checks, history, composition, docs |
|
||||
| Stage 3 — Example revision (S3.1–S3.5) | ✅ Done (except cosmetic S3.2) | See `roadmap/infospace-s3-closeout/PLAN.md` |
|
||||
|
||||
**Final validation (Wealth of Nations / VSM example, 988 entities):**
|
||||
- 988 per-entity evaluations landed
|
||||
- Collection checks pass 6/6 viability thresholds (`per_entity_mean = 3.957`
|
||||
against threshold 3.5; `redundancy_ratio = 0.006`; `coverage_ratio = 0.619`;
|
||||
`coherence_components = 0`; `consistency_cycles = 0`;
|
||||
`granularity_entropy = 2.675`)
|
||||
- Composition demonstrated via `examples/supply-chain-vsm/`
|
||||
- S3.2 (clean per-chapter git history) deferred as cosmetic-only; rationale
|
||||
in the close-out plan
|
||||
|
||||
See `roadmap/infospace-s3-closeout/PLAN.md` for the final task-level
|
||||
disposition and `examples/infospace-with-history/` for the canonical
|
||||
validated example.
|
||||
|
||||
---
|
||||
|
||||
## Vision
|
||||
|
||||
An **infospace** is a structured, evaluable, composable collection of
|
||||
|
||||
@@ -39,7 +39,7 @@ Confirm the main Markitect application still works correctly with the current
|
||||
capability code before publishing.
|
||||
|
||||
```bash
|
||||
cd /home/worsch/markitect_project
|
||||
cd /home/worsch/markitect-main
|
||||
make testdrive-jsui-test-all # 84 tests must pass
|
||||
# Manually verify view and edit modes in the running Markitect app
|
||||
```
|
||||
|
||||
@@ -30,7 +30,7 @@ class TestActualRoundtripBehavior:
|
||||
cmd = ["python", "-m", "markitect.cli"] + args
|
||||
result = subprocess.run(
|
||||
cmd,
|
||||
cwd="/home/worsch/markitect_project",
|
||||
cwd="/home/worsch/markitect-main",
|
||||
capture_output=True,
|
||||
text=True
|
||||
)
|
||||
|
||||
@@ -5,7 +5,7 @@ This test implements the requirements for initializing a SQLite database
|
||||
and storing markdown files with front matter parsing.
|
||||
|
||||
Issue #1: Initialize Database and Store Example Markdown File
|
||||
https://gitea.coulomb.social/coulomb/markitect_project/issues/1
|
||||
https://gitea.coulomb.social/coulomb/markitect-main/issues/1
|
||||
"""
|
||||
|
||||
import pytest
|
||||
|
||||
@@ -33,7 +33,7 @@ class TestRoundtripBase:
|
||||
cmd,
|
||||
capture_output=True,
|
||||
text=True,
|
||||
cwd="/home/worsch/markitect_project"
|
||||
cwd="/home/worsch/markitect-main"
|
||||
)
|
||||
|
||||
def validate_basic_structure_preservation(self, original: str, reconstructed: str) -> Dict[str, Any]:
|
||||
|
||||
@@ -223,3 +223,129 @@ class TestViabilityCommand:
|
||||
)
|
||||
assert result.exit_code == 0
|
||||
assert "No viability thresholds" in result.output
|
||||
|
||||
|
||||
# ── chapters (per-source triage view) ────────────────────────────────
|
||||
|
||||
|
||||
class TestChaptersCommand:
|
||||
@pytest.fixture
|
||||
def chapters_dir(self, tmp_path):
|
||||
"""Infospace with 2 source files and matching entities."""
|
||||
config_yaml = """\
|
||||
topic:
|
||||
name: "WoN"
|
||||
domain: "Economics"
|
||||
sources: artifacts/sources
|
||||
"""
|
||||
(tmp_path / "infospace.yaml").write_text(config_yaml)
|
||||
|
||||
sources = tmp_path / "artifacts" / "sources"
|
||||
sources.mkdir(parents=True)
|
||||
(sources / "book-1-chapter-01.md").write_text("# Chapter 1\n\nText.\n")
|
||||
(sources / "book-1-chapter-02.md").write_text("# Chapter 2\n\nText.\n")
|
||||
|
||||
entities = tmp_path / "output" / "entities"
|
||||
entities.mkdir(parents=True)
|
||||
(entities / "alpha.md").write_text(
|
||||
"# Alpha\n\n## Definition\n\nX.\n\n"
|
||||
"## Source Chapter\n\nBook I, Chapter 1\n"
|
||||
)
|
||||
(entities / "beta.md").write_text(
|
||||
"# Beta\n\n## Definition\n\nY.\n\n"
|
||||
"## Source Chapter\n\nBook I, Chapter 2\n"
|
||||
)
|
||||
(entities / "gamma.md").write_text(
|
||||
"# Gamma\n\n## Definition\n\nZ.\n\n"
|
||||
"## Source Chapter\n\nBook I, Chapter 2\n"
|
||||
)
|
||||
return tmp_path
|
||||
|
||||
def test_lists_sources_with_counts(self, runner, chapters_dir):
|
||||
result = runner.invoke(
|
||||
infospace_commands,
|
||||
["chapters", "--config", str(chapters_dir / "infospace.yaml")],
|
||||
)
|
||||
assert result.exit_code == 0
|
||||
assert "book-1-chapter-01" in result.output
|
||||
assert "book-1-chapter-02" in result.output
|
||||
# ch 1 -> 1 entity, ch 2 -> 2 entities
|
||||
assert "2 source file(s); 3 entities" in result.output
|
||||
|
||||
def test_json_format(self, runner, chapters_dir):
|
||||
result = runner.invoke(
|
||||
infospace_commands,
|
||||
["chapters", "--config", str(chapters_dir / "infospace.yaml"),
|
||||
"--format", "json"],
|
||||
)
|
||||
assert result.exit_code == 0
|
||||
import json
|
||||
rows = json.loads(result.output)
|
||||
by_id = {r["source_id"]: r for r in rows}
|
||||
assert by_id["book-1-chapter-01"]["entities"] == 1
|
||||
assert by_id["book-1-chapter-02"]["entities"] == 2
|
||||
|
||||
def test_no_sources_dir(self, runner, tmp_path):
|
||||
(tmp_path / "infospace.yaml").write_text(
|
||||
"topic:\n name: X\n sources: missing\n"
|
||||
)
|
||||
result = runner.invoke(
|
||||
infospace_commands,
|
||||
["chapters", "--config", str(tmp_path / "infospace.yaml")],
|
||||
)
|
||||
assert result.exit_code == 1
|
||||
|
||||
|
||||
# ── process: eval-after-source / classify-after-source flags ─────────
|
||||
|
||||
|
||||
class TestProcessAfterSourceFlags:
|
||||
def test_flags_registered_in_help(self, runner):
|
||||
result = runner.invoke(infospace_commands, ["process", "--help"])
|
||||
assert result.exit_code == 0
|
||||
assert "--eval-after-source" in result.output
|
||||
assert "--classify-after-source" in result.output
|
||||
|
||||
def test_flags_require_provider(self, runner, tmp_path):
|
||||
(tmp_path / "infospace.yaml").write_text(
|
||||
"topic:\n name: X\n sources: sources\n"
|
||||
"pipeline:\n stages:\n - template: extract-entities\n"
|
||||
)
|
||||
sources = tmp_path / "sources"
|
||||
sources.mkdir()
|
||||
(sources / "s1.md").write_text("source")
|
||||
result = runner.invoke(
|
||||
infospace_commands,
|
||||
["process", "--all",
|
||||
"--config", str(tmp_path / "infospace.yaml"),
|
||||
"--eval-after-source"],
|
||||
)
|
||||
assert result.exit_code == 1
|
||||
assert "require --provider" in result.output
|
||||
|
||||
|
||||
# ── pipeline: commit body composition ────────────────────────────────
|
||||
|
||||
|
||||
class TestCommitBodyComposition:
|
||||
def test_bucket_for(self, tmp_path):
|
||||
from markitect.infospace.config import InfospaceConfig, TopicConfig
|
||||
from markitect.infospace.pipeline import SourcePipeline
|
||||
cfg = InfospaceConfig(topic=TopicConfig(name="T", domain="D"))
|
||||
p = SourcePipeline(cfg, tmp_path)
|
||||
assert p._bucket_for("output/entities/x.md") == "entities"
|
||||
assert p._bucket_for("output/evaluations/x.md") == "evaluations"
|
||||
assert p._bucket_for("output/classifications/x.md") == "classifications"
|
||||
assert p._bucket_for("output/mappings/x.md") == "mappings"
|
||||
assert p._bucket_for("output/notes/x.md") == "other"
|
||||
assert p._bucket_for("README.md") is None # not under output/
|
||||
|
||||
def test_compose_body_uses_default_on_no_diff(self, tmp_path):
|
||||
"""When git diff fails or returns empty, fall back to the default blurb."""
|
||||
from markitect.infospace.config import InfospaceConfig, TopicConfig
|
||||
from markitect.infospace.pipeline import SourcePipeline
|
||||
cfg = InfospaceConfig(topic=TopicConfig(name="T", domain="D"))
|
||||
# Not a git repo, so `git diff --cached` will raise CalledProcessError.
|
||||
p = SourcePipeline(cfg, tmp_path)
|
||||
body = p._compose_commit_body("some-source")
|
||||
assert "Extract entities" in body
|
||||
|
||||
@@ -124,6 +124,33 @@ class TestMetricsFileIO:
|
||||
path.write_text("just a string", encoding="utf-8")
|
||||
assert read_metrics_file(path) == {}
|
||||
|
||||
def test_round_trip_preserves_structured_values(self, tmp_path):
|
||||
"""Non-numeric values like type_distribution must survive a round-trip.
|
||||
|
||||
Regression: eval-summary --update-metrics used to drop any key
|
||||
whose value wasn't a bare number, silently erasing type_distribution
|
||||
from the file on every run.
|
||||
"""
|
||||
path = tmp_path / "metrics.yaml"
|
||||
metrics = {
|
||||
"per_entity_mean": 3.9567,
|
||||
"vsm_type_matrix_cells": 29,
|
||||
"type_distribution": {
|
||||
"Element": 315,
|
||||
"Institution": 122,
|
||||
"Principle": 102,
|
||||
},
|
||||
}
|
||||
write_metrics_file(metrics, path)
|
||||
loaded = read_metrics_file(path)
|
||||
assert loaded["type_distribution"] == {
|
||||
"Element": 315, "Institution": 122, "Principle": 102,
|
||||
}
|
||||
# And the int stayed an int on disk, not 29.0.
|
||||
raw = path.read_text(encoding="utf-8")
|
||||
assert "vsm_type_matrix_cells: 29\n" in raw
|
||||
assert "vsm_type_matrix_cells: 29.0" not in raw
|
||||
|
||||
|
||||
# ── record_check_results ────────────────────────────────────────────
|
||||
|
||||
|
||||
82
tests/unit/llm/test_gemini.py
Normal file
82
tests/unit/llm/test_gemini.py
Normal file
@@ -0,0 +1,82 @@
|
||||
"""Tests for markitect.llm.gemini — retry behavior + happy path."""
|
||||
|
||||
from unittest import mock
|
||||
|
||||
import pytest
|
||||
|
||||
from markitect.llm.gemini import GeminiAdapter
|
||||
from markitect.llm.exceptions import LLMAPIError, LLMRateLimitError
|
||||
from markitect.prompts.execution.models import RunConfig, LLMResponse
|
||||
|
||||
|
||||
def _api_response(text="hello", model="gemini-2.5-flash"):
|
||||
return {
|
||||
"candidates": [
|
||||
{
|
||||
"content": {"parts": [{"text": text}], "role": "model"},
|
||||
"finishReason": "STOP",
|
||||
}
|
||||
],
|
||||
"modelVersion": model,
|
||||
"usageMetadata": {
|
||||
"promptTokenCount": 3,
|
||||
"candidatesTokenCount": 2,
|
||||
"totalTokenCount": 5,
|
||||
},
|
||||
}
|
||||
|
||||
|
||||
class TestGeminiAdapter:
|
||||
def _adapter(self, **kwargs):
|
||||
defaults = {"api_key": "AIza-test"}
|
||||
defaults.update(kwargs)
|
||||
return GeminiAdapter(**defaults)
|
||||
|
||||
@mock.patch("markitect.llm.gemini.post_json")
|
||||
def test_success(self, mock_post):
|
||||
mock_post.return_value = _api_response("generated")
|
||||
adapter = self._adapter()
|
||||
resp = adapter.execute_prompt("hi", RunConfig())
|
||||
assert isinstance(resp, LLMResponse)
|
||||
assert resp.content == "generated"
|
||||
assert resp.metadata["provider"] == "gemini"
|
||||
|
||||
@mock.patch("markitect.llm.gemini.post_json")
|
||||
@mock.patch("markitect.llm.gemini.time.sleep")
|
||||
def test_retry_on_429(self, mock_sleep, mock_post):
|
||||
mock_post.side_effect = [
|
||||
LLMRateLimitError("rate limited", status_code=429),
|
||||
_api_response("recovered"),
|
||||
]
|
||||
adapter = self._adapter(max_retries=2)
|
||||
resp = adapter.execute_prompt("hi", RunConfig())
|
||||
assert resp.content == "recovered"
|
||||
assert mock_sleep.call_count == 1
|
||||
|
||||
@mock.patch("markitect.llm.gemini.post_json")
|
||||
@mock.patch("markitect.llm.gemini.time.sleep")
|
||||
def test_retry_on_503(self, mock_sleep, mock_post):
|
||||
mock_post.side_effect = [
|
||||
LLMAPIError("unavailable", status_code=503),
|
||||
_api_response("back"),
|
||||
]
|
||||
adapter = self._adapter(max_retries=2)
|
||||
resp = adapter.execute_prompt("hi", RunConfig())
|
||||
assert resp.content == "back"
|
||||
|
||||
@mock.patch("markitect.llm.gemini.post_json")
|
||||
def test_no_retry_on_400(self, mock_post):
|
||||
mock_post.side_effect = LLMAPIError("bad request", status_code=400)
|
||||
adapter = self._adapter(max_retries=2)
|
||||
with pytest.raises(LLMAPIError) as exc_info:
|
||||
adapter.execute_prompt("hi", RunConfig())
|
||||
assert exc_info.value.status_code == 400
|
||||
|
||||
@mock.patch("markitect.llm.gemini.post_json")
|
||||
@mock.patch("markitect.llm.gemini.time.sleep")
|
||||
def test_exhausted_retries_raises(self, mock_sleep, mock_post):
|
||||
mock_post.side_effect = LLMRateLimitError("rate limited", status_code=429)
|
||||
adapter = self._adapter(max_retries=1)
|
||||
with pytest.raises(LLMRateLimitError):
|
||||
adapter.execute_prompt("hi", RunConfig())
|
||||
assert mock_sleep.call_count == 1 # 1 retry before giving up
|
||||
67
workplans/MARKITECT-WP-0001-statehub-bootstrap.md
Normal file
67
workplans/MARKITECT-WP-0001-statehub-bootstrap.md
Normal file
@@ -0,0 +1,67 @@
|
||||
---
|
||||
id: MARKITECT-WP-0001
|
||||
type: workplan
|
||||
title: "Bootstrap State Hub integration"
|
||||
domain: communication
|
||||
repo: markitect-main
|
||||
status: finished
|
||||
owner: codex
|
||||
topic_slug: communication
|
||||
created: "2026-06-22"
|
||||
updated: "2026-06-22"
|
||||
state_hub_workstream_id: "dfc40b03-fe8e-49fe-b8d4-86eb1fe26b4a"
|
||||
---
|
||||
|
||||
# Bootstrap State Hub integration
|
||||
|
||||
Knowledge artifact management and markdown engine platform.
|
||||
|
||||
## Review Generated Integration Files
|
||||
|
||||
```task
|
||||
id: MARKITECT-WP-0001-T01
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "7455a381-a93d-4220-8f80-3b6ccf953cff"
|
||||
|
||||
```
|
||||
|
||||
Result 2026-06-22: SCOPE.md and INTRODUCTION.md reviewed; AGENTS.md confirmed.
|
||||
|
||||
Review `INTENT.md`, `SCOPE.md`, `AGENTS.md`, and `.custodian-brief.md`.
|
||||
Replace generated placeholders with repo-specific facts where needed.
|
||||
|
||||
## Verify Local Developer Workflow
|
||||
|
||||
```task
|
||||
id: MARKITECT-WP-0001-T02
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "7e34bdab-aa49-49ca-b28a-b254725dd8db"
|
||||
|
||||
```
|
||||
|
||||
Result 2026-06-22: Documented make-based Python/JS workflow.
|
||||
|
||||
Identify the repo's install, test, lint, build, and run commands. Add or refine
|
||||
those commands in the agent instructions so future coding sessions can verify
|
||||
changes confidently.
|
||||
|
||||
## Seed First Real Workplan
|
||||
|
||||
```task
|
||||
id: MARKITECT-WP-0001-T03
|
||||
status: done
|
||||
priority: medium
|
||||
state_hub_task_id: "35a64da7-dda9-4315-901d-88c6827432d9"
|
||||
|
||||
```
|
||||
|
||||
Result 2026-06-22: MARKITECT-WP-0002 already exists (TestDrive npm publication).
|
||||
|
||||
Create the first implementation workplan for the repository's most important
|
||||
next change. After workplan file updates, run from `~/state-hub`:
|
||||
|
||||
```bash
|
||||
make fix-consistency REPO=markitect-main
|
||||
```
|
||||
28
workplans/MARKITECT-WP-0002-testdrive-jsui-publication.md
Normal file
28
workplans/MARKITECT-WP-0002-testdrive-jsui-publication.md
Normal file
@@ -0,0 +1,28 @@
|
||||
---
|
||||
id: MARKITECT-WP-0002
|
||||
type: workplan
|
||||
title: "TestDrive-JSUI — npm Publication"
|
||||
domain: communication
|
||||
repo: markitect-main
|
||||
status: backlog
|
||||
owner: codex
|
||||
topic_slug: communication
|
||||
created: "2026-06-22"
|
||||
updated: "2026-06-22"
|
||||
state_hub_workstream_id: "e203d487-01f1-494a-b14d-a436241a4c01"
|
||||
---
|
||||
|
||||
# TestDrive-JSUI — npm Publication
|
||||
|
||||
Backlog workstream for publishing the TestDrive JSUI package to npm.
|
||||
|
||||
## Publication Readiness
|
||||
|
||||
```task
|
||||
id: MARKITECT-WP-0002-T01
|
||||
status: todo
|
||||
priority: medium
|
||||
state_hub_task_id: "88b3c206-4d45-4bb3-bbb3-47443cdf2123"
|
||||
```
|
||||
|
||||
Define package scope, versioning, and publication checklist for TestDrive-JSUI.
|
||||
Reference in New Issue
Block a user