28 Commits

Author SHA1 Message Date
cd8339ecef Complete State Hub bootstrap workplans (WP-0001)
Some checks failed
Test Suite / security-scan (push) Has been cancelled
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
- Review integration files; fill SCOPE where templated
- Document dev workflow in stack-and-commands.md
- Seed WP-0002 implementation workplan; mark bootstrap finished
- Hub sync via fix-consistency
2026-06-22 23:35:13 +02:00
f8ab58edbe chore(consistency): sync task status from DB [auto]
Some checks failed
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
Updated by fix-consistency on 2026-06-22:
  - update .custodian-brief.md for markitect-main
2026-06-22 23:32:31 +02:00
2b5e9743fe Add State Hub bootstrap workplan and agent integration files
Some checks failed
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
Seed workplans/ with bootstrap workplan to satisfy ADR-001 C-01.
Includes regenerated dev-hub session-protocol and agent instruction files.
2026-06-22 21:44:38 +02:00
753c3d4fc6 chore(consistency): sync task status from DB [auto]
Some checks failed
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
Updated by fix-consistency on 2026-06-22:
  - update .custodian-brief.md for markitect-main
2026-06-22 21:42:25 +02:00
94e84f0db9 chore(consistency): sync task status from DB [auto]
Some checks failed
Test Suite / security-scan (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
Updated by fix-consistency on 2026-06-22:
  - update .custodian-brief.md for markitect-project
2026-06-22 21:40:39 +02:00
a765ccda21 chore(consistency): sync task status from DB [auto]
Some checks failed
Test Suite / security-scan (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
Updated by fix-consistency on 2026-06-22:
  - update .custodian-brief.md for markitect-main
2026-06-22 21:40:31 +02:00
4472fa6c7f chore(consistency): sync task status from DB [auto]
Some checks failed
Test Suite / performance-tests (push) Has been cancelled
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
Updated by fix-consistency on 2026-06-22:
  - update .custodian-brief.md for markitect-main
2026-06-22 18:02:31 +02:00
526fa1e3bc Human-review .repo-classification.yaml (CUST-WP-0050 follow-up)
Some checks failed
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
2026-06-22 17:56:16 +02:00
86de18c247 Add .repo-classification.yaml (CUST-WP-0050 T11 agent first-pass) 2026-06-22 17:47:38 +02:00
ca9d0d7030 Add credential routing instructions for all agent runtimes
Some checks failed
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
Propagate shared credential-routing section (Codex, Claude, Grok, llm-connect)
from state-hub template via scripts/propagate_credential_routing.py.
2026-06-18 22:48:38 +02:00
bc527ec09a Add capability registry scaffold (REUSE-WP-0014-T05 B03)
Some checks failed
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
2026-06-16 01:54:12 +02:00
ce984482e2 assessment of forgotten functionality
Some checks failed
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
2026-05-23 06:44:38 +02:00
9266f124e6 Refresh agent instruction files
Some checks failed
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
2026-05-18 16:55:45 +02:00
8740a66611 chore(consistency): sync task status from DB [auto]
Some checks failed
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
Updated by fix-consistency on 2026-05-03:
  - update .custodian-brief.md for markitect-project
2026-05-03 19:31:36 +02:00
b7e9edbb4b chore(consistency): sync task status from DB [auto]
Some checks failed
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
Updated by fix-consistency on 2026-05-01:
  - update .custodian-brief.md for markitect-project
2026-05-01 23:07:28 +02:00
479fa95fdf Scope update from repo-scoping refactor
Some checks failed
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
2026-05-01 12:27:17 +02:00
eb9b622499 chore: gitignore Claude Code session lock files
Some checks failed
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
`.claude/scheduled_tasks.lock` is per-session runtime state (holds the
owning session id and pid for the ScheduleWakeup queue); it shouldn't
be committed. Widened the pattern to `.claude/*.lock` so future lock
kinds are covered too.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-22 21:50:20 +02:00
e3e5b8ecc1 feat(infospace): systematic long-text processing — rich commit bodies, per-source eval/classify, chapters view
Some checks failed
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
Three coordinated changes that let the pipeline produce a clean
chapter-by-chapter git history on long texts without archaeology after
the fact.

1. Richer commit messages. `SourcePipeline._git_commit` now diffs the
   staged changes, buckets added files by output subdirectory (entities,
   evaluations, classifications, mappings, analyses, metrics, logs), and
   includes counts in the commit body. So `git log` reads "entities:
   +23, evaluations: +23" per chapter instead of the same generic blurb
   on every commit. Zero behaviour change when no output changed; falls
   back to the original message if the diff query fails.

2. --eval-after-source / --classify-after-source on `infospace process`.
   After a source's stages succeed, the pipeline identifies which entity
   files are *new* (set diff of entity slugs before vs after), loads
   their EntityMeta, and runs per-entity evaluation and/or
   classification scoped to just those slugs before the per-source git
   commit lands. Result: each chapter's commit is self-contained —
   extraction + evaluation + classification in one atomic unit. Gated
   behind explicit flags because the cost is real (LLM latency per
   chapter rather than amortised across one bulk batch).

3. `markitect infospace chapters` subcommand. Lists source files in
   canonical order with entity count, evaluated count, classified
   count, and mean per-entity score per source. Text or JSON output.
   Natural triage surface for long-text infospaces — spot chapters that
   under-extracted or evaluated poorly.

Also: `docs/advanced-usage.md` gets a new "Systematic processing of
long texts" section with the recommended flag combo and the tradeoff
note on cost.

11 new unit tests cover the chapters command (text/json/no-sources),
the process flag wiring (help + provider requirement), and the
commit-body bucket logic. Full infospace+llm unit suite (315 tests)
green; 3 pre-existing infospace failures unchanged.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-22 08:24:26 +02:00
9e8d73fa7d docs(roadmap): close out infospace tooling S3 and parent roadmap
All three stages of the infospace tooling roadmap are complete. The Wealth
of Nations / VSM example passes 6/6 viability thresholds on 988 entities,
and composition is demonstrated via the supply-chain-vsm example.

- Parent roadmap (roadmap/infospace-tooling/PLAN.md): header now shows the
  closed status with final validation metrics.
- S3 close-out plan (roadmap/infospace-s3-closeout/PLAN.md): records the
  final task dispositions. C.1–C.6 and C.8 done; C.7 (clean per-chapter
  git history) is deferred indefinitely — the task was cosmetic, its
  prerequisite branch no longer exists, and reconstructing 35 archival
  commits would not change any output files. Rationale documented inline.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-22 07:08:43 +02:00
d44a4cd3df feat(infospace,llm): agent ergonomics — entity lookup, model fallback, better errors
- `markitect infospace entity <name>`: single-entity lookup tolerating
  hyphens/underscores/case, with substring matching, ambiguity listing,
  and near-match hints. Prints slug, source path, domain, chapter, word
  count, VSM system, overall score, evaluator, and evaluation file path.
- `markitect infospace evaluate --model-fallback <model>`: if any
  entities fail with a rate-limit error, retry just those with a fresh
  adapter on the fallback model (different free-tier models have
  separate quota buckets).
- `markitect llm-check`: advisory when `OPENROUTER_API_KEY` is set but
  not used by the resolved provider; targeted hint when OpenRouter
  returns 401 (almost always a stale env key).
- `build_state`: raises `TypeError` with actionable message if passed a
  path instead of an `InfospaceConfig` — prior failure mode was a
  confusing `AttributeError` deep in the stack.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-22 01:07:25 +02:00
c0615c2d50 feat(infospace,llm): stabilize free-tier eval workflow
Some checks failed
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
Five improvements that eliminate most of the agent-in-the-loop friction
observed while closing out the 988-entity WoN evaluation (C.1):

1. Gemini adapter now retries on 429 + 5xx with exponential backoff
   (same pattern already used by OpenRouter/OpenAI). Removes the need
   for shell-level retry wrappers when hitting free-tier rate limits.

2. evaluate CLI prints the underlying error ("ERROR — HTTP 503 …")
   instead of a bare "ERROR", so agents don't have to drop into Python
   to diagnose transient failures.

3. --entity/--chapter now respect existing evaluation files by default
   (previously only the full-collection pass did). New --force flag
   opts into re-evaluation. Stops silently burning free-tier quota on
   re-runs of the same slug.

4. --entity accepts hyphenated slugs (matching entity filenames) and
   normalizes them to the underscore form used on disk. On a miss the
   CLI suggests near matches instead of a bare "not found".

5. eval-summary --update-metrics is no longer destructive:
   read_metrics_file/write_metrics_file preserve structured values
   (type_distribution) and don't flatten ints to floats. Fixes a
   silent data loss observed on every run.

Bonus: the evaluator field in written evaluation frontmatter now
falls back from run_config.model_name to the adapter's resolved model
(or the model echoed back in the API response), so rows no longer
show `evaluator: null` when --model is omitted.

Tests: new tests/unit/llm/test_gemini.py covers retry behavior;
tests/unit/infospace/test_history.py gains a round-trip test that
pins the type_distribution / int-preservation invariants.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-22 00:51:00 +02:00
965508ec06 chore(consistency): sync task status from DB [auto]
Updated by fix-consistency on 2026-04-22:
  - update .custodian-brief.md for markitect-project
2026-04-22 00:28:46 +02:00
f325f89dc9 feat(infospace): evaluate 3 missing WoN entities (C.1)
Some checks failed
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
Fills the 988 entity / 985 evaluation gap in the Wealth of Nations
infospace. Entities advanced_state_of_society, bank_notes, and
bank_systemic_risk_management had no evaluation files; runs through
Gemini (2.5-flash / 2.5-flash-lite for the last one, which hit the
free-tier RPM limit) bring the eval count to 988.

per_entity_mean nudged from 3.955635 to 3.95668; viability still
6/6 PASS.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-21 23:52:04 +02:00
36a5136bdf docs(infospace): add advanced-usage, composition guide, and performance notes (C.4/C.5/C.6)
Closes out three docs tasks from roadmap/infospace-s3-closeout/PLAN.md:

- examples/infospace-with-history/docs/advanced-usage.md (C.4) — 5 worked
  patterns covering incremental eval, re-eval workflow (no --force flag
  exists; documents the rm-then-re-run pattern instead), interpreting the
  eval-summary distribution, triaging low scorers via an awk pipeline
  over overall_score (since `entities --sort-by score` does not exist),
  and acting on check --json output.
- docs/composition-guide.md (C.5) — walks through how supply-chain-vsm
  binds WoN as a discipline, then a step-by-step for creating a new
  infospace that binds an existing one. Includes live output from
  `markitect infospace disciplines`.
- examples/infospace-with-history/docs/performance-notes.md (C.6) — cites
  the 6h 28m wall time of the 985-entity S3.3 batch, ~2.5 ent/min rate,
  ~2000–3000 tokens/entity estimate, word_overlap vs embedding backend
  for redundancy checks, and a provider-by-scale recommendation table.

All commands in these docs were run against the live infospace at
commit time.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-21 07:02:46 +02:00
b7e11461f4 chore: rename markitect_project to markitect-main across project
Finishes the in-progress rename so docs, configs, tests, and capability
manifests all reference the current repo name consistently. Fixes two
tests (test_roundtrip_consolidated.py, test_issue_140_roundtrip_simplified.py)
whose hardcoded cwd paths would have broken under the renamed directory.

Archival content under history/, reports/, and roadmap/eat-the-frog/, plus
derived artifacts (.venv_old/, node_modules/, asset_registry.json) are
intentionally left untouched.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-21 01:57:35 +02:00
3966814868 updated SCOPE file
Some checks failed
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
2026-03-25 00:11:46 +01:00
f4610a46e3 docs: add SCOPE.md for rapid orientation
Some checks failed
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 23:11:42 +01:00
0d95e6dbcf docs(claude): expand CLAUDE.md with commands and architecture
Some checks failed
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
Replaces the stub (State Hub integration only) with full dev commands,
module architecture overview, LLM config resolution chain, infospace
conventions, and active roadmap pointers. Removes CLAUDE.custodian.md
(superseded by the expanded CLAUDE.md).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-04 23:28:03 +01:00
59 changed files with 2542 additions and 106 deletions

20
.claude/rules/agents.md Normal file
View File

@@ -0,0 +1,20 @@
## Kaizen Agents
Specialized agent personas available on demand via the state-hub MCP.
**Discover:** `list_kaizen_agents()` — returns all agents with name, description, category
**Load:** `get_kaizen_agent("tdd-workflow")` — returns full instructions; read and follow them
Common agents:
| Agent | Category | When to use |
|-------|----------|-------------|
| `tdd-workflow` | testing | Step-by-step TDD8 workflow for any feature |
| `code-refactoring` | quality | Code quality analysis and safe refactoring |
| `test-maintenance` | testing | Diagnose and fix failing tests |
| `requirements-engineering` | process | Prevent interface/mock mismatches upfront |
| `keepaTodofile` | process | Maintain TODO.md during work |
| `project-management` | process | Track status, determine next steps |
| `datamodel-optimization` | quality | Optimize dataclasses and data structures |
All 17 agents: call `list_kaizen_agents()` for the full list.

View File

@@ -0,0 +1,8 @@
## Architecture
<!-- TODO: Describe the key design decisions and component structure.
Key modules, data flows, external integrations, state machines, etc. -->
## Quick Reference
`~/state-hub/mcp_server/TOOLS.md` — MCP tool reference

View File

@@ -0,0 +1,50 @@
# Credential and access routing
**Audience:** Codex, Claude Code, Grok, and custodian agents that call **llm-connect**
for inference. Run this check **before** requesting secrets, API keys, SSH access,
login tokens, or database passwords — in any repo, not only `ops-warden`.
ops-warden **issues SSH certificates only** (`warden sign`, `cert_command`). Every
other credential need belongs to another subsystem. **Do not** message
`ops-warden` on State Hub expecting a secret value; the reply is a pointer, not a key.
### Lookup (do this first)
```bash
warden route find "<describe your need>" --json
warden route show <catalog-id> --json
```
Requires the `warden` CLI from `~/ops-warden` (`uv tool install .` or `uv run warden`).
| Agent runtime | How to orient |
| --- | --- |
| **Codex / Grok** (shell, HTTP State Hub) | `warden route` commands above; inbox `to_agent=markitect-main` is for coordination, not secret vending |
| **Claude Code** (MCP when available) | `get_domain_summary("custodian")` for workstreams; **still** use `warden route` for credential ownership |
| **llm-connect** (inference service) | Never put secret retrieval in prompts; route custody to OpenBao/operator paths surfaced by `warden route` |
### Quick routing table
| I need… | Owner | ops-warden executes? |
| --- | --- | --- |
| SSH cert (`adm`/`agt`/`atm`) | ops-warden | **Yes**`warden sign` |
| API key, DB password, provider token | OpenBao (`railiance-platform`) | No — route only |
| Login / OIDC / MFA | key-cape / Keycloak | No — route only |
| Authorization decision | flex-auth | No — route only |
| activity-core → issue-core emission | activity-core + issue-core | No — `warden route show activity-core-issue-sink` |
| SSH tunnel | ops-bridge (+ `cert_command` from warden) | No — route only |
### Anti-patterns (do not do these)
- `POST /messages/` to `ops-warden` asking for `ISSUE_CORE_API_KEY`, `OPENROUTER_API_KEY`, etc.
- Inventing `warden secret`, `warden login`, `warden bao`, `warden tunnel` — they do not exist
- Pasting secrets into Git, State Hub, workplans, logs, or chat
### Other capabilities (reuse-surface)
Non-credential capabilities are usually discovered through **reuse-surface** federation
(`reuse-surface` registry / `capability.*` indexes). Credential routing is inlined in
every repo's agent instructions because it is high-frequency, high-risk, and easy to
get wrong.
**Canon:** `~/ops-warden/wiki/CredentialRouting.md` · catalog `~/ops-warden/registry/routing/catalog.yaml`

View File

@@ -0,0 +1,38 @@
## First Session Protocol
Triggered when `get_domain_summary("communication")` shows **no workstreams**.
The project is registered but work has not yet been structured.
**Step 1 — Read, don't write**
- `~/the-custodian/canon/projects/communication/project_charter_v0.1.md` — purpose, scope
- `~/the-custodian/canon/projects/communication/roadmap_v0.1.md` — planned phases
- Scan repo root: README, directory structure, existing code or docs
**Step 2 — Survey in-progress work**
Look for TODOs, open branches, half-finished files. Note done vs. started but incomplete.
**Step 3 — Propose workstreams to Bernd**
Propose 13 workstreams — each a coherent strand, weeks to months, anchored to a
roadmap phase. **Wait for approval before creating.**
**Step 4 — Create workplan file first, then DB record (ADR-001)**
```
workplans/MARKITECT-WP-NNNN-<slug>.md ← write this first
```
Then register in the hub:
```
create_workstream(topic_id="36c7421b-c537-4723-bf75-42a3ebc6a1dc", title="...", owner="...", description="...")
create_task(workstream_id="<id>", title="...", priority="high|medium|low")
```
**Step 5 — Record the setup**
```
add_progress_event(
summary="First session: structured communication into N workstreams, M tasks",
event_type="milestone",
topic_id="36c7421b-c537-4723-bf75-42a3ebc6a1dc",
detail={"workstreams": [...], "tasks_created": M}
)
```
<!-- Delete or archive this file once past first session -->

View File

@@ -0,0 +1,8 @@
## Repo boundary
This repo owns **Markitect Main** only. It does not own:
<!-- TODO: List what belongs in adjacent repos, e.g.:
- SSH key management → railiance-infra/
- State hub code → state-hub/
-->

View File

@@ -0,0 +1,5 @@
**Purpose:** Markitect Main - (fill in purpose)
**Domain:** communication
**Repo slug:** markitect-main
**Topic ID:** 36c7421b-c537-4723-bf75-42a3ebc6a1dc

View File

@@ -0,0 +1,85 @@
## Session Protocol
Dev Hub (State Hub API): http://127.0.0.1:8000
MCP server name in `~/.claude.json`: `dev-hub`
**Step 1 — Orient**
Read the offline-safe brief first — it works without a live hub connection:
```bash
cat .custodian-brief.md
```
Then call the MCP tool for richer cross-domain context when MCP tools are exposed:
```
get_domain_summary("communication")
```
If MCP tools are unavailable in the current agent session, use the REST API:
```bash
curl -s "http://127.0.0.1:8000/state/summary" | python3 -m json.tool
```
If the hub is offline: `cd ~/state-hub && make api`
**Step 2 — Check inbox**
With MCP tools:
```
get_messages(to_agent="markitect-main", unread_only=True)
```
Mark read with `mark_message_read(message_id)`. Reply or act on coordination
requests before proceeding.
Without MCP tools:
```bash
curl -s "http://127.0.0.1:8000/messages/?to_agent=markitect-main&unread_only=true" \
| python3 -m json.tool
curl -s -X PATCH "http://127.0.0.1:8000/messages/<id>/read" \
-H "Content-Type: application/json" -d '{}'
```
**Step 3 — Scan workplans**
```bash
ls workplans/
```
For each file with `status: ready`, `active`, or `blocked`, note pending
`wait`/`todo`/`progress` tasks.
**Step 4 — Present brief**
1. **Active workstreams** for `communication` — title, task counts, blocking decisions
2. **Pending tasks** from `workplans/` + any `[repo:markitect-main]` hub tasks
3. **Goal guidance** — if `goal_guidance` in summary:
- `needs_workplan`: surface as top action — *"Repo goal '{title}' has no workplan yet"*
- `alignment_warnings`: flag if active work is not aligned with current goal
4. **Suggested next action** — highest-priority open item
5. **SBOM status** — flag if `last_sbom_at` is unset for this repo
If no workstreams: follow First Session Protocol (`first-session.md`).
**During work:** `record_decision()` · `add_progress_event()` · `resolve_decision()`
> State Hub is a *read model*. Bootstrap tools (`create_workstream`, `create_task`)
> are First Session Protocol only. Work structure belongs in repo files (ADR-001).
**Session close:**
With MCP tools:
```
add_progress_event(summary="...", topic_id="36c7421b-c537-4723-bf75-42a3ebc6a1dc", workstream_id="<uuid>")
```
Without MCP tools:
```bash
curl -s -X POST http://127.0.0.1:8000/progress/ \
-H "Content-Type: application/json" \
-d '{"topic_id":"36c7421b-c537-4723-bf75-42a3ebc6a1dc","workstream_id":"<uuid>","event_type":"note","summary":"what changed","author":"codex"}'
```
If workplan files were modified, ensure the local copy is up to date first:
```bash
git -C <repo_path> pull --ff-only
cd ~/state-hub && make fix-consistency REPO=markitect-main
```
For repos where implementation runs on a remote machine (e.g. CoulombCore),
use the combined target which pulls before fixing:
```bash
cd ~/state-hub && make fix-consistency-remote REPO=markitect-main
```
**C-15** (DB task ahead of file) is normal in multi-machine workflows — writeback
will sync the file to match DB. **C-16** (repo behind remote) blocks all writes
until you pull — intentional to prevent clobbering remote progress.

View File

@@ -0,0 +1,16 @@
## Stack
- **Language:** Python 3.12+ (monorepo) + JavaScript UI (testdrive-jsui)
- **Key deps:** uv/pip, pytest, npm; see `pyproject.toml`, `package.json`, `Makefile`
## Dev Commands
```bash
make setup
make test
make test-js
make test-all
make lint
make build
make help
```

View File

@@ -0,0 +1,40 @@
## Workplan Convention (ADR-001)
File location: `workplans/MARKITECT-WP-NNNN-<slug>.md`
ID prefix: `MARKITECT-WP-`
Work items originate as files in this repo **before** being registered in the hub.
Canonical workplan/workstream frontmatter statuses are:
`proposed`, `ready`, `active`, `blocked`, `backlog`, `finished`, `archived`.
Use `proposed` for a newly drafted plan, `ready` after review against current
repo state, and `finished` when implementation is complete. `stalled` and
`needs_review` are derived health labels, not stored statuses.
Closed workplans may be moved to `workplans/archived/` with a completion-date
prefix: `YYMMDD-MARKITECT-WP-NNNN-<slug>.md`. The frontmatter id remains
unchanged; the prefix is only for quick visual reference.
Small opportunistic tasks discovered during another session use **Ad Hoc Tasks**:
`workplans/ADHOC-YYYY-MM-DD.md`, workstream slug `adhoc-YYYY-MM-DD`, and task ids
`ADHOC-YYYY-MM-DD-T01`, `T02`, etc. Use adhocs only for low-risk work completed
directly. Promote anything requiring analysis, design, approval, dependencies, or
multiple planned phases into a normal workplan.
Ecosystem todos from other agents arrive as `[repo:markitect-main]` hub tasks —
visible at session start. Pick one up by creating the workplan file, then registering
the workstream.
Task blocks use this shape:
```task
id: MARKITECT-WP-NNNN-T01
status: wait | todo | progress | done | cancel
priority: high | medium | low
state_hub_task_id: "<uuid>" # written by fix-consistency — do not edit
```
Status progression is `todo``progress``done`; use `wait` for waiting or
blocked work and `cancel` for stopped work.
<!-- Ralph Loop rules and HEUREKA sequence: ~/.claude/CLAUDE.md — do not duplicate here -->

View File

@@ -10,7 +10,7 @@ principles with strict separation of concerns.
## Directory Structure & Clean Architecture ## Directory Structure & Clean Architecture
``` ```
markitect_project/ markitect-main/
├── domain/ # Business logic (innermost layer) ├── domain/ # Business logic (innermost layer)
├── application/ # Use cases and workflows ├── application/ # Use cases and workflows
├── infrastructure/ # External interfaces (database, file system) ├── infrastructure/ # External interfaces (database, file system)

18
.custodian-brief.md Normal file
View File

@@ -0,0 +1,18 @@
<!-- custodian-brief: generated by fix-consistency — do not edit manually -->
# Custodian Brief — markitect-main
**Domain:** communication
**Last synced:** 2026-06-22 21:32 UTC
**State Hub:** http://127.0.0.1:8000 *(adjust if running on a remote machine)*
## Active Workstreams
*(none — repo may need first-session setup)*
---
## MCP Orientation (when available)
If the state-hub MCP server is reachable, call:
`get_domain_summary("communication")`
This provides richer cross-domain context.
If the MCP call fails, use this file as your orientation source.

2
.gitignore vendored
View File

@@ -91,6 +91,8 @@ debug_*.py
# Claude Code local settings (user-specific permissions) # Claude Code local settings (user-specific permissions)
.claude/settings.local.json .claude/settings.local.json
# Claude Code runtime session locks (per-session, not content)
.claude/*.lock
.aider* .aider*

2
.gitmodules vendored
View File

@@ -1,6 +1,6 @@
[submodule "wiki"] [submodule "wiki"]
path = wiki path = wiki
url = http://92.205.130.254:32166/coulomb/markitect_project.wiki.git url = http://92.205.130.254:32166/coulomb/markitect-main.wiki.git
branch = main branch = main
[submodule "capabilities/kaizen-agentic"] [submodule "capabilities/kaizen-agentic"]
path = capabilities/kaizen-agentic path = capabilities/kaizen-agentic

25
.repo-classification.yaml Normal file
View File

@@ -0,0 +1,25 @@
repo_classification:
standard: Repo Classification Standard
version: '1.0'
classified_at: '2026-06-22'
classified_by: human
category: product
domain: communication
secondary_domains:
- infotech
- agents
capability_tags:
- knowledge
- documentation
- product-development
- platform
business_stake:
- product
- technology
- execution
business_mechanics:
- intention
- coordination
- operation
- adaptation
notes: Markitect successor to archived markitect-project; human confirmed.

219
AGENTS.md Normal file
View File

@@ -0,0 +1,219 @@
# Markitect Main — Agent Instructions
## Repo Identity
**Purpose:** Markitect Main - (fill in purpose)
**Domain:** communication
**Repo slug:** markitect-main
**Topic ID:** `36c7421b-c537-4723-bf75-42a3ebc6a1dc`
**Workplan prefix:** `MARKITECT-WP-`
---
## State Hub Integration
The Custodian State Hub tracks work across all domains. Interact via HTTP REST —
there is no MCP server for Codex agents.
| Context | URL |
|---------|-----|
| Local workstation | `http://127.0.0.1:8000` |
| Remote via tunnel | `http://127.0.0.1:18000` |
### Orient at session start
```bash
# Offline brief — works without hub connection
cat .custodian-brief.md
# Active workstreams for this domain
curl -s "http://127.0.0.1:8000/workstreams/?topic_id=36c7421b-c537-4723-bf75-42a3ebc6a1dc&status=active" \
| python3 -m json.tool
# Check inbox
curl -s "http://127.0.0.1:8000/messages/?to_agent=markitect-main&unread_only=true" \
| python3 -m json.tool
```
Mark a message read:
```bash
curl -s -X PATCH "http://127.0.0.1:8000/messages/<id>/read" \
-H "Content-Type: application/json" -d '{}'
```
### Log progress (required at session close)
```bash
curl -s -X POST http://127.0.0.1:8000/progress/ \
-H "Content-Type: application/json" \
-d '{
"summary": "what was done",
"event_type": "note",
"author": "codex",
"workstream_id": "<uuid>",
"task_id": "<uuid>"
}'
```
Omit `workstream_id` / `task_id` when not applicable.
### Update task status
```bash
curl -s -X PATCH "http://127.0.0.1:8000/tasks/<task_id>" \
-H "Content-Type: application/json" \
-d '{"status": "progress"}'
# values: wait | todo | progress | done | cancel
```
### Flag a task for human review
```bash
curl -s -X PATCH "http://127.0.0.1:8000/tasks/<task_id>" \
-H "Content-Type: application/json" \
-d '{"needs_human": true, "intervention_note": "reason"}'
```
---
## Session Protocol
**Start:**
1. `cat .custodian-brief.md` — domain goal and open workstreams (offline-safe)
2. Check inbox: `GET /messages/?to_agent=markitect-main&unread_only=true`; mark read
3. Scan workplans: `ls workplans/` — note `status: ready`, `active`, or `blocked` files and open tasks
4. Check human-needed tasks: `GET /tasks/?needs_human=true`
**During work:**
- Update task statuses in workplan files as tasks progress
- Record significant decisions via `POST /decisions/`
**Close:**
1. Update workplan file task statuses to reflect progress
2. Log: `POST /progress/` with a summary of what changed
3. Note for the custodian operator: after workplan file changes, run from
`~/state-hub`:
```bash
make fix-consistency REPO=markitect-main
```
This syncs task status from files into the hub DB.
---
## Credential and access routing
**Audience:** Codex, Claude Code, Grok, and custodian agents that call **llm-connect**
for inference. Run this check **before** requesting secrets, API keys, SSH access,
login tokens, or database passwords — in any repo, not only `ops-warden`.
ops-warden **issues SSH certificates only** (`warden sign`, `cert_command`). Every
other credential need belongs to another subsystem. **Do not** message
`ops-warden` on State Hub expecting a secret value; the reply is a pointer, not a key.
### Lookup (do this first)
```bash
warden route find "<describe your need>" --json
warden route show <catalog-id> --json
```
Requires the `warden` CLI from `~/ops-warden` (`uv tool install .` or `uv run warden`).
| Agent runtime | How to orient |
| --- | --- |
| **Codex / Grok** (shell, HTTP State Hub) | `warden route` commands above; inbox `to_agent=markitect-main` is for coordination, not secret vending |
| **Claude Code** (MCP when available) | `get_domain_summary("custodian")` for workstreams; **still** use `warden route` for credential ownership |
| **llm-connect** (inference service) | Never put secret retrieval in prompts; route custody to OpenBao/operator paths surfaced by `warden route` |
### Quick routing table
| I need… | Owner | ops-warden executes? |
| --- | --- | --- |
| SSH cert (`adm`/`agt`/`atm`) | ops-warden | **Yes** — `warden sign` |
| API key, DB password, provider token | OpenBao (`railiance-platform`) | No — route only |
| Login / OIDC / MFA | key-cape / Keycloak | No — route only |
| Authorization decision | flex-auth | No — route only |
| activity-core → issue-core emission | activity-core + issue-core | No — `warden route show activity-core-issue-sink` |
| SSH tunnel | ops-bridge (+ `cert_command` from warden) | No — route only |
### Anti-patterns (do not do these)
- `POST /messages/` to `ops-warden` asking for `ISSUE_CORE_API_KEY`, `OPENROUTER_API_KEY`, etc.
- Inventing `warden secret`, `warden login`, `warden bao`, `warden tunnel` — they do not exist
- Pasting secrets into Git, State Hub, workplans, logs, or chat
### Other capabilities (reuse-surface)
Non-credential capabilities are usually discovered through **reuse-surface** federation
(`reuse-surface` registry / `capability.*` indexes). Credential routing is inlined in
every repo's agent instructions because it is high-frequency, high-risk, and easy to
get wrong.
**Canon:** `~/ops-warden/wiki/CredentialRouting.md` · catalog `~/ops-warden/registry/routing/catalog.yaml`
<!-- REPO-AGENTS-EXTENSIONS -->
<!-- Append repo-specific agent instructions below this marker.
The state-hub template sync preserves content after this line. -->
---
## Workplan Convention (ADR-001)
Work items originate as files in this repo — not in the hub. The hub is a
read/cache/index layer that rebuilds from files.
**File location:** `workplans/MARKITECT-WP-NNNN-<slug>.md`
**Archived location:** finished workplans may move to
`workplans/archived/YYMMDD-MARKITECT-WP-NNNN-<slug>.md`. The `YYMMDD` prefix is
the completion/archive date; the frontmatter `id` does not change.
**Ad Hoc Tasks:** small opportunistic fixes discovered during a session use
`workplans/ADHOC-YYYY-MM-DD.md` with task ids `ADHOC-YYYY-MM-DD-T01`, etc. Use
this only for low-risk work completed directly; create a normal workplan for
anything needing analysis, design, approval, dependencies, or multiple phases.
**Frontmatter:**
```yaml
---
id: MARKITECT-WP-NNNN
type: workplan
title: "..."
domain: communication
repo: markitect-main
status: proposed | ready | active | blocked | backlog | finished | archived
owner: codex
topic_slug: ...
created: "YYYY-MM-DD"
updated: "YYYY-MM-DD"
state_hub_workstream_id: "<uuid>" # written by fix-consistency — do not edit
---
```
Use `proposed` for a new draft, `ready` after review against current repo
state, and `finished` after implementation. `stalled` and `needs_review` are
derived health labels, not frontmatter statuses.
**Task block format** (one per `##` section):
```
## Task Title
` ` `task
id: MARKITECT-WP-NNNN-T01
status: wait | todo | progress | done | cancel
priority: high | medium | low
state_hub_task_id: "<uuid>" # written by fix-consistency — do not edit
` ` `
Task description text.
```
Status progression: `todo` → `progress` → `done`; use `wait` for waiting/blocked work and `cancel` for stopped work.
To create a new workplan:
1. Write the file following the format above
2. Notify the custodian operator to run `make fix-consistency REPO=markitect-main`
(or send a message to the hub agent via `POST /messages/`)

View File

@@ -1,34 +1,12 @@
# Markitect — Claude Code Instructions # Markitect Main — Claude Code Instructions
## Custodian State Hub Integration @SCOPE.md
@.claude/rules/repo-identity.md
This project is tracked as the **markitect** domain in the Custodian State Hub. @.claude/rules/session-protocol.md
Hub topic ID: `5571d954-0d30-4950-980d-7bcaaad8e3e2` @.claude/rules/first-session.md
@.claude/rules/workplan-convention.md
### Session Protocol @.claude/rules/stack-and-commands.md
@.claude/rules/architecture.md
**At the start of every session:** @.claude/rules/repo-boundary.md
Call `get_state_summary()` via the `state-hub` MCP tool to orient yourself. @.claude/rules/credential-routing.md
If the hub is not reachable, start it: `cd ~/the-custodian/state-hub && make api` @.claude/rules/agents.md
**At the end of every session:**
Call `add_progress_event()` with at minimum:
- `topic_id`: `5571d954-0d30-4950-980d-7bcaaad8e3e2`
- `summary`: what was accomplished or left in-flight
- `event_type`: `note` for routine updates, `milestone` for completions, `blocker` for blockers
### Available State-Hub MCP Tools
- `get_state_summary()` — full cross-domain overview
- `add_progress_event(summary, topic_id, event_type, detail)` — log progress
- `create_workstream(topic_id, title, ...)` — create a new workstream
- `create_task(workstream_id, title, ...)` — create a task under a workstream
- `update_task_status(task_id, status)` — move task through lifecycle
- `record_decision(title, decision_type, topic_id, ...)` — log decisions
- `resolve_decision(decision_id, rationale, decided_by)` — close a decision
### If the MCP Server is Not Available
The state-hub MCP server (`state-hub`) is registered at user scope in `~/.claude.json`.
It requires the API to be running at `http://127.0.0.1:8000`.
Fallback: use `curl` directly against the REST API — see `/docs` at the hub URL.

View File

@@ -457,7 +457,7 @@ Sister projects can reuse these capabilities directly:
Install capabilities via local file references: Install capabilities via local file references:
```toml ```toml
[project.dependencies] [project.dependencies]
release-management = {path = "../markitect_project/capabilities/release-management"} release-management = {path = "../markitect-main/capabilities/release-management"}
``` ```
### Shared Infrastructure ### Shared Infrastructure

129
SCOPE.md Normal file
View File

@@ -0,0 +1,129 @@
# SCOPE
> This file helps you quickly understand what this repository is about,
> when it is relevant, and when it is not.
> It is intentionally lightweight and may be incomplete.
---
## One-liner
Intelligent markdown engine and information management platform — treats documents as structured, queryable information spaces with schema validation, transclusion, LLM-driven evaluation, and infospace lifecycle management.
---
## Core Idea
MarkiTect turns fragmented knowledge (scattered docs, chats, notes) into structured, versioned, reusable artifacts. The core abstraction is an **infospace**: a curated collection of typed entities (concepts, mechanisms, observations) governed by a YAML config, validated against schemas, and evaluated for quality across five dimensions. The platform automates generation, validation, and transformation at scale, delegating domain-level judgment to LLMs while Python handles structure and evaluation.
---
## In Scope
- Parse, validate, and analyze markdown documents against schemas
- Generate schemas from example documents; enforce naming convention `{domain}-schema-v{major}.{minor}.md`
- Infospace lifecycle: create, populate, evaluate (per-entity + collection quality scores), compose, export
- Transclusion: embed content from one document into another, maintaining single source of truth
- LLM-driven prompt execution with dependency resolution and quality gates
- Relationship graph export (Mermaid, DOT) and analysis (networkx, FCA)
- Batch document processing; CLI (`markitect <command>`) and programmatic API
- Rendering: markdown → interactive HTML via plugin system (testdrive-jsui)
- Asset management (image embedding, resource handling)
---
## Out of Scope
- Visual/WYSIWYG editing (markdown-first, text-based workflows only)
- Real-time collaborative editing (git-based versioning instead)
- Financial transactions or external payment integration
- Making domain-level judgments in Python code (delegated to LLM via prompt templates)
- Storing secrets or credentials in plaintext
- Full GraphQL API (structure exists but not fully implemented)
- Vendor-specific integrations or lock-in
---
## Relevant When
- Managing large document sets (hundreds to thousands) needing consistent structure and validation
- Building or maintaining institutional knowledge bases, technical documentation, or canon releases
- Automating document generation from schemas or templates
- Tracking relationships and dependencies between knowledge artifacts
- Needing programmatic access to document structure (beyond file reading)
- Applying quality evaluation to a structured concept collection
---
## Not Relevant When
- Working with a handful of simple, unrelated documents
- Visual editor required
- Exclusively non-markdown source formats (PDF/Word need conversion first)
- No consistency, validation, or automation needed
---
## Current State
- Status: active (v0.13.0-dev, ~90 commits ahead of release)
- Implementation: substantial — core modules mature (CLI, parsing, schema management, prompt execution, infospace); infospace S3 close-out in progress; LLM adapter extracted to standalone `llm-connect` package
- Stability: stable core; plugin system and infospace tooling evolving; 200+ CHANGELOG entries since v0.6.0
- Usage: active personal development; examples with 988 entities and full evaluation pipeline
---
## How It Fits
- Upstream dependencies: `llm-connect` (LLM adapter library, extracted), `testdrive-jsui` (rendering plugin submodule), `markitect-utils` (utility library)
- Downstream consumers: Custodian — MarkiTect is the knowledge artifact platform in the canonical dependency order (Railiance → **Markitect** → Coulomb.social → Personhood/Foerster → Custodian)
- Often used with: the-custodian (state hub tracks markitect domain workstreams), kaizen-agentic (project-management agent for session workflow)
---
## Terminology
- Preferred terms: infospace, topic, discipline, entity, evaluation, viability, transclusion, schema, quality gates
- Also known as: "markitect", "the markdown engine"
- Potentially confusing terms: "topic" = the subject matter an infospace explains (not a chat thread); "discipline" = a reusable framework of concepts (itself a viable infospace); "infospace" ≠ filesystem directory (it's a curated conceptual collection with explicit quality thresholds)
---
## Related / Overlapping
- `llm-connect` — standalone LLM adapter extracted from MarkiTect (dependency)
- `the-custodian` — tracks markitect workstreams; custodian canon includes a markitect domain charter
- `marki-docx` — separate repo (on tegwick machine); relationship: docx export capability for MarkiTect artifacts
---
## Provided Capabilities
```capability
type: documentation
title: Structured document validation and schema management
description: Parse, validate, and enforce schemas on markdown documents — generate schemas from examples, validate entity collections, report naming convention compliance.
keywords: [markdown, schema, validation, document, structure, linting]
```
```capability
type: documentation
title: Infospace lifecycle management
description: Create, populate, evaluate (quality scores), compose, and export curated knowledge collections (infospaces) with transclusion and relationship graph analysis.
keywords: [infospace, knowledge, curation, evaluation, transclusion, quality, graph]
```
```capability
type: data
title: LLM-driven knowledge artifact generation
description: Execute prompts with dependency resolution and quality gates to generate typed entities — concepts, mechanisms, observations — at scale from schemas and templates.
keywords: [llm, generation, prompt, entity, artifact, knowledge, automation]
```
---
## Getting Oriented
- Start with: `CLAUDE.md` (dev commands, LLM config, infospace lifecycle), `INTRODUCTION.md` (use cases, philosophy)
- Key files / directories: `markitect/cli.py` (CLI entry point), `markitect/infospace/` (primary active area), `markitect/prompts/` (LLM execution), `roadmap/` (6 active planning tracks), `examples/infospace-with-history/` (988-entity reference implementation)
- Entry points: `markitect --help`; `markitect infospace --help`; `pytest tests/unit/` (inner TDD loop)

View File

@@ -15,7 +15,7 @@ You are responsible for:
### Directory Structure ### Directory Structure
``` ```
markitect_project/ markitect-main/
├── Makefile # Main project Makefile ├── Makefile # Main project Makefile
├── scripts/ ├── scripts/
│ └── capability_discovery.mk # Auto-discovery and delegation system │ └── capability_discovery.mk # Auto-discovery and delegation system

View File

@@ -7,7 +7,7 @@ detachment:
capability_name: issue-facade capability_name: issue-facade
capability_family: issue-tracking capability_family: issue-tracking
integration_pattern: capabilities-directory integration_pattern: capabilities-directory
original_location: /home/worsch/markitect_project/capabilities/issue-facade original_location: /home/worsch/markitect-main/capabilities/issue-facade
capability_metadata: capability_metadata:
spec_file: CAPABILITY-issue-tracking.yaml spec_file: CAPABILITY-issue-tracking.yaml
@@ -17,23 +17,23 @@ capability_metadata:
integration_details: integration_details:
parent_project: capabilities parent_project: capabilities
parent_path: /home/worsch/markitect_project/capabilities parent_path: /home/worsch/markitect-main/capabilities
re_integration_guide: | re_integration_guide: |
To re-integrate this capability using the new architecture: To re-integrate this capability using the new architecture:
# Option 1: Git submodule (recommended) # Option 1: Git submodule (recommended)
cd /home/worsch/markitect_project/capabilities cd /home/worsch/markitect-main/capabilities
git submodule add <repo-url> _issue-facade git submodule add <repo-url> _issue-facade
pip install -e _issue-facade/ pip install -e _issue-facade/
# Option 2: Clone directly # Option 2: Clone directly
cd /home/worsch/markitect_project/capabilities cd /home/worsch/markitect-main/capabilities
git clone <repo-url> _issue-facade git clone <repo-url> _issue-facade
pip install -e _issue-facade/ pip install -e _issue-facade/
# Option 3: Copy into project # Option 3: Copy into project
cd /home/worsch/markitect_project/capabilities cd /home/worsch/markitect-main/capabilities
cp -r /path/to/issue-facade _issue-facade cp -r /path/to/issue-facade _issue-facade
pip install -e _issue-facade/ pip install -e _issue-facade/

View File

@@ -8,7 +8,7 @@ This test module validates outline mode schema generation improvements including
- Content instruction integration - Content instruction integration
- End-to-end workflow from example document to generated drafts - End-to-end workflow from example document to generated drafts
Created for Issue #46: https://gitea.coulomb.social/coulomb/markitect_project/issues/46 Created for Issue #46: https://gitea.coulomb.social/coulomb/markitect-main/issues/46
""" """
import pytest import pytest

View File

@@ -209,7 +209,7 @@ tests/
## 🎯 Detailed File Structure After Migration ## 🎯 Detailed File Structure After Migration
``` ```
markitect_project/ markitect-main/
├── capabilities/ ├── capabilities/
│ └── release-management/ │ └── release-management/
│ ├── README.md ✅ CREATED │ ├── README.md ✅ CREATED

View File

@@ -162,7 +162,7 @@ clean_before_build = true
[tool.release-management.registries.gitea] [tool.release-management.registries.gitea]
url = "http://92.205.130.254:32166" url = "http://92.205.130.254:32166"
owner = "coulomb" owner = "coulomb"
repo = "markitect_project" repo = "markitect-main"
auth_token_env = "GITEA_API_TOKEN" auth_token_env = "GITEA_API_TOKEN"
[tool.release-management.registries.pypi] [tool.release-management.registries.pypi]

View File

@@ -141,7 +141,7 @@ make release-publish VERSION=0.8.0
## Registry Information ## Registry Information
- **Gitea URL**: http://92.205.130.254:32166 - **Gitea URL**: http://92.205.130.254:32166
- **Repository**: coulomb/markitect_project - **Repository**: coulomb/markitect-main
- **PyPI Registry URL**: http://92.205.130.254:32166/api/packages/coulomb/pypi - **PyPI Registry URL**: http://92.205.130.254:32166/api/packages/coulomb/pypi
- **Package List URL**: http://92.205.130.254:32166/api/v1/packages/coulomb - **Package List URL**: http://92.205.130.254:32166/api/v1/packages/coulomb

View File

@@ -8,7 +8,7 @@
```bash ```bash
# ❌ WRONG - Don't edit capability files from main repo # ❌ WRONG - Don't edit capability files from main repo
cd /home/worsch/markitect_project/capabilities/testdrive-jsui cd /home/worsch/markitect-main/capabilities/testdrive-jsui
vim src/testdrive_jsui/core.py # DON'T DO THIS! vim src/testdrive_jsui/core.py # DON'T DO THIS!
# ✅ CORRECT - Use separate Claude instance/session # ✅ CORRECT - Use separate Claude instance/session
@@ -29,7 +29,7 @@ cd /path/to/work/testdrive-jsui
| Session | Purpose | Location | | Session | Purpose | Location |
|---------|---------|----------| |---------|---------|----------|
| **Main Repo** | Integration, configuration | `/home/worsch/markitect_project` | | **Main Repo** | Integration, configuration | `/home/worsch/markitect-main` |
| **Capability** | Feature development, bugs | Separate clone or `capabilities/capability-name` | | **Capability** | Feature development, bugs | Separate clone or `capabilities/capability-name` |
**Why?** Prevents accidental cross-contamination and respects repository boundaries. **Why?** Prevents accidental cross-contamination and respects repository boundaries.
@@ -40,7 +40,7 @@ cd /path/to/work/testdrive-jsui
```bash ```bash
# After pushing changes to capability repo # After pushing changes to capability repo
cd /home/worsch/markitect_project cd /home/worsch/markitect-main
git submodule update --remote capabilities/testdrive-jsui git submodule update --remote capabilities/testdrive-jsui
git add capabilities/testdrive-jsui git add capabilities/testdrive-jsui
git commit -m "chore: update testdrive-jsui to latest" git commit -m "chore: update testdrive-jsui to latest"
@@ -50,7 +50,7 @@ git push
### Add New Capability ### Add New Capability
```bash ```bash
cd /home/worsch/markitect_project cd /home/worsch/markitect-main
# Add as submodule # Add as submodule
git submodule add http://gitea/coulomb/new-capability.git capabilities/new-capability git submodule add http://gitea/coulomb/new-capability.git capabilities/new-capability
@@ -67,7 +67,7 @@ git commit -m "feat: add new-capability submodule"
```bash ```bash
# Option 1: In submodule directory (careful!) # Option 1: In submodule directory (careful!)
cd /home/worsch/markitect_project/capabilities/testdrive-jsui cd /home/worsch/markitect-main/capabilities/testdrive-jsui
git checkout -b feature-branch git checkout -b feature-branch
# make changes # make changes
git commit -m "feat: new feature" git commit -m "feat: new feature"
@@ -86,7 +86,7 @@ git push origin feature-branch
### Check Capability Status ### Check Capability Status
```bash ```bash
cd /home/worsch/markitect_project cd /home/worsch/markitect-main
# List all capabilities # List all capabilities
make capabilities-list make capabilities-list

View File

@@ -9,7 +9,7 @@ MarkiTect is a markdown processing toolkit with transclusion, schema validation,
## Current Directory Structure ## Current Directory Structure
``` ```
markitect_project/ markitect-main/
├── markitect/ # Main package ├── markitect/ # Main package
│ ├── [34 root-level .py files] # Core functionality (see below) │ ├── [34 root-level .py files] # Core functionality (see below)
│ ├── assets/ # Asset discovery, management, caching (21 files) │ ├── assets/ # Asset discovery, management, caching (21 files)

View File

@@ -8,7 +8,7 @@ MarkiTect uses a **capabilities-based architecture** where functionality is orga
### 1. **Separation of Concerns** ### 1. **Separation of Concerns**
**Critical Rule:** The main repository (`markitect_project`) **MUST NOT** directly modify capability code. **Critical Rule:** The main repository (`markitect-main`) **MUST NOT** directly modify capability code.
-**DO**: Use capabilities as dependencies -**DO**: Use capabilities as dependencies
-**DO**: Configure capabilities through documented interfaces -**DO**: Configure capabilities through documented interfaces
@@ -28,7 +28,7 @@ MarkiTect uses a **capabilities-based architecture** where functionality is orga
Capabilities are integrated as **git submodules**, not regular directories: Capabilities are integrated as **git submodules**, not regular directories:
``` ```
markitect_project/ markitect-main/
├── .gitmodules # Submodule configuration ├── .gitmodules # Submodule configuration
├── capabilities/ ├── capabilities/
│ ├── testdrive-jsui/ # Git submodule → separate repo │ ├── testdrive-jsui/ # Git submodule → separate repo
@@ -80,8 +80,8 @@ engine.render_document(content, mode='edit', config=config)
#### Main Repository Session #### Main Repository Session
```bash ```bash
# In markitect_project/ # In markitect-main/
cd /home/worsch/markitect_project cd /home/worsch/markitect-main
# Main repo tasks: # Main repo tasks:
# - Integrate capabilities # - Integrate capabilities
@@ -93,7 +93,7 @@ cd /home/worsch/markitect_project
#### Capability Session #### Capability Session
```bash ```bash
# In capability repository # In capability repository
cd /home/worsch/markitect_project/capabilities/testdrive-jsui cd /home/worsch/markitect-main/capabilities/testdrive-jsui
# OR clone separately # OR clone separately
git clone http://gitea/coulomb/testdrive-jsui.git git clone http://gitea/coulomb/testdrive-jsui.git
@@ -122,7 +122,7 @@ cd testdrive-jsui
2. **Update main project** (different Claude instance) 2. **Update main project** (different Claude instance)
```bash ```bash
cd /home/worsch/markitect_project cd /home/worsch/markitect-main
git submodule update --remote capabilities/testdrive-jsui git submodule update --remote capabilities/testdrive-jsui
git commit -m "chore: update testdrive-jsui submodule" git commit -m "chore: update testdrive-jsui submodule"
``` ```
@@ -139,7 +139,7 @@ When a capability releases a new version:
```bash ```bash
# In main repo # In main repo
cd /home/worsch/markitect_project cd /home/worsch/markitect-main
# Update specific capability # Update specific capability
cd capabilities/testdrive-jsui cd capabilities/testdrive-jsui
@@ -160,7 +160,7 @@ git commit -am "chore: update all capabilities"
# http://gitea/coulomb/new-capability # http://gitea/coulomb/new-capability
# 2. Add as submodule to main repo # 2. Add as submodule to main repo
cd /home/worsch/markitect_project cd /home/worsch/markitect-main
git submodule add http://gitea/coulomb/new-capability.git capabilities/new-capability git submodule add http://gitea/coulomb/new-capability.git capabilities/new-capability
# 3. Add dependency to pyproject.toml # 3. Add dependency to pyproject.toml
@@ -324,7 +324,7 @@ def test_testdrive_jsui_integration():
1. **Create separate git repo** 1. **Create separate git repo**
```bash ```bash
cd /tmp cd /tmp
cp -r markitect_project/capabilities/capability-name capability-name cp -r markitect-main/capabilities/capability-name capability-name
cd capability-name cd capability-name
git init git init
git add . git add .
@@ -335,7 +335,7 @@ def test_testdrive_jsui_integration():
2. **Remove from main repo** 2. **Remove from main repo**
```bash ```bash
cd markitect_project cd markitect-main
git rm -rf capabilities/capability-name git rm -rf capabilities/capability-name
git commit -m "chore: remove capability-name for submodule conversion" git commit -m "chore: remove capability-name for submodule conversion"
``` ```

203
docs/composition-guide.md Normal file
View File

@@ -0,0 +1,203 @@
# Infospace Composition Guide
One completed, viable infospace can be reused as a **discipline** for
another infospace — a lens applied to a different topic. This guide
explains how composition works and walks through the live
`examples/supply-chain-vsm/` reference.
---
## What composition means
An **infospace** is a directory of typed entities governed by
`infospace.yaml`. Its entities and relations describe a specific topic
(for example, Adam Smith's *Wealth of Nations*).
A **discipline** is an infospace declared as a reusable analytical
framework by another infospace. When infospace B binds infospace A as a
discipline:
1. B's entities can reference A's entities in `## WoN Concept` (or
equivalent) sections.
2. Properties A has already computed on its entities — such as VSM system
placement — become available to B by transitivity through the mapping.
3. B can impose its own viability thresholds independently of A's. The two
infospaces each pass or fail viability on their own terms.
The binding is declarative: a relative path in `infospace.yaml` plus a
display name. No code. No import. The discipline is looked up on disk at
the declared path when B's commands run.
---
## The viability pre-condition
Binding a non-viable infospace as a discipline is a mistake: a framework
that fails its own thresholds is not a stable reference frame. Before
binding, confirm the candidate discipline is viable:
```bash
cd examples/infospace-with-history
markitect infospace viability
```
```
Metric Value Threshold Status
---------------------------------------------------------------
redundancy_ratio 0.0061 max=0.1 PASS
coverage_ratio 0.6190 min=0.4 PASS
coherence_components 0.0000 max=3 PASS
consistency_cycles 0.0000 max=0 PASS
granularity_entropy 2.6748 min=1.0 PASS
per_entity_mean 3.9556 min=3.5 PASS
Viable: YES (6/6 thresholds met)
```
If the discipline is not viable, fix it first (see
`examples/infospace-with-history/docs/advanced-usage.md` §4 for triaging
low scorers).
---
## Example — how `supply-chain-vsm` binds WoN
The supply-chain infospace declares WoN as a discipline in its
`infospace.yaml`:
```yaml
topic:
name: "Modern Supply Chain Management"
domain: "Operations Management"
sources: artifacts/sources/
disciplines:
- name: "Wealth of Nations"
path: ../infospace-with-history
```
The binding is a **relative path**, so the two infospaces travel together
(they can be moved as a pair without breaking the link).
Verify the binding resolves and the discipline is viable:
```bash
cd examples/supply-chain-vsm
markitect infospace disciplines
```
```
Name Entities Viable Path
----------------------------------------------------------------------
Wealth of Nations 988 YES ../infospace-with-history
```
Each supply-chain entity then carries a `## WoN Concept` section
mapping it to exactly one WoN entity. The consolidated mapping files
(`output/mappings/*-mappings.md`) record the pairing, rationale, and a
conceptual-continuity rating (Strong / Moderate / Weak):
| Supply Chain Entity | WoN Concept | Strength | VSM |
|------------------------------|----------------------------------|----------|-------|
| Demand Signal | Effectual Demand | Strong | S2 |
| Vendor-Managed Inventory | Division of Labour | Strong | S1/S2 |
| Just-in-Time Inventory | Circulating Capital | Strong | S1/S3 |
| Bullwhip Effect | Natural Price as Central Price | Moderate | S2 |
| Safety Stock | Accumulation of Stock | Moderate | S3 |
Because each WoN entity already has a VSM system placement (S1S5), the
supply-chain entities inherit a VSM position by transitivity through
their mapping — without supply-chain-vsm needing its own VSM reference.
---
## Creating a new infospace that binds an existing one
Step-by-step, using WoN as the discipline for a hypothetical "Modern
Monetary Policy" infospace:
### 1. Start from the target topic
```bash
mkdir -p examples/monetary-policy/artifacts/sources
cd examples/monetary-policy
markitect infospace init
```
### 2. Declare the discipline in `infospace.yaml`
```yaml
topic:
name: "Modern Monetary Policy"
domain: "Macroeconomics"
sources: artifacts/sources/
disciplines:
- name: "Wealth of Nations"
path: ../infospace-with-history
```
Alternatively, bind imperatively after `init`:
```bash
markitect infospace bind-discipline ../infospace-with-history --name "Wealth of Nations"
```
### 3. Set your own viability thresholds
Copy the `viability:` block from a reference infospace and tune the
numbers to the scale and maturity of your topic. A smaller infospace
(50 entities, not 988) may need laxer `coverage_ratio` and stricter
`redundancy_ratio`.
### 4. Verify the binding
```bash
markitect infospace disciplines
```
If `Viable` is `NO`, stop and fix the discipline before continuing.
### 5. Reference discipline entities in your own entities
For each entity in the new infospace, add a `## <Discipline> Concept`
section that names the WoN entity the concept maps to, plus a rationale.
The exact section heading is configured per schema — see
`schemas/won-mapping-schema-v1.0.md` in `supply-chain-vsm` for the
template used there.
### 6. Run checks and evaluate
```bash
markitect infospace check
markitect infospace evaluate --provider openrouter
markitect infospace eval-summary --update-metrics
markitect infospace viability
```
The new infospace passes or fails viability independently of WoN.
---
## Why composition, not inclusion?
An alternative would be to copy WoN entities directly into the target
infospace. Composition avoids that by design:
- **One source of truth** — if WoN is refined, every infospace that binds
it picks up the improvement on the next run without a sync step.
- **Separation of concerns** — each infospace owns its own schema,
thresholds, and entity set. Changing the target topic cannot pollute
the discipline.
- **Bounded dependency** — the binding is a path, so the coupling is
visible in one place (`infospace.yaml`) and easy to remove.
---
## See also
- `examples/supply-chain-vsm/README.md` — the full reference composition.
- `examples/supply-chain-vsm/output/mappings/` — consolidated mapping
files showing the rationale and strength rating for each pairing.
- `examples/infospace-with-history/docs/advanced-usage.md` — patterns for
maintaining the discipline once it is in use.

View File

@@ -0,0 +1,141 @@
# markitect-main → Successor Repos: Gap Assessment
**Date:** 2026-05-23
**Author:** Claude (custodian session)
**Status:** Draft — awaiting Bernd's decisions on items A/B/C below
## Purpose
Bernd is retiring `markitect-main` and has transferred most functionality to
sibling repos. This document identifies what was provided by `markitect-main`
that is **not addressed** in those successors, and flags candidates that may
not fit any successor's intent.
## Successor Ecosystem (5 repos, not 3)
| Repo | Role |
|---|---|
| `markitect-tool` | Markdown syntax layer + structured-document primitives; defines source-adapter and render-adapter contracts. CLI: `mkt`. |
| `kontextual-engine` | Headless knowledge operations engine: artifacts, collections, persistence, relationships, workflow runs/manifests, query, quality/assessment, API. |
| `infospace-bench` | Application layer — concrete infospaces, evaluation methodology, reference pilots. |
| `markitect-filter` | Source-format ingestion adapters (`source.epub3`, `source.pdf`) implementing the markitect-tool source-adapter contract. |
| `markitect-quarkdown` | Render/export adapter — implements the markitect-tool render-adapter contract via Quarkdown. |
## Method
Analysis is grounded in each successor's own assessment docs (recent, May 2026):
- `markitect-tool/docs/markitect-main-scope-assessment.md`
- `kontextual-engine/docs/markitect-main-scope-assessment.md`
- `kontextual-engine/docs/system-layer-extraction-inventory.md`
- `kontextual-engine/docs/system-layer-migration-backlog.md`
- `infospace-bench/docs/markitect-main-scope-assessment.md`
- `infospace-bench/docs/legacy-infospace-feature-inventory.md`
- `infospace-bench/docs/replacement-acceptance-matrix.md`
Cross-checked against actual `markitect-main` module sizing (Python LOC) and
`__init__.py` docstrings.
**Confidence:** These successor docs are authoritative on *intent*. They have
**not** been line-verified to confirm every "reimplement"-classified item
actually landed in the successor. Where verification matters, it's flagged.
---
## A. Doesn't fit any successor's intent — needs a new home or explicit retirement
These are explicitly pushed away by tool/engine/bench and are unrelated to
filter/quarkdown.
| markitect-main area | LOC | What it is | Status |
|---|---|---|---|
| `markitect/finance/` | ~8,100 | Cost-tracking system: cost items, period allocation to issues, financial reports, audit trails | **Orphan.** markitect-main's own SCOPE.md lists "financial transactions" as out-of-scope. Belongs with issue/project-ops, not knowledge tooling. |
| `issue_tracker/` + `_issue-tracking/` + `.issues/` | ~1,200 | Issue tracking (finance allocates costs to these issues) | **Orphan to the five** — but likely already superseded by the `issue-facade` capability / `use-issues` skill. **Verify before retiring.** |
| `markitect/profile/` | ~1,600 | User-profile CRUD, multi-profile, DB-backed | **Orphan.** Unrelated to all five. (Distinct from quarkdown's *render* "profile".) |
| `markitect/production/` | ~3,800 | Deployment-readiness validation, cross-platform checks, perf benchmarking | Engine keeps only "structured error/audit *ideas*". Deployment-validation bulk is orphan. |
| `tools/`, `services/`, gitea/tddai glue | ~5,500 | Project-ops tooling | Out-of-scope everywhere. |
| `markitect/legacy/` + `legacy_compat.py` | ~2,700 | Backward-compat shims | Retire by definition. |
## B. Rendering / asset / plugin layer — only *partially* covered, real residual gap
**This is the most consequential gap.** `SCOPE.md` lists "Rendering: markdown
→ interactive HTML via plugin system (testdrive-jsui)" as an in-scope
capability of markitect-main.
| Area | LOC | Covered? |
|---|---|---|
| `markitect/plugins/` (generic processor/formatter/validator/exporter plugin system) | ~8,000 | **No.** tool defines a render-adapter *contract* and an *extension* point, but the general plugin runtime isn't carried. |
| `markitect/assets/` (content-addressable asset store, dedup, `.mdpkg` ZIP packaging, symlink handling) + `asset_registry.json` (277 KB) | ~6,000 | **No.** Bench says "leave behind unless a concrete export needs assets." |
| Interactive-HTML / testdrive-jsui rendering, `static/`, `themes/`, `templates/document.html`, JS UI | — | **Partial only.** quarkdown covers a *Quarkdown* export path; the interactive-HTML / JS-UI path has no home. |
**Decision needed:** spin these into a dedicated render/asset repo (sibling to
quarkdown), fold the asset store into one of the existing repos, or retire the
interactive-HTML path.
## C. The other "Information Space" lineage — `markitect/spaces/` (~11,000 LOC)
**Distinct from `markitect/infospace/`** (which infospace-bench inherited).
`spaces/` is an older/parallel abstraction with features bench did *not* take:
- event-driven change tracking & notifications
- persistent transclusion context with cross-space references
- bidirectional directory synchronization
- HTML rendering of spaces with caching/themes
Engine takes generic persistence concepts and bench takes infospace semantics,
but **these specific `spaces/` behaviors (bidirectional sync, event
notifications, cross-space transclusion context) aren't mapped anywhere.**
Likely intended as dead/superseded — but 11k LOC warrants an explicit "retire
vs salvage" call.
## D. Declined-by-design (confirm retirement, don't re-extract)
| Area | LOC | Disposition |
|---|---|---|
| `markitect/graphql/` | ~4,000 | All three explicitly declined GraphQL ("evidence of API need, not a commitment"). |
| `markitect/query_paradigms/` | ~3,500 | Engine/tool keep the *QueryResult envelope* concept but say "do not port the registry wholesale." |
| `markitect/proxy/` | ~870 | Non-markdown→md proxy with checksum/freshness tracking. **Overlaps markitect-filter.** Freshness/staleness-tracking mechanism may be worth checking against bench's deferred "stale-mappings." |
| `capabilities/` (top-level) | ~8,300 | Capability-packaging architecture; partially maps to tool (schema generation) but the packaging approach itself isn't carried. |
---
## What this means
The successors are, by their own assessments, **near complete for the
in-scope core** (parsing/schema → tool; persistence/workflow → engine;
infospace lifecycle → bench; ingestion → filter; one render path →
quarkdown). The truly unaddressed functionality is almost entirely the stuff
markitect-main accreted **beyond** its stated scope: finance, issue tracking,
user profiles, production/deployment validation, the asset/plugin/interactive-HTML
rendering stack, and the older `spaces/` abstraction.
## Decisions for Bernd
Three live decisions, not a long extraction backlog:
### Decision 1 — Render/asset stack (Section B)
The one with genuine product value left.
- **Option 1a:** new repo (sibling to quarkdown) for plugin runtime + asset store + interactive-HTML
- **Option 1b:** fold the asset store into an existing repo (most likely markitect-tool, behind a flag); retire interactive-HTML
- **Option 1c:** retire the interactive-HTML path entirely; trust quarkdown export as the single render story
### Decision 2 — `markitect/spaces/` (Section C)
- **Option 2a:** salvage bidirectional-sync / event-tracking / cross-space transclusion into engine (engine has the persistence story to support it)
- **Option 2b:** retire wholesale as superseded by infospace
### Decision 3 — Project-ops cluster (Section A: finance + issues + profile)
- **Option 3a:** confirm `issue-facade` already replaces `issue_tracker/` + `finance/`; retire both
- **Option 3b:** identify a home for any pieces worth keeping
---
## Suggested verification before deciding
If verification matters before committing:
- **For Decision 1:** grep the five repos for any render/asset adapter that already covers the HTML path beyond Quarkdown.
- **For Decision 2:** check whether engine's `OperationRun` + collection model can express bidirectional-sync semantics, or whether new primitives would be needed.
- **For Decision 3:** confirm whether `issue-facade` truly replaces `issue_tracker/` + `finance/` end-to-end.
Happy to do any of these focused passes when you're ready to decide.

View File

@@ -117,7 +117,7 @@ This graph enables:
```bash ```bash
# Ensure MarkiTect is installed # Ensure MarkiTect is installed
cd /path/to/markitect_project cd /path/to/markitect-main
pip install -e . pip install -e .
``` ```

View File

@@ -0,0 +1,230 @@
# Advanced Usage — Wealth of Nations Infospace
Patterns for working with the WoN infospace (988 entities) after the initial
pipeline run. Every command in this file has been run against the actual
infospace at the time of writing (2026-04-21); output shapes are excerpted
verbatim.
All commands assume `cwd = examples/infospace-with-history` and the
`markitect-venv` Python environment.
---
## 1. Incremental evaluation — add entities after the initial run
`markitect infospace evaluate` writes one file per entity under
`output/evaluations/<slug>.md`. It skips any entity whose evaluation file
already exists, so re-running after adding a new entity processes only the
new one.
```bash
# Add a new entity file
vim output/entities/new-concept.md
# Evaluate only the new entity (explicit)
markitect infospace evaluate --entity new-concept --provider openrouter
# Or re-run the whole pass — existing 988 are skipped, only the new file hits the LLM
markitect infospace evaluate --provider openrouter
```
**How skip detection works.** Evaluation slugs are normalised to underscores
with `_s_` preserving apostrophes (`farmers-capital` entity →
`farmer_s_capital.md` evaluation). If a new entity slug collides with an
existing evaluation under this normalisation, the eval will be skipped.
To be sure an entity was picked up, check:
```bash
# Count entities vs evaluations
ls output/entities/*.md | grep -Ev 'book-[0-9]+-(chapter-[0-9]+|introduction)-' | wc -l
ls output/evaluations/*.md | wc -l
```
---
## 2. Re-evaluating after guideline changes
`evaluate` has no `--force` flag; re-evaluation requires deleting the
existing file first.
```bash
# Re-evaluate a single entity after updating the evaluation rubric
rm output/evaluations/accumulation_of_stock.md
markitect infospace evaluate --entity accumulation-of-stock --provider openrouter
# Re-evaluate a whole chapter
ls output/entities/book-1-chapter-06-entities.md # see which entities the chapter produced
# Map chapter entities to eval filenames (apostrophe/underscore normalisation) and rm them
```
After re-evaluating, refresh the aggregate:
```bash
markitect infospace eval-summary --update-metrics
```
This merges `per_entity_mean` into `output/metrics/metrics.yaml` so the next
`markitect infospace viability` check reflects the new scores.
---
## 3. Interpreting per-entity score distributions
`eval-summary` shows the mean for each of the five evaluation dimensions
plus the overall range:
```
$ markitect infospace eval-summary
Evaluation summary — 985 entities evaluated
Dimension Mean
--------------------------------------
overall 3.956
definition_precision 3.620
domain_placement 4.559
explanatory_value 3.936
source_grounding 4.358
vsm_relevance 3.305
Range: 1.00 4.80
```
Interpretation:
- `overall` above the 3.5 viability threshold → the collection passes
`per_entity_mean`.
- The lowest dimension (`vsm_relevance` = 3.305) is the weakest signal. If
the collection is meant to be VSM-grounded, this is the dimension most
worth improving (via sharper entity definitions or schema changes).
- A wide range (1.00 4.80) tells you there are outliers at both ends —
worth triaging (see pattern 4).
---
## 4. Triaging low scorers
`markitect infospace entities --by-type` prints each entity's star score
in-line:
```
$ markitect infospace entities --by-type | head
=== Element (315 entities) ===
active_and_productive_stock Accumulation S1 ★4.6
advanced_state_of_society General Theory S5
agio_of_bank_money Exchange S2 ★4.8
```
Entities with no `★` have no evaluation yet. To list the lowest-scoring
entities across the whole collection:
```bash
# Extract overall_score from every evaluation file and sort ascending
for f in output/evaluations/*.md; do
score=$(awk '/^overall_score:/ {print $2; exit}' "$f")
printf "%s\t%s\n" "$score" "$(basename "$f" .md)"
done | sort -n | head -20
```
The 20 lowest scorers are the natural triage list — inspect their
`output/entities/<slug>.md` and evaluation rationales to decide whether to
refine the entity, merge it with a better-formed neighbour, or drop it.
---
## 5. Reading and acting on collection-check output
`markitect infospace check` runs five concerns (C1C5). Use `--concern` to
focus on one and `--json` for machine-readable output:
```bash
# Redundancy — which pairs of entities are suspiciously similar?
markitect infospace check --concern redundancy --json
```
```json
{
"redundancy": {
"concern": "C1",
"redundancy_ratio": 0.0061,
"similar_pairs": [
{"entity_a": "bank_economic_contribution_metrics",
"entity_b": "bank_economic_development_metrics",
"similarity": 1.0, "method": "word_overlap"},
{"entity_a": "economic_system_objectives",
"entity_b": "economic_system_purpose",
"similarity": 0.9394, "method": "word_overlap"}
]
}
}
```
Acting on this:
- **Similarity = 1.0** is almost certainly a duplicate — pick one slug and
merge or delete the other.
- **0.850.99** usually means two entities genuinely cover the same idea
with slight phrasing differences. Merging is the cleanest fix.
- **< 0.85** usually represents legitimate adjacent concepts — leave as-is
unless the definition rubric says otherwise.
For coverage and coherence, the pattern is the same: the `--json` output
surfaces the specific entities / missing links / disconnected components
you need to look at, rather than a bare ratio.
---
## 5. Systematic processing of long texts
For long source material (books, multi-chapter specifications, corpora), the
pipeline can produce a clean chapter-by-chapter git history on its own if
you let it. The pattern:
```bash
# Process all sources in canonical order, eval and classify per chapter,
# snapshot metrics after each chapter.
markitect infospace process --all \
--provider openrouter \
--eval-after-source \
--classify-after-source \
--check-after-each
```
What you get:
- **One commit per source file**, not per batch run. The commit message body
lists counts by bucket (`entities: +23`, `evaluations: +23`,
`classifications: +23`) derived from the actual staged diff, so `git log`
reads like the story of the infospace growing.
- **Chapter-atomic commits.** `--eval-after-source` and
`--classify-after-source` evaluate and classify *only the new entities*
from the just-processed source before the commit lands, so each commit is
a self-contained chapter snapshot.
- **Metrics-per-chapter trail.** `--check-after-each` appends a snapshot to
`output/metrics/history.yaml` after every chapter, so `markitect infospace
history` later shows the metric trajectory rather than just start/end.
**Cost tradeoff.** `--eval-after-source` pays LLM latency per chapter rather
than amortising it across one bulk batch. It's worth it when you care about
the git history or want early quality signal, not when you're bulk-backfilling
a known-good corpus.
**Triage during the run.** While processing, use `markitect infospace
chapters` in another shell to see per-source entity/eval/classify counts and
mean scores — handy for spotting chapters that under-extracted or evaluated
poorly.
```
$ markitect infospace chapters
source entities evaluated classified mean_score
------------------- -------- --------- ---------- ----------
book-1-chapter-01 96 96 79 4.22
book-1-chapter-02 16 16 10 4.06
```
---
## See also
- `METRICS-METHODOLOGY.md` — how each metric is computed.
- `docs/composition-guide.md` — using this infospace as a discipline for a
different domain.
- `docs/performance-notes.md` — observed timings and provider choices.

View File

@@ -0,0 +1,106 @@
# Performance Notes — Wealth of Nations Infospace
Observed timings, file sizes, and provider choices from the 988-entity WoN
example. These are **operational notes**, not a benchmark — numbers come
from the actual S3.3 evaluation run (2026-02-23) rather than a controlled
experiment.
---
## Evaluation batch duration
The initial evaluation pass produced 985 `output/evaluations/*.md` files:
- First `evaluated_at`: `2026-02-23T00:11:52`
- Last `evaluated_at`: `2026-02-23T06:39:45`
- **Total wall time: ~6h 28m**
- **Effective throughput: ~2.5 entities/min** (~152 entities/hour)
Extracted from evaluation frontmatter:
```bash
grep -h '^evaluated_at:' output/evaluations/*.md | sort | sed -n '1p;$p'
```
Caveats:
- This was against OpenRouter's free tier, which applies implicit
rate-limiting and occasional retries.
- Throughput is not constant — gaps between bursts show up as plateaus
when you plot the timestamps.
- The batch was not fully parallelised; a tuned concurrent client could
likely 24× this throughput on a paid OpenRouter tier.
---
## Tokens per entity (estimate)
Direct token counts are not logged in the evaluation files, but the
inputs and outputs are on disk:
- **Input per request**: evaluation schema (~3.7 KB) + entity file
(~0.7 KB median) + fixed system prompt ≈ **~15002500 tokens in**
- **Output per request**: structured evaluation with 5 dimensions and
rationales, median eval file 3.6 KB ≈ **~600800 tokens out**
- **Round-trip total**: **~20003000 tokens per entity**
- **Batch total estimate**: 985 entities × ~2500 tokens ≈ **~2.5M tokens**
for the full pass
The constant per-entity input means the cheapest way to reduce spend on a
re-run is to narrow the targeted entities (`--entity <slug>` or
`--chapter <n>`), not to shorten the schema.
---
## Embedding cache and collection checks
`markitect infospace check --concern redundancy` supports two similarity
backends (see `markitect/infospace/checks/redundancy.py`):
- **`word_overlap`** — the default, used when no embeddings are provided.
Pure-Python set intersection over tokenised entity text. **No LLM calls,
no cache needed.** This is what the current WoN check runs.
- **`embedding`** — active when a pre-computed `{slug: vector}` mapping is
passed in. No persistent on-disk embedding cache exists today; the
caller is responsible for computing and supplying the vectors.
Implication: the 988-entity `check` runs in seconds because it's all
word-overlap. Switching to embedding similarity would add an embedding
API pass (another ~988 requests) which is currently a manual step
outside the CLI.
---
## Provider choice — recommendation
For the WoN dataset specifically (text-heavy entities, 5-dimension
rubric):
| Scale | Recommended provider | Rationale |
|-----------------------|----------------------------------|-----------|
| < 50 entities | `gemini/gemini-2.5-flash` | Fast default; free tier is generous enough; consistent with `markitect llm-check` out of the box. |
| 50 1000 entities | `openrouter` with a `:free` model (e.g. `arcee-ai/trinity-large-preview:free`) | What the S3.3 batch used; gets through 988 entities in one overnight run without cost. |
| > 1000 entities | `openrouter` with a paid small-context model, or `openai` | Free-tier rate limits start to dominate wall time; paying for higher concurrency is cheaper than calendar time. |
All providers are accepted by `markitect infospace evaluate --provider`.
The evaluation schema doesn't assume any provider-specific features.
Note on provider mixing: if part of a collection is evaluated under one
provider/model and the rest under another, `per_entity_mean` can drift
slightly (different models calibrate scores differently). For the
viability threshold of 3.5 the drift is usually negligible, but for
fine-grained outlier analysis prefer a single provider per batch.
---
## What is *not* measured here
- **End-to-end pipeline time** (entity extraction from raw chapters,
classification, relation graph) — only the evaluation phase is timed.
- **Memory footprint** — the full in-memory state for 988 entities is
small (< 200 MB observed), but not systematically measured.
- **Failure/retry rates** — the 985 vs 988 gap is three entities the
original run missed (plus one added later); no structured retry log
was kept.
Expanding any of these into a proper benchmark is **out of scope** for
the WoN example and should live alongside a synthetic corpus that can be
regenerated deterministically.

View File

@@ -0,0 +1,28 @@
---
entity_slug: advanced_state_of_society
evaluator: gemini-2.5-flash
evaluated_at: '2026-04-21T21:32:17.135192'
overall_score: 4.5
scores:
- name: definition_precision
value: 4.0
max_value: 5.0
rationale: The definition is precise, listing key characteristics like accumulated
stock and private property. It clearly distinguishes the concept by contrasting
it with earlier economic conditions.
- name: source_grounding
value: 5.0
max_value: 5.0
rationale: This entity is deeply grounded in Smith's work, particularly in Book
I
---
# Evaluation: Advanced State Of Society
## definition_precision — 4.0 / 5.0
The definition is precise, listing key characteristics like accumulated stock and private property. It clearly distinguishes the concept by contrasting it with earlier economic conditions.
## source_grounding — 5.0 / 5.0
This entity is deeply grounded in Smith's work, particularly in Book I

View File

@@ -0,0 +1,61 @@
---
entity_slug: bank_notes
evaluator: null
evaluated_at: '2026-04-21T21:33:16.736926'
overall_score: 4.4
scores:
- name: definition_precision
value: 5.0
max_value: 5.0
rationale: The definition is precise, clearly distinguishing bank notes by their
issuer, form, and key characteristics (payable on demand, confidence-based). It
avoids circularity and captures a distinct concept.
- name: source_grounding
value: 5.0
max_value: 5.0
rationale: The entity is excellently grounded in "The Wealth of Nations," specifically
Book II, Chapter 2, where Smith extensively discusses bank notes' role in economizing
precious metals and their reliance on public confidence.
- name: domain_placement
value: 4.0
max_value: 5.0
rationale: '"Exchange" is an appropriate domain as bank notes primarily function
as a medium for facilitating transactions. While "Money" or "Finance" could also
fit, "Exchange" accurately reflects their operational role in the economy.'
- name: vsm_relevance
value: 3.0
max_value: 5.0
rationale: Bank notes are a critical *medium* or *tool* that enables the primary
operations (S1) of an economy (i.e., exchange of goods and services). However,
they are not a VSM system or management function themselves, making their direct
mapping somewhat abstract.
- name: explanatory_value
value: 5.0
max_value: 5.0
rationale: This entity offers significant explanatory power by detailing how paper
money functions, its reliance on confidence, and its role in reducing the need
for precious metals, thereby illuminating a key mechanism in Smith's economic
theory.
---
# Evaluation: Bank Notes
## definition_precision — 5.0 / 5.0
The definition is precise, clearly distinguishing bank notes by their issuer, form, and key characteristics (payable on demand, confidence-based). It avoids circularity and captures a distinct concept.
## source_grounding — 5.0 / 5.0
The entity is excellently grounded in "The Wealth of Nations," specifically Book II, Chapter 2, where Smith extensively discusses bank notes' role in economizing precious metals and their reliance on public confidence.
## domain_placement — 4.0 / 5.0
"Exchange" is an appropriate domain as bank notes primarily function as a medium for facilitating transactions. While "Money" or "Finance" could also fit, "Exchange" accurately reflects their operational role in the economy.
## vsm_relevance — 3.0 / 5.0
Bank notes are a critical *medium* or *tool* that enables the primary operations (S1) of an economy (i.e., exchange of goods and services). However, they are not a VSM system or management function themselves, making their direct mapping somewhat abstract.
## explanatory_value — 5.0 / 5.0
This entity offers significant explanatory power by detailing how paper money functions, its reliance on confidence, and its role in reducing the need for precious metals, thereby illuminating a key mechanism in Smith's economic theory.

View File

@@ -0,0 +1,60 @@
---
entity_slug: bank_systemic_risk_management
evaluator: gemini-2.5-flash-lite
evaluated_at: '2026-04-21T21:49:35.222637'
overall_score: 4.0
scores:
- name: definition_precision
value: 4.0
max_value: 5.0
rationale: The definition is precise and clearly outlines the purpose of bank systemic
risk management. It avoids being an overly broad umbrella term.
- name: source_grounding
value: 3.0
max_value: 5.0
rationale: While the concept of managing risks to the banking system is present
in Book II, Chapter 2, the explicit framing of "systemic risk management" as a
distinct entity with specific practices might be a slight abstraction beyond Smith's
direct terminology.
- name: domain_placement
value: 5.0
max_value: 5.0
rationale: The "Regulation" domain is highly appropriate. Managing systemic risk
is fundamentally a regulatory concern aimed at ensuring the stability of the financial
system.
- name: vsm_relevance
value: 4.0
max_value: 5.0
rationale: This entity strongly maps to VSM System 3 (Internal Regulation/Audit)
as it involves monitoring and controlling internal operations to prevent systemic
failures. It also has elements of System 5 (Policy) in setting overall stability
goals.
- name: explanatory_value
value: 4.0
max_value: 5.0
rationale: The entity provides good explanatory value by highlighting a crucial
mechanism for maintaining financial stability. It explains *how* the banking system
can be protected from cascading failures.
---
# Evaluation: Bank Systemic Risk Management
## definition_precision — 4.0 / 5.0
The definition is precise and clearly outlines the purpose of bank systemic risk management. It avoids being an overly broad umbrella term.
## source_grounding — 3.0 / 5.0
While the concept of managing risks to the banking system is present in Book II, Chapter 2, the explicit framing of "systemic risk management" as a distinct entity with specific practices might be a slight abstraction beyond Smith's direct terminology.
## domain_placement — 5.0 / 5.0
The "Regulation" domain is highly appropriate. Managing systemic risk is fundamentally a regulatory concern aimed at ensuring the stability of the financial system.
## vsm_relevance — 4.0 / 5.0
This entity strongly maps to VSM System 3 (Internal Regulation/Audit) as it involves monitoring and controlling internal operations to prevent systemic failures. It also has elements of System 5 (Policy) in setting overall stability goals.
## explanatory_value — 4.0 / 5.0
The entity provides good explanatory value by highlighting a crucial mechanism for maintaining financial stability. It explains *how* the banking system can be protected from cascading failures.

View File

@@ -3,7 +3,7 @@ consistency_cycles: 0.0
coverage_ratio: 0.619048 coverage_ratio: 0.619048
granularity_entropy: 2.674752 granularity_entropy: 2.674752
modularity: 0.0 modularity: 0.0
per_entity_mean: 3.955635 per_entity_mean: 3.95668
redundancy_ratio: 0.006073 redundancy_ratio: 0.006073
type_distribution: type_distribution:
Element: 315 Element: 315

View File

@@ -240,8 +240,14 @@ def llm_catalog(output_format):
) )
def llm_check(provider, model): def llm_check(provider, model):
"""Send a minimal prompt to verify a provider is reachable and responding.""" """Send a minimal prompt to verify a provider is reachable and responding."""
import os
from markitect.llm import create_adapter from markitect.llm import create_adapter
from markitect.llm.exceptions import LLMConfigurationError, LLMError from markitect.llm.exceptions import (
LLMAPIError,
LLMConfigurationError,
LLMError,
)
from markitect.prompts.execution.models import RunConfig from markitect.prompts.execution.models import RunConfig
resolved = resolve_llm(cli_provider=provider, cli_model=model) resolved = resolve_llm(cli_provider=provider, cli_model=model)
@@ -252,6 +258,17 @@ def llm_check(provider, model):
f" model from: {resolved.model_source}" f" model from: {resolved.model_source}"
) )
# Advisory: OPENROUTER_API_KEY is set but this call won't use it. Common
# source of "works for me, fails for agents" when the env var holds a
# stale key that overrides a clean config entry.
if resolved.provider != "openrouter" and os.environ.get("OPENROUTER_API_KEY"):
click.echo(
" note: OPENROUTER_API_KEY is set but won't be used for this "
"provider. If OpenRouter calls fail elsewhere with 401, the env "
"var may be stale — unset or update it.",
err=True,
)
try: try:
adapter = create_adapter( adapter = create_adapter(
provider=resolved.provider, provider=resolved.provider,
@@ -273,6 +290,19 @@ def llm_check(provider, model):
except LLMError as exc: except LLMError as exc:
elapsed = time.monotonic() - start elapsed = time.monotonic() - start
click.echo(f"ERROR \u2014 LLM error after {elapsed:.1f}s: {exc}", err=True) click.echo(f"ERROR \u2014 LLM error after {elapsed:.1f}s: {exc}", err=True)
# Targeted hint: 401 on openrouter almost always means a stale key.
if (
resolved.provider == "openrouter"
and isinstance(exc, LLMAPIError)
and exc.status_code == 401
):
click.echo(
" hint: OpenRouter returned 401 (unauthorized). Check whether "
"OPENROUTER_API_KEY is stale (`unset OPENROUTER_API_KEY` to "
"fall back to the key in ~/.config/markitect/config.toml, or "
"update the env var).",
err=True,
)
sys.exit(1) sys.exit(1)
except Exception as exc: except Exception as exc:
elapsed = time.monotonic() - start elapsed = time.monotonic() - start

View File

@@ -7,8 +7,9 @@ inspecting, and evaluating infospaces.
from __future__ import annotations from __future__ import annotations
import re
from pathlib import Path from pathlib import Path
from typing import Optional from typing import Dict, Optional
import click import click
@@ -228,6 +229,227 @@ def _entities_by_type(cfg, root: "Path", entity_list: list) -> None:
click.echo(f"\nTotal: {total} entities") click.echo(f"\nTotal: {total} entities")
# ── chapters (per-source triage view) ────────────────────────────────
@infospace_commands.command()
@click.option("--config", "config_path", default=None, help="Path to infospace.yaml.")
@click.option(
"--format", "output_format",
type=click.Choice(["text", "json"]),
default="text",
help="Output format.",
)
def chapters(config_path: Optional[str], output_format: str):
"""List source files in canonical order with per-source stats.
For each source file in the sources directory, reports entity count,
mean per-entity score (if evaluated), classification coverage, and
processing status. Useful for triaging long-text infospaces.
"""
cfg, cfg_path = _load_config_or_exit(config_path)
root = cfg_path.parent
sources_dir = root / cfg.topic.sources if cfg.topic.sources else root
if not sources_dir.is_dir():
click.echo(f"No sources directory at {sources_dir}.", err=True)
raise SystemExit(1)
source_files = sorted(sources_dir.glob("*.md"))
if not source_files:
click.echo(f"No source files in {sources_dir}.", err=True)
raise SystemExit(1)
entities_dir = root / cfg.entities_dir
entity_list = (
parse_entity_directory(entities_dir) if entities_dir.is_dir() else []
)
# Build a source_id → [entities] map using the source_chapter field.
# Matching is lenient: entities with a source_chapter substring-equal
# to a normalized form of the source stem count as belonging to it.
def _chapter_keys(source_id: str) -> list:
"""Return strings an entity's source_chapter might contain."""
keys = [source_id, source_id.replace("-", " ")]
m = re.match(r"book-(\d+)-chapter-(\d+)", source_id)
if m:
book, chap = m.group(1), m.group(2)
roman = {"1": "I", "2": "II", "3": "III", "4": "IV", "5": "V"}
if book in roman:
keys.append(f"Book {roman[book]}, Chapter {int(chap)}")
keys.append(f"Book {roman[book]} Chapter {int(chap)}")
return keys
# Precompute evaluation scores and classification slugs once.
evals_dir = root / cfg.evaluations_dir
cls_dir = root / cfg.classifications_dir
eval_scores: Dict[str, float] = {}
if evals_dir.is_dir():
from markitect.infospace.evaluation_io import read_entity_evaluation
for ev_path in evals_dir.glob("*.md"):
try:
ev = read_entity_evaluation(ev_path)
if ev.overall_score is not None:
eval_scores[ev_path.stem] = ev.overall_score
except Exception:
continue
classified_slugs = (
{p.stem for p in cls_dir.glob("*.md")} if cls_dir.is_dir() else set()
)
rows = []
for source_file in source_files:
source_id = source_file.stem
keys = _chapter_keys(source_id)
matched = [
e for e in entity_list
if any(k.lower() in (e.source_chapter or "").lower() for k in keys)
]
slugs = {e.slug for e in matched}
evaluated = slugs & set(eval_scores)
classified = slugs & classified_slugs
mean = (
sum(eval_scores[s] for s in evaluated) / len(evaluated)
if evaluated else None
)
rows.append({
"source_id": source_id,
"entities": len(matched),
"evaluated": len(evaluated),
"classified": len(classified),
"mean_score": round(mean, 2) if mean is not None else None,
})
if output_format == "json":
import json
click.echo(json.dumps(rows, indent=2))
return
# Text: aligned table.
headers = ("source", "entities", "evaluated", "classified", "mean_score")
widths = [
max(len(h), max((len(str(r[h.replace(' ', '_')])) if h != "source"
else len(r["source_id"]))
for r in rows)) if rows else len(h)
for h in headers
]
fmt = " ".join(f"{{:<{w}}}" for w in widths)
click.echo(fmt.format(*headers))
click.echo(fmt.format(*("-" * w for w in widths)))
for r in rows:
click.echo(fmt.format(
r["source_id"],
r["entities"],
r["evaluated"],
r["classified"],
"-" if r["mean_score"] is None else f"{r['mean_score']:.2f}",
))
totals = {
"entities": sum(r["entities"] for r in rows),
"evaluated": sum(r["evaluated"] for r in rows),
"classified": sum(r["classified"] for r in rows),
}
click.echo(
f"\n{len(rows)} source file(s); "
f"{totals['entities']} entities, "
f"{totals['evaluated']} evaluated, "
f"{totals['classified']} classified."
)
# ── entity (single lookup) ───────────────────────────────────────────
@infospace_commands.command()
@click.argument("name")
@click.option("--config", "config_path", default=None, help="Path to infospace.yaml.")
def entity(name: str, config_path: Optional[str]):
"""Look up one entity by name, tolerating case / hyphens / underscores.
Prints slug, source path, domain, chapter, word count, overall score,
VSM system (if classified), and evaluation-file path.
"""
cfg, cfg_path = _load_config_or_exit(config_path)
root = cfg_path.parent
entities_dir = root / cfg.entities_dir
if not entities_dir.is_dir():
click.echo("No entities directory found.", err=True)
raise SystemExit(1)
entity_list = parse_entity_directory(entities_dir)
if not entity_list:
click.echo("No entities found.", err=True)
raise SystemExit(1)
# Normalize: lowercase, underscores.
def norm(s: str) -> str:
return s.lower().replace("-", "_").replace(" ", "_")
target = norm(name)
by_slug = {e.slug: e for e in entity_list}
match = by_slug.get(target)
if match is None:
# Substring fallback for partial input.
candidates = [e for e in entity_list if target in norm(e.slug)]
if len(candidates) == 1:
match = candidates[0]
elif len(candidates) > 1:
click.echo(f"Ambiguous — '{name}' matches multiple entities:", err=True)
for c in sorted(candidates, key=lambda e: e.slug)[:10]:
click.echo(f" {c.slug}", err=True)
if len(candidates) > 10:
click.echo(f" … and {len(candidates) - 10} more", err=True)
raise SystemExit(1)
else:
click.echo(f"No entity matching '{name}'.", err=True)
near = sorted(
e.slug for e in entity_list
if target.split("_", 1)[0] in e.slug
)[:5]
if near:
click.echo(f" Near matches: {', '.join(near)}", err=True)
raise SystemExit(1)
# Load score + classification (best-effort).
score: Optional[float] = None
evaluator: Optional[str] = None
eval_file = root / cfg.evaluations_dir / f"{match.slug}.md"
if eval_file.is_file():
try:
from markitect.infospace.evaluation_io import read_entity_evaluation
ev = read_entity_evaluation(eval_file)
score = ev.overall_score
evaluator = ev.evaluator
except Exception:
pass
vsm: Optional[str] = None
cls_file = root / cfg.classifications_dir / f"{match.slug}.md"
if cls_file.is_file():
try:
from markitect.infospace.classification_io import read_entity_classification
cls = read_entity_classification(cls_file)
vsm = cls.vsm_system
except Exception:
pass
# Output — one field per line so it's easy to grep or pipe.
click.echo(f"slug: {match.slug}")
click.echo(f"source_path: {match.source_path}")
click.echo(f"domain: {match.domain or '-'}")
click.echo(f"chapter: {match.source_chapter or '-'}")
click.echo(f"word_count: {match.total_word_count}")
click.echo(f"vsm_system: {vsm or '-'}")
if score is not None:
click.echo(f"overall_score: {score:.2f}")
click.echo(f"evaluator: {evaluator or '-'}")
click.echo(f"evaluation: {eval_file}")
else:
click.echo("evaluation: (not yet evaluated)")
# ── evaluate ───────────────────────────────────────────────────────── # ── evaluate ─────────────────────────────────────────────────────────
@@ -237,7 +459,14 @@ def _entities_by_type(cfg, root: "Path", entity_list: list) -> None:
@click.option("--model", default=None, help="LLM model name.") @click.option("--model", default=None, help="LLM model name.")
@click.option("--entity", "entity_slug", default=None, help="Evaluate a single entity by slug.") @click.option("--entity", "entity_slug", default=None, help="Evaluate a single entity by slug.")
@click.option("--chapter", default=None, help="Evaluate entities from a specific chapter.") @click.option("--chapter", default=None, help="Evaluate entities from a specific chapter.")
def evaluate(config_path, provider, model, entity_slug, chapter): @click.option("--force", is_flag=True, default=False,
help="Re-evaluate entities whose evaluation file already exists.")
@click.option("--model-fallback", "model_fallback", default=None,
help="If the primary model hits a rate limit (429), retry the "
"failed entities once with this model. Useful on free tiers "
"where models have separate quota buckets (e.g. "
"gemini-2.5-flash → gemini-2.5-flash-lite).")
def evaluate(config_path, provider, model, entity_slug, chapter, force, model_fallback):
"""Evaluate entities using LLM-based quality assessment.""" """Evaluate entities using LLM-based quality assessment."""
cfg, cfg_path = _load_config_or_exit(config_path) cfg, cfg_path = _load_config_or_exit(config_path)
root = cfg_path.parent root = cfg_path.parent
@@ -252,32 +481,44 @@ def evaluate(config_path, provider, model, entity_slug, chapter):
click.echo("No entities to evaluate.") click.echo("No entities to evaluate.")
return return
# Filter # Filter. Accept hyphenated input for --entity by normalizing to the
# underscore slug format produced by parse_entity_directory.
if entity_slug: if entity_slug:
entity_list = [e for e in entity_list if e.slug == entity_slug] normalized = entity_slug.replace("-", "_")
if not entity_list: matches = [e for e in entity_list if e.slug == normalized]
click.echo(f"Error: Entity '{entity_slug}' not found.", err=True) if not matches:
# Build a short "did you mean…" list from entities sharing a stem.
stem = normalized.split("_", 1)[0]
near = sorted(e.slug for e in entity_list if e.slug.startswith(stem))[:5]
msg = f"Error: Entity '{entity_slug}' not found."
if near:
msg += f" Did you mean: {', '.join(near)} ?"
click.echo(msg, err=True)
raise SystemExit(1) raise SystemExit(1)
entity_list = matches
elif chapter: elif chapter:
entity_list = [e for e in entity_list if chapter in e.source_chapter] entity_list = [e for e in entity_list if chapter in e.source_chapter]
if not entity_list: if not entity_list:
click.echo(f"No entities found for chapter '{chapter}'.") click.echo(f"No entities found for chapter '{chapter}'.")
return return
# Skip entities that already have evaluation files (incremental resume) # Skip entities that already have evaluation files (incremental resume).
# Applies uniformly to full-pass, --entity, and --chapter runs unless
# --force is set.
from markitect.infospace.evaluate import run_entity_evaluation from markitect.infospace.evaluate import run_entity_evaluation
output_dir = root / cfg.evaluations_dir output_dir = root / cfg.evaluations_dir
if not entity_slug and not chapter and output_dir.is_dir(): if not force and output_dir.is_dir():
previous_digests = { existing = {p.stem for p in output_dir.glob("*.md")}
p.stem: "" # non-empty sentinel → triggers skip in BatchEvaluator before = len(entity_list)
for p in output_dir.glob("*.md") entity_list = [e for e in entity_list if e.slug not in existing]
} skipped = before - len(entity_list)
entity_list = [e for e in entity_list if e.slug not in previous_digests]
if not entity_list: if not entity_list:
click.echo("All entities already evaluated. Nothing to do.") click.echo("All selected entities already evaluated. "
"Re-run with --force to overwrite.")
return return
if previous_digests: if skipped:
click.echo(f"Skipping {len(previous_digests)} already-evaluated entities.") click.echo(f"Skipping {skipped} already-evaluated entities. "
"Use --force to re-evaluate.")
# Create adapter # Create adapter
from markitect.llm import create_adapter from markitect.llm import create_adapter
@@ -285,10 +526,14 @@ def evaluate(config_path, provider, model, entity_slug, chapter):
adapter = create_adapter(provider, model=model) adapter = create_adapter(provider, model=model)
run_config = RunConfig(model_name=model, temperature=0.3, max_tokens=2000) run_config = RunConfig(model_name=model, temperature=0.3, max_tokens=2000)
# Progress callback # Progress callback — surface error detail so agents don't have to
# drop into Python to see whether an ERROR was 429, 503, or auth.
def on_progress(done, total, result): def on_progress(done, total, result):
status = result.status.upper() status = result.status.upper()
click.echo(f" [{done}/{total}] {result.key}: {status}") if status == "ERROR" and result.error:
click.echo(f" [{done}/{total}] {result.key}: ERROR — {result.error}")
else:
click.echo(f" [{done}/{total}] {result.key}: {status}")
click.echo(f"Evaluating {len(entity_list)} entities via {provider}...") click.echo(f"Evaluating {len(entity_list)} entities via {provider}...")
@@ -301,6 +546,42 @@ def evaluate(config_path, provider, model, entity_slug, chapter):
progress_callback=on_progress, progress_callback=on_progress,
) )
# Model fallback: if any entities failed with a rate-limit-looking
# error and the user opted in with --model-fallback, retry them once
# with a fresh adapter on the fallback model. Different free-tier
# models have separate quota buckets, so this often succeeds when
# the primary is exhausted.
if model_fallback and summary.failed > 0:
rate_limited = [
r for r in summary.results
if r.status == "error"
and r.error
and ("429" in r.error or "rate" in r.error.lower())
]
if rate_limited:
retry_slugs = {r.key for r in rate_limited}
retry_entities = [e for e in entity_list if e.slug in retry_slugs]
click.echo(
f"\n{len(retry_entities)} rate-limited entities — "
f"retrying with --model-fallback {model_fallback}..."
)
fb_adapter = create_adapter(provider, model=model_fallback)
fb_run_config = RunConfig(
model_name=model_fallback, temperature=0.3, max_tokens=2000
)
fb_summary = run_entity_evaluation(
config=cfg,
entities=retry_entities,
adapter=fb_adapter,
run_config=fb_run_config,
output_dir=output_dir,
progress_callback=on_progress,
)
summary.succeeded += fb_summary.succeeded
summary.failed = (summary.failed - len(retry_entities)) + fb_summary.failed
summary.total_prompt_tokens += fb_summary.total_prompt_tokens
summary.total_completion_tokens += fb_summary.total_completion_tokens
click.echo(f"\nDone: {summary.succeeded} succeeded, {summary.failed} failed, {summary.skipped} skipped") click.echo(f"\nDone: {summary.succeeded} succeeded, {summary.failed} failed, {summary.skipped} skipped")
if summary.total_tokens > 0: if summary.total_tokens > 0:
click.echo(f"Tokens used: {summary.total_tokens}") click.echo(f"Tokens used: {summary.total_tokens}")
@@ -1015,6 +1296,18 @@ def disciplines(config_path: Optional[str]):
help="Run collection checks (C1C5) after each source file.", help="Run collection checks (C1C5) after each source file.",
) )
@click.option("--no-commit", is_flag=True, help="Skip git commits.") @click.option("--no-commit", is_flag=True, help="Skip git commits.")
@click.option(
"--eval-after-source",
is_flag=True,
help="After each source's stages succeed, evaluate just the newly-"
"added entities so the per-source commit is self-contained.",
)
@click.option(
"--classify-after-source",
is_flag=True,
help="After each source's stages succeed, classify just the newly-"
"added entities so the per-source commit is self-contained.",
)
def process( def process(
glob_pattern: Optional[str], glob_pattern: Optional[str],
process_all: bool, process_all: bool,
@@ -1023,6 +1316,8 @@ def process(
model: Optional[str], model: Optional[str],
check_after_each: bool, check_after_each: bool,
no_commit: bool, no_commit: bool,
eval_after_source: bool,
classify_after_source: bool,
): ):
"""Process source files through the pipeline defined in infospace.yaml. """Process source files through the pipeline defined in infospace.yaml.
@@ -1096,12 +1391,22 @@ def process(
# Run pipeline # Run pipeline
from markitect.infospace.pipeline import SourcePipeline from markitect.infospace.pipeline import SourcePipeline
if (eval_after_source or classify_after_source) and adapter is None:
click.echo(
"Error: --eval-after-source / --classify-after-source require "
"--provider (they call the LLM).",
err=True,
)
raise SystemExit(1)
pipeline = SourcePipeline( pipeline = SourcePipeline(
cfg, root, cfg, root,
adapter=adapter, adapter=adapter,
provider=provider or "", provider=provider or "",
model=(model or _PROVIDER_DEFAULTS.get(provider or "", "")) if provider else "", model=(model or _PROVIDER_DEFAULTS.get(provider or "", "")) if provider else "",
no_commit=no_commit, no_commit=no_commit,
eval_after_source=eval_after_source,
classify_after_source=classify_after_source,
) )
total = len(source_files) total = len(source_files)

View File

@@ -195,12 +195,23 @@ def run_entity_evaluation(
""" """
topic = config.topic.name topic = config.topic.name
evaluations_path = output_dir or Path(config.evaluations_dir) evaluations_path = output_dir or Path(config.evaluations_dir)
evaluator_name = (run_config.model_name if run_config else "unknown") # Fall back from run_config.model_name (may be None if the CLI user did
# not pass --model) to the adapter's resolved model, and only then to
# "unknown". Keeps the evaluator field in the written frontmatter
# informative for later audits.
default_evaluator = (
(run_config.model_name if run_config else None)
or getattr(adapter, "_model", None)
or "unknown"
)
def _write_and_notify(done: int, total: int, result) -> None: def _write_and_notify(done: int, total: int, result) -> None:
# Write file immediately on success (incremental — run is resumable) # Write file immediately on success (incremental — run is resumable)
if result.status == "success" and result.response is not None: if result.status == "success" and result.response is not None:
scores = parse_evaluation_response(result.response.content, dimensions) scores = parse_evaluation_response(result.response.content, dimensions)
# Prefer the model name the adapter actually echoed back — it
# reflects post-resolution fallbacks (e.g. flash → flash-lite).
evaluator_name = result.response.model or default_evaluator
evaluation = EntityEvaluation( evaluation = EntityEvaluation(
entity_slug=result.key, entity_slug=result.key,
evaluator=evaluator_name, evaluator=evaluator_name,

View File

@@ -81,17 +81,26 @@ def snapshot_from_checks(
# ── Metrics file I/O ──────────────────────────────────────────────── # ── Metrics file I/O ────────────────────────────────────────────────
def write_metrics_file(metrics: Dict[str, float], path: Path) -> None: def write_metrics_file(metrics: Dict[str, Any], path: Path) -> None:
"""Write the latest metrics to a simple YAML file. """Write the latest metrics to a simple YAML file.
This file is used by ``markitect infospace viability`` for quick This file is used by ``markitect infospace viability`` for quick
threshold checking. threshold checking. Non-numeric values (e.g. ``type_distribution``)
are passed through unchanged; floats are rounded to 6 dp; ints are
preserved as ints so external consumers don't see ``29`` silently
become ``29.0`` on every round-trip.
""" """
def _normalize(v: Any) -> Any:
if isinstance(v, bool):
return v
if isinstance(v, float):
return round(v, 6)
return v
path.parent.mkdir(parents=True, exist_ok=True) path.parent.mkdir(parents=True, exist_ok=True)
path.write_text( path.write_text(
yaml.safe_dump( yaml.safe_dump(
{k: round(v, 6) if isinstance(v, float) else v {k: _normalize(v) for k, v in sorted(metrics.items())},
for k, v in sorted(metrics.items())},
default_flow_style=False, default_flow_style=False,
sort_keys=True, sort_keys=True,
), ),
@@ -99,14 +108,20 @@ def write_metrics_file(metrics: Dict[str, float], path: Path) -> None:
) )
def read_metrics_file(path: Path) -> Dict[str, float]: def read_metrics_file(path: Path) -> Dict[str, Any]:
"""Read the latest metrics from a YAML file.""" """Read the latest metrics from a YAML file.
Returns all keys as written on disk, preserving types verbatim so a
round-trip via :func:`write_metrics_file` does not silently drop
structured values (e.g. ``type_distribution``) or flatten ints to
floats.
"""
if not path.is_file(): if not path.is_file():
return {} return {}
raw = yaml.safe_load(path.read_text(encoding="utf-8")) raw = yaml.safe_load(path.read_text(encoding="utf-8"))
if not isinstance(raw, dict): if not isinstance(raw, dict):
return {} return {}
return {k: float(v) for k, v in raw.items() if isinstance(v, (int, float))} return raw
# ── History operations ─────────────────────────────────────────────── # ── History operations ───────────────────────────────────────────────

View File

@@ -62,6 +62,8 @@ class SourcePipeline:
provider: str = "", provider: str = "",
model: str = "", model: str = "",
no_commit: bool = False, no_commit: bool = False,
eval_after_source: bool = False,
classify_after_source: bool = False,
) -> None: ) -> None:
self.config = config self.config = config
self.root = root self.root = root
@@ -69,6 +71,8 @@ class SourcePipeline:
self.provider = provider self.provider = provider
self.model = model self.model = model
self.no_commit = no_commit self.no_commit = no_commit
self.eval_after_source = eval_after_source
self.classify_after_source = classify_after_source
# ── Public API ──────────────────────────────────────────────────── # ── Public API ────────────────────────────────────────────────────
@@ -110,6 +114,12 @@ class SourcePipeline:
stage_outputs: Dict[str, str] = {} stage_outputs: Dict[str, str] = {}
stage_logs: List[Dict[str, Any]] = [] stage_logs: List[Dict[str, Any]] = []
# Snapshot entity slugs before any stage runs so we can identify
# which entities were newly produced by this source. Used to scope
# --eval-after-source / --classify-after-source to only the new
# entities.
pre_entity_slugs = self._current_entity_slugs()
print(f"\nProcessing: {source_id}") print(f"\nProcessing: {source_id}")
print("=" * 60) print("=" * 60)
@@ -133,6 +143,14 @@ class SourcePipeline:
print(f"\n {source_id}: all stages complete.") print(f"\n {source_id}: all stages complete.")
self._write_processing_log(source_id, stage_logs, success=True) self._write_processing_log(source_id, stage_logs, success=True)
# Per-source follow-ups: evaluate and/or classify just the new
# entities this source produced, so the next commit contains a
# fully-processed chapter.
new_slugs = self._current_entity_slugs() - pre_entity_slugs
if new_slugs and (self.eval_after_source or self.classify_after_source):
self._run_per_source_followups(new_slugs)
if not self.no_commit: if not self.no_commit:
self._git_commit(source_id) self._git_commit(source_id)
@@ -636,7 +654,13 @@ class SourcePipeline:
# ── Git Integration ─────────────────────────────────────────────── # ── Git Integration ───────────────────────────────────────────────
def _git_commit(self, source_id: str) -> None: def _git_commit(self, source_id: str) -> None:
"""Stage all output changes and commit them for *source_id*.""" """Stage all output changes and commit them for *source_id*.
The commit message body summarises what actually changed — counts
of entities / evaluations / classifications / analyses added — so
``git log`` reads like the chapter-by-chapter story of the
infospace growing, not a wall of identical messages.
"""
output_dir = self.root / "output" output_dir = self.root / "output"
try: try:
subprocess.run( subprocess.run(
@@ -645,11 +669,11 @@ class SourcePipeline:
check=True, check=True,
capture_output=True, capture_output=True,
) )
body = self._compose_commit_body(source_id)
result = subprocess.run( result = subprocess.run(
[ [
"git", "commit", "-m", "git", "commit", "-m",
f"infospace: process {source_id}\n\n" f"infospace: process {source_id}\n\n{body}",
f"Extract entities, map to VSM, and synthesize analysis.",
], ],
cwd=str(self.root), cwd=str(self.root),
capture_output=True, capture_output=True,
@@ -666,3 +690,146 @@ class SourcePipeline:
except subprocess.CalledProcessError as e: except subprocess.CalledProcessError as e:
stderr = e.stderr.decode() if isinstance(e.stderr, bytes) else (e.stderr or "") stderr = e.stderr.decode() if isinstance(e.stderr, bytes) else (e.stderr or "")
print(f" Warning: Git error: {stderr.strip()}") print(f" Warning: Git error: {stderr.strip()}")
# ── Per-source helpers ────────────────────────────────────────────
def _current_entity_slugs(self) -> set:
"""Return the set of entity file stems currently on disk."""
entities_dir = self.root / self.config.entities_dir
if not entities_dir.is_dir():
return set()
return {p.stem for p in entities_dir.glob("*.md")}
def _run_per_source_followups(self, new_slugs: set) -> None:
"""Run per-source evaluation and/or classification on *new_slugs*.
Called after a source's pipeline stages succeed, before the git
commit, so each chapter's commit contains the full set of
artefacts derived from it.
"""
from markitect.infospace.entity_parser import parse_entity_directory
entities_dir = self.root / self.config.entities_dir
all_entities = parse_entity_directory(entities_dir)
new_entities = [e for e in all_entities if e.slug in new_slugs]
if not new_entities:
return
if self.adapter is None:
print(
" Skipping per-source eval/classify: no LLM adapter "
"configured (run with --provider)."
)
return
from markitect.prompts.execution.models import RunConfig
run_config = RunConfig(
model_name=self.model or None, temperature=0.3, max_tokens=2000
)
if self.eval_after_source:
from markitect.infospace.evaluate import run_entity_evaluation
print(f" Evaluating {len(new_entities)} new entity/entities…")
try:
run_entity_evaluation(
config=self.config,
entities=new_entities,
adapter=self.adapter,
run_config=run_config,
output_dir=self.root / self.config.evaluations_dir,
)
except Exception as exc:
print(f" Warning: per-source evaluation failed: {exc}")
if self.classify_after_source:
from markitect.infospace.classifier import run_entity_classification
print(f" Classifying {len(new_entities)} new entity/entities…")
try:
run_entity_classification(
config=self.config,
entities=new_entities,
adapter=self.adapter,
run_config=run_config,
output_dir=self.root / self.config.classifications_dir,
)
except Exception as exc:
print(f" Warning: per-source classification failed: {exc}")
def _compose_commit_body(self, source_id: str) -> str:
"""Summarise staged output changes into a commit-message body.
Counts added files per output subdirectory (entities, evaluations,
classifications, analyses, mappings…) and produces one line per
bucket that actually saw additions. Modified/deleted files are
noted separately for auditability.
"""
default = "Extract entities, map to VSM, and synthesize analysis."
try:
result = subprocess.run(
["git", "diff", "--cached", "--name-status", "--", "output"],
cwd=str(self.root),
check=True,
capture_output=True,
text=True,
)
except subprocess.CalledProcessError:
return default
added_by_bucket: Dict[str, int] = {}
modified = 0
deleted = 0
for line in result.stdout.splitlines():
parts = line.split("\t")
if len(parts) < 2:
continue
status = parts[0]
path = parts[-1]
if status.startswith("A"):
bucket = self._bucket_for(path)
if bucket:
added_by_bucket[bucket] = added_by_bucket.get(bucket, 0) + 1
elif status.startswith("M"):
modified += 1
elif status.startswith("D"):
deleted += 1
if not added_by_bucket and not modified and not deleted:
return default
# Emit buckets in a deterministic, reader-friendly order.
order = ["entities", "mappings", "analyses", "evaluations",
"classifications", "metrics", "logs", "other"]
lines: List[str] = []
for bucket in order:
n = added_by_bucket.get(bucket, 0)
if n:
lines.append(f"- {bucket}: +{n}")
if modified:
lines.append(f"- modified: {modified}")
if deleted:
lines.append(f"- deleted: {deleted}")
return "\n".join(lines) if lines else default
def _bucket_for(self, path: str) -> Optional[str]:
"""Map an ``output/...`` path to a commit-summary bucket name."""
# Use configured directory basenames where possible so non-default
# layouts still bucket correctly.
buckets = {
Path(self.config.entities_dir).name: "entities",
Path(self.config.evaluations_dir).name: "evaluations",
Path(self.config.classifications_dir).name: "classifications",
}
parts = Path(path).parts
if len(parts) < 2 or parts[0] != "output":
return None
sub = parts[1]
if sub in buckets:
return buckets[sub]
# Heuristic fallback for common additional output subdirectories.
known = {"mappings", "analyses", "metrics", "logs"}
if sub in known:
return sub
return "other"

View File

@@ -131,6 +131,12 @@ def build_state(
This is a convenience function that assembles the state object This is a convenience function that assembles the state object
and optionally runs viability checks if *metrics* are provided. and optionally runs viability checks if *metrics* are provided.
""" """
if not isinstance(config, InfospaceConfig):
raise TypeError(
f"build_state(config=...) expects an InfospaceConfig instance, "
f"got {type(config).__name__}. If you have a path, load the "
f"config first with load_infospace_config(path)."
)
state = InfospaceState( state = InfospaceState(
config=config, config=config,
entities=entities or [], entities=entities or [],

View File

@@ -9,7 +9,11 @@ from markitect.llm.adapter import LLMAdapter
from markitect.llm.models import RunConfig, LLMResponse from markitect.llm.models import RunConfig, LLMResponse
from markitect.llm.config import resolve_api_key, find_project_root from markitect.llm.config import resolve_api_key, find_project_root
from markitect.llm._http import post_json from markitect.llm._http import post_json
from markitect.llm.exceptions import LLMConfigurationError from markitect.llm.exceptions import (
LLMConfigurationError,
LLMAPIError,
LLMRateLimitError,
)
_DEFAULT_MODEL = "gemini-2.5-flash" _DEFAULT_MODEL = "gemini-2.5-flash"
_API_BASE = "https://generativelanguage.googleapis.com/v1beta" _API_BASE = "https://generativelanguage.googleapis.com/v1beta"
@@ -26,10 +30,12 @@ class GeminiAdapter(LLMAdapter):
model: Optional[str] = None, model: Optional[str] = None,
api_key: Optional[str] = None, api_key: Optional[str] = None,
system_prompt: Optional[str] = None, system_prompt: Optional[str] = None,
max_retries: int = 3,
**_kwargs: Any, **_kwargs: Any,
): ):
self._model = model or _DEFAULT_MODEL self._model = model or _DEFAULT_MODEL
self._system_prompt = system_prompt self._system_prompt = system_prompt
self._max_retries = max_retries
root = find_project_root() root = find_project_root()
key_file_paths = [root / "apikey-geminifree.txt"] if root else [] key_file_paths = [root / "apikey-geminifree.txt"] if root else []
@@ -77,7 +83,7 @@ class GeminiAdapter(LLMAdapter):
url = f"{_API_BASE}/models/{model}:generateContent?key={self._api_key}" url = f"{_API_BASE}/models/{model}:generateContent?key={self._api_key}"
start = time.time() start = time.time()
data = post_json(url, payload, timeout=config.timeout_seconds) data = self._post_with_retries(url, payload, timeout=config.timeout_seconds)
latency = time.time() - start latency = time.time() - start
# Parse Gemini response # Parse Gemini response
@@ -113,3 +119,27 @@ class GeminiAdapter(LLMAdapter):
if not (0.0 <= config.temperature <= 2.0): if not (0.0 <= config.temperature <= 2.0):
return False return False
return True return True
# ── Internals ───────────────────────────────────────────────────
def _post_with_retries(
self,
url: str,
payload: Dict[str, Any],
timeout: int,
) -> Dict[str, Any]:
last_exc: Optional[Exception] = None
for attempt in range(self._max_retries + 1):
try:
return post_json(url, payload, timeout=timeout)
except LLMRateLimitError as exc:
last_exc = exc
if attempt < self._max_retries:
time.sleep(2 ** attempt)
except LLMAPIError as exc:
if exc.status_code in (502, 503, 504) and attempt < self._max_retries:
last_exc = exc
time.sleep(2 ** attempt)
else:
raise
raise last_exc # type: ignore[misc]

4
package-lock.json generated
View File

@@ -1,11 +1,11 @@
{ {
"name": "markitect_project", "name": "markitect-main",
"version": "1.0.0", "version": "1.0.0",
"lockfileVersion": 3, "lockfileVersion": 3,
"requires": true, "requires": true,
"packages": { "packages": {
"": { "": {
"name": "markitect_project", "name": "markitect-main",
"version": "1.0.0", "version": "1.0.0",
"license": "ISC", "license": "ISC",
"dependencies": { "dependencies": {

View File

@@ -1,5 +1,5 @@
{ {
"name": "markitect_project", "name": "markitect-main",
"version": "1.0.0", "version": "1.0.0",
"description": "", "description": "",
"main": "index.js", "main": "index.js",
@@ -14,7 +14,7 @@
}, },
"repository": { "repository": {
"type": "git", "type": "git",
"url": "http://92.205.130.254:32166/coulomb/markitect_project" "url": "http://92.205.130.254:32166/coulomb/markitect-main"
}, },
"keywords": [], "keywords": [],
"author": "", "author": "",

12
registry/README.md Normal file
View File

@@ -0,0 +1,12 @@
# Capability Registry
Markdown-first capability index for federation and reuse planning.
## Authoring
1. Copy a capability entry template (see reuse-surface `templates/capability-entry.template.md`).
2. Add the row to `indexes/capabilities.yaml`.
3. Run `reuse-surface validate` from a checkout with the CLI installed.
4. Merge to `main` and verify publish with `reuse-surface establish --publish-check`.
Federation contract: reuse-surface `docs/RegistryFederation.md`.

View File

View File

@@ -0,0 +1,4 @@
version: 1
updated: '2026-06-16'
domain: helix_forge
capabilities: []

View File

@@ -10,6 +10,14 @@ and formally closes the roadmap.
**Parent roadmap:** `roadmap/infospace-tooling/PLAN.md` **Parent roadmap:** `roadmap/infospace-tooling/PLAN.md`
**Example location:** `examples/infospace-with-history/` **Example location:** `examples/infospace-with-history/`
**Status: CLOSED (2026-04-22).** All acceptance criteria except the cosmetic
per-chapter history (C.7) are met. Final metrics: 988 entities, 988 evaluations,
6/6 viability thresholds PASS (`per_entity_mean = 3.957`). Tooling work that
came out of this close-out landed as commits `c0615c2d` (gemini retry,
unified skip-existing, non-destructive metrics I/O) and `d44a4cd3`
(`infospace entity` lookup, `evaluate --model-fallback`, `llm-check`
stale-key advisory, `build_state` type guard).
### State at workstream open (2026-02-26) ### State at workstream open (2026-02-26)
| Item | Status | | Item | Status |
@@ -22,6 +30,28 @@ and formally closes the roadmap.
| 3 missing evaluations | ⏳ Outstanding | | 3 missing evaluations | ⏳ Outstanding |
| 4 follow-up items (commit b055c8d7) | ⏳ Outstanding | | 4 follow-up items (commit b055c8d7) | ⏳ Outstanding |
### State at workstream close (2026-04-22)
| Task | Status |
|------|--------|
| C.1 Complete 3 missing entity evaluations | ✅ Done (commit f325f89d) |
| C.2 Run eval-summary and verify viability | ✅ Done — 6/6 PASS |
| C.3 Refresh metrics report (988 entities) | ✅ Done — snapshot `090bb961` |
| C.4 Document advanced usage patterns | ✅ Done — `examples/infospace-with-history/docs/advanced-usage.md` |
| C.5 Composition-examples documentation | ✅ Done — `docs/composition-guide.md` |
| C.6 Performance benchmarking note | ✅ Done — `examples/infospace-with-history/docs/performance-notes.md` |
| C.7 Clean per-chapter git history | ⏭️ Deferred indefinitely — see note below |
| C.8 Formally close S3 roadmap | ✅ This commit |
**C.7 disposition.** The task assumed a pre-existing `clean-example-history`
branch with chapters 18 already committed; that branch no longer exists in
the repo. The task is explicitly cosmetic ("does not change output files"),
and the output files themselves are canonical. Reconstructing a 35-commit
per-chapter history from scratch would be archaeological rather than useful.
Closing as "won't do" unless a specific archival need surfaces. If revisited,
entities can be grouped by their `## Source Chapter` markdown section to
reconstruct chapter membership.
--- ---
## Tasks ## Tasks

View File

@@ -1,5 +1,31 @@
# Viable Infospace Tooling — Roadmap # Viable Infospace Tooling — Roadmap
## Status: CLOSED (2026-04-22)
All three stages complete.
| Stage | Status | Notes |
|-------|--------|-------|
| Stage 1 — Platform additions (S1.1S1.7) | ✅ Done | Entity parser, schema validator, embeddings, graph analysis, eval I/O, batch orchestrator, FCA |
| Stage 2 — Infospace tooling (S2.1S2.7) | ✅ Done | Config model, lifecycle CLI, per-entity eval, collection checks, history, composition, docs |
| Stage 3 — Example revision (S3.1S3.5) | ✅ Done (except cosmetic S3.2) | See `roadmap/infospace-s3-closeout/PLAN.md` |
**Final validation (Wealth of Nations / VSM example, 988 entities):**
- 988 per-entity evaluations landed
- Collection checks pass 6/6 viability thresholds (`per_entity_mean = 3.957`
against threshold 3.5; `redundancy_ratio = 0.006`; `coverage_ratio = 0.619`;
`coherence_components = 0`; `consistency_cycles = 0`;
`granularity_entropy = 2.675`)
- Composition demonstrated via `examples/supply-chain-vsm/`
- S3.2 (clean per-chapter git history) deferred as cosmetic-only; rationale
in the close-out plan
See `roadmap/infospace-s3-closeout/PLAN.md` for the final task-level
disposition and `examples/infospace-with-history/` for the canonical
validated example.
---
## Vision ## Vision
An **infospace** is a structured, evaluable, composable collection of An **infospace** is a structured, evaluable, composable collection of

View File

@@ -39,7 +39,7 @@ Confirm the main Markitect application still works correctly with the current
capability code before publishing. capability code before publishing.
```bash ```bash
cd /home/worsch/markitect_project cd /home/worsch/markitect-main
make testdrive-jsui-test-all # 84 tests must pass make testdrive-jsui-test-all # 84 tests must pass
# Manually verify view and edit modes in the running Markitect app # Manually verify view and edit modes in the running Markitect app
``` ```

View File

@@ -30,7 +30,7 @@ class TestActualRoundtripBehavior:
cmd = ["python", "-m", "markitect.cli"] + args cmd = ["python", "-m", "markitect.cli"] + args
result = subprocess.run( result = subprocess.run(
cmd, cmd,
cwd="/home/worsch/markitect_project", cwd="/home/worsch/markitect-main",
capture_output=True, capture_output=True,
text=True text=True
) )

View File

@@ -5,7 +5,7 @@ This test implements the requirements for initializing a SQLite database
and storing markdown files with front matter parsing. and storing markdown files with front matter parsing.
Issue #1: Initialize Database and Store Example Markdown File Issue #1: Initialize Database and Store Example Markdown File
https://gitea.coulomb.social/coulomb/markitect_project/issues/1 https://gitea.coulomb.social/coulomb/markitect-main/issues/1
""" """
import pytest import pytest

View File

@@ -33,7 +33,7 @@ class TestRoundtripBase:
cmd, cmd,
capture_output=True, capture_output=True,
text=True, text=True,
cwd="/home/worsch/markitect_project" cwd="/home/worsch/markitect-main"
) )
def validate_basic_structure_preservation(self, original: str, reconstructed: str) -> Dict[str, Any]: def validate_basic_structure_preservation(self, original: str, reconstructed: str) -> Dict[str, Any]:

View File

@@ -223,3 +223,129 @@ class TestViabilityCommand:
) )
assert result.exit_code == 0 assert result.exit_code == 0
assert "No viability thresholds" in result.output assert "No viability thresholds" in result.output
# ── chapters (per-source triage view) ────────────────────────────────
class TestChaptersCommand:
@pytest.fixture
def chapters_dir(self, tmp_path):
"""Infospace with 2 source files and matching entities."""
config_yaml = """\
topic:
name: "WoN"
domain: "Economics"
sources: artifacts/sources
"""
(tmp_path / "infospace.yaml").write_text(config_yaml)
sources = tmp_path / "artifacts" / "sources"
sources.mkdir(parents=True)
(sources / "book-1-chapter-01.md").write_text("# Chapter 1\n\nText.\n")
(sources / "book-1-chapter-02.md").write_text("# Chapter 2\n\nText.\n")
entities = tmp_path / "output" / "entities"
entities.mkdir(parents=True)
(entities / "alpha.md").write_text(
"# Alpha\n\n## Definition\n\nX.\n\n"
"## Source Chapter\n\nBook I, Chapter 1\n"
)
(entities / "beta.md").write_text(
"# Beta\n\n## Definition\n\nY.\n\n"
"## Source Chapter\n\nBook I, Chapter 2\n"
)
(entities / "gamma.md").write_text(
"# Gamma\n\n## Definition\n\nZ.\n\n"
"## Source Chapter\n\nBook I, Chapter 2\n"
)
return tmp_path
def test_lists_sources_with_counts(self, runner, chapters_dir):
result = runner.invoke(
infospace_commands,
["chapters", "--config", str(chapters_dir / "infospace.yaml")],
)
assert result.exit_code == 0
assert "book-1-chapter-01" in result.output
assert "book-1-chapter-02" in result.output
# ch 1 -> 1 entity, ch 2 -> 2 entities
assert "2 source file(s); 3 entities" in result.output
def test_json_format(self, runner, chapters_dir):
result = runner.invoke(
infospace_commands,
["chapters", "--config", str(chapters_dir / "infospace.yaml"),
"--format", "json"],
)
assert result.exit_code == 0
import json
rows = json.loads(result.output)
by_id = {r["source_id"]: r for r in rows}
assert by_id["book-1-chapter-01"]["entities"] == 1
assert by_id["book-1-chapter-02"]["entities"] == 2
def test_no_sources_dir(self, runner, tmp_path):
(tmp_path / "infospace.yaml").write_text(
"topic:\n name: X\n sources: missing\n"
)
result = runner.invoke(
infospace_commands,
["chapters", "--config", str(tmp_path / "infospace.yaml")],
)
assert result.exit_code == 1
# ── process: eval-after-source / classify-after-source flags ─────────
class TestProcessAfterSourceFlags:
def test_flags_registered_in_help(self, runner):
result = runner.invoke(infospace_commands, ["process", "--help"])
assert result.exit_code == 0
assert "--eval-after-source" in result.output
assert "--classify-after-source" in result.output
def test_flags_require_provider(self, runner, tmp_path):
(tmp_path / "infospace.yaml").write_text(
"topic:\n name: X\n sources: sources\n"
"pipeline:\n stages:\n - template: extract-entities\n"
)
sources = tmp_path / "sources"
sources.mkdir()
(sources / "s1.md").write_text("source")
result = runner.invoke(
infospace_commands,
["process", "--all",
"--config", str(tmp_path / "infospace.yaml"),
"--eval-after-source"],
)
assert result.exit_code == 1
assert "require --provider" in result.output
# ── pipeline: commit body composition ────────────────────────────────
class TestCommitBodyComposition:
def test_bucket_for(self, tmp_path):
from markitect.infospace.config import InfospaceConfig, TopicConfig
from markitect.infospace.pipeline import SourcePipeline
cfg = InfospaceConfig(topic=TopicConfig(name="T", domain="D"))
p = SourcePipeline(cfg, tmp_path)
assert p._bucket_for("output/entities/x.md") == "entities"
assert p._bucket_for("output/evaluations/x.md") == "evaluations"
assert p._bucket_for("output/classifications/x.md") == "classifications"
assert p._bucket_for("output/mappings/x.md") == "mappings"
assert p._bucket_for("output/notes/x.md") == "other"
assert p._bucket_for("README.md") is None # not under output/
def test_compose_body_uses_default_on_no_diff(self, tmp_path):
"""When git diff fails or returns empty, fall back to the default blurb."""
from markitect.infospace.config import InfospaceConfig, TopicConfig
from markitect.infospace.pipeline import SourcePipeline
cfg = InfospaceConfig(topic=TopicConfig(name="T", domain="D"))
# Not a git repo, so `git diff --cached` will raise CalledProcessError.
p = SourcePipeline(cfg, tmp_path)
body = p._compose_commit_body("some-source")
assert "Extract entities" in body

View File

@@ -124,6 +124,33 @@ class TestMetricsFileIO:
path.write_text("just a string", encoding="utf-8") path.write_text("just a string", encoding="utf-8")
assert read_metrics_file(path) == {} assert read_metrics_file(path) == {}
def test_round_trip_preserves_structured_values(self, tmp_path):
"""Non-numeric values like type_distribution must survive a round-trip.
Regression: eval-summary --update-metrics used to drop any key
whose value wasn't a bare number, silently erasing type_distribution
from the file on every run.
"""
path = tmp_path / "metrics.yaml"
metrics = {
"per_entity_mean": 3.9567,
"vsm_type_matrix_cells": 29,
"type_distribution": {
"Element": 315,
"Institution": 122,
"Principle": 102,
},
}
write_metrics_file(metrics, path)
loaded = read_metrics_file(path)
assert loaded["type_distribution"] == {
"Element": 315, "Institution": 122, "Principle": 102,
}
# And the int stayed an int on disk, not 29.0.
raw = path.read_text(encoding="utf-8")
assert "vsm_type_matrix_cells: 29\n" in raw
assert "vsm_type_matrix_cells: 29.0" not in raw
# ── record_check_results ──────────────────────────────────────────── # ── record_check_results ────────────────────────────────────────────

View File

@@ -0,0 +1,82 @@
"""Tests for markitect.llm.gemini — retry behavior + happy path."""
from unittest import mock
import pytest
from markitect.llm.gemini import GeminiAdapter
from markitect.llm.exceptions import LLMAPIError, LLMRateLimitError
from markitect.prompts.execution.models import RunConfig, LLMResponse
def _api_response(text="hello", model="gemini-2.5-flash"):
return {
"candidates": [
{
"content": {"parts": [{"text": text}], "role": "model"},
"finishReason": "STOP",
}
],
"modelVersion": model,
"usageMetadata": {
"promptTokenCount": 3,
"candidatesTokenCount": 2,
"totalTokenCount": 5,
},
}
class TestGeminiAdapter:
def _adapter(self, **kwargs):
defaults = {"api_key": "AIza-test"}
defaults.update(kwargs)
return GeminiAdapter(**defaults)
@mock.patch("markitect.llm.gemini.post_json")
def test_success(self, mock_post):
mock_post.return_value = _api_response("generated")
adapter = self._adapter()
resp = adapter.execute_prompt("hi", RunConfig())
assert isinstance(resp, LLMResponse)
assert resp.content == "generated"
assert resp.metadata["provider"] == "gemini"
@mock.patch("markitect.llm.gemini.post_json")
@mock.patch("markitect.llm.gemini.time.sleep")
def test_retry_on_429(self, mock_sleep, mock_post):
mock_post.side_effect = [
LLMRateLimitError("rate limited", status_code=429),
_api_response("recovered"),
]
adapter = self._adapter(max_retries=2)
resp = adapter.execute_prompt("hi", RunConfig())
assert resp.content == "recovered"
assert mock_sleep.call_count == 1
@mock.patch("markitect.llm.gemini.post_json")
@mock.patch("markitect.llm.gemini.time.sleep")
def test_retry_on_503(self, mock_sleep, mock_post):
mock_post.side_effect = [
LLMAPIError("unavailable", status_code=503),
_api_response("back"),
]
adapter = self._adapter(max_retries=2)
resp = adapter.execute_prompt("hi", RunConfig())
assert resp.content == "back"
@mock.patch("markitect.llm.gemini.post_json")
def test_no_retry_on_400(self, mock_post):
mock_post.side_effect = LLMAPIError("bad request", status_code=400)
adapter = self._adapter(max_retries=2)
with pytest.raises(LLMAPIError) as exc_info:
adapter.execute_prompt("hi", RunConfig())
assert exc_info.value.status_code == 400
@mock.patch("markitect.llm.gemini.post_json")
@mock.patch("markitect.llm.gemini.time.sleep")
def test_exhausted_retries_raises(self, mock_sleep, mock_post):
mock_post.side_effect = LLMRateLimitError("rate limited", status_code=429)
adapter = self._adapter(max_retries=1)
with pytest.raises(LLMRateLimitError):
adapter.execute_prompt("hi", RunConfig())
assert mock_sleep.call_count == 1 # 1 retry before giving up

View File

@@ -0,0 +1,67 @@
---
id: MARKITECT-WP-0001
type: workplan
title: "Bootstrap State Hub integration"
domain: communication
repo: markitect-main
status: finished
owner: codex
topic_slug: communication
created: "2026-06-22"
updated: "2026-06-22"
state_hub_workstream_id: "dfc40b03-fe8e-49fe-b8d4-86eb1fe26b4a"
---
# Bootstrap State Hub integration
Knowledge artifact management and markdown engine platform.
## Review Generated Integration Files
```task
id: MARKITECT-WP-0001-T01
status: done
priority: high
state_hub_task_id: "7455a381-a93d-4220-8f80-3b6ccf953cff"
```
Result 2026-06-22: SCOPE.md and INTRODUCTION.md reviewed; AGENTS.md confirmed.
Review `INTENT.md`, `SCOPE.md`, `AGENTS.md`, and `.custodian-brief.md`.
Replace generated placeholders with repo-specific facts where needed.
## Verify Local Developer Workflow
```task
id: MARKITECT-WP-0001-T02
status: done
priority: high
state_hub_task_id: "7e34bdab-aa49-49ca-b28a-b254725dd8db"
```
Result 2026-06-22: Documented make-based Python/JS workflow.
Identify the repo's install, test, lint, build, and run commands. Add or refine
those commands in the agent instructions so future coding sessions can verify
changes confidently.
## Seed First Real Workplan
```task
id: MARKITECT-WP-0001-T03
status: done
priority: medium
state_hub_task_id: "35a64da7-dda9-4315-901d-88c6827432d9"
```
Result 2026-06-22: MARKITECT-WP-0002 already exists (TestDrive npm publication).
Create the first implementation workplan for the repository's most important
next change. After workplan file updates, run from `~/state-hub`:
```bash
make fix-consistency REPO=markitect-main
```

View File

@@ -0,0 +1,28 @@
---
id: MARKITECT-WP-0002
type: workplan
title: "TestDrive-JSUI — npm Publication"
domain: communication
repo: markitect-main
status: backlog
owner: codex
topic_slug: communication
created: "2026-06-22"
updated: "2026-06-22"
state_hub_workstream_id: "e203d487-01f1-494a-b14d-a436241a4c01"
---
# TestDrive-JSUI — npm Publication
Backlog workstream for publishing the TestDrive JSUI package to npm.
## Publication Readiness
```task
id: MARKITECT-WP-0002-T01
status: todo
priority: medium
state_hub_task_id: "88b3c206-4d45-4bb3-bbb3-47443cdf2123"
```
Define package scope, versioning, and publication checklist for TestDrive-JSUI.