Fills the 988 entity / 985 evaluation gap in the Wealth of Nations
infospace. Entities advanced_state_of_society, bank_notes, and
bank_systemic_risk_management had no evaluation files; runs through
Gemini (2.5-flash / 2.5-flash-lite for the last one, which hit the
free-tier RPM limit) bring the eval count to 988.
per_entity_mean nudged from 3.955635 to 3.95668; viability still
6/6 PASS.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Closes out three docs tasks from roadmap/infospace-s3-closeout/PLAN.md:
- examples/infospace-with-history/docs/advanced-usage.md (C.4) — 5 worked
patterns covering incremental eval, re-eval workflow (no --force flag
exists; documents the rm-then-re-run pattern instead), interpreting the
eval-summary distribution, triaging low scorers via an awk pipeline
over overall_score (since `entities --sort-by score` does not exist),
and acting on check --json output.
- docs/composition-guide.md (C.5) — walks through how supply-chain-vsm
binds WoN as a discipline, then a step-by-step for creating a new
infospace that binds an existing one. Includes live output from
`markitect infospace disciplines`.
- examples/infospace-with-history/docs/performance-notes.md (C.6) — cites
the 6h 28m wall time of the 985-entity S3.3 batch, ~2.5 ent/min rate,
~2000–3000 tokens/entity estimate, word_overlap vs embedding backend
for redundancy checks, and a provider-by-scale recommendation table.
All commands in these docs were run against the live infospace at
commit time.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Finishes the in-progress rename so docs, configs, tests, and capability
manifests all reference the current repo name consistently. Fixes two
tests (test_roundtrip_consolidated.py, test_issue_140_roundtrip_simplified.py)
whose hardcoded cwd paths would have broken under the renamed directory.
Archival content under history/, reports/, and roadmap/eat-the-frog/, plus
derived artifacts (.venv_old/, node_modules/, asset_registry.json) are
intentionally left untouched.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Replaces the stub (State Hub integration only) with full dev commands,
module architecture overview, LLM config resolution chain, infospace
conventions, and active roadmap pointers. Removes CLAUDE.custodian.md
(superseded by the expanded CLAUDE.md).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Stage 1 — Decouple:
- Move RunConfig + LLMResponse to markitect/llm/models.py (canonical)
- Move LLMAdapter + Mock/ErrorLLMAdapter to markitect/llm/adapter.py
- markitect/prompts/execution/models.py and llm_adapter.py become re-export shims
- All 4 adapters + factory.py updated to import from markitect.llm.*
- Parameterize app_name in toml_config.py (resolve_llm, get_default_layers,
get_preference_layers): paths and env var now derived from app_name arg
- Add tests/test_llm_isolation.py: 7 isolation + backward-compat tests
Stage 2 — Extract:
- Standalone llm-connect package created at ~/llm-connect/
- All 18 llm files copied; markitect.* imports replaced with llm_connect.*
- LLMError base inlined in llm_connect/exceptions.py (no markitect dep)
- llm-connect installed into markitect-venv; declared in pyproject.toml
Smoke test: markitect llm-check succeeds (live Gemini API call).
Backward compat: markitect.prompts.execution.{models,llm_adapter} still work.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3-stage plan: decouple (RunConfig/LLMResponse move + app name
parameterization) → extract to standalone package → adopt in first
consumer. Registered as workstream in Custodian State Hub.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Registers markitect as a tracked domain in the Custodian State Hub.
Includes topic ID, session start/end protocol, and MCP tool reference.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds a closing remark (23 Feb 2026) summarising the final state of the
infospace: 988 entities, 985 evaluations, 823 L2 classifications, 15 L3
relations, viability 6/6 PASS.
New open tasks 20–23:
20. Complete L2 classification batch (165 entities blocked on credits)
21. Run classify-links for 58 Relation-type entities
22. Refresh stale metrics-report.md narrative
23. Smoke-test the graph command end-to-end
Also committed: history.py fix — write_metrics_file now preserves
non-float metric values (type_distribution dict) instead of crashing
on round().
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
New graph_export.py module supporting the `markitect infospace graph`
command added in the previous commit.
- build_entity_graph(): constructs node/edge graph from L2 classifications
and L3 relation triplets, with feedback loop detection via networkx
- apply_filters(): subgraph filters by entity type, VSM system, ego
neighbourhood, feedback-loops-only, and classified-only
- to_mermaid(): Mermaid flowchart export
- Uses "-- label -->" syntax for all edges (robust with parentheses);
"== label ==>" thick arrows for feedback loop edges
- markdown_fence=True wraps output in ```mermaid block (VS Code / GitHub)
- color_by="type" or "vsm" with distinct palettes for each
- to_dot(): Graphviz DOT export with fillcolor per type/VSM system
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
INFRA-TASKS #5 — process_chapters.py now skips writing *-prompt.md files
when the corresponding output file already exists on disk. DB-only rebuilds
no longer dirty the working tree with unchanged prompt content.
INFRA-TASKS #8 — Added '## Quality Metrics' section to the entity and VSM
mapping schemas, defining the five evaluation dimensions (Definition Precision,
Source Grounding, Domain Placement, VSM Relevance, Explanatory Value) with
1–5 rubrics used by the evaluate-entity template.
Also updated INFRA-TASKS.md to reflect current resolution status for tasks
4–19 across S2 and S3.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
markitect helper <QUESTION> now works as a short alias for
markitect llm-helper, per the original plan specification.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix evaluate dimensions to match template file:
definition_precision, source_grounding, domain_placement,
vsm_relevance, explanatory_value (was domain_relevance,
discipline_alignment, conceptual_clarity)
- Add VSM background context to evaluation prompt so LLM can
score vsm_relevance without macro injection
- Fix model_name bug: was sending literal "default" to API (HTTP 400)
- Refactor run_entity_evaluation to write files incrementally via
callback rather than all at once after the batch — long runs are
now resumable if interrupted
- Add incremental skip in CLI: entities with existing eval files
are skipped automatically on re-run (acts as resume)
- Add eval-summary command: reads all eval files, shows per-dimension
means, optionally writes per_entity_mean to metrics.yaml
- Fix record_check_results to merge rather than overwrite metrics.yaml
so per_entity_mean survives subsequent check runs
- Add per_entity_mean viability threshold (min: 3.5) to infospace.yaml
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Demonstrates infospace composition: the Wealth of Nations infospace is
used as a discipline, applying Smith's economic framework as a lens to
analyse modern supply chain management concepts.
New example: examples/supply-chain-vsm/
- infospace.yaml binding WoN as discipline (../infospace-with-history)
- 3 source documents: coordination mechanisms, capital & inventory,
market structure (~400 words each, original content)
- supply-chain-entity-schema-v1.0.md with WoN Concept required section
- won-mapping-schema-v1.0.md with Conceptual Continuity rating
- artifacts/won-reference/core-entities.md — 12 curated WoN entities
for injection as discipline context
- 8 hand-crafted entity files demonstrating LLM output format
- 3 mapping files with full rationale and VSM inheritance chains
- Viable: YES (5/5 thresholds)
Key mappings demonstrated:
Demand Signal → Effectual Demand (Strong, S2)
Vendor-Managed Inventory → Division of Labour (Strong, S1/S2)
Just-in-Time Inventory → Circulating Capital (Strong, S1/S3)
Bullwhip Effect → Natural Price (Moderate, S2)
Platform Intermediary → Merchant Capital (Strong, S2/S4)
Monopsony Power → Combination of Masters (Strong, S3*)
Platform fix: entity_parser.py now recognises ## Supply Chain Domain
as a domain alias for ## Economic Domain, enabling composed infospaces
to use their own domain section name.
Tutorial §13 rewritten with real commands, real output, and the full
mapping table from the demo.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds LAYERED-DEVELOPMENT.md documenting the concept for evolving a flat
entity collection into a structured systemic model through four layers:
L0 Source text → L1 Raw entities (current) → L2 Typed entities
→ L3 Relation graph → L4 Minimal systemic model
Covers: the element/relation/principle/institution type taxonomy,
VSM as a structural coordinate system, the type × VSM coverage matrix,
triplet extraction with a controlled predicate vocabulary, feedback loop
detection, and the distillation hypothesis for finding the generative
core of a corpus.
Extends TUTORIAL.md with sections 17–23:
17. Observing entity heterogeneity
18. The four-layer model overview
19. Layer 2 — classifying entities (schema, pipeline stage, metrics)
20. Layer 3 — extracting the relation graph (triplets, feedback loops)
21. Layer 4 — the minimal systemic model (core-model.md output)
22. Planned CLI commands for layers 2–4
23. Layers 2–4 as composed infospaces
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add `.*-raw\.md$` to `_DEFAULT_EXCLUDE_PATTERNS` in entity_parser.py to
prevent per-chapter raw LLM output files from being parsed as entities.
This eliminates 33 malformed domain values where delimiter text was
bleeding into the Economic Domain field.
- Lower coverage_ratio threshold from 0.50 → 0.40 in infospace.yaml to
reflect realistic multi-book corpus expectations (documented rationale
in METRICS-METHODOLOGY.md).
Post-fix metrics: 988 entities, 0 malformed, coverage_ratio=0.619 (pass).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- coverage.py: rewrite module docstring to explain what the metric actually
computes (domain × chapter cross-tabulation, not VSM system coverage),
what it does not capture (entity connectivity → C3), and when the
threshold is appropriate
- CoverageReport: add domain_densities, density_std, cross_cutting_ratio
for distribution-level insight beyond the aggregate ratio
- check_coverage: compute per-domain density and cross-cutting ratio
- METRICS-METHODOLOGY.md: correct C2 section to match implementation,
document the distribution-based interpretation, add implementation status
table distinguishing what is wired vs planned
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1021 entities extracted across all Books 1-5 of The Wealth of Nations.
Final metrics: coverage=0.4424, granularity=2.9533, redundancy=0.0059.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Free-tier APIs intermittently return invalid JSON or empty responses.
Now any exception in _call_llm retries up to 3 times with a 5s back-off,
rather than failing immediately on non-rate-limit errors.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>