Commit Graph

637 Commits

Author SHA1 Message Date
eb9b622499 chore: gitignore Claude Code session lock files
Some checks failed
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
`.claude/scheduled_tasks.lock` is per-session runtime state (holds the
owning session id and pid for the ScheduleWakeup queue); it shouldn't
be committed. Widened the pattern to `.claude/*.lock` so future lock
kinds are covered too.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-22 21:50:20 +02:00
e3e5b8ecc1 feat(infospace): systematic long-text processing — rich commit bodies, per-source eval/classify, chapters view
Some checks failed
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
Three coordinated changes that let the pipeline produce a clean
chapter-by-chapter git history on long texts without archaeology after
the fact.

1. Richer commit messages. `SourcePipeline._git_commit` now diffs the
   staged changes, buckets added files by output subdirectory (entities,
   evaluations, classifications, mappings, analyses, metrics, logs), and
   includes counts in the commit body. So `git log` reads "entities:
   +23, evaluations: +23" per chapter instead of the same generic blurb
   on every commit. Zero behaviour change when no output changed; falls
   back to the original message if the diff query fails.

2. --eval-after-source / --classify-after-source on `infospace process`.
   After a source's stages succeed, the pipeline identifies which entity
   files are *new* (set diff of entity slugs before vs after), loads
   their EntityMeta, and runs per-entity evaluation and/or
   classification scoped to just those slugs before the per-source git
   commit lands. Result: each chapter's commit is self-contained —
   extraction + evaluation + classification in one atomic unit. Gated
   behind explicit flags because the cost is real (LLM latency per
   chapter rather than amortised across one bulk batch).

3. `markitect infospace chapters` subcommand. Lists source files in
   canonical order with entity count, evaluated count, classified
   count, and mean per-entity score per source. Text or JSON output.
   Natural triage surface for long-text infospaces — spot chapters that
   under-extracted or evaluated poorly.

Also: `docs/advanced-usage.md` gets a new "Systematic processing of
long texts" section with the recommended flag combo and the tradeoff
note on cost.

11 new unit tests cover the chapters command (text/json/no-sources),
the process flag wiring (help + provider requirement), and the
commit-body bucket logic. Full infospace+llm unit suite (315 tests)
green; 3 pre-existing infospace failures unchanged.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-22 08:24:26 +02:00
9e8d73fa7d docs(roadmap): close out infospace tooling S3 and parent roadmap
All three stages of the infospace tooling roadmap are complete. The Wealth
of Nations / VSM example passes 6/6 viability thresholds on 988 entities,
and composition is demonstrated via the supply-chain-vsm example.

- Parent roadmap (roadmap/infospace-tooling/PLAN.md): header now shows the
  closed status with final validation metrics.
- S3 close-out plan (roadmap/infospace-s3-closeout/PLAN.md): records the
  final task dispositions. C.1–C.6 and C.8 done; C.7 (clean per-chapter
  git history) is deferred indefinitely — the task was cosmetic, its
  prerequisite branch no longer exists, and reconstructing 35 archival
  commits would not change any output files. Rationale documented inline.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-22 07:08:43 +02:00
d44a4cd3df feat(infospace,llm): agent ergonomics — entity lookup, model fallback, better errors
- `markitect infospace entity <name>`: single-entity lookup tolerating
  hyphens/underscores/case, with substring matching, ambiguity listing,
  and near-match hints. Prints slug, source path, domain, chapter, word
  count, VSM system, overall score, evaluator, and evaluation file path.
- `markitect infospace evaluate --model-fallback <model>`: if any
  entities fail with a rate-limit error, retry just those with a fresh
  adapter on the fallback model (different free-tier models have
  separate quota buckets).
- `markitect llm-check`: advisory when `OPENROUTER_API_KEY` is set but
  not used by the resolved provider; targeted hint when OpenRouter
  returns 401 (almost always a stale env key).
- `build_state`: raises `TypeError` with actionable message if passed a
  path instead of an `InfospaceConfig` — prior failure mode was a
  confusing `AttributeError` deep in the stack.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-22 01:07:25 +02:00
c0615c2d50 feat(infospace,llm): stabilize free-tier eval workflow
Some checks failed
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
Five improvements that eliminate most of the agent-in-the-loop friction
observed while closing out the 988-entity WoN evaluation (C.1):

1. Gemini adapter now retries on 429 + 5xx with exponential backoff
   (same pattern already used by OpenRouter/OpenAI). Removes the need
   for shell-level retry wrappers when hitting free-tier rate limits.

2. evaluate CLI prints the underlying error ("ERROR — HTTP 503 …")
   instead of a bare "ERROR", so agents don't have to drop into Python
   to diagnose transient failures.

3. --entity/--chapter now respect existing evaluation files by default
   (previously only the full-collection pass did). New --force flag
   opts into re-evaluation. Stops silently burning free-tier quota on
   re-runs of the same slug.

4. --entity accepts hyphenated slugs (matching entity filenames) and
   normalizes them to the underscore form used on disk. On a miss the
   CLI suggests near matches instead of a bare "not found".

5. eval-summary --update-metrics is no longer destructive:
   read_metrics_file/write_metrics_file preserve structured values
   (type_distribution) and don't flatten ints to floats. Fixes a
   silent data loss observed on every run.

Bonus: the evaluator field in written evaluation frontmatter now
falls back from run_config.model_name to the adapter's resolved model
(or the model echoed back in the API response), so rows no longer
show `evaluator: null` when --model is omitted.

Tests: new tests/unit/llm/test_gemini.py covers retry behavior;
tests/unit/infospace/test_history.py gains a round-trip test that
pins the type_distribution / int-preservation invariants.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-22 00:51:00 +02:00
965508ec06 chore(consistency): sync task status from DB [auto]
Updated by fix-consistency on 2026-04-22:
  - update .custodian-brief.md for markitect-project
2026-04-22 00:28:46 +02:00
f325f89dc9 feat(infospace): evaluate 3 missing WoN entities (C.1)
Some checks failed
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
Fills the 988 entity / 985 evaluation gap in the Wealth of Nations
infospace. Entities advanced_state_of_society, bank_notes, and
bank_systemic_risk_management had no evaluation files; runs through
Gemini (2.5-flash / 2.5-flash-lite for the last one, which hit the
free-tier RPM limit) bring the eval count to 988.

per_entity_mean nudged from 3.955635 to 3.95668; viability still
6/6 PASS.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-21 23:52:04 +02:00
36a5136bdf docs(infospace): add advanced-usage, composition guide, and performance notes (C.4/C.5/C.6)
Closes out three docs tasks from roadmap/infospace-s3-closeout/PLAN.md:

- examples/infospace-with-history/docs/advanced-usage.md (C.4) — 5 worked
  patterns covering incremental eval, re-eval workflow (no --force flag
  exists; documents the rm-then-re-run pattern instead), interpreting the
  eval-summary distribution, triaging low scorers via an awk pipeline
  over overall_score (since `entities --sort-by score` does not exist),
  and acting on check --json output.
- docs/composition-guide.md (C.5) — walks through how supply-chain-vsm
  binds WoN as a discipline, then a step-by-step for creating a new
  infospace that binds an existing one. Includes live output from
  `markitect infospace disciplines`.
- examples/infospace-with-history/docs/performance-notes.md (C.6) — cites
  the 6h 28m wall time of the 985-entity S3.3 batch, ~2.5 ent/min rate,
  ~2000–3000 tokens/entity estimate, word_overlap vs embedding backend
  for redundancy checks, and a provider-by-scale recommendation table.

All commands in these docs were run against the live infospace at
commit time.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-21 07:02:46 +02:00
b7e11461f4 chore: rename markitect_project to markitect-main across project
Finishes the in-progress rename so docs, configs, tests, and capability
manifests all reference the current repo name consistently. Fixes two
tests (test_roundtrip_consolidated.py, test_issue_140_roundtrip_simplified.py)
whose hardcoded cwd paths would have broken under the renamed directory.

Archival content under history/, reports/, and roadmap/eat-the-frog/, plus
derived artifacts (.venv_old/, node_modules/, asset_registry.json) are
intentionally left untouched.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-21 01:57:35 +02:00
3966814868 updated SCOPE file
Some checks failed
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
2026-03-25 00:11:46 +01:00
f4610a46e3 docs: add SCOPE.md for rapid orientation
Some checks failed
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 23:11:42 +01:00
0d95e6dbcf docs(claude): expand CLAUDE.md with commands and architecture
Some checks failed
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
Replaces the stub (State Hub integration only) with full dev commands,
module architecture overview, LLM config resolution chain, infospace
conventions, and active roadmap pointers. Removes CLAUDE.custodian.md
(superseded by the expanded CLAUDE.md).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-04 23:28:03 +01:00
36c20f37d0 feat(llm): extract adapter layer for standalone llm-connect package (S1+S2)
Some checks failed
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
Stage 1 — Decouple:
- Move RunConfig + LLMResponse to markitect/llm/models.py (canonical)
- Move LLMAdapter + Mock/ErrorLLMAdapter to markitect/llm/adapter.py
- markitect/prompts/execution/models.py and llm_adapter.py become re-export shims
- All 4 adapters + factory.py updated to import from markitect.llm.*
- Parameterize app_name in toml_config.py (resolve_llm, get_default_layers,
  get_preference_layers): paths and env var now derived from app_name arg
- Add tests/test_llm_isolation.py: 7 isolation + backward-compat tests

Stage 2 — Extract:
- Standalone llm-connect package created at ~/llm-connect/
- All 18 llm files copied; markitect.* imports replaced with llm_connect.*
- LLMError base inlined in llm_connect/exceptions.py (no markitect dep)
- llm-connect installed into markitect-venv; declared in pyproject.toml

Smoke test: markitect llm-check succeeds (live Gemini API call).
Backward compat: markitect.prompts.execution.{models,llm_adapter} still work.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 08:04:50 +01:00
72b87fd82e docs(roadmap): add workplans for infospace S3 close-out and JSUI publication
infospace-s3-closeout: 8 tasks (C.1-C.8) covering 3 missing evals,
viability sign-off, docs (advanced usage, composition, perf), deferred
git history cleanup, and formal roadmap closure.

testdrive-jsui-publication: 9 tasks (P.1-P.9) covering repo structure
decision, Markitect integration gate, pack/dry-run, npm publish, CDN
verify, fresh install test, GitHub release, and badges.

Both registered as workstreams in Custodian State Hub.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-26 00:44:05 +01:00
eaf4a955af docs(roadmap): add workplan for extracting llm module as shared library
Some checks failed
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
3-stage plan: decouple (RunConfig/LLMResponse move + app name
parameterization) → extract to standalone package → adopt in first
consumer. Registered as workstream in Custodian State Hub.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-24 21:51:54 +01:00
e9dc9a8517 docs(custodian): add session protocol CLAUDE.md for State Hub integration
Registers markitect as a tracked domain in the Custodian State Hub.
Includes topic ID, session start/end protocol, and MCP tool reference.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-24 21:42:41 +01:00
b055c8d7bb docs(example): close out INFRA-TASKS with summary and 4 follow-up items
Some checks failed
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
Adds a closing remark (23 Feb 2026) summarising the final state of the
infospace: 988 entities, 985 evaluations, 823 L2 classifications, 15 L3
relations, viability 6/6 PASS.

New open tasks 20–23:
  20. Complete L2 classification batch (165 entities blocked on credits)
  21. Run classify-links for 58 Relation-type entities
  22. Refresh stale metrics-report.md narrative
  23. Smoke-test the graph command end-to-end

Also committed: history.py fix — write_metrics_file now preserves
non-float metric values (type_distribution dict) instead of crashing
on round().

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-23 13:45:58 +01:00
ef3d47779e feat(infospace): add entity-relation graph export (Mermaid + DOT)
Some checks failed
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
New graph_export.py module supporting the `markitect infospace graph`
command added in the previous commit.

- build_entity_graph(): constructs node/edge graph from L2 classifications
  and L3 relation triplets, with feedback loop detection via networkx
- apply_filters(): subgraph filters by entity type, VSM system, ego
  neighbourhood, feedback-loops-only, and classified-only
- to_mermaid(): Mermaid flowchart export
  - Uses "-- label -->" syntax for all edges (robust with parentheses);
    "== label ==>" thick arrows for feedback loop edges
  - markdown_fence=True wraps output in ```mermaid block (VS Code / GitHub)
  - color_by="type" or "vsm" with distinct palettes for each
- to_dot(): Graphviz DOT export with fillcolor per type/VSM system

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-23 13:14:25 +01:00
d1f57272a4 feat(example): add L2 classifications for 823/988 WoN entities (S3.4)
Batch classification via OpenRouter (claude-sonnet-4). 165 entities
remain unclassified due to credit exhaustion; incremental skip means
a follow-up run will complete them automatically.

Type × VSM matrix (823 entities):
                  S1   S2   S3  S3*   S4   S5
  Element         86   75   58   21   43   32  (315 total, 38%)
  Process         39   42   37   17   67   24  (226 total, 28%)
  Institution      4   12   30   24    .   52  (122 total, 15%)
  Principle        3    7   15    2   43   32  (102 total, 12%)
  Relation         2   14    5    5   22   10   (58 total,  7%)
  Matrix fill: 29/30 cells (Institution/S4 empty — expected)

Metrics updated: type_entropy=2.0936, vsm_type_matrix_cells=29

Also:
- BatchEvaluator gains delay_seconds param for rate-limited providers
- classify CLI gains --rpm option (--rpm 10 for Gemini free tier)
- history.write_metrics_file now handles non-float metric values
  (type_distribution is a dict, was crashing round())
- run_entity_classification forwards delay_seconds to BatchEvaluator
- classify-links and graph commands added by user (entities --by-type,
  graph --format mermaid/dot, classify-links for Relation enrichment)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-23 12:49:11 +01:00
a9ca0adfcf feat(example): add per-entity LLM evaluations for 985 WoN entities (S3.3)
Batch evaluation of all 988 entities via OpenRouter. 984 succeeded on
first pass; 3 failed (network errors). eval-summary --update-metrics
written with per_entity_mean=3.9556.

Viability dashboard: 6/6 PASS
  redundancy_ratio   0.0061  (max 0.10)
  coverage_ratio     0.6190  (min 0.40)
  coherence_comps    0.0000  (max 3)
  consistency_cycles 0.0000  (max 0)
  granularity_entropy 2.6748 (min 1.0)
  per_entity_mean    3.9556  (min 3.5)

Dimension breakdown (mean across 985 entities):
  definition_precision  3.62
  source_grounding      4.36
  domain_placement      4.56
  vsm_relevance         3.31
  explanatory_value     3.94

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-23 09:36:46 +01:00
81a4c8796a feat(infospace): add L2 entity classification with type × VSM matrix (S2.9)
Implements the L2 typed-entities layer — each entity is assigned an
Entity Type (Element, Process, Relation, Principle, Institution) and a
VSM System (S1–S5) by an LLM, with one-sentence rationales for each.

New modules:
- markitect/infospace/classification.py — EntityClassification dataclass
  + ENTITY_TYPES / VSM_SYSTEMS controlled vocabularies
- markitect/infospace/classification_io.py — write/read classification
  files (YAML frontmatter + markdown body, mirrors evaluation_io)
- markitect/infospace/classifier.py — build_classification_prompt(),
  parse_classification_response(), run_entity_classification(); batch
  runner writes files incrementally (same resumable pattern as evaluate)

CLI: markitect infospace classify [--entity SLUG] [--provider P] [--model M]
  - Incremental skip: checks output/classifications/ for existing files
  - Defaults to openrouter provider; 2000 max_tokens (Gemini 2.5 Flash
    uses ~787 thinking tokens, so 800 was too low)

CLI: markitect infospace classify-summary [--update-metrics]
  - Entity type counts + VSM system counts with percentages
  - 5 × 6 type × VSM matrix (spots structural blind spots at a glance)
  - --update-metrics writes type_distribution, type_entropy,
    vsm_type_matrix_cells to metrics.yaml

Config: InfospaceConfig gains classifications_dir (default output/classifications)
Schema: schemas/typed-entity-schema-v1.0.md — type/VSM vocabulary tables,
  rationale format rules, validation rules, metrics enabled at L2
infospace.yaml: schemas.typed_entity references typed-entity-schema-v1.0.md

Seed classifications (3): division_of_labour (Process/S1),
  natural_price_as_central_price (Principle/S2),
  invisible_hand_mechanism (Principle/S4)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-23 09:35:58 +01:00
2d45425b25 feat(infospace): add L3 relation graph with VSM-aware triplets (S2.8)
Implements the L3 relation graph layer — a directed graph of (Subject,
Predicate, Object) triplets annotated with VSM channel codes and feedback
roles. Triplets are authored as markdown files under output/relations/,
parsed into RelationMeta dataclasses, and analysed with networkx.

New modules:
- markitect/infospace/relation_models.py — RelationMeta dataclass +
  RELATION_TYPES controlled vocabulary (15 relation classes → VSM codes)
- markitect/infospace/relation_parser.py — parse_relation_file() and
  parse_relations_directory()

New schema: examples/infospace-with-history/schemas/relation-schema-v1.0.md
  — file naming convention, required sections, controlled vocabulary table

15 seed relation files covering the three core WoN feedback loops:
  - Capital Accumulation loop (positive reinforcement, S1/S3)
  - Market Price Balancing loop (negative feedback, S2/S3)
  - Market Extent mutual dependency (S1/S2)
  Plus structural relations: wages regulation, rent residual, price
  decomposition, invisible hand coordination

CLI: markitect infospace relations [--entity SLUG] [--vsm FILTER]
     [--loops] [--stats]
  - Builds directed graph from parsed files
  - Detects feedback loops via nx.simple_cycles()
  - 6 loops found from 15 seed relations (3 intended + 3 emergent)
  - --stats aggregates by VSM system code (strips parentheticals)

Config: InfospaceConfig gains relations_dir (default output/relations)
infospace.yaml: schemas.relation references relation-schema-v1.0.md

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-23 06:04:28 +01:00
fa27572f43 fix(example): skip prompt writes when output exists, add quality rubrics
INFRA-TASKS #5 — process_chapters.py now skips writing *-prompt.md files
when the corresponding output file already exists on disk. DB-only rebuilds
no longer dirty the working tree with unchanged prompt content.

INFRA-TASKS #8 — Added '## Quality Metrics' section to the entity and VSM
mapping schemas, defining the five evaluation dimensions (Definition Precision,
Source Grounding, Domain Placement, VSM Relevance, Explanatory Value) with
1–5 rubrics used by the evaluate-entity template.

Also updated INFRA-TASKS.md to reflect current resolution status for tasks
4–19 across S2 and S3.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-23 06:04:09 +01:00
dfab3d598b feat(cli): add 'helper' alias for markitect helper command
markitect helper <QUESTION> now works as a short alias for
markitect llm-helper, per the original plan specification.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-23 05:40:11 +01:00
34ed7a6fab docs(tutorial): update §8-9 for eval-summary command and 6/6 viability
- Add eval-summary command documentation with dimension descriptions
- Document resumable evaluate (incremental skip on re-run)
- Fix --entity slug example to use underscores (not hyphens)
- Update viability output to show per_entity_mean as 6th threshold
- Add workflow note: check → eval-summary --update-metrics → viability

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-23 05:33:11 +01:00
7f1eecbdb2 feat(infospace): add eval-summary command and improve evaluate pipeline (S3.3)
- Fix evaluate dimensions to match template file:
  definition_precision, source_grounding, domain_placement,
  vsm_relevance, explanatory_value (was domain_relevance,
  discipline_alignment, conceptual_clarity)
- Add VSM background context to evaluation prompt so LLM can
  score vsm_relevance without macro injection
- Fix model_name bug: was sending literal "default" to API (HTTP 400)
- Refactor run_entity_evaluation to write files incrementally via
  callback rather than all at once after the batch — long runs are
  now resumable if interrupted
- Add incremental skip in CLI: entities with existing eval files
  are skipped automatically on re-run (acts as resume)
- Add eval-summary command: reads all eval files, shows per-dimension
  means, optionally writes per_entity_mean to metrics.yaml
- Fix record_check_results to merge rather than overwrite metrics.yaml
  so per_entity_mean survives subsequent check runs
- Add per_entity_mean viability threshold (min: 3.5) to infospace.yaml

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-23 01:26:45 +01:00
574bb11db6 feat(example): add supply-chain-vsm composition demo (S3.5)
Demonstrates infospace composition: the Wealth of Nations infospace is
used as a discipline, applying Smith's economic framework as a lens to
analyse modern supply chain management concepts.

New example: examples/supply-chain-vsm/
- infospace.yaml binding WoN as discipline (../infospace-with-history)
- 3 source documents: coordination mechanisms, capital & inventory,
  market structure (~400 words each, original content)
- supply-chain-entity-schema-v1.0.md with WoN Concept required section
- won-mapping-schema-v1.0.md with Conceptual Continuity rating
- artifacts/won-reference/core-entities.md — 12 curated WoN entities
  for injection as discipline context
- 8 hand-crafted entity files demonstrating LLM output format
- 3 mapping files with full rationale and VSM inheritance chains
- Viable: YES (5/5 thresholds)

Key mappings demonstrated:
  Demand Signal          → Effectual Demand        (Strong, S2)
  Vendor-Managed Inventory → Division of Labour    (Strong, S1/S2)
  Just-in-Time Inventory → Circulating Capital     (Strong, S1/S3)
  Bullwhip Effect        → Natural Price           (Moderate, S2)
  Platform Intermediary  → Merchant Capital        (Strong, S2/S4)
  Monopsony Power        → Combination of Masters  (Strong, S3*)

Platform fix: entity_parser.py now recognises ## Supply Chain Domain
as a domain alias for ## Economic Domain, enabling composed infospaces
to use their own domain section name.

Tutorial §13 rewritten with real commands, real output, and the full
mapping table from the demo.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-23 00:08:51 +01:00
8f00fa2018 docs(tutorial): update all commands to use markitect infospace CLI (S3.4)
Replace all process_chapters.py references throughout the tutorial with
the correct markitect infospace subcommands:

- §2  Project layout: remove process_chapters.py, add LAYERED-DEVELOPMENT.md
- §7  Processing: --chapter → process "glob", --book N → "book-N-*.md",
      --list → status/entities, --archive-entity → documented manual step
- §8  Check: remove incorrect --provider flag; note checks are deterministic
- §9  Viability: real output from full 988-entity corpus (Viable: YES)
- §10 History: real snapshot table; add --metric flag example
- §10 Git tracking: remove process_chapters.py from commit example
- §11 Cost: update openrouter/free example command
- §12 Completion: rewrite with actual observed metric progression table
- §14 Quality loop: update all commands; add archive-entity manual procedure
- §15 Artifact DB: --all without --provider = dry-run (no LLM calls)
- §16 Adapting: update step 6 and 7 to new CLI

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-22 23:31:38 +01:00
c861520ccd docs(example): add layered development concept and extend tutorial
Some checks failed
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
Adds LAYERED-DEVELOPMENT.md documenting the concept for evolving a flat
entity collection into a structured systemic model through four layers:

  L0 Source text → L1 Raw entities (current) → L2 Typed entities
  → L3 Relation graph → L4 Minimal systemic model

Covers: the element/relation/principle/institution type taxonomy,
VSM as a structural coordinate system, the type × VSM coverage matrix,
triplet extraction with a controlled predicate vocabulary, feedback loop
detection, and the distillation hypothesis for finding the generative
core of a corpus.

Extends TUTORIAL.md with sections 17–23:
  17. Observing entity heterogeneity
  18. The four-layer model overview
  19. Layer 2 — classifying entities (schema, pipeline stage, metrics)
  20. Layer 3 — extracting the relation graph (triplets, feedback loops)
  21. Layer 4 — the minimal systemic model (core-model.md output)
  22. Planned CLI commands for layers 2–4
  23. Layers 2–4 as composed infospaces

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-20 10:43:32 +01:00
9c32ad1837 fix(infospace): exclude raw LLM output from entity parsing; lower coverage threshold
- Add `.*-raw\.md$` to `_DEFAULT_EXCLUDE_PATTERNS` in entity_parser.py to
  prevent per-chapter raw LLM output files from being parsed as entities.
  This eliminates 33 malformed domain values where delimiter text was
  bleeding into the Economic Domain field.
- Lower coverage_ratio threshold from 0.50 → 0.40 in infospace.yaml to
  reflect realistic multi-book corpus expectations (documented rationale
  in METRICS-METHODOLOGY.md).

Post-fix metrics: 988 entities, 0 malformed, coverage_ratio=0.619 (pass).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-20 09:28:20 +01:00
7c38f9b427 merge(reprocess-v2): complete pipeline rewrite and full corpus processing
Merges the reprocess-v2 branch into main, covering:

Infrastructure changes:
- markitect infospace process — new CLI command for batch source processing
- SourcePipeline — @{macro} substitution, skip-if-exists, git commit per source
- PipelineStage config extended with name, output_dir, output_macro,
  split_entities, macros, max_tokens fields
- Per-stage max_tokens (extract=8k, map-to-vsm=10k, synthesize=4k)
- LLM provenance comment in each new entity file
- output/processing-log.yaml with per-source token/cost/duration/retry stats
- Retry on all LLM errors (not just rate limits) with 5s back-off
- C2 coverage: add domain_densities, density_std, cross_cutting_ratio

Example (infospace-with-history):
- All 35 chapters processed: 1021 entities across Books 1–5
- Per-chapter git commits showing metric evolution from 0 → final state
- Final metrics: coverage=0.44, granularity=2.95, redundancy=0.006
- METRICS-METHODOLOGY.md C2 section corrected and expanded

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-20 00:11:39 +01:00
dfe56a4f9b docs(metrics): clarify C2 coverage — domain×chapter matrix, not domain×VSM
- coverage.py: rewrite module docstring to explain what the metric actually
  computes (domain × chapter cross-tabulation, not VSM system coverage),
  what it does not capture (entity connectivity → C3), and when the
  threshold is appropriate
- CoverageReport: add domain_densities, density_std, cross_cutting_ratio
  for distribution-level insight beyond the aggregate ratio
- check_coverage: compute per-domain density and cross-cutting ratio
- METRICS-METHODOLOGY.md: correct C2 section to match implementation,
  document the distribution-based interpretation, add implementation status
  table distinguishing what is wired vs planned

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-20 00:08:46 +01:00
0f54f094e4 chore(example): final metrics snapshot — all 35 chapters processed
1021 entities extracted across all Books 1-5 of The Wealth of Nations.
Final metrics: coverage=0.4424, granularity=2.9533, redundancy=0.0059.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-19 22:54:54 +01:00
4a15a50337 infospace: process book-5-chapter-03
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 22:54:40 +01:00
92dfe367c7 infospace: process book-5-chapter-02
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 22:46:32 +01:00
23c397e46a infospace: process book-5-chapter-01
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 22:36:06 +01:00
e695ddfbbd infospace: process book-4-chapter-09
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 22:32:07 +01:00
5245dbbfc8 infospace: process book-4-chapter-08
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 22:25:52 +01:00
4319d2a32b infospace: process book-4-chapter-07
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 22:14:18 +01:00
efdaa884c8 infospace: process book-4-chapter-06
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 22:01:44 +01:00
2804de3d24 infospace: process book-4-chapter-05
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 21:47:52 +01:00
3e96ac7b8d infospace: process book-4-chapter-04
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 21:36:17 +01:00
a687e508f3 infospace: process book-4-chapter-03
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 21:31:40 +01:00
da9c5fce80 infospace: process book-4-chapter-02
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 21:19:39 +01:00
cd87ebfdc0 infospace: process book-4-chapter-01
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 21:13:08 +01:00
666f78d1ba infospace: process book-4-introduction
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 21:02:00 +01:00
579e02989b infospace: process book-3-chapter-04
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 20:46:20 +01:00
8401c69ff2 infospace: process book-3-chapter-03
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 20:40:35 +01:00
1b9a31665c fix(pipeline): retry on all LLM errors (not just rate limits)
Free-tier APIs intermittently return invalid JSON or empty responses.
Now any exception in _call_llm retries up to 3 times with a 5s back-off,
rather than failing immediately on non-rate-limit errors.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-19 20:32:23 +01:00
06e904ccf5 infospace: process book-3-chapter-02
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 20:30:22 +01:00