Commit Graph

94 Commits

Author SHA1 Message Date
e3e5b8ecc1 feat(infospace): systematic long-text processing — rich commit bodies, per-source eval/classify, chapters view
Some checks failed
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
Three coordinated changes that let the pipeline produce a clean
chapter-by-chapter git history on long texts without archaeology after
the fact.

1. Richer commit messages. `SourcePipeline._git_commit` now diffs the
   staged changes, buckets added files by output subdirectory (entities,
   evaluations, classifications, mappings, analyses, metrics, logs), and
   includes counts in the commit body. So `git log` reads "entities:
   +23, evaluations: +23" per chapter instead of the same generic blurb
   on every commit. Zero behaviour change when no output changed; falls
   back to the original message if the diff query fails.

2. --eval-after-source / --classify-after-source on `infospace process`.
   After a source's stages succeed, the pipeline identifies which entity
   files are *new* (set diff of entity slugs before vs after), loads
   their EntityMeta, and runs per-entity evaluation and/or
   classification scoped to just those slugs before the per-source git
   commit lands. Result: each chapter's commit is self-contained —
   extraction + evaluation + classification in one atomic unit. Gated
   behind explicit flags because the cost is real (LLM latency per
   chapter rather than amortised across one bulk batch).

3. `markitect infospace chapters` subcommand. Lists source files in
   canonical order with entity count, evaluated count, classified
   count, and mean per-entity score per source. Text or JSON output.
   Natural triage surface for long-text infospaces — spot chapters that
   under-extracted or evaluated poorly.

Also: `docs/advanced-usage.md` gets a new "Systematic processing of
long texts" section with the recommended flag combo and the tradeoff
note on cost.

11 new unit tests cover the chapters command (text/json/no-sources),
the process flag wiring (help + provider requirement), and the
commit-body bucket logic. Full infospace+llm unit suite (315 tests)
green; 3 pre-existing infospace failures unchanged.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-22 08:24:26 +02:00
f325f89dc9 feat(infospace): evaluate 3 missing WoN entities (C.1)
Some checks failed
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
Fills the 988 entity / 985 evaluation gap in the Wealth of Nations
infospace. Entities advanced_state_of_society, bank_notes, and
bank_systemic_risk_management had no evaluation files; runs through
Gemini (2.5-flash / 2.5-flash-lite for the last one, which hit the
free-tier RPM limit) bring the eval count to 988.

per_entity_mean nudged from 3.955635 to 3.95668; viability still
6/6 PASS.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-21 23:52:04 +02:00
36a5136bdf docs(infospace): add advanced-usage, composition guide, and performance notes (C.4/C.5/C.6)
Closes out three docs tasks from roadmap/infospace-s3-closeout/PLAN.md:

- examples/infospace-with-history/docs/advanced-usage.md (C.4) — 5 worked
  patterns covering incremental eval, re-eval workflow (no --force flag
  exists; documents the rm-then-re-run pattern instead), interpreting the
  eval-summary distribution, triaging low scorers via an awk pipeline
  over overall_score (since `entities --sort-by score` does not exist),
  and acting on check --json output.
- docs/composition-guide.md (C.5) — walks through how supply-chain-vsm
  binds WoN as a discipline, then a step-by-step for creating a new
  infospace that binds an existing one. Includes live output from
  `markitect infospace disciplines`.
- examples/infospace-with-history/docs/performance-notes.md (C.6) — cites
  the 6h 28m wall time of the 985-entity S3.3 batch, ~2.5 ent/min rate,
  ~2000–3000 tokens/entity estimate, word_overlap vs embedding backend
  for redundancy checks, and a provider-by-scale recommendation table.

All commands in these docs were run against the live infospace at
commit time.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-21 07:02:46 +02:00
b7e11461f4 chore: rename markitect_project to markitect-main across project
Finishes the in-progress rename so docs, configs, tests, and capability
manifests all reference the current repo name consistently. Fixes two
tests (test_roundtrip_consolidated.py, test_issue_140_roundtrip_simplified.py)
whose hardcoded cwd paths would have broken under the renamed directory.

Archival content under history/, reports/, and roadmap/eat-the-frog/, plus
derived artifacts (.venv_old/, node_modules/, asset_registry.json) are
intentionally left untouched.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-21 01:57:35 +02:00
b055c8d7bb docs(example): close out INFRA-TASKS with summary and 4 follow-up items
Some checks failed
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
Adds a closing remark (23 Feb 2026) summarising the final state of the
infospace: 988 entities, 985 evaluations, 823 L2 classifications, 15 L3
relations, viability 6/6 PASS.

New open tasks 20–23:
  20. Complete L2 classification batch (165 entities blocked on credits)
  21. Run classify-links for 58 Relation-type entities
  22. Refresh stale metrics-report.md narrative
  23. Smoke-test the graph command end-to-end

Also committed: history.py fix — write_metrics_file now preserves
non-float metric values (type_distribution dict) instead of crashing
on round().

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-23 13:45:58 +01:00
d1f57272a4 feat(example): add L2 classifications for 823/988 WoN entities (S3.4)
Batch classification via OpenRouter (claude-sonnet-4). 165 entities
remain unclassified due to credit exhaustion; incremental skip means
a follow-up run will complete them automatically.

Type × VSM matrix (823 entities):
                  S1   S2   S3  S3*   S4   S5
  Element         86   75   58   21   43   32  (315 total, 38%)
  Process         39   42   37   17   67   24  (226 total, 28%)
  Institution      4   12   30   24    .   52  (122 total, 15%)
  Principle        3    7   15    2   43   32  (102 total, 12%)
  Relation         2   14    5    5   22   10   (58 total,  7%)
  Matrix fill: 29/30 cells (Institution/S4 empty — expected)

Metrics updated: type_entropy=2.0936, vsm_type_matrix_cells=29

Also:
- BatchEvaluator gains delay_seconds param for rate-limited providers
- classify CLI gains --rpm option (--rpm 10 for Gemini free tier)
- history.write_metrics_file now handles non-float metric values
  (type_distribution is a dict, was crashing round())
- run_entity_classification forwards delay_seconds to BatchEvaluator
- classify-links and graph commands added by user (entities --by-type,
  graph --format mermaid/dot, classify-links for Relation enrichment)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-23 12:49:11 +01:00
a9ca0adfcf feat(example): add per-entity LLM evaluations for 985 WoN entities (S3.3)
Batch evaluation of all 988 entities via OpenRouter. 984 succeeded on
first pass; 3 failed (network errors). eval-summary --update-metrics
written with per_entity_mean=3.9556.

Viability dashboard: 6/6 PASS
  redundancy_ratio   0.0061  (max 0.10)
  coverage_ratio     0.6190  (min 0.40)
  coherence_comps    0.0000  (max 3)
  consistency_cycles 0.0000  (max 0)
  granularity_entropy 2.6748 (min 1.0)
  per_entity_mean    3.9556  (min 3.5)

Dimension breakdown (mean across 985 entities):
  definition_precision  3.62
  source_grounding      4.36
  domain_placement      4.56
  vsm_relevance         3.31
  explanatory_value     3.94

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-23 09:36:46 +01:00
81a4c8796a feat(infospace): add L2 entity classification with type × VSM matrix (S2.9)
Implements the L2 typed-entities layer — each entity is assigned an
Entity Type (Element, Process, Relation, Principle, Institution) and a
VSM System (S1–S5) by an LLM, with one-sentence rationales for each.

New modules:
- markitect/infospace/classification.py — EntityClassification dataclass
  + ENTITY_TYPES / VSM_SYSTEMS controlled vocabularies
- markitect/infospace/classification_io.py — write/read classification
  files (YAML frontmatter + markdown body, mirrors evaluation_io)
- markitect/infospace/classifier.py — build_classification_prompt(),
  parse_classification_response(), run_entity_classification(); batch
  runner writes files incrementally (same resumable pattern as evaluate)

CLI: markitect infospace classify [--entity SLUG] [--provider P] [--model M]
  - Incremental skip: checks output/classifications/ for existing files
  - Defaults to openrouter provider; 2000 max_tokens (Gemini 2.5 Flash
    uses ~787 thinking tokens, so 800 was too low)

CLI: markitect infospace classify-summary [--update-metrics]
  - Entity type counts + VSM system counts with percentages
  - 5 × 6 type × VSM matrix (spots structural blind spots at a glance)
  - --update-metrics writes type_distribution, type_entropy,
    vsm_type_matrix_cells to metrics.yaml

Config: InfospaceConfig gains classifications_dir (default output/classifications)
Schema: schemas/typed-entity-schema-v1.0.md — type/VSM vocabulary tables,
  rationale format rules, validation rules, metrics enabled at L2
infospace.yaml: schemas.typed_entity references typed-entity-schema-v1.0.md

Seed classifications (3): division_of_labour (Process/S1),
  natural_price_as_central_price (Principle/S2),
  invisible_hand_mechanism (Principle/S4)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-23 09:35:58 +01:00
2d45425b25 feat(infospace): add L3 relation graph with VSM-aware triplets (S2.8)
Implements the L3 relation graph layer — a directed graph of (Subject,
Predicate, Object) triplets annotated with VSM channel codes and feedback
roles. Triplets are authored as markdown files under output/relations/,
parsed into RelationMeta dataclasses, and analysed with networkx.

New modules:
- markitect/infospace/relation_models.py — RelationMeta dataclass +
  RELATION_TYPES controlled vocabulary (15 relation classes → VSM codes)
- markitect/infospace/relation_parser.py — parse_relation_file() and
  parse_relations_directory()

New schema: examples/infospace-with-history/schemas/relation-schema-v1.0.md
  — file naming convention, required sections, controlled vocabulary table

15 seed relation files covering the three core WoN feedback loops:
  - Capital Accumulation loop (positive reinforcement, S1/S3)
  - Market Price Balancing loop (negative feedback, S2/S3)
  - Market Extent mutual dependency (S1/S2)
  Plus structural relations: wages regulation, rent residual, price
  decomposition, invisible hand coordination

CLI: markitect infospace relations [--entity SLUG] [--vsm FILTER]
     [--loops] [--stats]
  - Builds directed graph from parsed files
  - Detects feedback loops via nx.simple_cycles()
  - 6 loops found from 15 seed relations (3 intended + 3 emergent)
  - --stats aggregates by VSM system code (strips parentheticals)

Config: InfospaceConfig gains relations_dir (default output/relations)
infospace.yaml: schemas.relation references relation-schema-v1.0.md

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-23 06:04:28 +01:00
fa27572f43 fix(example): skip prompt writes when output exists, add quality rubrics
INFRA-TASKS #5 — process_chapters.py now skips writing *-prompt.md files
when the corresponding output file already exists on disk. DB-only rebuilds
no longer dirty the working tree with unchanged prompt content.

INFRA-TASKS #8 — Added '## Quality Metrics' section to the entity and VSM
mapping schemas, defining the five evaluation dimensions (Definition Precision,
Source Grounding, Domain Placement, VSM Relevance, Explanatory Value) with
1–5 rubrics used by the evaluate-entity template.

Also updated INFRA-TASKS.md to reflect current resolution status for tasks
4–19 across S2 and S3.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-23 06:04:09 +01:00
34ed7a6fab docs(tutorial): update §8-9 for eval-summary command and 6/6 viability
- Add eval-summary command documentation with dimension descriptions
- Document resumable evaluate (incremental skip on re-run)
- Fix --entity slug example to use underscores (not hyphens)
- Update viability output to show per_entity_mean as 6th threshold
- Add workflow note: check → eval-summary --update-metrics → viability

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-23 05:33:11 +01:00
7f1eecbdb2 feat(infospace): add eval-summary command and improve evaluate pipeline (S3.3)
- Fix evaluate dimensions to match template file:
  definition_precision, source_grounding, domain_placement,
  vsm_relevance, explanatory_value (was domain_relevance,
  discipline_alignment, conceptual_clarity)
- Add VSM background context to evaluation prompt so LLM can
  score vsm_relevance without macro injection
- Fix model_name bug: was sending literal "default" to API (HTTP 400)
- Refactor run_entity_evaluation to write files incrementally via
  callback rather than all at once after the batch — long runs are
  now resumable if interrupted
- Add incremental skip in CLI: entities with existing eval files
  are skipped automatically on re-run (acts as resume)
- Add eval-summary command: reads all eval files, shows per-dimension
  means, optionally writes per_entity_mean to metrics.yaml
- Fix record_check_results to merge rather than overwrite metrics.yaml
  so per_entity_mean survives subsequent check runs
- Add per_entity_mean viability threshold (min: 3.5) to infospace.yaml

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-23 01:26:45 +01:00
574bb11db6 feat(example): add supply-chain-vsm composition demo (S3.5)
Demonstrates infospace composition: the Wealth of Nations infospace is
used as a discipline, applying Smith's economic framework as a lens to
analyse modern supply chain management concepts.

New example: examples/supply-chain-vsm/
- infospace.yaml binding WoN as discipline (../infospace-with-history)
- 3 source documents: coordination mechanisms, capital & inventory,
  market structure (~400 words each, original content)
- supply-chain-entity-schema-v1.0.md with WoN Concept required section
- won-mapping-schema-v1.0.md with Conceptual Continuity rating
- artifacts/won-reference/core-entities.md — 12 curated WoN entities
  for injection as discipline context
- 8 hand-crafted entity files demonstrating LLM output format
- 3 mapping files with full rationale and VSM inheritance chains
- Viable: YES (5/5 thresholds)

Key mappings demonstrated:
  Demand Signal          → Effectual Demand        (Strong, S2)
  Vendor-Managed Inventory → Division of Labour    (Strong, S1/S2)
  Just-in-Time Inventory → Circulating Capital     (Strong, S1/S3)
  Bullwhip Effect        → Natural Price           (Moderate, S2)
  Platform Intermediary  → Merchant Capital        (Strong, S2/S4)
  Monopsony Power        → Combination of Masters  (Strong, S3*)

Platform fix: entity_parser.py now recognises ## Supply Chain Domain
as a domain alias for ## Economic Domain, enabling composed infospaces
to use their own domain section name.

Tutorial §13 rewritten with real commands, real output, and the full
mapping table from the demo.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-23 00:08:51 +01:00
8f00fa2018 docs(tutorial): update all commands to use markitect infospace CLI (S3.4)
Replace all process_chapters.py references throughout the tutorial with
the correct markitect infospace subcommands:

- §2  Project layout: remove process_chapters.py, add LAYERED-DEVELOPMENT.md
- §7  Processing: --chapter → process "glob", --book N → "book-N-*.md",
      --list → status/entities, --archive-entity → documented manual step
- §8  Check: remove incorrect --provider flag; note checks are deterministic
- §9  Viability: real output from full 988-entity corpus (Viable: YES)
- §10 History: real snapshot table; add --metric flag example
- §10 Git tracking: remove process_chapters.py from commit example
- §11 Cost: update openrouter/free example command
- §12 Completion: rewrite with actual observed metric progression table
- §14 Quality loop: update all commands; add archive-entity manual procedure
- §15 Artifact DB: --all without --provider = dry-run (no LLM calls)
- §16 Adapting: update step 6 and 7 to new CLI

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-22 23:31:38 +01:00
c861520ccd docs(example): add layered development concept and extend tutorial
Some checks failed
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
Adds LAYERED-DEVELOPMENT.md documenting the concept for evolving a flat
entity collection into a structured systemic model through four layers:

  L0 Source text → L1 Raw entities (current) → L2 Typed entities
  → L3 Relation graph → L4 Minimal systemic model

Covers: the element/relation/principle/institution type taxonomy,
VSM as a structural coordinate system, the type × VSM coverage matrix,
triplet extraction with a controlled predicate vocabulary, feedback loop
detection, and the distillation hypothesis for finding the generative
core of a corpus.

Extends TUTORIAL.md with sections 17–23:
  17. Observing entity heterogeneity
  18. The four-layer model overview
  19. Layer 2 — classifying entities (schema, pipeline stage, metrics)
  20. Layer 3 — extracting the relation graph (triplets, feedback loops)
  21. Layer 4 — the minimal systemic model (core-model.md output)
  22. Planned CLI commands for layers 2–4
  23. Layers 2–4 as composed infospaces

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-20 10:43:32 +01:00
9c32ad1837 fix(infospace): exclude raw LLM output from entity parsing; lower coverage threshold
- Add `.*-raw\.md$` to `_DEFAULT_EXCLUDE_PATTERNS` in entity_parser.py to
  prevent per-chapter raw LLM output files from being parsed as entities.
  This eliminates 33 malformed domain values where delimiter text was
  bleeding into the Economic Domain field.
- Lower coverage_ratio threshold from 0.50 → 0.40 in infospace.yaml to
  reflect realistic multi-book corpus expectations (documented rationale
  in METRICS-METHODOLOGY.md).

Post-fix metrics: 988 entities, 0 malformed, coverage_ratio=0.619 (pass).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-20 09:28:20 +01:00
7c38f9b427 merge(reprocess-v2): complete pipeline rewrite and full corpus processing
Merges the reprocess-v2 branch into main, covering:

Infrastructure changes:
- markitect infospace process — new CLI command for batch source processing
- SourcePipeline — @{macro} substitution, skip-if-exists, git commit per source
- PipelineStage config extended with name, output_dir, output_macro,
  split_entities, macros, max_tokens fields
- Per-stage max_tokens (extract=8k, map-to-vsm=10k, synthesize=4k)
- LLM provenance comment in each new entity file
- output/processing-log.yaml with per-source token/cost/duration/retry stats
- Retry on all LLM errors (not just rate limits) with 5s back-off
- C2 coverage: add domain_densities, density_std, cross_cutting_ratio

Example (infospace-with-history):
- All 35 chapters processed: 1021 entities across Books 1–5
- Per-chapter git commits showing metric evolution from 0 → final state
- Final metrics: coverage=0.44, granularity=2.95, redundancy=0.006
- METRICS-METHODOLOGY.md C2 section corrected and expanded

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-20 00:11:39 +01:00
dfe56a4f9b docs(metrics): clarify C2 coverage — domain×chapter matrix, not domain×VSM
- coverage.py: rewrite module docstring to explain what the metric actually
  computes (domain × chapter cross-tabulation, not VSM system coverage),
  what it does not capture (entity connectivity → C3), and when the
  threshold is appropriate
- CoverageReport: add domain_densities, density_std, cross_cutting_ratio
  for distribution-level insight beyond the aggregate ratio
- check_coverage: compute per-domain density and cross-cutting ratio
- METRICS-METHODOLOGY.md: correct C2 section to match implementation,
  document the distribution-based interpretation, add implementation status
  table distinguishing what is wired vs planned

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-20 00:08:46 +01:00
0f54f094e4 chore(example): final metrics snapshot — all 35 chapters processed
1021 entities extracted across all Books 1-5 of The Wealth of Nations.
Final metrics: coverage=0.4424, granularity=2.9533, redundancy=0.0059.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-19 22:54:54 +01:00
4a15a50337 infospace: process book-5-chapter-03
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 22:54:40 +01:00
92dfe367c7 infospace: process book-5-chapter-02
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 22:46:32 +01:00
23c397e46a infospace: process book-5-chapter-01
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 22:36:06 +01:00
e695ddfbbd infospace: process book-4-chapter-09
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 22:32:07 +01:00
5245dbbfc8 infospace: process book-4-chapter-08
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 22:25:52 +01:00
4319d2a32b infospace: process book-4-chapter-07
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 22:14:18 +01:00
efdaa884c8 infospace: process book-4-chapter-06
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 22:01:44 +01:00
2804de3d24 infospace: process book-4-chapter-05
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 21:47:52 +01:00
3e96ac7b8d infospace: process book-4-chapter-04
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 21:36:17 +01:00
a687e508f3 infospace: process book-4-chapter-03
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 21:31:40 +01:00
da9c5fce80 infospace: process book-4-chapter-02
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 21:19:39 +01:00
cd87ebfdc0 infospace: process book-4-chapter-01
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 21:13:08 +01:00
666f78d1ba infospace: process book-4-introduction
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 21:02:00 +01:00
579e02989b infospace: process book-3-chapter-04
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 20:46:20 +01:00
8401c69ff2 infospace: process book-3-chapter-03
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 20:40:35 +01:00
06e904ccf5 infospace: process book-3-chapter-02
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 20:30:22 +01:00
59d42b1665 infospace: process book-3-chapter-01
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 20:18:15 +01:00
8c11e13fef infospace: process book-2-chapter-05
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 20:03:11 +01:00
ac4e508aff infospace: process book-2-chapter-04
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 19:57:59 +01:00
8e1943afdb infospace: process book-2-chapter-03
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 19:50:53 +01:00
05711e541d infospace: process book-2-chapter-02
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 19:43:19 +01:00
8cb9ee6f6e infospace: process book-2-chapter-01
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 19:26:57 +01:00
db129fde6b infospace: process book-1-chapter-11
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 19:19:20 +01:00
6d9ec4e34b infospace: process book-1-chapter-10
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 18:59:36 +01:00
679f482e49 config(example): increase extract-entities max_tokens to 8000
Chapters with many pre-existing entities were still truncating at 6000 tokens
because the LLM needs space to output the full list of candidates even when
most are skipped.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-19 18:48:33 +01:00
368571905a infospace: process book-1-chapter-09
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 15:58:08 +01:00
9c95912d68 infospace: process book-1-chapter-08
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 15:47:12 +01:00
0828581269 infospace: process book-1-chapter-07
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 15:40:24 +01:00
283abac378 infospace: process book-1-chapter-06
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 15:29:59 +01:00
90ca14dd85 config(example): increase max_tokens for map-to-vsm (10k) and synthesize (4k)
map-to-vsm was consistently truncating at 6000 tokens; synthesize-analysis
sometimes truncated at 3000 for chapters with many entities.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-19 15:21:04 +01:00
098b781f92 infospace: process book-1-chapter-05
Extract entities, map to VSM, and synthesize analysis.
2026-02-19 15:20:35 +01:00