Files

tegwick 9e8d73fa7d docs(roadmap): close out infospace tooling S3 and parent roadmap

All three stages of the infospace tooling roadmap are complete. The Wealth
of Nations / VSM example passes 6/6 viability thresholds on 988 entities,
and composition is demonstrated via the supply-chain-vsm example.

- Parent roadmap (roadmap/infospace-tooling/PLAN.md): header now shows the
  closed status with final validation metrics.
- S3 close-out plan (roadmap/infospace-s3-closeout/PLAN.md): records the
  final task dispositions. C.1–C.6 and C.8 done; C.7 (clean per-chapter
  git history) is deferred indefinitely — the task was cosmetic, its
  prerequisite branch no longer exists, and reconstructing 35 archival
  commits would not change any output files. Rationale documented inline.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

2026-04-22 07:08:43 +02:00

23 KiB

Raw Blame History

Viable Infospace Tooling — Roadmap

Status: CLOSED (2026-04-22)

All three stages complete.

Stage	Status	Notes
Stage 1 — Platform additions (S1.1–S1.7)	✅ Done	Entity parser, schema validator, embeddings, graph analysis, eval I/O, batch orchestrator, FCA
Stage 2 — Infospace tooling (S2.1–S2.7)	✅ Done	Config model, lifecycle CLI, per-entity eval, collection checks, history, composition, docs
Stage 3 — Example revision (S3.1–S3.5)	✅ Done (except cosmetic S3.2)	See `roadmap/infospace-s3-closeout/PLAN.md`

Final validation (Wealth of Nations / VSM example, 988 entities):

988 per-entity evaluations landed
Collection checks pass 6/6 viability thresholds (per_entity_mean = 3.957 against threshold 3.5; redundancy_ratio = 0.006; coverage_ratio = 0.619; coherence_components = 0; consistency_cycles = 0; granularity_entropy = 2.675)
Composition demonstrated via examples/supply-chain-vsm/
S3.2 (clean per-chapter git history) deferred as cosmetic-only; rationale in the close-out plan

See roadmap/infospace-s3-closeout/PLAN.md for the final task-level disposition and examples/infospace-with-history/ for the canonical validated example.

Vision

An infospace is a structured, evaluable, composable collection of concepts that explains a topic through the lens of one or more disciplines. Infospaces are the unit of knowledge work in MarkiTect.

This roadmap organises the work needed to move from the current ad-hoc example (infospace-with-history) to a general-purpose platform for creating, evaluating, maintaining, and composing infospaces.

Terminology

These terms establish the vocabulary for infospace tooling. They generalise from the Wealth of Nations / VSM example but are not specific to it.

Infospace

A curated, self-describing collection of entities (concepts, mechanisms, observations) that together explain a topic. An infospace has:

A topic — the subject matter being explained (e.g. "The Wealth of Nations", "cellular biology", "Kubernetes networking")
One or more disciplines — external frameworks applied as lenses (e.g. "Viable System Model", "category theory")
Entities — the atomic units of knowledge, each with a definition, provenance, and quality scores
Schemas — structural templates that define what a well-formed entity, mapping, or analysis looks like
Evaluations — per-entity and collection-level quality assessments
Metrics — quantitative indicators of completeness, coherence, consistency, and granularity balance

An infospace is viable when it meets threshold scores across its defined metrics — it is fit for purpose as an explanatory tool.

Topic

The subject matter an infospace is built to explain. A topic sits within a domain (broader field of knowledge) but is more specific:

Domain: Economics → Topic: The Wealth of Nations
Domain: Systems Theory → Topic: Viable System Model
Domain: Computer Science → Topic: Distributed consensus protocols

A topic provides the source material — the texts, data, or observations from which entities are extracted.

Discipline

A reusable framework of concepts applied as a lens to explore a topic. A discipline is itself an infospace — one that has been evaluated as viable and packaged for reuse.

In our example, the VSM is the discipline: a set of concepts (S1-S5, recursion, variety, viability) from systems theory, applied to the economic concepts in Smith's work.

Key property: Disciplines compose. An infospace built with one discipline can itself become a discipline for another infospace. The Wealth of Nations infospace, viewed through VSM, could become a discipline applied to a modern supply chain analysis.

Entity

The atomic unit of an infospace. An entity has:

Identity: a unique slug and human-readable title
Definition: a precise, non-circular explanation
Provenance: the source chapter, passage, and extraction context
Domain placement: which area of the topic it belongs to
Discipline mapping: how it connects to the applied discipline (e.g. which VSM system)
Quality scores: per-entity LLM-evaluated metrics
Lifecycle state: active, archived (with reason), or draft

Evaluation

A structured assessment of quality, applied at two levels:

Per-entity evaluation: scores an individual entity against quality rubrics defined in its schema (definition precision, source grounding, discipline relevance, etc.)
Collection evaluation: scores the entity set as a whole against five concerns: redundancy, coverage, coherence, consistency, and granularity balance

Evaluations are always performed by delegated LLM calls through MarkiTect's LLM integration — never by the coding agent working on infrastructure. This separation ensures that domain-level judgment stays in the problem space, not the tooling space.

Viability

An infospace is viable when:

Its entities individually meet quality thresholds (per-entity eval)
Its collection metrics are within acceptable ranges
It can answer its defined competency questions — the canonical queries the infospace is meant to support
It has been evaluated recently enough that metrics reflect current content

Viability is not binary — it is a profile of scores that the user sets thresholds for based on their needs.

Architecture: Three Layers

┌──────────────────────────────────────────────────┐
│  Layer 3: Infospace Instances                    │
│  Specific infospaces built by users              │
│  (Wealth of Nations + VSM, supply chain + ...)   │
│  Works IN an infospace                           │
├──────────────────────────────────────────────────┤
│  Layer 2: Infospace Tooling                      │
│  Terminology, primitives, composition model      │
│  CLI: infospace create/evaluate/compose/...      │
│  Works WITH infospaces                           │
├──────────────────────────────────────────────────┤
│  Layer 1: MarkiTect Platform                     │
│  Artifacts, prompts, LLM, spaces, graph, embed   │
│  Provides FOR infospaces                         │
└──────────────────────────────────────────────────┘

Boundary condition: LLM delegation

All LLM-based evaluation (entity scoring, pairwise judgments, coverage analysis) is delegated to MarkiTect's LLM integration module. The coding agent that works on infrastructure never makes domain-level judgments itself. This keeps a clean separation:

Coding agent → writes Python, templates, schemas, tests
MarkiTect LLM → evaluates entities, judges redundancy, assesses coverage, checks consistency

The infospace tooling (Layer 2) orchestrates these LLM calls through prompt templates and the prompt execution engine, not through ad-hoc prompting.

Stage 1: MarkiTect Platform Additions

Infrastructure that must exist before infospace tooling can be built. These are general-purpose platform capabilities, not infospace-specific.

S1.1 — Entity metadata parser

Add a deterministic markdown parser that extracts structured metadata from entity files: H1 title, sections present, word counts, domain, source chapter. Returns a dataclass usable by all downstream metrics.

Maps to: INFRA-TASKS #13, #10 Location: markitect/prompts/quality/ or new markitect/analysis/ Depends on: Nothing — can start immediately Deliverable: parse_entity_metadata(path) -> EntityMeta function with tests

S1.2 — Schema compliance validator

Deterministic validation of entity/mapping files against their schemas: section presence, word count ranges, heading format, enum values. No LLM needed.

Maps to: INFRA-TASKS #10 Location: markitect/prompts/quality/validator.py (extend existing) Depends on: S1.1 Deliverable: validate_document(path, schema) -> ValidationResult with tests

S1.3 — Embedding adapter

Add embedding support to markitect/llm/. Needs:

EmbeddingAdapter interface: embed(texts: list[str]) -> list[list[float]]
OpenRouterEmbeddingAdapter implementation (or OpenAI embedding endpoint)
Caching layer: store embeddings keyed by {slug: content_digest} so unchanged entities skip re-embedding
Cosine similarity utility: similarity_matrix(embeddings) -> np.ndarray

Maps to: INFRA-TASKS #14 (prerequisite) Location: markitect/llm/embeddings.py Depends on: Nothing — can start immediately Deliverable: Embedding adapter + cache + similarity computation, with tests

S1.4 — Graph analysis utilities

The existing DependencyGraph supports basic traversal and cycle detection. Collection-level metrics need richer analysis:

Connected components
Betweenness centrality
Community detection (Louvain or label propagation)
Modularity score
Degree distribution
Cohesion/coupling computation

Decide: extend DependencyGraph or add a lightweight wrapper that converts to networkx (adding it as an optional dependency).

Maps to: INFRA-TASKS #16 (prerequisite) Location: markitect/prompts/dependencies/analysis.py or new markitect/analysis/graph.py Depends on: Nothing — can start immediately Deliverable: Graph analysis functions with tests

S1.5 — Structured evaluation output

Define a standard format for evaluation results: YAML front-matter + markdown body. Add utilities for:

Writing evaluation results (per-entity, per-pair, collection-level)
Reading/parsing evaluation results back into dataclasses
Appending timestamped snapshots to a history file
Diffing two snapshots

Maps to: INFRA-TASKS #11, #12 Location: markitect/prompts/quality/ or markitect/analysis/ Depends on: S1.1 Deliverable: EvaluationResult model + read/write utilities with tests

S1.6 — Batch LLM evaluation orchestrator

A pipeline component that runs an evaluation prompt template against a batch of entities (or entity pairs), collecting structured results. Must handle:

Rate limiting and retry (reuse existing adapter logic)
Progress reporting
Incremental evaluation (skip entities whose content hasn't changed since last eval)
Result aggregation

This is the mechanism by which infospace tooling delegates LLM work to the platform.

Maps to: INFRA-TASKS #9 (prerequisite) Location: markitect/prompts/execution/batch.py Depends on: S1.5 Deliverable: BatchEvaluator class with tests

S1.7 — FCA computation

Formal Concept Analysis: build a formal context (entity × attribute matrix), compute the concept lattice, extract gap concepts. Either implement a minimal FCA algorithm or integrate a library.

Maps to: INFRA-TASKS #15 (prerequisite) Location: markitect/analysis/fca.py Depends on: S1.1 Deliverable: FormalContext, ConceptLattice, find_gap_concepts() with tests

Summary: Stage 1 dependency graph

S1.1 Entity metadata parser ──┬── S1.2 Schema validator
                               ├── S1.5 Eval output format ── S1.6 Batch evaluator
                               └── S1.7 FCA computation

S1.3 Embedding adapter ──────── (independent)
S1.4 Graph analysis ─────────── (independent)

S1.1, S1.3, and S1.4 can proceed in parallel. S1.6 (batch evaluator) is the final piece needed before Stage 2 can begin.

Stage 2: Infospace Tooling

The user-facing layer that provides documented primitives for working with infospaces. Built on top of Stage 1 infrastructure and the existing markitect/spaces/ module.

S2.1 — Infospace model and configuration

Define the Infospace as a first-class concept that extends the existing InformationSpace with:

Topic declaration: name, domain, source material reference
Discipline bindings: which external infospaces are applied as lenses
Schema registry: which schemas govern entity structure
Competency questions: what the infospace should be able to answer
Viability thresholds: minimum acceptable metric scores
Evaluation state: latest per-entity and collection scores

Configuration format: a infospace.yaml (or section in existing config) that declares all of the above.

Location: new markitect/infospace/ package Depends on: S1.1, S1.5, existing markitect/spaces/ Deliverable: InfospaceConfig, InfospaceState models + loader

S2.2 — Infospace lifecycle commands

CLI commands for the core lifecycle:

# Initialise a new infospace
markitect infospace init --topic "Wealth of Nations" \
  --domain "Economics" \
  --discipline vsm-framework

# Show infospace status (entity count, eval state, viability)
markitect infospace status

# List entities with quality summary
markitect infospace entities [--sort-by score|domain|chapter]

# Show viability dashboard
markitect infospace viability

These commands read the infospace.yaml config and present information from the metadata index and evaluation results.

Location: markitect/infospace/cli.py integrated into main CLI Depends on: S2.1 Deliverable: CLI commands with help text and tests

S2.3 — Per-entity evaluation primitives

Prompt templates and CLI commands for evaluating individual entities:

# Evaluate all entities
markitect infospace evaluate --provider openrouter

# Evaluate entities from a specific chapter
markitect infospace evaluate --chapter book-1-chapter-05 --provider openrouter

# Re-evaluate a single entity
markitect infospace evaluate --entity division-of-labour --provider openrouter

Uses the batch evaluator (S1.6) to run the evaluate-entity prompt template (defined in the infospace's schema directory) against entities. Writes structured results to output/evaluations/.

Maps to: INFRA-TASKS #8, #9 Location: markitect/infospace/evaluation.py Depends on: S1.6, S2.1 Deliverable: Per-entity evaluation pipeline + CLI + prompt template

S2.4 — Collection-level checks

CLI commands for each of the five collection concerns:

# Run all collection checks
markitect infospace check --provider openrouter

# Run specific checks
markitect infospace check redundancy --provider openrouter
markitect infospace check coverage --provider openrouter
markitect infospace check coherence --provider openrouter
markitect infospace check consistency --provider openrouter
markitect infospace check granularity --provider openrouter

Each check uses Stage 1 infrastructure (embeddings, graph analysis, FCA) and delegates LLM judgment to the platform. Results written to output/metrics/ as per-concern reports + unified metrics.yaml.

Maps to: INFRA-TASKS #14-19 Location: markitect/infospace/checks/ (one module per concern) Depends on: S1.3, S1.4, S1.6, S1.7, S2.1 Deliverable: Five check modules + unified orchestrator + CLI

S2.5 — Metrics history and viability tracking

Track metrics over time. After each evaluation or check run, append a timestamped snapshot to metrics-history.yaml. Provide commands to review trends:

# Show metrics history
markitect infospace history

# Compare two snapshots
markitect infospace history diff 2026-02-18 2026-03-01

# Check viability against thresholds
markitect infospace viability

Viability is assessed by comparing current metrics to the thresholds declared in infospace.yaml. A simple pass/fail per metric with the actual value.

Maps to: INFRA-TASKS #12 Location: markitect/infospace/history.py Depends on: S2.4, S1.5 Deliverable: History tracking + viability assessment + CLI

S2.6 — Infospace composition model

The mechanism by which one infospace is applied as a discipline to another. Builds on markitect/spaces/composability/:

Discipline binding: declare that infospace A uses infospace B as a discipline. B's entities become available as mapping targets.
Cross-infospace references: entity in A maps to concept in B using the same mapping schema and evaluation pipeline.
Discipline viability requirement: B must be viable (meets its own thresholds) before it can be used as a discipline for A.
Cascading evaluation: when B's entities change, A's mappings that reference them are flagged for re-evaluation.

# Bind a discipline to the current infospace
markitect infospace bind-discipline ./path/to/vsm-infospace

# List bound disciplines and their viability
markitect infospace disciplines

# Check for stale mappings after discipline update
markitect infospace check stale-mappings

Location: markitect/infospace/composition.py Depends on: S2.1, existing markitect/spaces/composability/ Deliverable: Composition model + CLI + documentation

S2.7 — Documentation: Infospace Primitives Reference

A reference document explaining all primitives, their purpose, and how they compose. This is the user-facing documentation for the infospace tooling layer — the equivalent of a framework guide.

Location: docs/infospace-primitives.md or in-CLI help Depends on: S2.1-S2.6 Deliverable: Reference documentation

Summary: Stage 2 dependency graph

S2.1 Model & config ──┬── S2.2 Lifecycle CLI
                       ├── S2.3 Per-entity evaluation
                       ├── S2.4 Collection checks ── S2.5 History & viability
                       └── S2.6 Composition model

S2.7 Documentation (depends on all above)

Stage 3: Example Revision

Revisit the Wealth of Nations / VSM example using the new tooling. The example becomes both a tutorial and a validation of the tooling.

S3.1 — Migrate example to infospace configuration

Replace the ad-hoc process_chapters.py setup with a declarative infospace.yaml:

topic:
  name: "The Wealth of Nations"
  domain: "Classical Economics"
  sources: artifacts/sources/

disciplines:
  - name: "Viable System Model"
    path: artifacts/vsm-reference/

schemas:
  entity: schemas/economic-entity-schema-v1.0.md
  mapping: schemas/vsm-mapping-schema-v1.0.md
  analysis: schemas/chapter-analysis-schema-v1.0.md

competency_questions: schemas/competency-questions.md

viability:
  redundancy_ratio: { max: 0.05 }
  coverage_ratio: { min: 0.60 }
  coherence_components: { max: 1 }
  consistency_cycles: { max: 0 }
  granularity_entropy: { min: 1.0 }
  per_entity_mean: { min: 3.5 }

pipeline:
  stages:
    - template: extract-entities
      spaces: [sources, guidelines, vsm-reference, entities]
    - template: map-to-vsm
      spaces: [entities, vsm-reference, guidelines]
    - template: synthesize-analysis
      spaces: [sources, entities, mappings, vsm-reference]
  post_batch:
    - template: assess-metrics
      spaces: [analyses, vsm-reference]

Depends on: S2.1 Deliverable: infospace.yaml + migration of process_chapters.py to use infospace tooling APIs

S3.2 — Clean per-chapter git history

Re-run all processed chapters (and remaining ones) with per-chapter commits on a clean branch, then replace the current tangled history.

Maps to: INFRA-TASKS #4, #7 Depends on: S3.1 Deliverable: Clean branch with one commit per chapter

S3.3 — Full evaluation run

Run all per-entity evaluations and collection checks on the completed infospace. Establish baseline metrics. Demonstrate the viability dashboard.

Maps to: INFRA-TASKS #6 Depends on: S2.3, S2.4, S2.5, S3.2 Deliverable: Complete evaluation results + viability report

S3.4 — Rewrite tutorial

Update TUTORIAL.md to use infospace tooling commands instead of raw process_chapters.py invocations. The tutorial should walk through:

Initialising an infospace (markitect infospace init)
Defining schemas and competency questions
Processing chapters (pipeline execution)
Evaluating entities (markitect infospace evaluate)
Running collection checks (markitect infospace check)
Reviewing viability (markitect infospace viability)
Iterating: refining guidelines, re-processing, re-evaluating
Using the infospace as a discipline for a new project

Depends on: S3.1-S3.3 Deliverable: Revised TUTORIAL.md

S3.5 — Demonstrate composition

Create a minimal second infospace (e.g. a modern supply chain case study or a different economic text) that binds the Wealth of Nations infospace as a discipline. Demonstrates the composition model from S2.6.

Depends on: S2.6, S3.3 Deliverable: Second example infospace + composition tutorial section

Task Mapping

Cross-reference between INFRA-TASKS numbers and roadmap stages:

INFRA-TASK	Description	Stage
1-3	Infra fixes (resolved)	—
4	Per-chapter git history	S3.2
5	Prompt file side-effects	S1.6 (batch eval avoids this)
6	Stale metrics	S3.3
7	Remaining 28 chapters	S3.2
8	Per-concept quality metrics in schema	S2.3
9	Evaluate-entity prompt template	S2.3
10	Deterministic schema compliance	S1.2
11	Structured metrics output	S1.5
12	Metrics-over-time tracking	S2.5
13	Entity metadata index	S1.1
14	Redundancy detection (C1)	S2.4
15	Coverage completeness (C2)	S2.4
16	Structural coherence (C3)	S2.4
17	Definitional consistency (C4)	S2.4
18	Granularity balance (C5)	S2.4
19	Unified collection evaluation	S2.4

Implementation Order

Recommended sequence, accounting for dependencies and value delivery:

Phase A — Foundation (Stage 1, parallelisable)

S1.1 Entity metadata parser
S1.3 Embedding adapter
S1.4 Graph analysis utilities

Phase B — Validation & Output (Stage 1) 4. S1.2 Schema compliance validator (needs S1.1) 5. S1.5 Structured evaluation output (needs S1.1) 6. S1.7 FCA computation (needs S1.1)

Phase C — Orchestration (Stage 1 → Stage 2 bridge) 7. S1.6 Batch LLM evaluation orchestrator (needs S1.5)

Phase D — Infospace Core (Stage 2) 8. S2.1 Infospace model and configuration 9. S2.2 Lifecycle commands 10. S2.3 Per-entity evaluation primitives (needs S1.6, S2.1)

Phase E — Collection Intelligence (Stage 2) 11. S2.4 Collection-level checks (needs S1.3, S1.4, S1.7, S2.1) 12. S2.5 Metrics history and viability tracking

Phase F — Composition (Stage 2) 13. S2.6 Infospace composition model 14. S2.7 Documentation

Phase G — Example (Stage 3) 15. S3.1 Migrate example to infospace config 16. S3.2 Clean per-chapter history 17. S3.3 Full evaluation run 18. S3.4 Rewrite tutorial 19. S3.5 Demonstrate composition

23 KiB Raw Blame History Unescape Escape