Files
markitect-main/roadmap/infospace-tooling/PLAN.md
tegwick 9e8d73fa7d docs(roadmap): close out infospace tooling S3 and parent roadmap
All three stages of the infospace tooling roadmap are complete. The Wealth
of Nations / VSM example passes 6/6 viability thresholds on 988 entities,
and composition is demonstrated via the supply-chain-vsm example.

- Parent roadmap (roadmap/infospace-tooling/PLAN.md): header now shows the
  closed status with final validation metrics.
- S3 close-out plan (roadmap/infospace-s3-closeout/PLAN.md): records the
  final task dispositions. C.1–C.6 and C.8 done; C.7 (clean per-chapter
  git history) is deferred indefinitely — the task was cosmetic, its
  prerequisite branch no longer exists, and reconstructing 35 archival
  commits would not change any output files. Rationale documented inline.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-22 07:08:43 +02:00

648 lines
23 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Viable Infospace Tooling — Roadmap
## Status: CLOSED (2026-04-22)
All three stages complete.
| Stage | Status | Notes |
|-------|--------|-------|
| Stage 1 — Platform additions (S1.1S1.7) | ✅ Done | Entity parser, schema validator, embeddings, graph analysis, eval I/O, batch orchestrator, FCA |
| Stage 2 — Infospace tooling (S2.1S2.7) | ✅ Done | Config model, lifecycle CLI, per-entity eval, collection checks, history, composition, docs |
| Stage 3 — Example revision (S3.1S3.5) | ✅ Done (except cosmetic S3.2) | See `roadmap/infospace-s3-closeout/PLAN.md` |
**Final validation (Wealth of Nations / VSM example, 988 entities):**
- 988 per-entity evaluations landed
- Collection checks pass 6/6 viability thresholds (`per_entity_mean = 3.957`
against threshold 3.5; `redundancy_ratio = 0.006`; `coverage_ratio = 0.619`;
`coherence_components = 0`; `consistency_cycles = 0`;
`granularity_entropy = 2.675`)
- Composition demonstrated via `examples/supply-chain-vsm/`
- S3.2 (clean per-chapter git history) deferred as cosmetic-only; rationale
in the close-out plan
See `roadmap/infospace-s3-closeout/PLAN.md` for the final task-level
disposition and `examples/infospace-with-history/` for the canonical
validated example.
---
## Vision
An **infospace** is a structured, evaluable, composable collection of
concepts that explains a **topic** through the lens of one or more
**disciplines**. Infospaces are the unit of knowledge work in MarkiTect.
This roadmap organises the work needed to move from the current
ad-hoc example (`infospace-with-history`) to a general-purpose platform
for creating, evaluating, maintaining, and composing infospaces.
---
## Terminology
These terms establish the vocabulary for infospace tooling. They
generalise from the Wealth of Nations / VSM example but are not
specific to it.
### Infospace
A curated, self-describing collection of **entities** (concepts,
mechanisms, observations) that together explain a **topic**. An
infospace has:
- A **topic** — the subject matter being explained (e.g. "The Wealth
of Nations", "cellular biology", "Kubernetes networking")
- One or more **disciplines** — external frameworks applied as lenses
(e.g. "Viable System Model", "category theory")
- **Entities** — the atomic units of knowledge, each with a definition,
provenance, and quality scores
- **Schemas** — structural templates that define what a well-formed
entity, mapping, or analysis looks like
- **Evaluations** — per-entity and collection-level quality assessments
- **Metrics** — quantitative indicators of completeness, coherence,
consistency, and granularity balance
An infospace is **viable** when it meets threshold scores across its
defined metrics — it is fit for purpose as an explanatory tool.
### Topic
The subject matter an infospace is built to explain. A topic sits
within a **domain** (broader field of knowledge) but is more specific:
- Domain: Economics → Topic: The Wealth of Nations
- Domain: Systems Theory → Topic: Viable System Model
- Domain: Computer Science → Topic: Distributed consensus protocols
A topic provides the **source material** — the texts, data, or
observations from which entities are extracted.
### Discipline
A reusable framework of concepts applied as a lens to explore a topic.
A discipline is itself an infospace — one that has been evaluated as
viable and packaged for reuse.
In our example, the VSM is the discipline: a set of concepts (S1-S5,
recursion, variety, viability) from systems theory, applied to the
economic concepts in Smith's work.
**Key property:** Disciplines compose. An infospace built with one
discipline can itself become a discipline for another infospace. The
Wealth of Nations infospace, viewed through VSM, could become a
discipline applied to a modern supply chain analysis.
### Entity
The atomic unit of an infospace. An entity has:
- **Identity**: a unique slug and human-readable title
- **Definition**: a precise, non-circular explanation
- **Provenance**: the source chapter, passage, and extraction context
- **Domain placement**: which area of the topic it belongs to
- **Discipline mapping**: how it connects to the applied discipline
(e.g. which VSM system)
- **Quality scores**: per-entity LLM-evaluated metrics
- **Lifecycle state**: active, archived (with reason), or draft
### Evaluation
A structured assessment of quality, applied at two levels:
- **Per-entity evaluation**: scores an individual entity against
quality rubrics defined in its schema (definition precision, source
grounding, discipline relevance, etc.)
- **Collection evaluation**: scores the entity set as a whole against
five concerns: redundancy, coverage, coherence, consistency, and
granularity balance
Evaluations are always performed by **delegated LLM calls** through
MarkiTect's LLM integration — never by the coding agent working on
infrastructure. This separation ensures that domain-level judgment
stays in the problem space, not the tooling space.
### Viability
An infospace is viable when:
1. Its entities individually meet quality thresholds (per-entity eval)
2. Its collection metrics are within acceptable ranges
3. It can answer its defined **competency questions** — the canonical
queries the infospace is meant to support
4. It has been evaluated recently enough that metrics reflect current
content
Viability is not binary — it is a profile of scores that the user
sets thresholds for based on their needs.
---
## Architecture: Three Layers
```
┌──────────────────────────────────────────────────┐
│ Layer 3: Infospace Instances │
│ Specific infospaces built by users │
│ (Wealth of Nations + VSM, supply chain + ...) │
│ Works IN an infospace │
├──────────────────────────────────────────────────┤
│ Layer 2: Infospace Tooling │
│ Terminology, primitives, composition model │
│ CLI: infospace create/evaluate/compose/... │
│ Works WITH infospaces │
├──────────────────────────────────────────────────┤
│ Layer 1: MarkiTect Platform │
│ Artifacts, prompts, LLM, spaces, graph, embed │
│ Provides FOR infospaces │
└──────────────────────────────────────────────────┘
```
### Boundary condition: LLM delegation
All LLM-based evaluation (entity scoring, pairwise judgments, coverage
analysis) is delegated to MarkiTect's LLM integration module. The coding
agent that works on infrastructure never makes domain-level judgments
itself. This keeps a clean separation:
- **Coding agent** → writes Python, templates, schemas, tests
- **MarkiTect LLM** → evaluates entities, judges redundancy, assesses
coverage, checks consistency
The infospace tooling (Layer 2) orchestrates these LLM calls through
prompt templates and the prompt execution engine, not through ad-hoc
prompting.
---
## Stage 1: MarkiTect Platform Additions
Infrastructure that must exist before infospace tooling can be built.
These are general-purpose platform capabilities, not infospace-specific.
### S1.1 — Entity metadata parser
Add a deterministic markdown parser that extracts structured metadata
from entity files: H1 title, sections present, word counts, domain,
source chapter. Returns a dataclass usable by all downstream metrics.
**Maps to:** INFRA-TASKS #13, #10
**Location:** `markitect/prompts/quality/` or new `markitect/analysis/`
**Depends on:** Nothing — can start immediately
**Deliverable:** `parse_entity_metadata(path) -> EntityMeta` function
with tests
### S1.2 — Schema compliance validator
Deterministic validation of entity/mapping files against their schemas:
section presence, word count ranges, heading format, enum values. No
LLM needed.
**Maps to:** INFRA-TASKS #10
**Location:** `markitect/prompts/quality/validator.py` (extend existing)
**Depends on:** S1.1
**Deliverable:** `validate_document(path, schema) -> ValidationResult`
with tests
### S1.3 — Embedding adapter
Add embedding support to `markitect/llm/`. Needs:
- `EmbeddingAdapter` interface: `embed(texts: list[str]) -> list[list[float]]`
- `OpenRouterEmbeddingAdapter` implementation (or OpenAI embedding endpoint)
- Caching layer: store embeddings keyed by `{slug: content_digest}` so
unchanged entities skip re-embedding
- Cosine similarity utility: `similarity_matrix(embeddings) -> np.ndarray`
**Maps to:** INFRA-TASKS #14 (prerequisite)
**Location:** `markitect/llm/embeddings.py`
**Depends on:** Nothing — can start immediately
**Deliverable:** Embedding adapter + cache + similarity computation, with
tests
### S1.4 — Graph analysis utilities
The existing `DependencyGraph` supports basic traversal and cycle
detection. Collection-level metrics need richer analysis:
- Connected components
- Betweenness centrality
- Community detection (Louvain or label propagation)
- Modularity score
- Degree distribution
- Cohesion/coupling computation
Decide: extend `DependencyGraph` or add a lightweight wrapper that
converts to networkx (adding it as an optional dependency).
**Maps to:** INFRA-TASKS #16 (prerequisite)
**Location:** `markitect/prompts/dependencies/analysis.py` or new
`markitect/analysis/graph.py`
**Depends on:** Nothing — can start immediately
**Deliverable:** Graph analysis functions with tests
### S1.5 — Structured evaluation output
Define a standard format for evaluation results: YAML front-matter +
markdown body. Add utilities for:
- Writing evaluation results (per-entity, per-pair, collection-level)
- Reading/parsing evaluation results back into dataclasses
- Appending timestamped snapshots to a history file
- Diffing two snapshots
**Maps to:** INFRA-TASKS #11, #12
**Location:** `markitect/prompts/quality/` or `markitect/analysis/`
**Depends on:** S1.1
**Deliverable:** `EvaluationResult` model + read/write utilities with
tests
### S1.6 — Batch LLM evaluation orchestrator
A pipeline component that runs an evaluation prompt template against a
batch of entities (or entity pairs), collecting structured results.
Must handle:
- Rate limiting and retry (reuse existing adapter logic)
- Progress reporting
- Incremental evaluation (skip entities whose content hasn't changed
since last eval)
- Result aggregation
This is the mechanism by which infospace tooling delegates LLM work
to the platform.
**Maps to:** INFRA-TASKS #9 (prerequisite)
**Location:** `markitect/prompts/execution/batch.py`
**Depends on:** S1.5
**Deliverable:** `BatchEvaluator` class with tests
### S1.7 — FCA computation
Formal Concept Analysis: build a formal context (entity × attribute
matrix), compute the concept lattice, extract gap concepts. Either
implement a minimal FCA algorithm or integrate a library.
**Maps to:** INFRA-TASKS #15 (prerequisite)
**Location:** `markitect/analysis/fca.py`
**Depends on:** S1.1
**Deliverable:** `FormalContext`, `ConceptLattice`, `find_gap_concepts()`
with tests
### Summary: Stage 1 dependency graph
```
S1.1 Entity metadata parser ──┬── S1.2 Schema validator
├── S1.5 Eval output format ── S1.6 Batch evaluator
└── S1.7 FCA computation
S1.3 Embedding adapter ──────── (independent)
S1.4 Graph analysis ─────────── (independent)
```
S1.1, S1.3, and S1.4 can proceed in parallel. S1.6 (batch evaluator) is
the final piece needed before Stage 2 can begin.
---
## Stage 2: Infospace Tooling
The user-facing layer that provides documented primitives for working
with infospaces. Built on top of Stage 1 infrastructure and the existing
`markitect/spaces/` module.
### S2.1 — Infospace model and configuration
Define the `Infospace` as a first-class concept that extends the existing
`InformationSpace` with:
- **Topic declaration**: name, domain, source material reference
- **Discipline bindings**: which external infospaces are applied as lenses
- **Schema registry**: which schemas govern entity structure
- **Competency questions**: what the infospace should be able to answer
- **Viability thresholds**: minimum acceptable metric scores
- **Evaluation state**: latest per-entity and collection scores
Configuration format: a `infospace.yaml` (or section in existing config)
that declares all of the above.
**Location:** new `markitect/infospace/` package
**Depends on:** S1.1, S1.5, existing `markitect/spaces/`
**Deliverable:** `InfospaceConfig`, `InfospaceState` models + loader
### S2.2 — Infospace lifecycle commands
CLI commands for the core lifecycle:
```bash
# Initialise a new infospace
markitect infospace init --topic "Wealth of Nations" \
--domain "Economics" \
--discipline vsm-framework
# Show infospace status (entity count, eval state, viability)
markitect infospace status
# List entities with quality summary
markitect infospace entities [--sort-by score|domain|chapter]
# Show viability dashboard
markitect infospace viability
```
These commands read the `infospace.yaml` config and present information
from the metadata index and evaluation results.
**Location:** `markitect/infospace/cli.py` integrated into main CLI
**Depends on:** S2.1
**Deliverable:** CLI commands with help text and tests
### S2.3 — Per-entity evaluation primitives
Prompt templates and CLI commands for evaluating individual entities:
```bash
# Evaluate all entities
markitect infospace evaluate --provider openrouter
# Evaluate entities from a specific chapter
markitect infospace evaluate --chapter book-1-chapter-05 --provider openrouter
# Re-evaluate a single entity
markitect infospace evaluate --entity division-of-labour --provider openrouter
```
Uses the batch evaluator (S1.6) to run the evaluate-entity prompt
template (defined in the infospace's schema directory) against entities.
Writes structured results to `output/evaluations/`.
**Maps to:** INFRA-TASKS #8, #9
**Location:** `markitect/infospace/evaluation.py`
**Depends on:** S1.6, S2.1
**Deliverable:** Per-entity evaluation pipeline + CLI + prompt template
### S2.4 — Collection-level checks
CLI commands for each of the five collection concerns:
```bash
# Run all collection checks
markitect infospace check --provider openrouter
# Run specific checks
markitect infospace check redundancy --provider openrouter
markitect infospace check coverage --provider openrouter
markitect infospace check coherence --provider openrouter
markitect infospace check consistency --provider openrouter
markitect infospace check granularity --provider openrouter
```
Each check uses Stage 1 infrastructure (embeddings, graph analysis, FCA)
and delegates LLM judgment to the platform. Results written to
`output/metrics/` as per-concern reports + unified `metrics.yaml`.
**Maps to:** INFRA-TASKS #14-19
**Location:** `markitect/infospace/checks/` (one module per concern)
**Depends on:** S1.3, S1.4, S1.6, S1.7, S2.1
**Deliverable:** Five check modules + unified orchestrator + CLI
### S2.5 — Metrics history and viability tracking
Track metrics over time. After each evaluation or check run, append a
timestamped snapshot to `metrics-history.yaml`. Provide commands to
review trends:
```bash
# Show metrics history
markitect infospace history
# Compare two snapshots
markitect infospace history diff 2026-02-18 2026-03-01
# Check viability against thresholds
markitect infospace viability
```
Viability is assessed by comparing current metrics to the thresholds
declared in `infospace.yaml`. A simple pass/fail per metric with the
actual value.
**Maps to:** INFRA-TASKS #12
**Location:** `markitect/infospace/history.py`
**Depends on:** S2.4, S1.5
**Deliverable:** History tracking + viability assessment + CLI
### S2.6 — Infospace composition model
The mechanism by which one infospace is applied as a discipline to
another. Builds on `markitect/spaces/composability/`:
- **Discipline binding**: declare that infospace A uses infospace B as a
discipline. B's entities become available as mapping targets.
- **Cross-infospace references**: entity in A maps to concept in B using
the same mapping schema and evaluation pipeline.
- **Discipline viability requirement**: B must be viable (meets its own
thresholds) before it can be used as a discipline for A.
- **Cascading evaluation**: when B's entities change, A's mappings that
reference them are flagged for re-evaluation.
```bash
# Bind a discipline to the current infospace
markitect infospace bind-discipline ./path/to/vsm-infospace
# List bound disciplines and their viability
markitect infospace disciplines
# Check for stale mappings after discipline update
markitect infospace check stale-mappings
```
**Location:** `markitect/infospace/composition.py`
**Depends on:** S2.1, existing `markitect/spaces/composability/`
**Deliverable:** Composition model + CLI + documentation
### S2.7 — Documentation: Infospace Primitives Reference
A reference document explaining all primitives, their purpose, and how
they compose. This is the user-facing documentation for the infospace
tooling layer — the equivalent of a framework guide.
**Location:** `docs/infospace-primitives.md` or in-CLI help
**Depends on:** S2.1-S2.6
**Deliverable:** Reference documentation
### Summary: Stage 2 dependency graph
```
S2.1 Model & config ──┬── S2.2 Lifecycle CLI
├── S2.3 Per-entity evaluation
├── S2.4 Collection checks ── S2.5 History & viability
└── S2.6 Composition model
S2.7 Documentation (depends on all above)
```
---
## Stage 3: Example Revision
Revisit the Wealth of Nations / VSM example using the new tooling.
The example becomes both a tutorial and a validation of the tooling.
### S3.1 — Migrate example to infospace configuration
Replace the ad-hoc `process_chapters.py` setup with a declarative
`infospace.yaml`:
```yaml
topic:
name: "The Wealth of Nations"
domain: "Classical Economics"
sources: artifacts/sources/
disciplines:
- name: "Viable System Model"
path: artifacts/vsm-reference/
schemas:
entity: schemas/economic-entity-schema-v1.0.md
mapping: schemas/vsm-mapping-schema-v1.0.md
analysis: schemas/chapter-analysis-schema-v1.0.md
competency_questions: schemas/competency-questions.md
viability:
redundancy_ratio: { max: 0.05 }
coverage_ratio: { min: 0.60 }
coherence_components: { max: 1 }
consistency_cycles: { max: 0 }
granularity_entropy: { min: 1.0 }
per_entity_mean: { min: 3.5 }
pipeline:
stages:
- template: extract-entities
spaces: [sources, guidelines, vsm-reference, entities]
- template: map-to-vsm
spaces: [entities, vsm-reference, guidelines]
- template: synthesize-analysis
spaces: [sources, entities, mappings, vsm-reference]
post_batch:
- template: assess-metrics
spaces: [analyses, vsm-reference]
```
**Depends on:** S2.1
**Deliverable:** `infospace.yaml` + migration of `process_chapters.py` to
use infospace tooling APIs
### S3.2 — Clean per-chapter git history
Re-run all processed chapters (and remaining ones) with per-chapter
commits on a clean branch, then replace the current tangled history.
**Maps to:** INFRA-TASKS #4, #7
**Depends on:** S3.1
**Deliverable:** Clean branch with one commit per chapter
### S3.3 — Full evaluation run
Run all per-entity evaluations and collection checks on the completed
infospace. Establish baseline metrics. Demonstrate the viability
dashboard.
**Maps to:** INFRA-TASKS #6
**Depends on:** S2.3, S2.4, S2.5, S3.2
**Deliverable:** Complete evaluation results + viability report
### S3.4 — Rewrite tutorial
Update `TUTORIAL.md` to use infospace tooling commands instead of
raw `process_chapters.py` invocations. The tutorial should walk
through:
1. Initialising an infospace (`markitect infospace init`)
2. Defining schemas and competency questions
3. Processing chapters (pipeline execution)
4. Evaluating entities (`markitect infospace evaluate`)
5. Running collection checks (`markitect infospace check`)
6. Reviewing viability (`markitect infospace viability`)
7. Iterating: refining guidelines, re-processing, re-evaluating
8. Using the infospace as a discipline for a new project
**Depends on:** S3.1-S3.3
**Deliverable:** Revised `TUTORIAL.md`
### S3.5 — Demonstrate composition
Create a minimal second infospace (e.g. a modern supply chain case
study or a different economic text) that binds the Wealth of Nations
infospace as a discipline. Demonstrates the composition model from S2.6.
**Depends on:** S2.6, S3.3
**Deliverable:** Second example infospace + composition tutorial section
---
## Task Mapping
Cross-reference between INFRA-TASKS numbers and roadmap stages:
| INFRA-TASK | Description | Stage |
|------------|-------------|-------|
| 1-3 | Infra fixes (resolved) | — |
| 4 | Per-chapter git history | S3.2 |
| 5 | Prompt file side-effects | S1.6 (batch eval avoids this) |
| 6 | Stale metrics | S3.3 |
| 7 | Remaining 28 chapters | S3.2 |
| 8 | Per-concept quality metrics in schema | S2.3 |
| 9 | Evaluate-entity prompt template | S2.3 |
| 10 | Deterministic schema compliance | S1.2 |
| 11 | Structured metrics output | S1.5 |
| 12 | Metrics-over-time tracking | S2.5 |
| 13 | Entity metadata index | S1.1 |
| 14 | Redundancy detection (C1) | S2.4 |
| 15 | Coverage completeness (C2) | S2.4 |
| 16 | Structural coherence (C3) | S2.4 |
| 17 | Definitional consistency (C4) | S2.4 |
| 18 | Granularity balance (C5) | S2.4 |
| 19 | Unified collection evaluation | S2.4 |
---
## Implementation Order
Recommended sequence, accounting for dependencies and value delivery:
**Phase A — Foundation (Stage 1, parallelisable)**
1. S1.1 Entity metadata parser
2. S1.3 Embedding adapter
3. S1.4 Graph analysis utilities
**Phase B — Validation & Output (Stage 1)**
4. S1.2 Schema compliance validator (needs S1.1)
5. S1.5 Structured evaluation output (needs S1.1)
6. S1.7 FCA computation (needs S1.1)
**Phase C — Orchestration (Stage 1 → Stage 2 bridge)**
7. S1.6 Batch LLM evaluation orchestrator (needs S1.5)
**Phase D — Infospace Core (Stage 2)**
8. S2.1 Infospace model and configuration
9. S2.2 Lifecycle commands
10. S2.3 Per-entity evaluation primitives (needs S1.6, S2.1)
**Phase E — Collection Intelligence (Stage 2)**
11. S2.4 Collection-level checks (needs S1.3, S1.4, S1.7, S2.1)
12. S2.5 Metrics history and viability tracking
**Phase F — Composition (Stage 2)**
13. S2.6 Infospace composition model
14. S2.7 Documentation
**Phase G — Example (Stage 3)**
15. S3.1 Migrate example to infospace config
16. S3.2 Clean per-chapter history
17. S3.3 Full evaluation run
18. S3.4 Rewrite tutorial
19. S3.5 Demonstrate composition