docs: add infospace primitives reference (S2.7)
Reference document covering all infospace tooling primitives: config, entity metadata, schema validation, per-entity evaluation, collection checks, metrics history, viability, composition, and CLI commands. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
344
docs/infospace-primitives.md
Normal file
344
docs/infospace-primitives.md
Normal file
@@ -0,0 +1,344 @@
|
||||
# Infospace Primitives Reference
|
||||
|
||||
This document describes the primitives provided by the `markitect/infospace/`
|
||||
package for creating, evaluating, maintaining, and composing infospaces.
|
||||
|
||||
---
|
||||
|
||||
## Core Concepts
|
||||
|
||||
An **infospace** is a structured, evaluable, composable collection of
|
||||
entities that explains a **topic** through the lens of one or more
|
||||
**disciplines**.
|
||||
|
||||
| Term | Meaning |
|
||||
|------|---------|
|
||||
| **Topic** | The subject matter being explained |
|
||||
| **Discipline** | A reusable framework of concepts applied as an analytical lens |
|
||||
| **Entity** | The atomic unit of knowledge — slug, definition, provenance, domain |
|
||||
| **Evaluation** | Per-entity or collection-level quality assessment |
|
||||
| **Viability** | Whether an infospace meets its threshold scores |
|
||||
|
||||
---
|
||||
|
||||
## Configuration (`infospace.yaml`)
|
||||
|
||||
Every infospace is declared via an `infospace.yaml` file. The configuration
|
||||
model is defined in `markitect/infospace/config.py`.
|
||||
|
||||
### Minimal example
|
||||
|
||||
```yaml
|
||||
topic:
|
||||
name: "The Wealth of Nations"
|
||||
domain: "Classical Economics"
|
||||
sources: artifacts/sources/
|
||||
|
||||
disciplines:
|
||||
- name: "Viable System Model"
|
||||
path: artifacts/vsm-reference/
|
||||
|
||||
schemas:
|
||||
entity: schemas/economic-entity-schema-v1.0.md
|
||||
|
||||
viability:
|
||||
coverage_ratio: { min: 0.60 }
|
||||
redundancy_ratio: { max: 0.05 }
|
||||
per_entity_mean: { min: 3.5 }
|
||||
```
|
||||
|
||||
### Key models
|
||||
|
||||
- **`TopicConfig`** — `name`, `domain`, `sources`
|
||||
- **`DisciplineBinding`** — `name`, `path` (to another infospace directory)
|
||||
- **`SchemaRegistry`** — `entity`, `mapping`, `analysis` schema paths
|
||||
- **`ViabilityThreshold`** — `metric`, `min`, `max` bounds
|
||||
- **`PipelineConfig`** — Ordered list of `PipelineStage` entries
|
||||
- **`InfospaceConfig`** — Top-level config combining all of the above
|
||||
|
||||
### Default directories
|
||||
|
||||
| Setting | Default |
|
||||
|---------|---------|
|
||||
| `entities_dir` | `output/entities` |
|
||||
| `evaluations_dir` | `output/evaluations` |
|
||||
| `metrics_dir` | `output/metrics` |
|
||||
|
||||
---
|
||||
|
||||
## Entity Metadata
|
||||
|
||||
Entities are parsed from markdown files by `markitect/infospace/entity_parser.py`.
|
||||
|
||||
**`EntityMeta`** fields: `slug`, `title`, `definition`, `domain`,
|
||||
`source_chapter`, `context`, `original_wording`, `modern_interpretation`,
|
||||
`definition_word_count`, `total_word_count`, `section_slugs`.
|
||||
|
||||
```python
|
||||
from markitect.infospace import parse_entity_directory
|
||||
entities = parse_entity_directory(Path("output/entities"))
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Schema Validation
|
||||
|
||||
Deterministic validation of entity files against structural schemas.
|
||||
|
||||
```python
|
||||
from markitect.infospace import validate_entity, ECONOMIC_ENTITY_SCHEMA
|
||||
result = validate_entity(entity_meta, schema=ECONOMIC_ENTITY_SCHEMA)
|
||||
print(result.summary())
|
||||
```
|
||||
|
||||
Checks: section presence, word count ranges, heading format, enum values.
|
||||
|
||||
---
|
||||
|
||||
## Per-entity Evaluation
|
||||
|
||||
LLM-based quality assessment of individual entities. Defined in
|
||||
`markitect/infospace/evaluate.py`.
|
||||
|
||||
```bash
|
||||
# Evaluate all entities
|
||||
markitect infospace evaluate --provider openrouter
|
||||
|
||||
# Single entity
|
||||
markitect infospace evaluate --entity division-of-labour --provider openrouter
|
||||
```
|
||||
|
||||
### Pipeline functions
|
||||
|
||||
- `build_evaluation_prompt(entity, topic, dimensions)` — build the LLM prompt
|
||||
- `parse_evaluation_response(text, dimensions)` — parse LLM output to `ScoreEntry` list
|
||||
- `run_entity_evaluation(config, entities, adapter, ...)` — full batch pipeline
|
||||
|
||||
Results are written to `output/evaluations/` as YAML frontmatter + markdown.
|
||||
|
||||
---
|
||||
|
||||
## Collection-level Checks
|
||||
|
||||
Five concerns assessed at the collection level. Each has a dedicated
|
||||
module in `markitect/infospace/checks/`.
|
||||
|
||||
| Concern | Module | Key metric |
|
||||
|---------|--------|------------|
|
||||
| **C1 — Redundancy** | `redundancy.py` | `redundancy_ratio` |
|
||||
| **C2 — Coverage** | `coverage.py` | `coverage_ratio` |
|
||||
| **C3 — Coherence** | `coherence.py` | `coherence_components`, `modularity` |
|
||||
| **C4 — Consistency** | `consistency.py` | `consistency_cycles` |
|
||||
| **C5 — Granularity** | `granularity.py` | `granularity_entropy` |
|
||||
|
||||
### Orchestrator
|
||||
|
||||
```python
|
||||
from markitect.infospace.checks import run_all_checks
|
||||
report = run_all_checks(entities, embeddings=emb, graph=g)
|
||||
metrics = report.metrics() # Dict[str, float]
|
||||
```
|
||||
|
||||
### CLI
|
||||
|
||||
```bash
|
||||
# Run all checks
|
||||
markitect infospace check
|
||||
|
||||
# Run specific concerns
|
||||
markitect infospace check --concern redundancy --concern coverage
|
||||
|
||||
# JSON output
|
||||
markitect infospace check --json
|
||||
```
|
||||
|
||||
After each check run, metrics are automatically recorded to history.
|
||||
|
||||
---
|
||||
|
||||
## Metrics History
|
||||
|
||||
Timestamped snapshots track metrics over time. Defined in
|
||||
`markitect/infospace/history.py`.
|
||||
|
||||
```bash
|
||||
# Show history
|
||||
markitect infospace history
|
||||
|
||||
# Trend for a single metric
|
||||
markitect infospace history --metric coverage_ratio
|
||||
|
||||
# Compare two snapshots
|
||||
markitect infospace history-diff 2026-02-01 2026-03-01
|
||||
```
|
||||
|
||||
### Key functions
|
||||
|
||||
- `snapshot_from_checks(report, entity_count)` — create snapshot from check results
|
||||
- `record_check_results(report, config, root, entity_count)` — save metrics + append to history
|
||||
- `get_history(config, root)` — read full history
|
||||
- `metric_trend(history, metric_name)` — extract single metric across time
|
||||
|
||||
---
|
||||
|
||||
## Viability
|
||||
|
||||
Viability is assessed by comparing current metrics to thresholds declared
|
||||
in `infospace.yaml`.
|
||||
|
||||
```bash
|
||||
markitect infospace viability
|
||||
```
|
||||
|
||||
### Threshold model
|
||||
|
||||
```yaml
|
||||
viability:
|
||||
coverage_ratio: { min: 0.60 } # must be >= 0.60
|
||||
redundancy_ratio: { max: 0.05 } # must be <= 0.05
|
||||
consistency_cycles: { max: 0 } # must be exactly 0
|
||||
```
|
||||
|
||||
Each threshold has `min` and/or `max` bounds. A metric passes if it falls
|
||||
within bounds. An infospace is viable when all thresholds pass.
|
||||
|
||||
---
|
||||
|
||||
## Composition
|
||||
|
||||
One infospace can use another as a discipline. The composition model is
|
||||
defined in `markitect/infospace/composition.py`.
|
||||
|
||||
### Binding a discipline
|
||||
|
||||
```bash
|
||||
markitect infospace bind-discipline ./path/to/vsm-infospace --name "Viable System Model"
|
||||
```
|
||||
|
||||
This adds a `DisciplineBinding` to `infospace.yaml` and validates the
|
||||
discipline exists and has an `infospace.yaml`.
|
||||
|
||||
### Checking discipline status
|
||||
|
||||
```bash
|
||||
markitect infospace disciplines
|
||||
```
|
||||
|
||||
Shows: name, entity count, viability status, path.
|
||||
|
||||
### Viability requirement
|
||||
|
||||
A discipline must meet its own viability thresholds to be considered
|
||||
reliable. The `check_discipline_status()` function loads the discipline's
|
||||
metrics and runs its own threshold checks.
|
||||
|
||||
### Stale mapping detection
|
||||
|
||||
```bash
|
||||
markitect infospace stale-mappings
|
||||
```
|
||||
|
||||
Compares local mapping references against the discipline's current entity
|
||||
set. If a referenced discipline entity has been removed, the mapping is
|
||||
flagged as stale.
|
||||
|
||||
### Key functions
|
||||
|
||||
- `resolve_discipline_path(binding, root)` — resolve to absolute path
|
||||
- `load_discipline_config(binding, root)` — load discipline's `infospace.yaml`
|
||||
- `check_discipline_status(binding, root)` — full status with viability
|
||||
- `get_discipline_entities(binding, root)` — entity list from discipline
|
||||
- `find_stale_mappings(config, root, mapping_references)` — detect stale refs
|
||||
- `bind_discipline(config, name, path, root)` — add binding to config
|
||||
|
||||
---
|
||||
|
||||
## Evaluation Output Format
|
||||
|
||||
Evaluation results use YAML frontmatter + markdown body. Defined in
|
||||
`markitect/infospace/evaluation.py` and `evaluation_io.py`.
|
||||
|
||||
### Per-entity evaluation file
|
||||
|
||||
```markdown
|
||||
---
|
||||
entity_slug: division-of-labour
|
||||
evaluator: openrouter/default
|
||||
evaluated_at: '2026-02-19T10:30:00'
|
||||
overall_score: 4.1667
|
||||
scores:
|
||||
- name: definition_precision
|
||||
value: 4.5
|
||||
max_value: 5.0
|
||||
...
|
||||
---
|
||||
|
||||
# Evaluation: Division Of Labour
|
||||
|
||||
## definition_precision — 4.5 / 5.0
|
||||
|
||||
The definition clearly captures the core concept...
|
||||
```
|
||||
|
||||
### Snapshot
|
||||
|
||||
```yaml
|
||||
snapshot_id: abc12345
|
||||
created_at: '2026-02-19T10:30:00+00:00'
|
||||
schema_name: default
|
||||
entity_count: 85
|
||||
entity_evaluations: [...]
|
||||
collection_metrics:
|
||||
- name: coverage_ratio
|
||||
value: 0.75
|
||||
concern: C2
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## State
|
||||
|
||||
Runtime state is computed from entities, evaluations, and metrics.
|
||||
Defined in `markitect/infospace/state.py`.
|
||||
|
||||
```python
|
||||
from markitect.infospace import build_state
|
||||
state = build_state(config, entities=entities, metrics=metrics)
|
||||
state.is_viable # True if all thresholds pass
|
||||
state.viability_results # List[ViabilityResult]
|
||||
state.summary() # Dict for display
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## CLI Command Summary
|
||||
|
||||
All commands are under `markitect infospace`:
|
||||
|
||||
| Command | Purpose |
|
||||
|---------|---------|
|
||||
| `init` | Create a new `infospace.yaml` |
|
||||
| `status` | Show entity count, domains, evaluation state |
|
||||
| `entities` | List entities with metadata |
|
||||
| `evaluate` | Run per-entity LLM evaluation |
|
||||
| `check` | Run collection-level quality checks (C1-C5) |
|
||||
| `viability` | Show viability dashboard |
|
||||
| `history` | Show metrics history |
|
||||
| `history-diff` | Compare two snapshots by date |
|
||||
| `bind-discipline` | Bind an external infospace as a discipline |
|
||||
| `disciplines` | List bound disciplines and viability |
|
||||
| `stale-mappings` | Detect stale cross-infospace references |
|
||||
|
||||
---
|
||||
|
||||
## Platform Dependencies
|
||||
|
||||
The infospace tooling builds on these platform modules:
|
||||
|
||||
| Module | Used for |
|
||||
|--------|----------|
|
||||
| `markitect/llm/` | Embedding adapters, LLM evaluation |
|
||||
| `markitect/analysis/graph.py` | Graph analysis (networkx wrapper) |
|
||||
| `markitect/analysis/fca.py` | Formal Concept Analysis |
|
||||
| `markitect/prompts/execution/batch.py` | Batch LLM evaluation |
|
||||
| `markitect/prompts/dependencies/models.py` | DependencyGraph |
|
||||
Reference in New Issue
Block a user