Files
markitect-main/roadmap/infospace-tooling/viable-information-spaces.md
tegwick b5e994b014 docs: preliminary introduction to Viable Information Spaces
Conceptual overview of infospaces as structured, evaluable, composable
knowledge collections. Establishes the vocabulary (topic, discipline,
entity, viability), the build cycle (extract, map, evaluate, refine),
the five collection quality concerns, and the composition model
(hierarchical, networked, swarm).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 23:54:53 +01:00

15 KiB

Viable Information Spaces

A preliminary introduction to the concepts, structure, and purpose of viable information spaces as a framework for structured knowledge work.


What is an Information Space?

An information space is a curated collection of concepts — each precisely defined, grounded in source material, and connected to the others — that together explain a topic. It is not a database, not a knowledge graph in the technical sense, and not a document collection. It is closer to what a domain expert carries in their head: a working vocabulary of ideas, their relationships, and the judgment to know which idea applies where.

The difference is that an information space makes this vocabulary explicit, evaluable, and composable. Every concept has a written definition. Every relationship can be traced. The quality of the whole collection can be measured and improved over time.

We use the term infospace as shorthand.


Why "Viable"?

The word comes from Stafford Beer's Viable System Model, but the idea generalises beyond it. A viable system is one that can maintain a separate existence — it is complete enough to function, coherent enough to hold together, and adaptive enough to improve when circumstances change.

A viable infospace has the same properties:

  • Complete enough — it covers the topic well enough to answer the questions it was built to answer. Not every detail, but every concept that matters.
  • Coherent enough — its concepts connect into an explanatory web, not a disconnected list. You can trace how one idea leads to another.
  • Consistent enough — concepts don't contradict each other. Terms are used the same way throughout. Definitions don't go in circles.
  • Balanced enough — concepts operate at comparable levels of abstraction. The infospace doesn't mix foundational theories with trivial observations without acknowledging the difference.
  • Non-redundant enough — each concept earns its place. Two concepts that mean the same thing should be one concept.

None of these are absolute. "Enough" is defined by the purpose. An infospace built for teaching needs different coverage than one built for research. Viability is a profile of scores against thresholds that the user sets.


The Anatomy of an Infospace

Topic

Every infospace is built to explain something specific. The topic is the subject matter: a text, a system, a body of knowledge, a problem domain. In our first example, the topic is Adam Smith's The Wealth of Nations — the economic ideas contained in that specific work.

A topic sits within a broader domain (economics, biology, software engineering) but is more focused. The domain provides context; the topic provides the source material from which concepts are extracted.

Entities

The atomic units of an infospace are its entities — the individual concepts, mechanisms, and observations that constitute its vocabulary. Each entity has:

  • A name and unique identifier
  • A definition — precise, non-circular, distinguishable from neighbouring concepts
  • Provenance — where it came from (which chapter, passage, or data source)
  • A domain placement — which area of the topic it belongs to
  • Quality scores — how well it is defined, grounded, and connected

Entities are stored as individual files, one concept per file. This makes them independently addressable, diffable, and composable.

Schemas

Schemas define what a well-formed entity looks like: which sections it must have, what validation rules apply, what quality metrics are evaluated. A schema is not code — it is a markdown document that both humans and LLMs read as instructions.

Schemas serve two purposes:

  1. Structural — they tell the extraction pipeline what to produce (required sections, word count ranges, heading formats)
  2. Evaluative — they define quality rubrics against which each entity is scored (definition precision, source grounding, explanatory value)

By changing a schema, you change what the infospace considers "good" without changing any infrastructure.

Disciplines

Here is where things get interesting. An infospace doesn't just catalogue what's in the source material — it looks at the source through a lens. We call this lens a discipline: a structured framework of concepts from another domain, applied to illuminate the topic at hand.

In our example, the discipline is Stafford Beer's Viable System Model — a set of concepts from systems theory (System 1 through System 5, recursion, variety, viability) applied to the economic ideas in Smith's work. The VSM provides the analytical structure; Smith provides the raw material.

The key insight: a discipline is itself an infospace. The VSM concepts (S1-S5, recursion, variety, algedonic signals) form their own curated, evaluable collection of ideas. To use the VSM as a discipline, it must first be a viable infospace in its own right — its concepts must be well-defined, coherent, and complete.

This leads to a recursive property: infospaces can be built on top of other infospaces. The Wealth of Nations infospace, viewed through the VSM lens, could itself become a discipline applied to analyse a modern supply chain. Each layer adds structure without losing the detail beneath it.


How Infospaces Are Built

Building an infospace is an incremental process with four repeating phases:

1. Extract

Source material is processed one unit at a time (a chapter, a document, a dataset). For each unit, an LLM extracts entities according to the schemas and guidelines. Entities that already exist are recognised and skipped — the infospace grows by accumulation, not duplication.

2. Map

Extracted entities are mapped to the discipline. In our example, each economic concept is mapped to a VSM system with a strength rating and rationale. This is where the discipline lens does its work: it forces the question "what role does this concept play in the larger system?"

3. Evaluate

After extraction and mapping, the infospace is evaluated at two levels:

  • Per-entity: each concept is scored against quality rubrics. Is the definition precise? Is it grounded in the source? Does it connect meaningfully to the discipline?
  • Collection-level: the set of concepts is assessed for redundancy, coverage, coherence, consistency, and granularity balance.

Evaluation produces structured, machine-readable scores — not prose narratives. These scores are tracked over time.

4. Refine

Evaluation reveals what needs improvement. Redundant concepts are merged or archived. Coverage gaps are addressed by re-extracting with improved guidelines. Inconsistencies are resolved by clarifying definitions. Guidelines and schemas are updated. The cycle repeats.

This loop — extract, map, evaluate, refine — is the heartbeat of a viable infospace. Each iteration makes the infospace more viable: more complete, more coherent, more consistent.


How Infospaces Are Evaluated

Quality is assessed through two complementary mechanisms:

LLM Evaluation

A language model reads an entity (or a pair of entities) and judges it against defined rubrics. This captures qualitative aspects that can't be computed mechanically: Is this definition actually precise? Does this mapping rationale make sense? Are these two concepts really different?

LLM evaluation is always delegated — it runs through prompt templates and the platform's LLM integration, never through the human or agent working on infrastructure. This separation keeps domain judgment in the problem space.

Deterministic Aggregation

Structured scores from LLM evaluation, plus metrics computed directly from files (section counts, word lengths, graph properties, similarity matrices), are aggregated into collection-level indicators. These are numbers that can be tracked, diffed, and plotted:

  • Redundancy ratio — what fraction of concepts substantially overlap
  • Coverage ratio — what fraction of the domain-discipline matrix is populated
  • Graph density — how connected the concept web is
  • Cycle count — how many circular definition chains exist
  • Granularity entropy — how balanced the abstraction levels are

These indicators, compared against user-defined thresholds, determine whether the infospace is viable for its intended purpose.


Five Concerns of Collection Quality

Individual concept quality (is this definition good?) is necessary but not sufficient. An infospace made of individually excellent concepts can still fail as a collection. Five concerns capture what can go wrong:

Redundancy

Do two concepts mean the same thing? Overlap wastes the reader's attention and creates ambiguity about which concept to use. Redundancy is detected through embedding similarity (are the definitions close in meaning?) confirmed by LLM judgment (are they genuinely the same concept, or merely related?).

Coverage

Does the concept set cover the domain? Are there areas of the topic that have no corresponding concepts? Coverage is assessed structurally (which cells in the domain-discipline matrix are empty?) and functionally (can the infospace answer the questions it was built to answer?).

Coherence

Do the concepts form a connected web of explanations, or a fragmented list of isolated ideas? Coherence is measured through graph analysis: connected components (is everything reachable?), modularity (are there meaningful clusters?), and bridge concepts (which ideas connect different areas?).

Consistency

Are concepts defined in terms of each other without contradiction? Are there circular definition chains? Do definitions use terms that should be concepts but aren't? Consistency is checked through dependency graph analysis (cycles, undefined terms) and LLM pairwise judgment (do related definitions contradict each other?).

Granularity Balance

Are concepts at comparable levels of abstraction? An infospace that mixes broad theoretical principles with narrow observations — without acknowledging the difference — confuses more than it explains. Balance is assessed by classifying each concept's abstraction level and measuring the distribution.


Infospaces as Organisms

The biological metaphor is deliberate. A viable organism maintains its identity while exchanging material with its environment. It has internal coherence (its parts work together), boundary integrity (it is distinguishable from its surroundings), and adaptive capacity (it responds to change).

Infospaces exhibit the same properties:

  • Internal coherence — concepts connect and support each other
  • Boundary — the topic and discipline define what belongs and what doesn't
  • Adaptation — evaluation and refinement allow the infospace to improve

And like organisms, infospaces don't exist in isolation.

Hierarchical Composition

One infospace can serve as a discipline for another. The VSM infospace provides the lens for the Wealth of Nations infospace, which could provide the lens for a supply chain infospace. Each layer adds structure and interpretive power. This is analogous to biological organisation: cells compose into tissues, tissues into organs, organs into organisms.

For this to work, the lower-level infospace must be viable — you can't build reliable analysis on a shaky foundation. A discipline that is incomplete or inconsistent will produce unreliable mappings.

Network Composition

Infospaces can also relate laterally. Two infospaces at the same level might share concepts, reference each other's entities, or provide complementary views of overlapping domains. A Wealth of Nations infospace and a Marx's Capital infospace might share economic entities while differing in their analytical discipline.

This networked structure mirrors how knowledge actually works: fields overlap, vocabularies are shared and contested, and understanding grows by connecting islands of well-organised thought.

Swarm Behaviour

When many infospaces exist and interact, emergent properties appear. Common entities across many infospaces become well-tested through repeated evaluation in different contexts. Concepts that survive across multiple disciplines are more likely to be fundamental. Gaps visible from one perspective may be filled by insights from another.

This is speculative territory for now, but the tooling should be designed with it in mind: infospaces as first-class, composable, addressable units of knowledge.


The Role of Tooling

An infospace is a living artefact that requires ongoing maintenance. The tooling must support every phase of the lifecycle:

Creating an infospace

Declaring a topic, binding disciplines, defining schemas and competency questions, setting viability thresholds. This should be a single configuration step, not a programming exercise.

Populating an infospace

Processing source material through the extract-map pipeline, one unit at a time. Progress is tracked. Each addition is committed to version history.

Evaluating an infospace

Running per-entity and collection-level checks. Producing structured, machine-readable scores. Comparing against viability thresholds. Identifying specific issues (this entity is redundant, this domain gap needs filling, these definitions contradict).

Refining an infospace

Acting on evaluation results: archiving redundant entities, re-extracting with improved guidelines, updating schemas, re-evaluating. Every change is traceable.

Composing infospaces

Binding one infospace as a discipline for another. Checking that the discipline is viable. Propagating changes when the discipline's concepts are updated.

Monitoring an infospace

Tracking metrics over time. Seeing how coverage, coherence, and consistency evolve as content is added. Detecting regressions when a re-extraction reduces quality.

The tooling should present these operations as simple, well-documented commands — not as infrastructure details. The user thinks in terms of "evaluate my infospace" and "check for redundancy", not in terms of embedding vectors and graph algorithms.


Where We Are

We have built the first example infospace: 85 economic entities from Adam Smith's The Wealth of Nations, mapped to Beer's Viable System Model, with schemas, prompt templates, and a chapter-by-chapter pipeline.

This example has taught us what works (incremental extraction, deduplication, flat canonical entity sets, transclusion views) and what's missing (per-concept evaluation, collection-level checks, composition model, clean tooling commands).

The work ahead is to generalise from this example: build the platform capabilities needed, create the tooling layer that makes infospace operations accessible, and then revisit the example as both a validation and a tutorial.

The goal is that anyone with a body of source material and an analytical framework can create a viable infospace — and that infospaces, once built, become reusable intellectual tools for future work.