infospace-bench/docs/markitect-main-scope-assessment.md

# MarkiTect Main Scope Assessment

Date: 2026-05-03

## Purpose

This assessment compares `/home/worsch/markitect-main` against the
`infospace-bench` intent, PRD, and FRS. It identifies which ideas should be
migrated, reimplemented, referenced, or deliberately left behind.

## Summary

`markitect-main` already contains substantial infospace work, but it is bundled
with responsibilities that this repo explicitly excludes: markdown parsing,
schema tooling, rendering, asset management, GraphQL experiments, prompt
runtime infrastructure, and broad platform behavior.

`infospace-bench` should therefore become a focused successor for the
application-layer infospace lifecycle. The right move is selective
reimplementation and migration of concepts, examples, tests, and documentation,
not a package copy.

## Source Areas Reviewed

| MarkiTect area | Assessment | Successor action |
| --- | --- | --- |
| `markitect/infospace/` | Closest match. Contains entity metadata, config, evaluation, collection checks, history, composition, relation parsing, graph export, and CLI glue. | Reimplement as the first functional baseline, adapting boundaries to lower-layer deps. |
| `examples/infospace-with-history/` | Strong reference experiment with `infospace.yaml`, source corpus, pipeline stages, viability thresholds, outputs, and metrics methodology. | Migrate as a reference pilot after pruning generated bulk and documenting provenance. |
| `docs/infospace-primitives.md` | Useful conceptual reference for topic, discipline, entity, evaluation, viability, checks, history, and composition. | Rewrite into this repo's docs with updated architecture terms. |
| `markitect/prompts/` | Provides generation workflow, dependency resolution, quality gates, traceability, and batch execution. | Reimplement only workflow-facing contracts; runtime/provider mechanics belong outside or behind adapters. |
| `markitect/analysis/graph.py` and relationship exports | Relevant for inspection and visualization. | Keep graph/relationship inspection as an application feature; rely on lower layers for reusable primitives where possible. |
| `markitect/schema/` and legacy schema modules | Useful as input constraints, but lower-level schema tooling is out of scope. | Reference or depend on schema tooling; do not copy wholesale. |
| `markitect/packaging/transclusion/` and `markitect/spaces/transclusion/` | Useful conceptually for composed artifacts, but implementation is syntax/tooling heavy. | Treat as dependency or later integration point, not first migration. |
| `markitect/assets/`, rendering plugins, static assets | Mostly outside current PRD/FRS except exported artifacts. | Leave behind unless a concrete infospace export needs assets. |
| GraphQL/server experiments | Platform/API layer not validated for this repo. | Defer. Start with CLI/service contracts only after the domain model is stable. |

## In-Scope Concepts To Carry Forward

- `infospace.yaml` as a declarative project artifact
- Topic, discipline binding, entity, relation, evaluation, and viability
- Entity parsing into structured metadata
- Collection checks: redundancy, coverage, coherence, consistency, granularity
- Metrics history and snapshot diffs
- Viability thresholds as explicit acceptance criteria
- Relationship graph export and inspection
- Reference corpus workflow: sources -> analyses -> entities -> mappings ->
  evaluations -> reports
- Pipeline definitions that are inspectable and reproducible

## Reimplementation Boundaries

The successor should reimplement interfaces around these concepts, but with
cleaner responsibility boundaries:

- Markdown parsing should come from `markitect-tool` or a thin adapter.
- Persistence and long-running workflow orchestration should come from
  `kontextual-engine`.
- LLM calls should use `llm-connect` or an equivalent provider adapter.
- This repo should own concrete configuration, project layout, reports,
  fixtures, workflow definitions, and application-level glue.

## Initial Architecture Recommendation

Start with a file-backed baseline before adding a service:

```text
infospaces/<slug>/infospace.yaml
infospaces/<slug>/artifacts/
infospaces/<slug>/output/
workflows/
reports/
docs/
```

Then add a small CLI around the FRS lifecycle:

- `create`
- `load` / `inspect`
- `add-artifact`
- `evaluate`
- `check`
- `export`

Only introduce a service/API when there is a concrete need for long-running
state, external consumers, or integration with `kontextual-engine`.

## Migration Priority

1. Documentation and repo shape
2. Infospace config and artifact model
3. Evaluation output and metrics history
4. Collection checks and relationship inspection
5. Reference infospace pilot
6. Workflow execution and AI-assisted generation
7. Service/API layer

## Non-Migration List

Do not migrate these directly into `infospace-bench`:

- General markdown parser/serializer modules
- Schema generator/validator libraries as reusable tooling
- Asset management subsystem
- Rendering/plugin infrastructure
- GraphQL platform experiments
- Database and orchestration infrastructure
- Legacy compatibility layers

## Open Questions

- Should `infospace-bench` depend directly on local sibling repos during early
  development, or use published/package boundaries from the start?
- Which reference infospace should be the first maintained pilot: the existing
  Wealth of Nations/VSM experiment, or a smaller purpose-built corpus?
- Should State Hub track each infospace as its own workstream once the repo
  model exists?