--- id: IB-WP-0013 type: workplan title: "Wealth VSM Generation Pipeline Parity" domain: markitect repo: infospace-bench status: completed owner: markitect topic_slug: markitect created: "2026-05-14" updated: "2026-05-14" state_hub_workstream_slug: "ib-wp-0013-wealth-vsm-generation-pipeline-parity" state_hub_workstream_id: "74dc579e-9b03-4a00-b739-84b1007cfb94" --- # IB-WP-0013 - Wealth VSM Generation Pipeline Parity ## Goal Make `infospace-bench` capable of regenerating the Adam Smith `Wealth of Nations` / VSM infospace through explicit, auditable workflows. This should replace the old `markitect-project` generation path without copying its hidden provider calls, implicit output conventions, or monolithic `process` command shape. ## Intent The legacy implementation could run a chapter corpus through: - entity extraction - VSM mapping - chapter-level analysis synthesis - entity evaluation - classification and relation enrichment - collection metrics The successor should express those stages as declared infospace workflows with deterministic planning, fake-adapter tests, explicit assisted-generation requests, stable manifest registration, and clear provenance. ## Non-Goals - Recreate the old `process_chapters.py` script as-is. - Hide provider-specific LLM calls behind a generic command. - Require a live provider or network access for default tests. - Commit the full regenerated Wealth/VSM output before a one-chapter pilot is proven. - Move durable runtime, retrieval, or audit responsibilities into `infospace-bench`; those remain `kontextual-engine` concerns. ## Tasks ### T01 - Legacy pipeline decomposition and corpus map ```task id: IB-WP-0013-T01 status: done priority: high state_hub_task_id: "2c558d1e-290f-4e0e-abe6-37302cc31ac4" ``` - Map legacy `examples/infospace-with-history/process_chapters.py` - Inventory old templates: `extract-entities`, `map-to-vsm`, `synthesize-analysis`, `evaluate-entity`, and `assess-metrics` - Inventory source corpus, guidelines, VSM reference artifacts, generated outputs, processing logs, and metrics files - Record what must be migrated, reframed, delegated, deferred, or retired - Pick the first one-chapter golden target, preferably Book I Chapter III so it aligns with the current pruned legacy slice ### T02 - Assisted generation adapter and CLI boundary ```task id: IB-WP-0013-T02 status: done priority: high state_hub_task_id: "70beb49c-49a3-49f4-9b3a-a4c5bdb88485" ``` - Extend workflow execution so assisted stages can be executed through an explicit adapter selected by the caller - Keep dry-run planning as the default safe path - Add a deterministic fake adapter for tests - Persist assisted requests, provider metadata, generated outputs, and run records - Expose CLI/API behavior without embedding provider-specific code in core workflow logic ### T03 - Entity bundle splitting and manifest registration ```task id: IB-WP-0013-T03 status: done priority: high state_hub_task_id: "4a340077-f0ab-40fe-a0bc-0fa94a325774" ``` - Parse generated chapter-level entity bundles into individual entity artifacts - Normalize stable artifact IDs and filenames - Register each artifact in `artifacts/index.yaml` - Preserve source chapter, workflow, stage, provider, and input provenance - Make reruns idempotent: unchanged artifacts should not duplicate manifest entries - Add tests for malformed bundles, duplicate entities, and manifest updates ### T04 - VSM mapping analysis and evaluation workflows ```task id: IB-WP-0013-T04 status: done priority: high state_hub_task_id: "62696191-d6fa-4d34-bf18-97f390a31b61" ``` - Recreate `map-to-vsm` as an explicit assisted workflow - Recreate `synthesize-analysis` as an explicit assisted workflow - Recreate entity evaluation as an explicit assisted workflow that writes successor `artifact_id` evaluation files - Ensure generated mappings and relations can be parsed by current semantic models or clearly identify required model extensions - Connect generated evaluations to metrics/history and viability checks ### T05 - Wealth VSM pilot scale-up acceptance ```task id: IB-WP-0013-T05 status: done priority: medium state_hub_task_id: "fe8dd175-9630-4fe1-99aa-2f3e58172a52" ``` - Prove one-chapter regeneration end to end with deterministic tests - Add a committed pilot report comparing regenerated successor output with the legacy generated output shape - Add docs for running a live provider-backed generation outside the default test suite - Document cost, rate-limit, resume, and reproducibility guidance - Define the acceptance path for scaling from one chapter to the full corpus ## Acceptance - A user can inspect, plan, and run the Wealth/VSM generation workflow over a one-chapter pilot without using the old `markitect-project` process script - Default tests use fake adapters and are deterministic - Generated entities are split into stable files and registered in the manifest - Evaluation outputs use successor `artifact_id` semantics and feed metrics history - The workflow clearly distinguishes deterministic template stages from assisted provider-backed stages - Remaining full-corpus risks are documented before any large generation run ## Relationship To IB-WP-0014 This workplan can start on the current local-folder backend. It should avoid hard-coding storage assumptions where reasonable, but it is not blocked by the backend abstraction workplan. ## Implementation - Added `docs/wealth-vsm-generation-pipeline.md` with the legacy pipeline decomposition, one-chapter pilot path, live-provider guidance, and full corpus scale-up sequence. - Added `infospaces/wealth-vsm-generation-pilot/` with Book I Chapter III, explicit extraction, mapping/analysis, and evaluation workflows, deterministic fixture responses, contracts, and a pilot report. - Added `FixtureAssistedGenerationAdapter` and CLI `workflow run --fixture-responses` support so assisted stages are explicit and deterministic by default. - Added entity bundle parsing/splitting with idempotent manifest registration. - Added evaluation output handling so generated evaluation files feed `infospace-bench check` metrics/history. - Added `tests/test_wealth_vsm_generation.py`. ## Verification - `python3 -m pytest tests/test_wealth_vsm_generation.py` - `python3 -m pytest`