generated from coulomb/repo-seed
infospace pipeline for wealth of nations example
This commit is contained in:
@@ -87,3 +87,12 @@ infospace-bench workflow plan infospaces/bootstrap-pilot bootstrap-readiness
|
||||
infospace-bench workflow run infospaces/bootstrap-pilot bootstrap-readiness
|
||||
```
|
||||
|
||||
Run the Wealth/VSM one-chapter generation pilot with deterministic assisted
|
||||
fixtures:
|
||||
|
||||
```bash
|
||||
infospace-bench workflow run infospaces/wealth-vsm-generation-pilot wealth-vsm-extract-entities --fixture-responses infospaces/wealth-vsm-generation-pilot/workflows/fixtures/wealth-vsm-fake-responses.yaml
|
||||
infospace-bench workflow run infospaces/wealth-vsm-generation-pilot wealth-vsm-map-and-analyze --fixture-responses infospaces/wealth-vsm-generation-pilot/workflows/fixtures/wealth-vsm-fake-responses.yaml
|
||||
infospace-bench workflow run infospaces/wealth-vsm-generation-pilot wealth-vsm-evaluate-entities --fixture-responses infospaces/wealth-vsm-generation-pilot/workflows/fixtures/wealth-vsm-fake-responses.yaml
|
||||
infospace-bench check infospaces/wealth-vsm-generation-pilot
|
||||
```
|
||||
|
||||
@@ -30,6 +30,7 @@ considered a replacement for each in-scope legacy infospace behavior from
|
||||
| Persist durable assets | Optional engine-backed repository adapter | Dry-run sync tests and integration design | `IB-WP-0010` | boundary done |
|
||||
| Run a legacy-derived pilot | Pruned `infospace-with-history` migration | Pilot corpus, migration report, parity comparison | `IB-WP-0011` | done |
|
||||
| Provide command migration path | Legacy command parity guide | Command table, examples, migration guide, decision record, acceptance tests | `IB-WP-0012` | done |
|
||||
| Regenerate Wealth/VSM pilot | Explicit assisted workflows and deterministic fixtures | One-chapter generation tests, bundle splitting, evaluation metrics, scale-up docs | `IB-WP-0013` | done |
|
||||
|
||||
## Replacement Gates
|
||||
|
||||
|
||||
76
docs/wealth-vsm-generation-pipeline.md
Normal file
76
docs/wealth-vsm-generation-pipeline.md
Normal file
@@ -0,0 +1,76 @@
|
||||
# Wealth VSM Generation Pipeline
|
||||
|
||||
Date: 2026-05-14
|
||||
|
||||
## Purpose
|
||||
|
||||
This document defines how `infospace-bench` regenerates the Adam Smith
|
||||
`Wealth of Nations` / VSM infospace through explicit workflows.
|
||||
|
||||
The successor path is workflow-first. It does not reuse the legacy
|
||||
`process_chapters.py` entrypoint, hide provider calls in a broad command, or
|
||||
write generated files outside the artifact manifest.
|
||||
|
||||
## Legacy pipeline decomposition
|
||||
|
||||
The old Wealth/VSM experiment in `markitect-main` processed source chapters
|
||||
through these conceptual stages:
|
||||
|
||||
| Legacy stage | Successor workflow shape | Notes |
|
||||
| --- | --- | --- |
|
||||
| `extract-entities` | `wealth-vsm-extract-entities` assisted stage plus `split_entities` stage | Assisted output is a chapter entity bundle; bench splits and registers stable entity artifacts. |
|
||||
| `map-to-vsm` | `wealth-vsm-map-and-analyze` assisted relation stage | Relation artifacts use the successor relation parser and manifest IDs. |
|
||||
| `synthesize-analysis` | `wealth-vsm-map-and-analyze` assisted analysis stage | Analysis remains a generated artifact with source provenance. |
|
||||
| `evaluate-entity` | `wealth-vsm-evaluate-entities` assisted stage | Evaluation files use successor `artifact_id` frontmatter. |
|
||||
| `assess-metrics` | `infospace-bench check` | Deterministic checks merge generated evaluations into metrics and history. |
|
||||
|
||||
The first golden target is Book I Chapter III because it grounds the existing
|
||||
`wealth-vsm-legacy-slice` pilot and exercises the market-extent relation.
|
||||
|
||||
## One-chapter pilot
|
||||
|
||||
`infospaces/wealth-vsm-generation-pilot/` contains:
|
||||
|
||||
- one source excerpt: `book-1-chapter-03.md`
|
||||
- explicit workflow declarations for extraction, VSM mapping/analysis, and
|
||||
entity evaluation
|
||||
- deterministic fixture responses for tests
|
||||
- markdown contracts for generated entity and relation artifacts
|
||||
- a pilot report comparing the successor workflow shape with the legacy
|
||||
process script
|
||||
|
||||
Default tests use fixture responses so they do not require network access,
|
||||
provider credentials, or live model output.
|
||||
|
||||
## Live provider-backed generation
|
||||
|
||||
Any live provider-backed generation should use the same workflow declarations and
|
||||
the same assisted request records. Provider adapters must be selected
|
||||
explicitly by the caller and should record provider metadata in workflow run
|
||||
records and artifact provenance.
|
||||
|
||||
Live runs should document:
|
||||
|
||||
- provider and model
|
||||
- prompt/template version
|
||||
- source corpus selection
|
||||
- retry and rate-limit settings
|
||||
- expected cost range
|
||||
- resume strategy
|
||||
- generated artifact review status
|
||||
|
||||
## Full corpus scale-up
|
||||
|
||||
Scale-up should proceed only after the one-chapter pilot is green.
|
||||
|
||||
Recommended sequence:
|
||||
|
||||
1. Run Book I Chapter III with fixture responses.
|
||||
2. Run Book I Chapter III with a live provider in a disposable copy.
|
||||
3. Review generated entities, relations, evaluations, and metrics.
|
||||
4. Add a small Book I batch with explicit cost and resume notes.
|
||||
5. Only then run the full corpus.
|
||||
|
||||
The full corpus should not be committed wholesale until it has a current scoped
|
||||
use, deterministic acceptance coverage, and a migration report explaining what
|
||||
was generated, reviewed, deferred, or retired.
|
||||
Reference in New Issue
Block a user